Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

6 New Things ChatGPT Can Do with Upgraded Voice & Vision

Dhruv Kudalkar by Dhruv Kudalkar
May 14, 2024
Reading Time: 7 mins read
ChatGPT Audio Video Advancements
Follow us on Google News   Subscribe to our newsletter

OpenAI just announced ChatGPT’s new real-time conversational chat! It can now understand both audio and video. With these new advancements, it can tell us how you are feeling through your facial expressions or adjust its audio tone according to the user’s current emotional state. Let’s discuss such amazing features added to ChatGPT!

6 New Audiovisual Advancements to ChatGPT

In their official Spring Update streamed on X, OpenAI’s employees demonstrated the new voice and vision capabilities of ChatGPT. Let’s take a look at a few examples!

1) Provide Real-Time Advice

ChatGPT can provide users with advice in real time and help them prepare for different situations. It will guide you through each step like it is making conversation with you.

Mark Chen, a research lead at OpenAI, let ChatGPT know that he was doing a live demo and was feeling nervous. He asked it how he could calm his nerves. It told him to take a deep breath.

Live demo of GPT-4o realtime conversational speech pic.twitter.com/FON78LxAPL

— OpenAI (@OpenAI) May 13, 2024

Mark then took a deep breath in a hard and haphazard manner which ChatGPT recognized and let him know that he is not a vacuum cleaner. This showed that ChatGPT can identify when a user is doing something wrong thus highlighting its remarkable capabilities.

It then explained to him how to take a deep breath and asked him if he felt better. Mark performed the steps as instructed, felt a lot better, and thanked GPT for the same.

This example shows how ChatGPT can help users in difficult situations by providing them with advice about how to approach a situation better. It also guides users on how to perform certain tasks step-by-step.

2) Understand Emotions

ChatGPT is now capable of understanding emotions as well. Mark let it know that his fellow research lead Barret Zoph was having a hard time sleeping and asked it to tell him a bedtime story about robots and love.

Live demo of GPT-4o voice variation pic.twitter.com/b7lLJkhBt1

— OpenAI (@OpenAI) May 13, 2024

ChatGPT then narrated the story in a boring and less-expressive manner to which they asked it to use some more emotion and drama.

Barret then asked it to generate even more emotion to the maximum limit much more than it was doing before. It then used a lot of emotions to narrate this story. CTO Mira Murati then asked it to narrate the story in a robotic tone which it successfully did. It was also able to narrate it in a singing tone.

This shows ChatGPT’s new emotional and sentimental capabilities to change its tone based on the situation. It can alter its tone as per the requirement. For example, it can use a childish tone when speaking to a child or use a more serious tone when narrating a news article.

3) Prompt with Live Videos

You can also interact with ChatGPT using videos now. Barret asked ChatGPT to help him with a linear equation that he wrote down on paper. He asked it only for hints and not the final solution.

Live demo of GPT-4o vision pic.twitter.com/m7iyixdTLY

— OpenAI (@OpenAI) May 13, 2024

He noted down the question and asked ChatGPT the equation which it correctly understood. This highlighted the upgraded vision capabilities as it was able to grasp what was written through the real-time video provided by the user.

It then started giving all the steps to Barret. Barret also acted confused in order to test its mathematical skills but it correctly guided him.

Mark then told ChatGPT that he was weak at linear equations and asked if it had to be used in the real world. It then gave him some real-world scenarios where linear equations are used. They were really happy with the accurate responses.

This showed how it was able to follow instructions and help the user solve the mathematical problem. It also demonstrated the real-world application of the question asked. It also showed that ChatGPT did not get confused when the user was trying to test its capabilities.

Thus, ChatGPT’s new vision capability will be extremely useful to users when they want to chat with it using real-time videos.

4) Real-time help for Developers

ChatGPT can help developers with real-time coding problems using their Desktop app. It can hear the user but it can’t see any bit of code unless the code is highlighted. Once the code is highlighted it gets sent to it. Barret shared a code with ChatGPT and asked for a description of the code.

Live demo of coding assistance and desktop app pic.twitter.com/GlSPDLJYsZ

— OpenAI (@OpenAI) May 13, 2024

It then provided an accurate and concise description of the code along with appropriate answers to future questions.

ChatGPT can also see real-time data on the desktop using its vision capabilities. Once the user clicks on the desktop button, it can see what is on the screen. Barret presented it with a plot and it accurately described the plot in a simple manner.

This shows how ChatGPT can now help users with their coding problems in real-time. It allows users to share their code just by highlighting it or sharing their screen with the click of a button. This will improve the coding experience and help users solve their queries quickly and efficiently.

5) A Very Good Translator

ChatGPT can now be used as a translator with its real-time translation capabilities. Mark and Mira had a conversation in Italian and English and they asked ChatGPT to translate English to Italian and vice versa.

Live audience request for GPT-4o realtime translation pic.twitter.com/VSj5phFKM6

— OpenAI (@OpenAI) May 13, 2024

It excelled at this task and was able to perform all translations with flying colours.

6) Detect How You Are Feeling?

ChatGPT can also understand the sentiments of a user by looking at their face. Barret tried it out with a happy look and ChatGPT correctly guessed that Barret was happy and in a good mood.

Live audience request for GPT-4o vision capabilities pic.twitter.com/FPRXpZ2I9N

— OpenAI (@OpenAI) May 13, 2024

This shows that it can identify how a user is feeling along with the sentiments attached. This can help to improve a user’s mood as ChatGPT can assist the user to feel better by providing helpful advice.

OpenAI also released a few more use cases of these new capabilities in their official blog. These use cases include singing, interview preparation, math, teaching, games, real-time translations, jokes, customer service, and general knowledge. All these capabilities will be a part of OpenAI’s newly announced GPT-4o model.

Conclusion

The new voice and vision capabilities of ChatGPT aim to provide a personalized user experience like never before! The GPT-4o model will be available to all users including users subscribed to the free plan. Users will be avail of these features for their use in this new model.

ShareTweetShareSendSend
Dhruv Kudalkar

Dhruv Kudalkar

Hello, I'm Dhruv Kudalkar, a final year undergraduate student pursuing a degree in Information Technology. My research interests revolve around Generative AI and Natural Language Processing (NLP). I constantly explore new technologies and strive to stay up-to-date in these fields, driven by a passion for innovation and a desire to contribute to the ever-evolving landscape of intelligent systems.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.