OpenAI confirmed that advanced language features in ChatGPT won’t be introduced until later this year, but has continued to share glimpses of what we can expect. The latest showcases GPT-4o’s impressive language capabilities and teaches users Portuguese.
GPT-4o was unveiled at the OpenAI spring update earlier this year, and with it the impressive advanced speech features. They also revealed some vision and screen sharing features that we now know won’t be coming until much later in the year, or possibly even early next year
One of the big selling points of the original demo was GPT-4o’s ability to act as a live translation device, but in some of the new demos we see that it can also be an incredible language teacher, something I’ve experienced myself to a lesser extent with the current language model.
In a new OpenAI video, a native English speaker who wants to learn Portuguese and a Spanish speaker with a basic knowledge of the language use ChatGPT to improve their skills. At various points, they ask the speaker to speak more slowly or to explain terms – and he does it perfectly.
Learn languages with GPT-4o
What makes the new ChatGPT-4o voice feature so exciting is that it is natively speech-to-speech capable. Unlike previous models that had to convert speech to text first and do the same in reverse to respond, this one just naturally understands what you are saying.
The ability to understand speech and audio natively enables some exciting features, including working in multiple languages, setting different accents or changing the speed and liveliness of a voice, essentially making it the perfect teacher
Thanks to its native language capabilities, it can listen to you and analyze the way you pronounced certain words and even your accent. It can then provide direct feedback based on what it heard, rather than evaluating a transcript.
In addition, GPT-4o also has impressive reasoning and problem-solving skills and can therefore spot where you are making a mistake even in less obvious ways.
What else have we seen from GPT-4o?
You annoyed me 🥲 from r/ChatGPT
There have been several demos of the new enhanced voice capabilities, including some that weren’t intended for release. One of these shows that it’s able to create sound effects when telling a story, and another shows that it’s able to use several different voices.
In the official videos shared by OpenAI on YouTube, we saw it being used as a math teacher. In the video, it works on an iPad, splitting its screen and showing the AI advice and information on every aspect of a math problem.
The enhanced speech mode, and especially the ability to understand speech natively, appears to be one of the most significant advances in artificial intelligence since OpenAI integrated a chat interface into its GPT-3 model in November 2022.