OpenAI announced earlier this week that most users will have to wait until the fall to get access to GPT-4o’s Advanced Voice feature, but a lucky few have apparently already gotten a sneak peek of what’s possible with the next-generation voice assistant.
Reddit user RozziTheCreator was one of the lucky few, sharing a recording of a new GPT-4o voice we’ve never heard before telling a horror story, complete with sound effects tied to the story, like thunder and footsteps.
It appears that this was a mistake. OpenAI told me in a statement that some users had inadvertently gained access to the model, but that this has now been corrected.
What can we hear in the leaked video?
You annoyed me 🥲 from r/ChatGPT
Every video we’ve had so far on GPT-4o’s enhanced voice feature has been powered by OpenAI, and while the sound was fantastic, this was limited to bespoke use cases.
RozziTheCreator’s new video seems to showcase this ability in a more natural way, including a sound effect feature we’ve never heard before.
Imagine there’s this small town where everyone knows everyone, kind of like a video, and at the end of the street there’s this little house.
GPT-4o
I messaged RozziTheCreator about the experience and he said, “It just suddenly appeared, it looked the same, the only difference was the voice.” The discovery happened late at night when RozziTheCreator tried to ask the chatbot a question: “Boom, I discovered the change.”
It only lasted a few minutes and was, according to RozziTheCreator, “full of errors,” so there wasn’t much time to get much out of it, but they did manage to capture a snippet of this incredible story.
“It started going crazy and repeating and responding to things I hadn’t said,” RozziTheCreator said, before reverting back to the normal basic voice that everyone else already knows how to use.
In the video, GPT-4o can be heard telling the story in a relaxed manner, accompanied by sound effects. He explains: “Imagine it, there’s this little town, everyone knows everyone, a video like that, and there’s this little house at the end of the street.”
It continues the story of two teenagers who searched the house during the storm with “nothing but a flashlight and their phones for light.”
So what went wrong with the rollout?
OpenAI is slowly rolling out a whole host of new features. The first Plus users were supposed to get GPT-4o’s enhanced speech feature this month, but there were delays due to some security issues and concerns about whether the hardware infrastructure was in place.
I asked OpenAI what happened to give RozziTheCreator access, and a spokesperson told me, “While we were testing the feature, we accidentally sent invitations to a small number of ChatGPT users. This was a bug and we have fixed it.”
They confirmed that the first Plus users will get access next month, but for most people it will be a while yet. The initial rollout is to “gather feedback and plan expansion based on what we learn.”
So, no GPT-4o voice yet, but this is the latest in a series of examples of GPT-4o seemingly wanting to break free of its limitations and unleash its full capabilities. I’ve seen examples myself where it was directly analyzing audio files one minute and running them through code the next.
This has made me even more excited about the comprehensive features and even more annoyed about the delay – as understandable as it may be.