Are you waiting for ChatGPT’s flirt voice mode? Your wait is now longer

June 26, 2024

Calvin Wankhede / Android Authority

In brief

ChatGPT’s new enhanced voice mode has been delayed by at least a month.
OpenAI is currently working on improving the security and reliability of the model.
The feature will soon be available to select users as a limited alpha version, with the full release planned for late 2024.

Last month, I wrote that one of GPT-4o’s key features wouldn’t see the light of day for several weeks. The feature in question was an advanced voice conversation mode built into the ChatGPT smartphone app, with capabilities far beyond those of any personal assistant we’ve seen to date. But today, today, OpenAI has announced that the feature won’t be available for at least a month.

In a recent tweet, OpenAI explained that it originally planned to roll out the feature to select users in late June, but the company has decided it needs another month to focus on security. Frankly, and in OpenAI’s own words, the company is “improving the model’s ability to detect and reject certain content.”

OpenAI also cited infrastructure-related challenges as a reason for the delay. This is not surprising, as ChatGPT has had numerous outages in the last month alone. Even before this, I personally noticed stuttering and artifacts when using the normal voice conversation mode. GPT-4o could be more computationally intensive, especially since OpenAI promises that it can deliver responses to audio input in just 232 milliseconds.

Although OpenAI said it won’t release access to the new voice mode until next month, a few users have reportedly already received an in-app invitation to test the feature. The page describes “Advanced Voice” as a new feature in “limited alpha.” Accepting the invitation doesn’t seem to release access to the new voice mode, however, so a popup may appear sooner than intended.

OpenAI’s tweet, meanwhile, suggests that alpha access will be made available to a small group of users next month, with general availability planned for fall, but the company warns that the release schedule will depend on compliance with internal security and reliability standards.

What can ChatGPT’s advanced voice mode do?

We got our first look at GPT-4o’s new voice mode at OpenAI’s Spring Update event in early May. In the weeks that followed, the company released a series of demos in which ChatGPT not only facilitated quick, back-and-forth discussions, but was also able to modulate its voice to mimic sarcasm, laughter, and more. OpenAI has also claimed that the model will be able to detect emotions in the user’s voice and respond accordingly – a first for any chatbot.

Some sample videos also combined GPT-4o’s voice and visual capabilities, allowing the chatbot to answer questions about real-world situations. In one such demo, Sal Khan, founder of Khan Academy, showed how the feature can be used as a teaching tool for on-screen math problems.

According to OpenAI’s tweet, the new video and screen sharing features will be introduced separately from the voice mode. However, all of these advanced features are locked behind the company’s paid ChatGPT Plus subscription. Previously, the $20-per-month subscription only unlocked text-based access to the GPT-4o model, as well as additional features like custom GPTs.

Do you have a tip? Talk to us! Email our team at [email protected]. You can remain anonymous or credit the information, the choice is yours.