close
close

OpenAI delays release of new ChatGPT voice mode

OpenAI delays release of new ChatGPT voice mode

Don’t miss the leading minds from OpenAI, Chevron, Nvidia, Kaiser Permanente and Capital One at VentureBeat Transform 2024. Gain key insights into GenAI and expand your network at this exclusive three-day event. Learn more


It was probably the centerpiece of OpenAI’s big Spring Update event, at least alongside the new GPT-4o base model.

But the new voice mode the AI ​​company demoed for ChatGPT — which enables a more natural, conversation-like, dare I say “human” experience with the AI ​​chatbot, complete with emotional modulation and the ability to handle interruptions — is officially being pushed back by at least a month from its planned late-June rollout date for a “small group” of ChatGPT Plus users.

This means we can’t expect Voice Mode until late July or early August at the earliest, and even then it will only be available to a “small group of users.” The reason? OpenAI says it needs even more time to ensure the new Voice Mode can “detect and reject certain content.”

As OpenAI posted today on its X account:


Countdown to VB Transform 2024

Join business leaders in San Francisco from July 9-11 for our premier AI event. Network with peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. Register now


We’re releasing an update to the enhanced voice mode that we demonstrated during our Spring Update and that we continue to be very excited about:

We had planned to roll this out in alpha to a small group of ChatGPT Plus users in late June, but we still need a month to reach our launch threshold. For example, we are improving the model’s ability to detect and reject certain content. We are also working to improve the user experience and prepare our infrastructure to scale to millions of users while still ensuring real-time responses.

As part of our iterative deployment strategy, we will launch the alpha with a small group of users to gather feedback and expand based on what we learn. We plan for all Plus users to have access in the fall. Exact timelines will depend on whether we maintain our high security and reliability standards. We’re also working on rolling out the new video and screen sharing features we demoed separately and will keep you updated on that timeline.

ChatGPT’s advanced voice mode can understand and respond to emotions and non-verbal cues, bringing us closer to real-time conversations powered by artificial intelligence. Our mission is to bring these new experiences to you with thoughtfulness.

The news comes as OpenAI tries to fend off the rise of competitors like Anthropic, which was founded by former OpenAI team members. While Anthropic doesn’t yet have a speech output or input mode, it just released a new base model, Claude 3.5 Sonnet, which appears to match or even outperform OpenAI’s GPT-4o model in various third-party benchmarks.

OpenAI also faces lawsuits and a growing tide of critics outside and inside the company concerned about the safety measures (or lack thereof) surrounding its goal of artificial generalized intelligence (AGI), or AI that outperformed humans at the most economically valuable work. Another group of OpenAI employees attacked management for what they saw as overly restrictive separation agreements and equity caps, most of which OpenAI has since lifted.

Other critics outside the company include actress Scarlett Johansson, who attacked OpenAI after the Spring Update event for a demo of an AI voice named “Sky” that she said sounded like her after she refused to let the company use her likeness. OpenAI and executives have since refuted her claim with evidence showing that the company recruited Sky’s voice actress independently of its advances toward Johansson. Nevertheless, the company has disabled that AI voice.

But despite the growing barrage of criticism and backlash, the company continues to impress and gain new followers, such as producers of new music videos and commercials with its yet-to-be-released video AI model Sora or healthcare startup Color, which is integrating GPT-4o into a cancer screening app for doctors. The company is also gaining numerous new corporate accounts.