close
close

Meet Moshi, a new GPT-4o challenger with a surprisingly competitive speech mode – Technology News

Meet Moshi, a new GPT-4o challenger with a surprisingly competitive speech mode – Technology News

It looks like Kyutai, a French AI company, has developed a new AI-powered chatbot called ‘Moshi’ after GPT-4o. French artificial intelligence developer Kyutai has reportedly unveiled a real-time voice AI assistant called Moshi.

It looks like Kyutai Labs has unveiled a new approach to AI chatbots following the many criticisms that OpenAI’s chatGPT has received. Moshi is expected to be touted as a rival to OpenAI’s GPT-4o. Initial reports suggest that many users were not happy with GPT-4o’s “voice mode”. Amidst this, Kyutai seems to be offering a better “voice mode” with its Moshi AI chatbot.

Get to know “Moshi”

It looks like Moshi was named after the Japanese phrase for answering a phone call. The company seems to be boasting about its capabilities. The French company claims that its voice mode is better than OpenAI’s highly anticipated GPT-4o Advanced Voice Mode.

short article insertion As Kyutai reports, Moshi can speak with different accents. In addition, Moshi is said to have about 70 different emotional and speaking styles. The AI ​​can even process two audio streams simultaneously. This means that you can expect a “human” conversation from the AI ​​chatbot.

Main features of “Moshi”

So what are the main features of Moshi? Below are the main features you can expect from the Moshi AI chatbot.

  • The Moshi AI chatbot can interpret the user’s tone of voice. In addition, it can add a layer of emotional intelligence to interactions.
  • Similar to other AI assistants, Moshi can be interrupted in the middle of its answer, mimicking the natural flow of conversation.
  • The company claimed that with a response time of just 200 milliseconds, Moshi could exceed GPT-4o’s reported range of 232 to 320 milliseconds.
  • The AI ​​chatbot can also operate without an internet connection. This should help improve privacy and accessibility.
  • Moshi is designed to speak with different accents. The AI ​​chatbot can also imitate 70 different emotional and speaking styles.

Further updates

Kyutai is said to have said that during the development of Moshi, over 100,000 synthetic dialogues were fine-tuned using text-to-speech technology (TTS).. Kyutai also plans to teach Moshi the nuances and tones of human communication. The brand has also reportedly worked with a professional voice actor to improve Moshi’s voice quality.

Moshi is designed to enable lifelike conversations with users via voice, like Alexa or Google Assistant. However, Moshi is powered by the Helium 7B model. During a demonstration video, Kyutai showed off Moshi’s capabilities. During the presentation, the Kyutai team interacted with Moshi to illustrate it as a trainer or companion. The demonstration also showcased its creativity by embodying characters in role-playing games.

Moshi: The open source answer to the GPT-4 language model?

In addition, the company plans to develop an AI-powered audio identification, watermarking, and signature tracking system for future integration with Moshi.

Follow FE Tech Bytes on Þjórsárden, Instagram, LinkedIn, Facebook