close
close

When voice assistants miss their target: The need for inclusive design in voice technology

When voice assistants miss their target: The need for inclusive design in voice technology

This ad is interesting for two reasons. First, it shows how important voice technology is becoming for businesses. Voice technology has completely changed the way we interact with our devices and the digital world. It is estimated that there will be over 4.2 billion digital voice assistants in use worldwide by 2023, and this number is expected to rise to 8.4 billion by 2024 (Statista). As a result of this boom, brands are increasingly relying on accurate voice recognition to ensure discoverability and engagement.

Second, it’s not the users’ fault or Hyundai’s that its dealers are difficult to find via voice search. As you might expect, users often have trouble with unusual or foreign brand names like Hyundai. Even here in Boston, MA, I suddenly asked colleagues how they pronounce Hyundai and their answers varied. In addition to mispronunciations, regional accents, dialects and even individual speech patterns can contribute to inaccuracies in voice recognition. It’s the responsibility of the companies developing voice technology to account for these variations.

The accuracy of voice search is an important factor

Voice search is becoming more integrated into the user experience. The Voicebot.ai US Smart Speaker Consumer Adoption Report 2022 found a significant increase in the variety and frequency of commands used with voice assistants, suggesting that users are increasingly relying on voice technology for a wider range of tasks. At the same time, the report also highlights that users often experience frustration with voice assistants, especially when it comes to understanding complex commands or accents.

This is confirmed by Applause’s own research. The 2023 Voice and AI Technologies Survey found that 30% of users are either somewhat or extremely dissatisfied with voice assistants. When asked about their general attitude towards the technology, the most common response was, “I would use voice assistants more if they were more responsive to the way I express myself.” A better understanding of voice variations and responses was among the top three suggestions for how voice assistants could be improved, along with more relevant and user-specific responses.

Good and bad strategies for accuracy and inclusivity

I’ve seen examples of companies offering workarounds for mispronunciations that don’t actually solve the real problem. While alternative search options like text input or visual menus provide an alternative for users who struggle with voice commands, there’s no guarantee that users won’t just give up if voice search fails the first time. The same goes for pronunciation guides; users realistically won’t read these in busy situations and they’re just another potential source of frustration.

There’s simply no shortcut to combating mispronunciations in speech technology other than tackling the problem directly in the model. Brands can’t change their names, but developers can change how well their models account for variations in pronunciation.

Speech recognition solutions for brands

Speech recognition technology can be improved through several approaches:

  • Diverse training data: By training voice models on a wider range of accents, dialects and pronunciation variations, we can teach them to better understand the diversity of human language.
  • User feedback loops: Continuous user feedback allows recurring detection errors to be identified and corrected, allowing the system to improve over time.
  • Thorough testing: Thorough testing with different user groups can uncover potential pronunciation problems before they impact the user experience.
  • Regular updates: Language models should be updated regularly to incorporate new data and improve accuracy.

The future of voice search means inclusivity

Today, accurate pronunciation recognition is just the starting point for voice experiences. Research shows that AI can now classify emotional states with up to 85% accuracy by analyzing voice patterns, allowing voice assistants to respond in a more understanding and friendly manner. In addition, innovations such as multilingual support, voice biometrics for increased safety, multimodal interactions, and personalized responses are shaping a future where voice technology is not only accurate, but also inclusive and user-centric.

Voice technology should work for everyone, regardless of accent, dialect or pronunciation. The Hyundai advert, while amusing, is a reminder that there is still much work to be done.