close
close

Microsoft’s AI can now perfectly imitate human voices

Microsoft’s AI can now perfectly imitate human voices

Microsoft has unveiled VALL-E 2, a breakthrough in artificial intelligence that enables remarkably human-like speech replication. Achieving “human parity,” the technology accurately mimics a speaker’s voice from just a few seconds of audio input.

Despite its groundbreaking capabilities, Microsoft has decided not to release VALL-E 2 publicly due to concerns about possible misuse, such as voice spoofing or identity fraud.

VALL-E 2 represents a significant development over its predecessor and uses innovative techniques such as Repetition Aware Sampling and Grouped Code Modeling to improve the naturalness and efficiency of the language.

Microsoft researchers have tested the performance using well-known databases and demonstrated superiority in terms of language robustness, naturalness and speaker similarity.

Microsoft cautiously classifies VALL-E 2 as a research project and has no immediate plans for commercial integration or wider accessibility.

The company stresses the importance of mitigating the risks associated with the misuse of artificial intelligence, citing recent incidents of AI-powered voice imitation and their implications for security and privacy.

Microsoft continues to face scrutiny over AI governance and privacy concerns, including its partnership with OpenAI. Decisions like holding off on publicly releasing VALL-E 2 reflect the company’s commitment to responsible use of AI.