close
close

Human voice or deepfake? Our brain can tell the difference – yet

Human voice or deepfake? Our brain can tell the difference – yet

ZURICH: Fake or real? It is becoming increasingly difficult to distinguish whether a human or an AI-generated voice is speaking – at least not consciously.

Researchers have observed that our brains respond differently to deepfake voices than to natural ones, although we may not be fully aware of it.

Fake voices seem to be less enjoyable to listen to, according to a study published in the journal Communications Biology.

Speech synthesis algorithms are now so powerful that the characteristics of voice clones are very close to those of natural speakers.

Voices imitated using deepfake technologies are used for telephone scams and to give virtual assistants the voice of a celebrity.

The team led by Claudia Roswandowitz from the University of Zurich analyzed how well human identity is preserved in voice clones. The researchers recorded the voices of four German-speaking men in 2020 and then used computer algorithms to generate deepfake voices of these speakers.

Deepfaked voices are already pretty perfect

The next step was to test how good the imitation was, i.e. how convincingly the identity was cloned. 25 test subjects were asked to decide whether the identity of two pre-recorded voices was identical or not.

In about two-thirds of the tests, the deepfake voices were correctly assigned to the respective speaker.

“This makes it clear that while current deepfake voices do not perfectly mimic identity, they have the potential to deceive people’s perceptions,” Roswandowitz said.

Using functional magnetic resonance imaging (fMRI), the researchers then investigated how individual areas of the brain react to fake and real voices.

According to the results, differences were found in two key areas: the nucleus accumbens and the auditory cortex. The researchers believe it is likely that both areas play an important role in whether a person recognizes a deepfake voice as fake or not.

“The nucleus accumbens is an important part of the reward system in the brain,” says Roswandowitz. When comparing a deepfake and a natural voice, it was less active than when comparing two real voices.

In other words, when we listen to a false voice, our brain’s reward system is less activated.

The brain tries to compensate for deepfake errors

According to the study, there were also differences in activity in the auditory cortex, which is responsible for analyzing sounds.

This area was more involved in recognizing the identity of deepfake voices. “We suspect that this area responds to the imperfect acoustic imitation of deepfake voices and tries to compensate for the missing acoustic signal,” Roswandowitz said.

The compensation probably took place largely in secret. “The consciousness is then signaled that something is different and more difficult, but this often remains below the threshold of perception.”

The rapid development of AI technologies has led to a massive increase in the creation and distribution of deepfakes, the researchers note.

So would today’s deepfakes, created four years later, completely fool the audience? Or would the results be similar?

“That’s a very interesting question,” says Roswandowitz. Newer AI-generated voices would probably have a slightly better sound quality.

Roswandowitz assumes that the differences in the activity of the auditory cortex would be smaller than at the time of the study.

This region reacts to the different sound quality. In the nucleus accumbens, however, she expects similar results. “It would be very interesting to investigate this.” – dpa