Is he losing his voice? Voice-over master Dean Compoginis sounds the alarm about AI

July 14, 2024

short version

Dean Compoginis, a resident of Aptos, makes a good living as a voice actor. But artificial intelligence, which imitates the human voice more and more realistically, now poses an existential threat to people in his field, he says.

Dean Compoginis has a beautiful voice – full, expressive and with a certain musicality.

It’s so great, in fact, that he was able to earn a good living from it, first as a radio presenter and then as a busy professional voice actor.

At 65, the Aptos resident is still in top form, doing voiceover work for commercials, commercials, video games and other formats. His voice has been heard in ads for the California Lottery and Burger King, as well as commercials for Comedy Central and “Family Guy.”

But he also feels that thanks to artificial intelligence he might soon lose control of his own voice.

Humans are primarily visual creatures. Our eyes are the primary means by which we explore the world around us, and most importantly, the primary means by which we perceive what we are familiar with and what we believe to be trustworthy. That’s why much of the debate about AI as a tool for deception and fraud revolves around video and still images.

But what about the world of hearing? If AI has not yet succeeded in creating faces and images that can reliably deceive the human eye, how much closer is it to developing a voice that could deceive the ear?

For over a year, stories have been surfacing of scammers using AI-generated voices to convince unsuspecting people over the phone that their friends or family members are in trouble in order to trick them into giving them money or confidential information.

Earlier this year, OpenAI, the developer of ChatGPT and one of the AI industry’s leading players, announced the development of Voice Engine, which needs just a 15-second sample to essentially clone a unique voice. The technology has only been released in preview so far, mainly, the company says, to “start a dialogue about the responsible use of synthetic voices.” (OpenAI also faced its share of crossfire when one of its AI-generated “personal assistants” may have cloned the voice of movie star Scarlett Johansson.)

What has received less attention than the AI voice scams, however, is what the AI revolution is doing to the $4.4 billion voiceover industry. Since the dawn of recorded sound, people like Dean Compoginis have lent their real voices to everything from the sweet voices in old newsreels to dubbing in cartoons to perhaps the greatest movie trailer voiceover, Mr. “In-a-world…” himself, Don LaFontaine.

Nowadays, these options tend to include explainer videos, podcasts, e-learning videos and audio book commentaries.

Aptos-based voice actor Dean Compoginis — Credit: Natasha Loudermilk / Santa Cruz Lookout

Compoginis has been part of that tradition for years, recording in his own home studio in Aptos, which has been even more convenient during the pandemic. He is one of about 45 voice artists from Santa Cruz and the surrounding area listed on Voices.com, one of the industry’s leading marketplace platforms.

Compoginis has enjoyed success in his business, but now he is facing a grim turn toward a future that threatens to turn his industry on its head.

“I think we’re just at the beginning of what AI can do in terms of actually replacing human creators,” he said.

Of course, it’s one thing to have AI providing fully computer-generated voices to convert text to speech that sound so human that they put real professionals out of work. There are already tools that do just that. But it’s quite another when you find that your own distinctive voice has been copied and used in a way you didn’t authorize.

A few years ago, Compoginis voiced a character in a video game that was so popular that it was downloaded millions of times. Not long ago, someone sent him a link to a website that promised to produce a commentary in the character’s voice. Compoginis had never heard of the website and certainly received no compensation for it. He felt like his voice was being hijacked.

“I thought, ‘Boy, this is what the SAG-AFTRA strike was about,'” he said.

He’s referring to the actors’ union strike that began exactly one year ago and lasted nearly four months, the longest strike in the history of the Screen Actors Guild and the American Federation of Television and Radio Artists, of which Compoginis is a member. The strike came about in part because of concerns about the increasing viability of AI and its ability to mimic real people. The settlement of the strike brought some protections designed to prevent Hollywood studios and producers from using AI-generated performances. But just as building a levee can’t stop the rising tide, the momentum of AI will continue to put pressure on the creative industry.

When the possibilities of AI first emerged in the creative industry, Compoginis heard from voice actors, directors, agents and others in the industry that “they will never replace the emotion and nuance of a real human voice.” But Compoginis soon realized, “It’s only a matter of time before that happens.”

Another element at play here is that AI-generated voices are already normalized by the use of such voices in social media environments. “The standards of what people accept as quality of content are sort of dropping,” he said. “Certainly, when younger generations get their hands on phones, they’re already used to hearing something that isn’t a human voice, and that’s totally acceptable. They don’t think twice about it. It seems like everything I see on Instagram – I don’t use TikTok, but I’m sure it’s the same – are the same five AI voices.”

It’s difficult to be outraged by an AI-generated voice that says, for example, “Your call is very important to us” on hold. Most of us can tell when it’s an obvious bot voice. But as the phone scams and the ScarJo scandal show, AI tries to create convincing replicas of unique voices. And that can fool even a professional’s hearing.

“Sometimes I get fooled,” Compoginis said. “Sometimes I have to listen really closely before I finally think, ‘Oh, that’s an AI voice, because (no real human) would stress that syllable or whatever.’ But I think that in many cases, the person who is not doing it professionally doesn’t notice.”

Companies offering AI voiceovers could even take advantage of entry-level professionals looking to get into the voiceover field. Compoginis said there is plenty of work online for voice artists in what is known as text-to-speech (TTS) voice modeling, which creates artificial voices from raw real voices that can be used for everything from commercials to podcasts to audiobooks.

“You see places where they seem to pay a significant amount of money to spend twenty hours or so reading endless lists of words,” he said. “And then you sign your vote and they can use it for whatever they want.”

Many people today – public figures, celebrities, podcasters, social media users – have already created a limitless catalog of statements from their own recorded voices. Unscrupulous opportunists can create a world in which, for example, a political candidate is recorded saying something outrageous that he did not actually say.

In an apocalyptic world where you can’t believe your own ears, there’s also the possibility that AI will fundamentally influence the evolution of language itself. Language changes over time because each generation of human speakers and writers pushes it into new areas through slang, jargon, memes, and other innovations. Where is the tipping point between the time when AI merely imitates language and the time when it influences and shapes language by creating new words and phrases?

But right now, Dean Compoginis and other professional voice artists are racing to stay one step ahead of technology and preserve what is unique and distinctive about humanity.

“One of the secrets (of good voiceover work) that’s hard to teach learners,” he said, “is that the magic is in the pauses, in the spaces between the beats. And there’s something about just slowing down or speeding up a little bit, whether it’s in music or in a great script, that adds a human quality to the whole thing. I’m sure there are AI scientists out there breaking that down right now. ‘How do we get the pauses in there so they sound perfect?’ you know? But right now, at least, it’s those pauses, spaces between the words. They’re what really make it resonate with us as humans.”

Have something to say? Lookout welcomes letters to the editor in accordance with our guidelines. Guidelines can be found here.