New Resemble AI Software Turns 3-Minute Records into Synthetic Speech Profiles
Synthetic speech technology startup Resemble AI has debuted a new tool for creating a digital voice based on a few short recordings called Resemble Clone. With Resemble Clone, a few minutes of someone’s voice can be used to make a custom speech that sounds like the person in the recording or a fictitious variation.
Resemble AI develops speech software to replicate or synthesize voices for a lot of potential purposes. Resemble Clone is specifically aimed at the entertainment industry, which the company believes could be a major benefactor of its technology. The software tool only needs a minimum of three minutes of someone’s voice to begin creating an artificial profile. The longer the recording, the more natural it sounds, however, according to what the founders told Voicebot in an interview earlier this year. The voice doesn’t even have to be live, any voice recording can be used.
“It’s a shift in paradigm for [all] voice acting. It’s like how with advances in visual technology, actors do less work,” Resemble AI co-founder Zohaib Ahmed told Voicebot in the interview. “The need for this technology, for [synthetic] speech technology, is only growing bigger.”
The final result does sound like a human, but it can still be distinguished from a real voice. The very high-end deepfakes out there can be a lot harder to tell apart. That’s part of why the Toronto and San Francisco-based startup also offers Resemblyzer an open-source tool that can be used to pick out the deepfakes from real audio.
Synthetic Speech for a Real Market
Resemble AI is a young startup, part of the Betaworks Ventures Synthetic Camp accelerator program in New York this year, which came with a $200,000 investment. It is not alone in seeing the ways artificially generated speech that sounds human could be useful, however. Other startups, as well as tech giants, are all pursuing similar ideas.
On the startup side, Australian-born speech synthesis startup Replica Studios recently closed a $2.5 million seed funding round led by The Venture Reality Fund. Meanwhile, VocaliD, which began as a voice prosthesis developer, is now offering synthetic voices for call centers and voice apps all over the world.
The big names aren’t ignoring the space either. Amazon and Google both are working on variations of artificial speech generation. Google Assistant is experimenting with WaveNet technology, while Alexa’s use of neural text-to-speech (NTTS) is how Amazon enabled Alexa to imitate Samuel L. Jackson’s voice. Chinese firm Baidu is also developing voice cloning of its own. Resemble AI and other startups will need to offer something unique to stand out from what the bigger names create. A tool like Resemble Clone could help at least make people more aware of the shape of the market as it evolves. People will want as much flexibility and options for customization as they can, and Resemble AI will want to be the name people think of when they are working on their next film or video game or voice app.