Resemble AI Raises $8M and Launches Deepfake Voice Detector
Resemble AI founders Zohaib Ahmed and Saqib Muhammad first showed off their generative AI-fueled voice cloning technology to the Voicebot team in the summer of 2019 during a voice marketing meetup in New York City. Four years and a couple of weeks later, Resemble AI has raised an $8 million Series A funding round led by Javelin Venture Partners with Comcast Ventures. And while 2019 deepfake recordings were usually not difficult to pick out from the real thing, Resemble now feels there’s a need for its new Deepfake Detect tool for spotting generative AI-produced audio.
Resemble AI’s proprietary generative AI models can train up a voice clone with just five minutes of a person’s voice. Improved technology has led to better-sounding voices from smaller recording samples, and the company has branched into applying its voice cloning for translation. The company’s ambitions for the entertainment industry have borne fruit as well. Last year’s Netflix documentary, The Andy Warhol Diaries, included Resmble’s AI-generated voice of the artist reading excerpts from his memoir for the film. One way of noting how the conversation around synthetic media has evolved is that Ahmed and Muhammad called the generative AI underlying Resemble’s ‘deep learning software’ when first discussing them.
“I think we were always a generative AI company, even in 2019. It’s like how virtual reality was called VR or AR, and now I think the metaverse, everyone has adapted. I think we are a generative AI company at heart, though,” Ahmed told Voicebot in a new interview. “A lot has happened in four years. We have a million users, with a good portion paying, and we are scaling up on the business front,” There are also new tech challenges and markets that are interesting to us, like dubbing. We’ve already seen customers in the wild using it for that.”
Resemble Detect is essentially a sophisticated AI ear, one that listens for the very subtle sonic artifacts inherent in any manipulated audio. Regardless of how the sound is adjusted, those clues remain, and Resemble Detect can use them to assess the likelihood that the audio is a deepfake. Ahmed claimed that Resemble Detect is up to 98% accurate in real-time identification of deepfake audio though that falls to 87% when it has never encountered the voice or track before.
Resemble Detect augments the audio watermark feature Resemble introduced in February. The PerTh Watermarker allows Resemble AI to tag any audio produced by its software, without compromising sound quality. The watermark is essentially a very soft sound that people won’t notice but that is encoded with information that a computer can decrypt and not only identify it as synthetic but trace it back to its original dataset.
“We’ve tested it with everything, voices, music; we tested it with the fake Drake song even. It works across all of it. We call it the antivirus for AI,” Ahmed said. “The updated watermark we can trace back to the source if the [synthetic voice] trained on our dataset. If Spotify used our technology to [watermark] their library and someone made a [voice clone] trained on a song in their catalog, the watermark would prove it trained on our dataset.”