Google’s Area 120 Lab Rolls Out Free AI-Powered Video Dubbing Translator Tool
Google has introduced a new free tool named Aloud for quickly dubbing videos in multiple languages. Developers at Google’s in-house incubation hub Area 120 built Aloud to produce dubbed versions of videos in new languages in less than an hour, along with a chance for the creator to correct errors.
Aloud combines several popular AI tasks into a single tool for creators on YouTube and other video platforms. The user can provide subtitles to a video or the AI’s speech-to-text model can produce a transcript for review. Then the tool translates the text into an available language and the user picks a synthetic voice to read out the translated speech, replacing the original audio in the video for publishing. Aloud can translate and dub a five-minute video in 10 minutes, according to its designers. As a transparency measure, Aloud creators have to mention that the dub is synthetic and reference the original video in the description, credits, or as a pinned comment. The current early-access version of Aloud only dubs English videos into Spanish and Portuguese, but the Sri Lankan-born developers have Hindi and Bahasa-Indonesian in the pipeline along with several other languages.
“Dubbing used to take weeks worth of effort and a large budget. But with Aloud, you only need a few minutes,” Aloud co-founders Buddhika Kottahachchi and Sasakthi Abeysinghe explained in announcing the product. “We use advances in audio separation, machine translation and speech synthesis to reduce time-consuming and costly steps like translation, video editing and audio production. You do not even need to know any language other than the ones you already speak, and all of this is available at no cost to the creator.”
Google has been investigating AI translation and transcription for a long time. For instance, Google researchers produced an AI translation model last year named Translatotron 2 capable of translating and synthesizing human speech. That is an unrelated project as Aloud’s creators aren’t authors on the paper and Translatotron 2 was specifically designed to only produce translated audio in the original speaker’s voice as a way of avoiding deepfakes. There’s also Google Translate’s real-time transcription option and the instant translation feature for Google Assistant on Android. Aloud is focusing on video, not just audio. The creators cite informational and educational content as their first focus.
“With dubbing, you can now reach previously unreachable portions of the world’s population. In our experiments, we have seen double-digit growth in views just by dubbing into one additional language,” Kottahachchi and Abeysinghe wrote. “Aloud doesn’t create new content — it only uses the original speech and translates it into a different language of your choice. We’re also working with YouTube to let creators add multiple audio tracks to their videos, a new feature that they started testing with a small group of creators late last year.”