Meta Unveils ‘Universal Translator’ AI Model SeamlessM4T
Meta has introduced a new AI model capable of translating and transcribing speech across nearly 100 languages called SeamlessM4T. The tech giant is pitching the model as a kind of universal real-time language translator, one that transcends the limitations of existing systems restricted to certain languages and forms of communication.
SeamlessM4T can perform speech-to-text, speech-to-speech translation, text-to-speech translation, and text-to-text translation for several dozen languages. Other speech translation systems have only worked for a fraction of the world’s approximately 7,000 languages, but Meta claims SeamlessM4T is on the road to a universal translator because it condenses almost 100 input and output languages in a single model. By combining capabilities in one model, SeamlessM4T overcomes issues plaguing pipelines dividing speech translation across multiple subsystems. Its unified design is a breakthrough in enabling seamless cross-lingual speech and text communication. You can see it in action in the video below.
The system implicitly detects source languages without needing a separate identifier model. It significantly improves performance for lower-resource languages while maintaining accuracy on high-resource ones like English and Spanish. Meta has made the model open source to encourage research and development by AI developers, along with a 270,000-hour multimodal dataset called SeamlessAlign and an array of supporting libraries and tools.
“Today we’re releasing SeamlessM4T, a new multimodal AI model that lets people who speak different languages communicate more effectively,” Meta CEO Mark Zuckerberg wrote in a Facebook post about the new model. “Over time, we’ll integrate these AI advances in translation and transcription into Facebook, Instagram, WhatsApp, Messenger, and Threads.”
The new model continues Meta’s work on leveraging AI to translate languages. The company shared a model last July claiming particularly high-quality translations among 200 languages named NLLB-200 for its roots in Meta’s No Language Left Behind (NLLB) project. The NLLB-200 model came only a couple of years after a 100-language translation model, the first to work without requiring English as an intermediary language between any of the other tongues. And last fall, Meta released an oral-only model for AI translation aimed at languages with limited or no written form. All of them can be traced to the two giant conversational AI datasets Meta released in 2021.