Microsoft Translator Now Counts More Than 100 Languages in Library
Microsoft Translator added another dozen languages and dialects to its library this week, giving the service more than 100 languages in total. More than 5.66 billion people should now be able to understand text and documents processed by the translator, including Uyghur, Macedonian, two forms of Mongolian, and the Inuktitut dialect of the Inuktut language spoken by the Inuit.
Encompassing more than 100 languages and dialects means Microsoft went well beyond the most commonly spoken tongues such as English, Chinese, Hindi, Spanish, and Arabic. Only 40,000 Inuit in Canada speak Inuktitut, for instance. Azure Cognitive Services also handles AI models for Optical Character Recognition (OCR), so any written text can be absorbed and translated into any of the supported languages. Google beat Microsoft to the milestone back in 2016, but Amazon is lagging at 71 tongues. Microsoft claims it has an advantage over its rivals thanks to its AI techniques, which power Translator in the mobile app, Microsoft Office, Bing, and Azure’s enterprise services.
“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak,” Microsoft Azure AI chief technology officer Xuedong Huang said. “Not only do we celebrate what we have done on translation – reach 100 languages – but also for speech and OCR as well. We want to remove language barriers.”
Translating speech and text is a hugely popular aspect of speech AI development. Google recently released Translatotron 2.0, a new version of its model that recreates a speaker’s voice in a different language, which will likely inform future versions of Google Translate and its real-time transcription and the instant translation feature for Google Assistant on Android. Alexa has real-time translation as a feature as of late last year, expanding the multilingual mode Alexa has been opening up to new languages. Translation services are also what motivated Zoom to acquire Kites as it works to make enterprise communications as universal as possible, without the limitations of language holding them back.
Microsoft Translator upgraded its own app not long ago, offering regional accents as an option. The Speech Regions feature lets users adjust how the text-to-speech voice pronounces words to match different common variations based on location. English comes in American, British, Australian, Canadian, Irish, and Indian flavors, while Spanish has voices that sound native to Mexico and Spain.
Microsoft’s multilingual AI model known as Z-code combines several languages based on linguistic families so that the models can learn from each other. French, Spanish, and Italian models could all teach each other because they are from the Romance family, for instance. This drastically cuts down on how much data is needed for a good translation, making the process faster as well as more accurate when there isn’t a huge dictionary, or the language is endangered. Microsoft marries Z-code to its other AI linguistic tech to form what the company refers to as its XYZ-code vision.
“We can leverage the commonality and use that shared transfer learning capability to improve the whole language family,” Huang said. “This is bringing people closer together. This is the capability already in production because of our XYZ-code vision.”
Follow @voicebotai Follow @erichschwartz
Google’s Translatotron 2 Improves Linguistic Shifts Without the Deepfake Potential