Google Cloud Upgrades Speech AI Models Already Used by Spotify Car Thing
Google Cloud has introduced new speech recognition models for its speech-to-text (STT) API. The models improve accuracy for 23 languages and 61 locales supported by Google for third-party voice assistants. That includes Spotify’s voice assistant built into the Spotify Car Thing device.
Google Speech AI
Google’s STT API has skyrocketed in popularity since its release in 2017. More than a billion minutes of speech every month are processed by the API, according to the company. The latest ‘conformer’ models implement a single neural network for speech recognition rather than multiple models for sound, language, and pronunciation. It’s more efficient and accurate than before, even when it’s loud or the acoustic environment is otherwise not ideal. The improvements are evident immediately, though tuning the model leads to better performance.
“With voice continuing to emerge as the new frontier in human-computer interaction, many enterprises may seek to level up their technology and present consumers with speech recognition systems that more reliably and accurately recognize what their users are saying,” Google speech team distinguished scientist Françoise Beaufays wrote in announcing the upgrade. “If you are building speech control interfaces where users speak to their smart devices and applications, these improvements can let your users speak to these interfaces more naturally and in longer sentences. Without having to worry about whether or not their speech will be accurately captured, your users can establish better relationships with the machines and applications they interact with—and with your businesses as the brand behind the experience.”
Though only just announced Google Cloud has been testing the new models with a few clients, including Spotify. The streaming platform teamed with the tech giant to develop the voice assistant interface on its Car Thing device. Drivers can converse with the AI by saying, “Hey, Spotify.”
“Spotify worked closely with Google on bringing our brand new voice interface, ‘Hey Spotify,’ to customers across our mobile apps and Car Thing,” Spotify head of technology hardware Daniel Bromand said. “The increases in quality and especially noise robustness from the latest models, in addition to Spotify’s work on NLU and AI, are what make it possible to have these services work so well for so many users.”