Google Assistant Upgrade Lets Users Teach Name Pronunciation, Improves Contextual Understanding
Google Assistant won’t always awkwardly mispronounce names that aren’t in its existing vocabulary anymore. Users can now add the proper pronunciation to their contacts, directly teaching the voice assistant how to say the name just as typing it out provides the proper spelling. The new feature comes bundled with upgraded contextual understanding for Google Assistant, so that it will respond based on a whole conversation, not just the most recent sentence.
The pronunciation tool works like a limited extension of Voice Match, the process Google Assistant uses to identify people by their voice and speech patterns. As seen in the video above, you can add audio to a contact card that the voice assistant will use to learn how to say their name correctly, though it won’t keep the recording once it learns the pronunciation. It’s reminiscent of My Fair Lady, with the user as Professor Henry Higgins and Google Assistant as a virtual Eliza Doolittle.
“Understanding spoken language is difficult because it’s so contextual, and varies so much from person to person. And names can bring up other language hiccups — for instance, some names that are spelled the same are pronounced differently. It’s this kind of complexity that makes perfectly understanding the way we speak so difficult,” Google Assistant product management director Yury Pinsky wrote in a blog post. “Names matter, and it’s frustrating when you’re trying to send a text or make a call and Google Assistant mispronounces or simply doesn’t recognize a contact. We want Assistant to accurately recognize and pronounce people’s names as often as possible, especially those that are less common.”
Direct audio instruction to teach voice assistants has begun popping up recently. In December, Amazon introduced a Teachable AI feature for Alexa that lets the user directly instruct Alexa about their preferences, with the voice assistant asking follow-up questions to refine its responses to a user’s phrasing. Meanwhile, Samsung Bixby debuted a feature more similar to Google this month, enabling users to tell the voice assistant their relationship with a contact and use it as an alternative to a given name. If you ask Bixby to call your grandmother, it will ask who that is, connecting the title to the contact. Bixby will remember the relationship for future requests, which could be especially useful for when a child wants to call a relative as well as just streamlining the conversation.
Google Assistant has also improved its more indirect learning ability. The voice assistant revamped its Natural Language Understanding (NLU) engine to be able to slot what a user says into a bigger picture, starting with timers and time-related tasks. Using context clues from the whole conversation, Google Assistant can run multiple timers and alarms and understand which of them a user is referring to in future commands, even if they don’t remember precisely the name used in setting it up. Pinsky wrote that the voice assistant is 100% more accurate in responding to any timer and alarm-related requests, and more use cases will be added soon.
The voice assistant applies similar contextual thinking to any conversation with users, relying on the whole conversation to determine what the subject is. Google’s example, as seen on the right, shows the voice assistant understanding that after talking about Miami, a user asking about the “best beaches” is thinking about beaches around Miami. Google Assistant doesn’t need the proper nouns to be repeated in every sentence, and it will even use whatever is on the screen of the smartphone or tablet to fill in the missing nouns. The conversation mimics one between two humans more closely as a result.
“To get things done with the Google Assistant, it needs to understand you – it has to both recognize the words you’re saying, and also know what you mean. It should adapt to your way of talking, not require you to say exactly the right words in the right order,” Pinsky wrote. “There’s a lot of work to be done, and we look forward to continue advancing our conversational AI capabilities as we move toward more natural, fluid voice interactions that truly make everyday a little easier.”