Google Assistant Takes Lead in Understanding Speakers with Accents
- Speech recognition testing lab Vocalize.ai evaluated the ability of Amazon Alexa, Apple Siri and Google Assistant operating within smart speakers to correctly recognize accented speech by fluent English speakers from the U.S., India and China.
- Google Assistant performed significantly better than both Alexa and Siri in three performance tests.
Vocalize.ai operates a test lab that measures the automated speech recognition (ASR) performance of smart speakers and its most recent study evaluated how well Amazon Alexa, Apple Siri and Google Assistant recognize speakers with foreign accents. Vocalize decided to address the challenge of non-native English speakers with language fluency and moderately accented speech.
The company started with a set of three speakers and used a team at SRI to evaluate the accents of each subject on rate them on a six point scale where “5 corresponds to native or near-native performance and score near 1 indicates a very strong and very hard to understand accent.” The test subjects included:
- English with US accent 5.3 (native, near native)
- English with Indian accent 3.7 (moderate accent)
- English with Chinese accent 3.2 (moderate accent)
Google Excels in Isolated Word Recognition of Accented Speech
The test evaluated the ability of each voice assistant to recognize 36 spondaic words at a constant volume and distance from the test speaker and microphone. Vocalize researchers conclude that each of the voice assistants performed well for the U.S. and Indian accents. However, Google was well ahead of its peers when recognizing the English speaker from China.
Speech-in-Noise and SRT Delivered Another Victory for Google
A common use case for smart speakers involves using the devices in the midst of ambient sounds. Google again stood out in this evaluation that includes background noise sounds playing at the same time the user request was being made. The speech in noise (SIN) loss was 0%, 2% and 6% for the U.S., Indian and Chinese accent corpuses respectively for Google. That rose to 8%, 11% and 14% for Siri and 11%, 15% and 19% for Alexa. The results for Alexa and Siri are consistent with moderate hearing loss in humans.
Google Continues to Lead in Speech Recognition
The speech recognition threshold evaluates how an ASR system performs across varying volume levels. Google again stood out as performing well across each of the accents. In this test, the Y-axis is a useful data point, but the more important finding is the variance across accents. The report author concludes:
Siri’s maximum range is 6dB, and for this evaluation, we consider anything greater than 5dB as significant. This suggests that people with a Chinese accent may have to intentionally speak louder to be understood by Siri.
Vocalize points out the the U.N. estimates foreign-born U.S. residents comprise 15% of the total population. Recognizing accents is an important capability for voice assistants that aspire to serve everyone. Today, Google is leading its closest rivals according to the Vocalize.ai study.
Some data suggests there are regional variations in voice assistant performance even among native English speakers raised in the United States. That problem is exacerbated when you have consumers speaking in their second language. What are your thoughts? Is Google Assistant more consistently reliable than Amazon Echo or Apple HomePod? Let us know what you think on Twitter. You can review additional details about the study here.