Apple Researchers Are Improving Voice Wake Word Detection and Speaker Identification
Apple is publishing a series of research papers that lay out how it may improve Siri’s ability to correctly identify when it is being triggered, by whom, and in what language. Though each paper tackles a different subject, they all point to a shared goal of a voice assistant that only turns on when it is supposed to and responds to the speaker as an individual.
Ignoring the Noise, Identifying the Speaker
One of the studies looked at ways to reduce the number of times voice assistants are activated by accident. Speech or background noise that sounds close to the wake-up words turning on the voice assistant is an incredibly common issue. Siri specifically created an awkward moment for BBC meteorologist Tomasz Schafernaker when it interrupted and contradicted his predictions about the weather live on TV. Apple’s researchers created a graph neural network (GNN) AI model to try and improve the detection of false alarms without making it harder to wake the voice assistant on purpose. The program cut the number of false wake-ups by 87%. Crucially, the voice assistant did still wake up 99% of the time when the trigger phrase was deliberately spoken.
As voice assistants become more personalized, they need to figure out who is speaking to them after they note their trigger word. Another of Apple’s studies looked at combining the two. In other words, using the detection of the trigger word to figure out who is saying it at the same time. Using 16,000 hours of audio from more than 100 people, the researchers built AI models to test the idea. The new models could detect the sounds of a wake word and tag the speaker at the same time, faster and more efficiently, without sacrificing accuracy.
Building Trust With Accuracy
Nearly two-thirds of voice assistant users reporting they have accidentally awakened their device over the course of a month, according to a recent survey. It’s the kind of small but persistent problem that turns people off the idea of having voice-activated technology. There’s also a privacy concern, especially when people might not know they have accidentally awoken their voice assistant. The suspicion about voice assistants listening when they shouldn’t hit new heights over the last summer when reports came out about contractors listened to conversations and sounds people didn’t know were recorded and transmitted.
Voice assistant platforms shifted the way those programs operate afterward but being able to promise that voice assistants will only wake up when commanded is something all of the major platforms are interested in pursuing. For instance, Google is developing controls for users to customize how sensitive Google Assistant is to its wake word. Apple’s research isn’t going to be integrated into Siri immediately, but the investment in this science points to the goals the company has for its voice assistant.