Apple is Developing Stutter Detection for Siri
Apple researchers are working on ways to teach Siri to determine if a speaker is stuttering and compensating so the voice assistant can understand what they are saying. A new study showcases their work developing a database of relevant audio clips to train the AI accordingly.
The new study focuses on coming designing AI training tools for spotting when someone speaks with a stutter. The goal is to build up a collection of clips where people with a stutter are speaking. That audio can then be fed into automatic speech recognition models used by Siri and other voice assistants to teach them when someone with an atypical speech pattern is talking. From there, the AI can start to generate responses so that people with a stutter can engage with a voice assistant without being interrupted or misunderstood. At the moment, the only recourse Siri users have is the Hold to Talk feature, which keeps Siri in listening mode as long as they want. That way, the voice assistant won’t interrupt before the user is done speaking, even if they have a stutter or other speech impediment.
“The ability to automatically detect stuttering events in speech could help speech pathologists track an individual’s fluency over time or help improve speech recognition systems for people with atypical speech patterns. Despite increasing interest in this area, existing public datasets are too small to build generalizable dysfluency detection systems and lack sufficient annotations,” Apple’s scientists wrote in a paper on their research. “In this work, we introduce Stuttering Events in Podcasts (SEP-28k), a dataset containing over 28k clips labeled with five event types including blocks, prolongations, sound repetitions, word repetitions, and interjections. Audio comes from public podcasts largely consisting of people who stutter interviewing other people who stutter.”
The study found that expanding the amount of data used to train the models made them better at detecting stuttering by 28%. That said, the researchers point out that they are using just one approach and that other models and designs could improve that detection rate. They also cite other kinds of speech dysfunctions that need their own models of detection.
Apple’s research is part of a larger wave of work being done to make voice assistants more accessible to people with atypical speech. For instance, Israeli voice tech startup Voiceitt recently began working with Alexa to enable people with atypical and impaired speech to use the voice assistant. The startup’s mobile app listens and translates a user’s words into a form Alexa can understand, giving those with limited speaking function access to Alexa. Google is pursuing similar goals, including potential competition for Voiceitt in the form of Project Euphonia, a program for training voice assistants to understand what people with speech impairments are saying.