SoapBox Labs is Schooling Voice AI to Understand Children
SoapBox Labs is teaching voice assistants to understand the way children speak. The Irish startup seeks to make voice technology accessible to children in a way that standard speech recognition software has yet to achieve.
Children Confuse AI
There’s an emerging video genre chronicling the hilarity of voice assistants misunderstanding children. As funny as many of these videos are, and it’s hard not to laugh at a toddler getting upset because Alexa won’t play Baby Shark, their prevalence points to the fact that most voice assistants simply are not built to understand children’s speech.
“My lightbulb moment was watching my child interact with technology,” SoapBox Labs CEO Patricia Scanlon told Voicebot in an interview. “To them, voice technology is a need more than a want. And they don’t like not being understood.”
Scanlon, a former Bell Labs researcher, founded SoapBox in 2013 to bridge this child-shaped gap in speech technology. After more than two decades working in speech recognition tech, she determined the biggest problem to be a lack of information. Most of the recordings used to build and train voice AIs are from adults. However, children speak differently. This limits the voice assistant’s ability to understand the quirks inherent in children’s speech.
“They elongate syllables, overenunciate words, and don’t speak the same way as adults. We gathered thousands of hours of children’s speech to build our own dataset because any system is only as good as the data added to it, Scanlon said. “The big names in voice technology work okay, but once you have a child who deviates from adult speech, it becomes a problem. They don’t have or use the data on how children speak, so they struggle to understand them.”
Talking to Tots
SoapBox Labs marries its data with deep learning technology to create models and algorithms that voice platforms can leverage to speak with children. Scanlon said the focus up until recently has been on creating the technology, but that the company has started rolling out its tech as an API.
The applications of accurate speech recognition for children are broad. Creating voice apps or smart devices for education and entertainment is already a growing industry, whether it’s Amazon’s Echo Dot Kids Edition or toys that can move and talk to kids.
“2019 has been a great year because we don’t have to explain why it matters anymore at the board level,” Scanlon said. “They get it. Education about speech recognition [for kids] is happening naturally in the world.”
Most of SoapBox’s upcoming agreements are still under wraps, but a couple of its partnerships been made public. For instance, Lingumi Labs is using SoapBox’s tech to teach English to young children in Taiwan. On a potentially much bigger level, SoapBox worked with Microsoft to add its speech recognition tech to the Azure cloud platform earlier this year to build automated language and reading tutors for children. SoapBox’s technology is used to help the children and the AI understand each other during lessons.
SoapBox has raised about $5.5 million in venture funding and grants for its research since launching in 2013. Scanlon said the current plan is to continue to expand its data set to accommodate new languages and accents, enabling more children to interact with voice technology successfully.
More Security and Technical Know-How Needed
The current and future demand for this kind of technology begs the question of why there aren’t more companies focusing on children’s speech or why the likes of Amazon and Google haven’t devoted more resources to the project. Scanlon said she thinks the reasons are straightforward.
“Looking back, I now know why there aren’t more companies competing with us. Number one is data security. We’ve been very focused on that by design from the beginning. We have a patent-pending plan to take security to the next level. Right now though, trust is eroded.”
Scanlon pointed to the ongoing concern over who is listening to recordings made by voice assistants as a global issue. Then there are the kid-specific privacy fears. Amazon is in the midst of several lawsuits over whether Alexa is violating children’s privacy, and many of the other big names in the industry are under scrutiny over whether they are violating the Children’s Online Privacy Protection Act (COPPA) or related laws.
“Secondly, it’s very hard to build speech recognition for children, it’s an underrated technical difficulty,” Scanlon said. “Other companies have made a lot of noise about smart products for kids, then the products never happen and they say it was too hard. Children’s speech recognition requires laser focus. People think it’s like another accent and it’s not.”