Sensory Debuts Child Speech Models for Kid-Focused Voice AI Developers
AI and speech technology creator Sensory has augmented its VoiceHub platform with speech recognition models specifically designed to understand children. Voice app and voice-enabled device developers can integrate the child speech models into their products through VoieHub’s pipeline for adding custom wakewords and voice controls to smart devices.
The child speech models produced by Sensory emphasize the ways that kids speak differently from adults. Sensory analyzed its database of children’s speech to come up with a voice AI model that would be better suited for younger users. The company reported that the word error rate fell by a third with the new models. The speech recognition software has been further adapted to work with Sensory’s TrulyNatural continuous speech recognizer and TrulyHandsfree custom wakewords. T
he idea is that companies working on kid-centric voice apps, toys, and other hardware would be able to assure potential customers that the custom voice assistant will understand what the child says. That’s on top of trying to attract parents with Sensory’s on-edge AI processing that keeps data from being transmitted to the cloud, potentially raising privacy concerns.
“Sensory has some of the most talented technologists in the speech industry,” Sensory CEO Todd Mozer said. “We challenged the team to create a private and accurate recognizer for kid’s speech and they delivered. This opens up new and fun voice-enabled products for kids of all ages.”
Lack of data and technical difficulties have kept mainly developers from raising voice AI for kids to the sophistication of its adult counterparts. New AI reading helpers in Google Play and the new Alexa Voice Profiles for Kids are supposed to enable personalized experiences for children. In hardware, meanwhile, Alexa’s kid profiles can pair with the Echo Show 5 Kids. Both Amazon and Google are arrayed against brand-specific voice AIs for interested third-party toy manufacturers who are designing voice-enabled smart toys. The rising demand has also been a boon for specialists like children’s speech recognition tech startup SoapBox Labs. The startup’s steady drumbeat of new features now includes Voice Activity Detection (VAD) and custom wake words. SoapBox built its services on a database of thousands of hours of children’s speech and deep learning technology to understand the unique patterns and inflections of children’s speech.
As SoapBox Labs CEO Patricia Scanlon pointed out to Voicebot last summer when Google unveiled its kid-focused voice AI, there is a “massive and growing demand for voice tech that works for kids. The issue is that even with a big dataset, “making voice tech work with the complexity of kids’ voice, accent, language, and behavior patterns is no small task.”