Listen Learner from Apple and CMU Raises Entirely New Privacy Concerns for Voice Assistants
Researchers from Apple and Carnegie Mellon University have created a new approach to using voice to teach artificial intelligence about the world without a lot of initial training called Listen Learner. The goal is to make an AI that is adapted to the context and understands the environment around it. The technology raises a few critical privacy questions that will need answers if it’s going to be integrated into voice assistants at home.
Listen Learner is built around the idea that a smart speaker keeps its microphone on, listening to all of the sounds around it. The AI learns what different sounds mean by asking its owner, as the video demonstrates. Like a small child, the AI wants to know what every sound around it means. What differs is that the AI will remember what it is told the first time. The researchers, who presented their paper at the most recent Conference on Human Factors in Computing Systems, describe how the AI would start broadly asking what certain sounds are. Over time, it would become more refined as the AI learns to distinguish between similar sounds like different faucets or doors closing.
“We have presented Listen Learner, a system that seeks to enable high-accuracy, low-effort acoustic activity recognition using one-shot user labeling,” the researchers wrote in their paper. “We built a hardware and software implementation that gradually discovers new event classes from the environment with no user demonstration or training involved.”
The paper points to the growing number of smart devices that are useful assistants, but cannot connect their environment to the requests they receive from users. All it takes is a brief conversation, and the device will know for the future what that incidental sound means. It’s easy to envision how this could be a boon to users at home. For instance, the AI might note the occurrence or absence of sounds before a human does, informing them of a leaky faucet or the lack of power to the fridge before it becomes a bigger issue. It would arguably integrate elements of a smart home into far more than just the connected devices and smart appliances. Voice assistant developers would also get something from smart homes using Listen Learner or something similar. More information helps make for smarter systems, and with the microphone always on, the data feed would be enormous compared to just having it on when activated by a wake word.
But, those benefits have to face a genuine question of privacy. Microphones that always record what is happening around them often make people uncomfortable. It was less than a year ago that Apple and every other voice assistant developer had to respond to reports that contractors were listening to audio recordings for quality control and improvement programs, including. Apple and others had to pause their contractor programs and revise how they operated in some form. Apple even apologized for its program. The paper’s authors recognize that issue and offer possible solutions.
“While our acoustic approach to activity recognition affords benefits such as improved classification accuracy and incremental learning capabilities, the capture and transmission of audio data, especially spoken content, should raise privacy concerns,” the authors wrote. “In an ideal implementation, all data would be retained on the sensing device (though significant compute would be required for local training).”
On-edge processing, where all of the device’s computing needs are done without the cloud, is more feasible than ever for Apple after it recently acquired AI startup Xnor.ai for a reported $200 million. Xnor.ai builds low-power machine learning technology that operates without requiring a connection to the cloud, exactly what the paper recommends. That approach is becoming more popular as technology develops. Speech tech developer Sensory debuted a customizable voice assistant specifically for smart home appliances that operate entirely without the cloud, and it is a feature for the Picovoice platform.
That said, a smart speaker system that is always listening is not going to appeal to the more privacy-minded consumer base. Apple may have better luck applying the system to an enterprise vertical, where theoretically, the only things it will hear are work-related. Even then, the sense of being spied on is going to be tough to overcome.