Yes, Humans Behind Alexa and Other Voice Assistants are Listening to Some User Utterances. The Technology Requires it to Get Better.
Bloomberg carried the salacious headline Wednesday, “Amazon Workers are Listening to What you Tell Alexa.” Despite the expertly crafted clickbait-style headline, it is a well-researched article that even includes interviewing some Amazon employees in Romania and includes a subheadline that brings a bit more balance to the article, “A global team reviews audio clips in an effort to help the voice-activated assistant respond to commands.” But, who reads subheadlines?
The opening paragraph carries over the sensationalism of the headline including the statement, “Millions…are reluctant to invite the devices and their powerful microphones into their homes out of concern that someone might be listening. Sometimes, someone is.” This ominous statement is followed by the very levelheaded commentary:
“Amazon.com Inc. employs thousands of people around the world to help improve the Alexa digital assistant powering its line of Echo speakers. The team listens to voice recordings captured in Echo owners’ homes and offices. The recordings are transcribed, annotated and then fed back into the software as part of an effort to eliminate gaps in Alexa’s understanding of human speech and help it better respond to commands.
“The Alexa voice review process, described by seven people who have worked on the program, highlights the often-overlooked human role in training software algorithms. In marketing materials, Amazon says Alexa ‘lives in the cloud and is always getting smarter.’ But like many software tools built to learn from experience, humans are doing some of the teaching.”
Yes, all of the voice assistant providers are reviewing some user utterance data. That is because humans judge the quality of the voice interaction and not the machines. The voice assistants can only ascertain the quality of an interaction by the number and type of logged errors or the lack of them. But, that doesn’t mean the user received what they wanted or had a rich experience. Humans are a critical quality check on the performance of all voice assistants and automated speech recognition systems (ASR) and were long before Alexa was launched in 2014. The key is not whether humans listen to voice assistant interactions, but whether the providers of these solutions enforce safeguards that protect user privacy.
The Voice Industry Reaction
Voicebot reached out in person and on social media to gauge the reaction of people in the industry and looked at the bastion of unconstrained consumer opinion, Reddit, to understand the sentiment of the people. Not surprisingly, most of the industry reaction thought the article was interesting, had a clickbait-style headline and were thoroughly unsurprised by what was reported as they already knew this is how voice assistant quality assurance works. That’s not to say everyone was unconcerned as you will see below.
Completely makes sense from a technical perspective. From a consumer perspective disclosure is appropriate. Most would still make the privacy trade for utility.
— Mark Phillips (@mark_voicefirst) April 11, 2019
It’s disappointing, since I do believe that privacy and the responsible handling of data are important topics, and the media should accurately frame these issues correctly so that consumers can better make decisions around privacy vs convenience
— Tim McElreath (@TimMcElreath) April 11, 2019
That is right it is a very unpopular opinion but I think it’s actually how probably 99% of the people really think.
that’s why #DeleteFacebook was nice to say but how many people do you know that really deleted Facebook Instagram and WhatsApp In 2018?
I probably know 1-3 🤔
— Gurvinder Singh (@Govithinks) April 11, 2019
Freddie Feldman, voice design director at Wolters Kluwer, commented:
“On my team, we find the article particularly interesting because we are in the midst of a big ‘IVR Call Recording/Auditing’ project here. We’re not able to just flip the switch and record calls/sessions as they do on the voice-assistant platforms. We have HIPAA and PHI to contend with. Listening to actual calls, in their entirety, is really necessary to ensure a positive and engaging experience for our users. When you hear the disclosure on an IVR call that says ‘this call may be recorded and monitored for quality control purposes,’…we really mean it. Personally, this auditing by Amazon/Google/Apple should come as no surprise to anyone and really shouldn’t be of much concern either. It’s all de-identified and you do agree to these terms when you set up the devices. Machine Learning isn’t magic, it needs to be constantly trained and it can’t do that on its own.”
Alexis Hue also weighed in from France:
Big headline but not much of a surprise tbh. If you want to train your model, you somehow need human touch. The real question is ,’how well is it handled?’ when it comes to privacy.
However, not everyone was nonplussed by the Bloomberg story.
In fairness, you wouldn’t be at these companies. I love the idea of a noble knight stopping the craziness, but that isn’t reality. Many strong, intelligent and ethical people (engineers/scientists) have left these companies because of these practices after failing to change them.
— Noelle 💕 (@NoelleLaCharite) April 11, 2019
A growing majority of people actually believe online privacy is in crisis: https://t.co/WMBUlaDkFr
— Michael Cholod (@MichaelCholod) April 11, 2019
— fredzannarbor (@fredzannarbor) April 11, 2019
The Consumer Reaction
Slightly more surprising was the low level of consumer outrage, or absence of it at least on Reddit in forums of device owners. Maybe, consumers are a bit more sophisticated about these things than Bloomberg headline writers may have assumed. Granted this is a select group and not the general populous. However, this Reddit group has its share of Alexa superfans and disgruntled users so we often see more negativity. You could find some highly negative comments about Amazon and the story, but these tended to be infringed subreddits. Here are some of the comments.
I am certain some people will suggest this headline and story will fuel consumer outrage at big tech and surveillance commerce and then unceremoniously unplug their smart speakers. This is unlikely to happen based on past consumer behavior around privacy concerns. However, this article raises several critical questions. Should the voice assistant providers be more forthcoming about how they improve performance? Do they need more policy and technical safeguards to ensure consumer privacy? What is the obligation of the company if someone reviewing a user utterance believes a crime has been committed? Will consumers be more concerned if the utterance reviews are taking place in countries with different cultural backgrounds and legal frameworks for protecting consumer privacy?
There may not be a single privacy violation “death blow” that undermines consumer confidence in voice assistants. However, each incident does chip away at confidence and could cumulatively inhibit adoption. We haven’t necessarily seen that with smartphones, but it appears there is some impact on specific social media providers. A good outcome of the sensationalist headline would be that each of the leading voice assistant providers takes a fresh look at what they are doing today to protect privacy and determine if they need to strengthen their practices or disclosure. Bob Stolzberg of VoiceXP agrees and said this week:
“I think it would be nice to see it open a wider conversation about how all of the platforms handle audio data. It’s no secret Google has done the same thing with audio data collected from their audio services besides Google Assistant in the past.”