On Voice AI Politeness Part II
This essay continues from On Voice AI Politeness Part I
So, let us get back to our main two core questions: How polite should a human being be with a voicebot, and how polite should a voicebot be with a human being?
In my opinion, the answer to the first question is straightforward. The human being should be able to behave in any way that they wish to behave, with their only concern being to have the voicebot do what they want it to do and do it as quickly or as slowly as they want it to do it. If the human wishes to use “Please” and “Thank you” and other politeness markers, then the voicebot should accommodate such markers. If not, then the voicebot should accommodate their absence. If the human wants to bark orders, they should be able to bark orders. If they are less pressed for time or are in the mood to be cordial, or they want to exercise their manners or their vocal cords, then they should be able to do that too.
As of this writing, both Amazon and Google have deployed a “politeness mode” feature which detects whether or not the user is “acting politely” and then reacts accordingly. For instance, in Alexa’s “Magic Word” mode, if the user asks Alexa a question but does not include “Please,” Alexa responds with, “What’s the magic word?” and proceeds to provide the answer only if the user says, “Please” but not otherwise.
Google adopted a carrot strategy, compared to Amazon’s stick one. In their “Pretty Please” mode, if a user says “Please” or “Thank you” in their requests to Google Assistant, they will be “rewarded” with what Google calls “delightful responses,” which are verbal acknowledgments of the user’s politeness. For example,
Human: “Hey Google, please set a timer for 5 minutes.”
Assistant: “Thanks for asking so nicely. Alright, 5 minutes. Starting now.”
The deployment of both of these features was a reaction by these companies to parents who became concerned that both Amazon Alexa and Google Assistant were teaching their children impolite behavior: they watched as their children boorishly bossed their voicebot around by barking commands at it and felt that such behavior was turning them into uncouth social creatures.
The problem with these two features, in my opinion, is twofold: first, as we mentioned in Part One, forcing children to confer to voicebots the same degree of courtesy and politeness that they need to observe when engaging humans may be cheapening the value of these acts of politeness, both for givers as well as for receivers of such politeness expressions: if you say “Thank you” and “Please” to a machine, how much is that expression really worth to you, the giver, and to me, the receiver?
Secondly, in the case of Amazon’s Magic word, having a robot resist cooperation unless and until you behave in a respectful manner may teach the user (especially a child) to believe that these robots not only can occupy a moral high ground but that they can coerce them to change their behavior because they occupy that moral high ground. It is one thing for a system to refuse to let you in until you typed in the right username and password. There is no moral high ground there: there is only the necessity to protect your data. In the case of the voicebot that insists on polite behavior, its insistence has to do purely with forcing you to comply with social practices as a precondition to engaging with you. If the user accepts this arrangement, why would they resist next time the voicebot refuses to engage with them unless they subscribed to some ethic — e.g., the ethic of loyalty (to the brand) — or value, or ideology?
Third, in the case of Google’s “Pretty Please,” where the strategy is to reward rather than to punish, the voicebot’s reaction, when it detects polite behavior, is to answer with language that includes verbiage recognizing the “good behavior” of the user. In other words, the voicebot will reward the user by responding with a prompt that is longer than if the user had not behaved politely. As we will argue below, in the context of human-to-voicebot interactions, providing a long response is often less, not more, polite than responding with a shorter (but not too short) sentence. In fact, as it is not hard to imagine Google’s “Pretty Please” feature causing users to stop behaving politely because they find Google’s longish expressions of praise and approbation tiresome, condescending, and, most crucially, needlessly wasteful of their time.
On the flip side, voicebots must most definitely behave politely towards human beings. But what we mean by politeness here is not the sprinkling of “please” and “thank you” but rather a behavior that goes to the heart of what politeness is all about.
Politeness in its core essence is about three things: Respecting the other, treating the other as a unique person (being smart and using information about the other and about the context of the conversation to help the other), and being consistent — that is, engaging with a degree of integrity — with the other.
A voicebot that treats users with respect is more likely to win the cooperation of its users than one that does not. Here are some instances of how the voicebot should behave respectfully towards the user:
Respect the user’s time: for instance, avoid having the user suffer through long prompts; proactively tell the user how long they need to wait for an agent; offer to the user the option to be called back; let the user interrupt.
Respect the user’s freedom: let the user opt-out if they don’t want to interact with the voicebot; let them get back to the voicebot while waiting.
Don’t lie to the user: for example, don’t tell them that you are going to route them to a human and then have them interact with the voicebot.
Don’t blame the user: in cases where an error occurs, the voicebot should always take the blame.
Never terminate an interaction unilaterally: the act of ending a conversation unilaterally is the ultimate act of disrespect in the context of a conversation. Always make sure that the decision to end the dialog is consensual.
Tell the user what you are going to do. For instance, the voicebot should always tell the user that it needs to pause the dialog interaction for a few seconds to execute a back end action (e.g., retrieve something from the database); the voicebot should always tell the user that it is transferring the user to an agent.
On the flip side, the best way to win a user’s respect and, therefore, their cooperation is by acting intelligently. Here are some examples:
Know the user’s preferences: if the customer has selected English in previous interactions, don’t keep asking them what language they wish to use every time they engage. Note the language preference, remember it, and default to it.
Know the user’s level of expertise: treat frequent users who know the voicebot differently from first-time or infrequent users.
Anticipate the user’s requests: if a user has recently placed an order or submitted a ticket, the chances are that they are calling to inquire about that order or that ticket. Offer the user the status of that order the next time they call.
Detect and act on request spikes: if the voicebot is experiencing a sudden spike in interactions with humans, have the voicebot adapt its behavior in light of that spike: for instance, if the first three weekdays of every new month experience a spike in users engaging with the voicebot to inquire about their checking balance, then during the first three days of the month, have the voicebot volunteer to offer the user’s balance before lapsing to the main menu.
Nothing unnerves a user more than an irrational machine. Every instance of inconsistency by the voicebot will occasion the user to ask, “Why is it behaving like this? Did I miss something, or is this thing just badly designed?” Obviously, such questioning can only hurt the user’s confidence in the voicebot’s ability to help them solve their problem.
In Language: be consistent in how you refer to objects, properties and actions across prompts and menus. Don’t use “ticket” in one prompt and “case” in another; “incorrect” in one and “invalid” in another; “log in” in one and “sign on” in another.
In modality: if the user can speak their answer in part of their engagement with the voicebot, don’t take that ability away from them in some other part unless you explain to them why you are taking it away from them.
Across contexts: if users are responding to an infomercial and the infomercial tells the viewers that by calling the line, they will get to a sales agent, then make sure that the voicebot does not offer options that have nothing to do with sales: e.g., offering them to be connected to the help desk or to billing.
I will end by saying this: voicebots are an evolving technology that is here to stay, and we are in the early days still of where voicebots will be in, say, five years from now. When voicebots begin to not only detect and consider emotional states of the humans they interact with, but also themselves begin to behave emotionally — and not as a result of some emerging singularity but to behave truly intelligently at the human level — we will probably have to reconsider the politeness equation. Until then, I think it would be best to keep a clear head about how we should engage with voicebots and how voicebots should engage with us.