Video Shows Google Duplex Making a Real Restaurant Reservation – Says it is Google Calling and the Call Will Be Recorded, Full Analysis
One of the first demonstrations of Google Duplex at the I/O developer conference in May involved making a restaurant reservation. Sundar Pichai, Google’s CEO, said it was an actual call, but it was not clear if that was one successful call among 100 failures or if it represented a typical call. Google Duplex started rolling out this week to “select” Pixel smartphone owners in San Francisco, New York, Atlanta, and Phoenix. A reporter at VentureBeat tested it out and captured a video of the incoming call to Cafe Prague in San Francisco. You can watch the conversation of the call below.
Natural Dialogue That Sounds Like Human Speech
We included the transcript below and added numbers to each dialogue to more easily follow the exchange and comments about the conversation.
Google Duplex(1): “Hi, I am calling to make a reservation for a client. I’m calling from Google, so the call may be recorded. Can I book a table for tomorrow, please?”
Restaurant Hostess(1): “Yes. And what time would you like to reserve?”
Google Duplex(2): “At 7 p.m.”
Restaurant Hostess(2): “Okay. Perfect. How many people?”
Google Duplex(3): “It’s for two people.”
Restaurant Hostess(3): “Okay. Two people tomorrow at 7 p.m. What’s your name?”
Google Duplex(4): “The first name is Anna.”
Restaurant Hostess(4): “Anna. Okay. Thank you.”
Google Duplex(5): “Okay.”
Restaurant Hostess(5): “I’ll see you tomorrow. Thank you.”
Google Duplex(6): “Thank you very much. Thank you. Bye-bye.”
Restaurant Hostess(6): “Bye-bye.”
The exchange took about 37-seconds and my first impression is that the voice sounded very humanlike. Both the quality of the voice and the speech pattern seemed like it was a human making the call. On a longer call, it would likely become more apparent that it is not a person, but in this limited exchange for a routine transaction, it has the same type of natural quality that Google demonstrated at I/O. The key difference here is that this demonstration was not pre-packaged. It was conducted live by real people in the wild.
In exchange (3) Google Duplex answers the question of how many are in the party by saying “It’s for two people.” The more efficient robot-like response would be “two” or “two people.” However, Duplex offers a complete sentence that includes a contraction. In exchange (5) Google Duplex says “Okay,” to a comment by the restaurant hostess that didn’t require a response but would typically evoke a natural comment of some sort from a human. And, in exchange (6), Google Duplex says, “Thank you” twice and ends the conversation with the polite and informal, “Bye-bye.”
This doesn’t appear to be the type of call that will infuriate recipients because they are speaking with a computer. The call was fast, efficient, polite and sounded a lot like other restaurant reservation conversations between humans. Granted, this could be different if there were complications, but those situations are also more challenging when humans are involved.
Duplex Says Call is From Google and it “may be recorded”
The other notable aspect of this call is that Duplex announces that the call is from Google and “it may be recorded.” This is necessary because in 12 states, including California, recording a call requires consent from both parties. Google Duplex must record the call so it can apply the AI to analyze what is said in real-time and formulate an appropriate response. In theory, this recording could be retained for only a few moments and then discarded, but from a legal standpoint, it does constitute a recording and therefore requires consent.
Google initially indicated to Voicebot that Duplex would not be available in the 12 states that require all-party consent, but later changed course and decided to preface the call with the indication that it “may be recorded.” Of course, the implementation demonstrated in this video does ask for explicit consent, but instead offers notification and implied consent given that the call recipient continues the call. And, you may recognize that the notification was spoken quickly followed without pause by the request. It was clear that the hostess from Cafe Prague was a willing participant in the demonstration so was likely to continue no matter how the notification was delivered. However, the speed at which it is said followed quickly by the request is likely to lead people to pass over considering the recording point and simply respond to the question.
This speech formulation is actually different than what was included in Google Duplex advertisements earlier this year where it said, “This automated call will be recorded.” The current formulation does sound less officious and appears to get the job done.
An Assistant That Does Work for You
Google Duplex goes well beyond the idea of a voice assistant fetching driving directions, playing a song or answering a question. It actually goes out into the world and does something for you. Duplex acts as your agent interacting in the world on your behalf. Restaurant reservations may seem like a trivial step, but it is a first step that will lead to far greater capabilities over time. And, Google Duplex seems to work pretty well even in an uncontrolled environment. Let us know on Twitter if you’ve had a chance to try it out.