Correct Call-to-Action Recall by Users is Twice as High for Human Voices as Synthetic for Voice Apps
You may think that consumer preference for human over synthetic voices in voice apps is simply an aesthetic concern. Data from the new report What Consumers Want in Voice App Design, suggests it may impact much more than style. Voices.com, Pulse Labs, and Voicebot.ai collaborated to design a study that could dispell some of the mystery behind voice user experience (VUX) design by attributing data to design choices.
In one test, we played dialogue to a voice app user from two different synthetic voices using text-to-speech and one human voice. Each user only heard one of the examples. For the synthetic voices, we varied the length with a shorter clip measuring 25 seconds and the longer clip 49 seconds. The human voice actor used the script for the longer content and also recorded for 49 seconds.
We found that for a panel of 240 consumers, correct call-to-action recall by voice app users hearing a human voice was more than double that of either of the synthetic voices. The longer duration synthetic voice dialogue actually performed a little better than the shorter dialogue variant.
You can download the full report with 17 charts in 19 pages of analysis through the button below.Download
Brands Should Take Notice
Dylan Zwick, co-founder and chief product officer for Pulse Labs, immediately pointed out the importance of these findings for brands. “We didn’t know what to expect for this experiment and were surprised that using a human voice had such an impact on recall. That’s an important lesson for brands who want to make sure their voice products are remembered.”
As you dig deeper into the data you can see that only 26.8% of users that heard the long dialogue synthetic voice even noticed there was a call-to-action (CTA). That means nearly 3-out-of-4 users didn’t notice the CTA at all. It compares to 40.1% and 42.5% for the short dialogue synthetic voice and human voice respectively.
There may be an equivalent of banner blindness that arises in linear audio. We can call it CTA deafness. It is interesting that the longer synthetic voice seemed to suffer this phenomenon more acutely. It is also intriguing that even though nearly as many users who heard the short dialogue synthetic voice as the human voice recognized a CTA had been communicated, so many of them recalled it incorrectly.
“As synthetic voice technology improves, we’re hopeful about its capacity to make voice production scale more cost-effectively, but in the short term, you can’t beat a good old fashioned voice over talent. Humans win again……for now,” said Brandon Kaplan, CEO of Skilled Creative, who also supported the study.Download