Voice Assistant’s Growth Flywheel

Improving Voice Assistant User Experiences

October 21, 2015 – Marty McFly walks into his home in Hill Valley and tells his large 100” flat-screen TV, “Art off. Ok, I want channels 18, 24, 63, 109, 87 and the weather channel”. The giant screen TV responds immediately with the channels Marty wished for with his voice.

This was science fiction back in 1989 when “Back To The Future II” was released. It did become a reality in 2015 when the Xfinity X1 voice remote was introduced. Voice Assistants, as we know them today, have been around for 10 years with the introduction of Siri in 2011. But have they really lived up to the expectations for mass adoption?

In the 21st century, 10 years is a good time for a new technology to leap into mass adoption or fade into oblivion. Remember how the utility of the iPod and the entire portable MP3 player category got eclipsed by smartphones within a span of 10 years.

User experience on voice assistants leave much to be desired

The early adopters of voice assistants – the tech enthusiasts (including me) still find voice assistants falling woefully short in providing an impeccable user experience. Major challenges include

  • near field voice assistants work far better than far-field voice assistants
  • noisy environments make recognition harder
  • discovery of new functions or voice skills are still tedious
  • the voice assistants expect commands in a certain vocabulary and quite far from the level 5 AI assistant – “the assistant can help me figure out what I need.”

The other day I asked one of my voice assistants, “Start core training,” and it responded back with, “I cannot help you with starting the car.”

Customers expect more for less

Here is a bold and provocative statement – customers do not want to pay for using a voice assistant. They expect the voice experience to be embedded into the core product like mobile phones, wearables, cars, etc. Yes, they do need to buy a smart speaker, but it’s well known that smart speakers are heavily subsidized, and monetization potential is quite limited. On the other hand, building a voice assistant (either completely on your own or using off-the-shelf technology components or as a skill on the popular voice platforms) rich in experience for the customer requires a substantial investment in budget, people and continuous fine-tuning.

In the early years of voice assistants, the topic of data privacy got a lot of attention due to the mishandling of private voice data by some of the popular voice assistant platforms, which later forced them to impose strict measures when it comes to data privacy.

The flywheel concept developed in the book “Good to Great” describes the process of relentlessly pushing a giant heavy flywheel turn by turn, building momentum until a point of a breakthrough beyond which the flywheel starts to spin on its own. All the above-mentioned challenges with voice assistants got me thinking – is there really a flywheel for a 10X growth for voice assistants? To answer this question, we need to turn each of the challenges into opportunities to start building momentum.

Invest in high-quality voice experiences

On the quality of voice experiences, there is a big ongoing debate on what are the killer features of a voice assistant. The answer is not so straightforward. It highly depends on the location of usage and type of device.

  • Audio content (music streaming, radio, podcasts) and Q&A rules on smart speakers that are mainly stationary at home
  • Navigation and calling/messaging others are popular on mobile native/car native voice assistants on the go
  • Interacting with TV content is obvious for voice assistants on TV
  • Voice assistant’s usage on wearables and mobile apps are just gaining traction
  • Voice assistants on telephone lines and public places are still in exploration

The Alexa Voice Skills and Google Actions with hundreds of thousands of voice use-cases have failed to expand the reach and increase active regular usage of voice assistants. The focus seems to have turned into increasing engagement (voice skill quality) vs. expanding features (voice skill quantity). Despite the challenges, investing in high-quality killer features for the voice experience for your customer is key to get the flywheel to start spinning.

Maximize the distribution with as many customer touchpoints

To get the flywheel to gain momentum, it’s important to bring your voice assistant on as many free-to-use touch points as possible. Existing distributions channels (e.g., the embedded voice assistant on a mobile app with millions of active users) provide a faster reach in terms of the installed-based instead of developing new distribution channels (e.g., the voice assistant on own hardware). Building and delivering a high-quality voice experience to customers is not a one-off project but requires continuous fine-tuning and improvements. This requires substantial investment in terms of budget and people. To justify the required investment, the cost of building and maintaining the voice assistant should be lowered. This can be achieved by utilizing near/offshore tech talent, raising the efficiency of the underlying technology infrastructure through a combination of open source and best of breed multi-vendor tech stack strategy.

Guiding customers through the capabilities of the Voice Assistant

With a high-quality voice experience on multiple free-to-use touchpoints, you have only crossed half of the well, i.e., making your voice assistant accessible to your customer. The explosion of voice features and skills has led to a massive discovery problem. People usually get lost with the complexity of using a voice assistant – remembering the exact command or knowing which skill is third or first-party (sometimes not knowing the difference causes data privacy issues) and being aware of feature improvements over time.

There was a time when one of the voice assistants that I use wouldn’t recognize Indian names while calling nor recognize many Indian music titles while playing music. After the initial tries, I gave up using those features. But by coincidence, I discovered some months ago that Indian name recognition and music catalog recognition have vastly improved that I now find myself calling Indian names and playing Indian titles more often. Hence, we can say assistants are on their way to meet requirements for the masses to use them on a regular basis.

Voice assistants are bad at selling themselves. You need to create massive product awareness. The best way to educate the customers is “in-the-moment recommendations” about features and how to use them. While your customer is habitually zapping channels using the remote control, you could show a tip saying, “Hey did you know that you could just push the voice button and talk?”.

When building the feature set of your voice experiences, pay close attention to how the feature will be promoted to your customer. Devoting sufficient development capacity to implement in-product marketing of the voice experiences (notifications, reminders, on-screen tips) on top of the usual marketing channels (social media campaigns, email marketing etc.) is key to create stickiness for the customer.

Getting the flywheel to spinning on its own

The combination of

  • high-quality voice experiences
  • on as many touchpoints as possible
  • with massive product awareness

will lead to the flywheel starting to gain momentum. This is the only way to get massive reach and engagement in the form of active usage. Answering the question raised at the beginning of the article, yes, there is a growth flywheel. The tricky part remains with gaining the acceptance from customers and businesses that the flywheel continues to spin on its own.

With higher active usage, voice assistants will become a part of the customer’s daily life, thereby creating sticky consumer behavior leading to brand awareness for the business. This, in turn, increases the likelihood of businesses continuing to invest in voice assistants to make them better, given the tangible ROI in terms of customer experience impact. A large customer base actively using the voice assistant helps you identify the new features as well as improve the performance, feeding into the “invest in high-quality voice experiences” section of the flywheel.

Traditionally, enterprises tend to look for a solid business case before investing in voice experiences. The outcome of the business case could be a positive customer experience for millions of customers or a positive financial impact. Monetization through voice assistants has remained a pipe dream so far, but this could change with the wider adoption of voice commerce & voice advertising and with the adoption of business/enterprise use cases (e.g., customer care).

There is no “easy to copy” playbook for reaching a massive scale with voice assistants in the consumer domain. Piggybacking on the massive scale of their smartphone operating systems, Apple and Google have their voice assistants installed on billions of devices. But it’s no secret that installed base does not equate to active regular usage of the voice assistants. Amazon and Google have achieved massive scale through hardware sales (smart speakers), but it’s well known that the scale comes at the cost of heavy hardware subsidies, and active usage is still a challenge compared to the proportion of the overall number of devices sold. Everyone needs to carve their own path towards scale, overcoming the challenges for adoption.

By actively lowering the production costs and by allocating the costs over the larger user base gained through increased active usage, the cost per user can be kept substantially lower. Lowering the cost per user makes further investments by the business justifiable to make the voice assistant better. Massive-scale in terms of active usage leads to customer acceptance and lower costs per user with tangible customer experience impact and/or financial impact leads to business acceptance.

Once we have reached the threshold of gaining customer and business acceptance, further investments will follow in building more high-quality voice experiences, making it a virtuous cycle. This puts the flywheel into a spinning mode bringing voice assistants into a growth trajectory.

Getting back to the question, have voice assistants lived up to expectations for mass adoption? The answer is no. Is there a flywheel that could result in a 10X growth for voice assistants? The answer is yes. The next 5 years will tell whether voice assistants manage to break the “regular use stickiness” barrier and put the flywheel into the spinning mode.