The Top Voice AI Stories of 2021
Looking back at 2021 it was a different kind of year for voice AI than in the past. We could have said that in 2020 as well, but that was true on every societal and economic level due to the pandemic. However, we saw the seeds of change in the voice AI industry in late 2019 that was paused briefly and then accelerated in 2020 which set up 2021 to be the year of enterprise adoption.
Voice AI news, innovation, and investment were largely driven by consumer applications in the period 2016 – 2019. Amazon, Google, and Apple (and occasionally Samsung) dominated the headlines related to the technology. First, that was driven by the interest in devices such as smart speakers, smart displays, and voice interactive smart home products. That was followed by applications for media, driving, and other daily activities. However, 2021 was very different. Consumer voice interactive devices and applications with just a couple of exceptions were receiving incremental updates without much fanfare or material advances. The tech giants continued to invest billions of dollars in their voice assistant portfolios but the attention clearly shifted.
The Voice AI Year of the Enterprise
You will see in our top stories countdown that enterprise voice AI innovation dominated the headlines, investment landscape, and attention from the industry. Much of this began during the pandemic but its origin was earlier. In 2020, the Voicebot Podcast ran a series about custom (brand-owned) assistants that were launching or already operational in 2020. Most of those projects began in 2018 or 2019 and were predicted by Voicebot’s 1,000 assistants hypothesis and GOWN model in early 2019.
The pandemic also introduced new pressures to support functions and many digital interactions with customers. The volume of questions accelerated at a time when companies were ill-prepared to manage the increase with workers unable to come to work and didn’t have access to the antiquated tools most contact centers were using a year ago. That led to the rise of chatbots for customer service and a few voice assistant projects founded on the promise of automating customer interactions. The pandemic was an accelerant for a trend that was materializing prior to 2020. We see the impact of that acceleration with the activity of 2021 and what you should expect to see next year.
With that context in mind, here are my candidates for the top 10 voice AI stories of 2021. Let me know if you agree or disagree on Twitter (@bretkinsella). What did I miss? What did I just get wrong? Bring it on.
Number 10 – Amazon Introduces an Alexa Robot
While devices have not been the top story of voice AI in 2021, Amazon’s Astro Robot deserves a spot in the top 10. This in-home robot segment has struggled as smart speakers and ambient solutions usurped many of the use cases they were supposed to fulfill. What is the rationale for an in-home robot when you have low-cost voice assistant enabled microphones in every room in the home? That remains to be seen. However, Amazon has long believed it needed a robot with eyes for Alexa to take the next step-change in intelligence accumulation.
The Astro Robot was not expected this year even though there had been earlier leaks about a secret project. Astro looks like a pretty capable and interesting product for the home but it is not clear robots will become more than a novelty for device aficionados. Amazon has demonstrated strong capabilities in delivering products that consumers adore and doing it at a competitive price point. Maybe Amazon’s Astro will succeed where its forebearers have failed.
Number 9 – Val Kilmer Gets a New Synthetic Voice
Val Kilmer lost his natural speaking voice due to throat cancer. Sonantic gave him a synthetic replica based on recordings from his career in film. This was just one of the more high-profile stories around synthetic voices which were a big story overall in 2021 and merits the top 10 designation as a category. Veritone, Replica, and Supertone all had well covered announcements in the space and many startups raised funding.
The rise in virtual humans, the emergence of metaverses, speech quality improvement for a variety of commercial applications, and use as voice prostheses for people with vocal medical conditions all led to increased attention to this segment that had previously taken a backseat in voice AI. This looks like a market to watch in 2022 as well provided the companies in the space can figure out a sustainable growth model that so far has been elusive.
Number 8 – Cerence Introduces Drive 2.0 and Cloud Services
Cerence delivered a new platform for interactive services, including voice assistants, that was a significant departure from its legacy architecture. Drive 2.0 is an extensible platform that includes easy integration to Cerence and third-party cloud services and onboard applications. Automakers have been interested in a more flexible solution for integrating outside services for some time and were increasingly looking for integration with popular third-party consumer services.
While other competitors in the automotive voice assistant market also serve other industry and consumer segments, Cerence is 100% focused on automaker solutions. This year, the company also extended its solutions to two-wheelers to significantly expand the market for transportation-orientation voice solutions.
Number 7 – Ford Commits to Android Automotive OS
It looked like Cerence was about to once again wrap up the in-car voice assistant market that it previously dominated for two decades. SoundHound’s automotive momentum that began in 2018 appeared to be waning as was Amazon’s Alexa adoption by automakers. Group PSA made a deal in 2020 to adopt Google Android Automotive OS (GAAS) but the automaker was not displacing its incumbent providers so the commitment was unclear. In February, Ford said it would include GAAS and Google Assistant across its product lines in 2023.
This was a big win for Google. It appears Cerence will still have a place in the Ford tech stack but it was an important for Google because it includes the OS plus Google Assistant for another big automaker. It also opens up competition once again for automaker voice assistants Tier 1 supplier. Amazon has not backed out of the market and SoundHound is still there but GAAS looks a lot more credible competitor to Cerence entering 2022 than it did at the beginning of 2021.
Number 6 – Social Audio Explodes onto the Scene
Social audio was a big story in the early part of 2021 and Discord’s move to add the feature helped boost its valuation to $15 billion in September. But, the biggest social audio story of the year was Clubhouse. It quickly amassed tens of millions of users and a $4 billion valuation.
Some people have questioned whether social audio is really a voice AI story. It certainly has a life outside of AI technology. However, the space is complementary and a new source of growth for voice AI tech providers. We saw Clubhouse implement spatial audio in 2021, widespread use of speech-to-text transcription, the introduction of a voice assistant by Tinkoff that helped manage social audio rooms, and sentiment analysis of conversations on the platforms. The social audio fervor of the first half of 2021 subsided later in the year, but it remains an important new social media channel and one that is tailor-made for voice AI technology solutions.
Number 5 – Smart Speaker Sales Flat in U.S. while UK Growth Continues
A widely overlooked story of 2021 was the abrupt slowdown in U.S. smart speaker sales. Voicebot’s Smart Speaker Consumer Adoption Report 2020 found that over 34% of U.S. adults had access to a smart speaker. The figure one year later showed a rise of less than 1% to 35% in the Smart Speaker Consumer Adoption Report for 2021. Voice assistant impact goes well beyond smart speakers but the devices have dominated news coverage and activity in the space since 2016.
This data preceded leaked internal Amazon memos that chronicled a precipitous drop in smart speaker sales and concerns among executives about device usage rates. At the same time, the UK rose to 38% ownership among the adult population. Smart speakers are an important new device in consumer life but they are no longer the driver of voice assistant adoption.
Number 4 – Baidu’s Xiaodu Valued at $5.1 Billion
Our highest-ranked consumer product and application story was Baidu’s Xiaodu division receiving a $5.1 billion valuation in an August financing event. Xiaodu is the Baidu equivalent of Amazon’s Alexa portfolio. It includes the Xiaodu voice assistant, smart displays, smart speakers, smart wireless earbuds, smart TVs, and other smart home devices based on the DuerOS platform.
“A deal with Midea, China’s largest smart device manufacturer, to allow Xiaodu’s smart speakers to control Midea’s more than 70 million appliances,” has further expanded the voice assistant technology’s reach. Baidu’s separation of its voice assistant portfolio into a separate operating company has offered a rare opportunity to assess the value of these product lines as the U.S. tech giants go to great lengths to obfuscate the operating costs and revenue.
Number 3 – SoundHound Announces $2 Billion SPAC
SoundHound was the first unicorn in the voice AI industry that achieved a $1 billion valuation in a private financing round. That occurred way back in 2018. In November 2021, the company announced it would go public using a special purpose acquisition company to bypass the IPO process. That event is expected to raise up to $244 million in new financing and set the company’s new valuation at $2.1 billion.
The market for enterprises to have their own custom voice assistant has led some companies to consider SoundHound as a white label option to get to market faster. S0undHound made waves three years ago with a number of automotive client contracts and more recently has made progress in media, most notably with recent Vizio and Netflix deals. The newly reorganized business also expects to shift its business model from licensing to more subscription and ad monetization over the next five years.
Number 2 – Verbit Surpasses $2 Billion Valuation
Transcription might not be the most exciting application of voice AI technology but it does represent a very large market. Verbit says that revenue in the industry surpasses $30 billion annually. That market size and torrid growth enabled Verbit to reach unicorn status and a $1 billion valuation during a $157 million funding round announced in June. Just five months later the company stepped up again to a $2 billion valuation and a $250 million financing.
This isn’t a technology story as much as a business model and application story. The application of transcription for specific industries such as higher education, law, and media. The business model goes beyond just providing technology to offer a complete service that integrates technology and human services to provide customers rapid turnaround transcription at 99.9% accuracy.
Number 1 – Microsoft Acquires Nuance
I know. Boring. When a giant enterprise software company buys another giant enterprise software company it can be hard to be excited. However, it’s hard to overlook the acquisition price of nearly $20 billion. Microsoft is the most important company of the technology shift to PCs that began in the 1980s. Nuance is the most important, and largest by revenue, company that grew up around the advent of voice technologies in the 1990s and early 2000s. Two legacy tech companies that have evolved significantly are now joining forces.
Aside from the deal size, this move is also significant because of the rationale behind it. After Nuance shed its automotive business with the spin-out of Cerence in 2019 and a few other product lines, it was left largely with a healthcare and contact center practice. Microsoft made clear in emphasis around the acquisition that it was most interested in Nuance’s healthcare portfolio. Healthcare assumed center stage in voice AI during the pandemic and its adoption throughout healthcare is assured. Microsoft wants a piece of the action and didn’t want to build its business from scratch.