The voice industry visionaries have helped bring us to where voice assistants are today and are defining where we are headed. Our 2020 list is once again topped by Jeff Bezos who through force of will and billions of investment dollars has created a new market and prompted other tech giants to follow his lead, or at least accelerate their plans.
Adam Cheyer dropped down a couple of spots because he departed Samsung but stayed on our list due to his outsized influence to date and because he has remained engaged with the voice community. Our judges were clear that his place on the list should be secure at least through 2020. We did lose a couple of players largely because they shifted roles, departed the industry, or led a company that was acquired and now have a lower profile. At the same time, the large number of submissions from the Voicebot community prompted us to expand the list from 11 to 17 honorees. There were many well-qualified candidates that did not make the list. However, presented below is an even broader variety of top leaders and why they made the list in 2020.
To learn more about the selection methodology and honorees in the other categories go here: Go to Overview
WHY HE MADE THE LIST // Google is so dominant in search, mobile devices, online video, advertising, and numerous other online services that it is hard to overstate the importance of Sundar Pichai taking up the competitive challenge set out by Amazon. In many ways, Google validated the voice assistant market for the home. Amazon Echo was no longer a novelty from that online shopping company once Google Home hit the market. It was a consumer segment with product choices. And, Google’s presence pushed Amazon to innovate and expand faster. It also effectively doubled the marketing spend around voice assistants, which quickly raised consumer awareness. Google acquired API.ai and transformed it into Dialogflow, created a broad portfolio of smart speaker devices, expanded Android Auto efforts in the car, and began refashioning search around the Assistant. That was followed by leveraging the global reach of Android and earlier voice initiatives for smartphones to introduce support for more than 30 languages and even expand into feature phones and other access channels. Of course, the introduction of Google Duplex further extended the types of services voice assistants could provide. The ultimate fast-follower in tech, Google was already reorienting around AI technology when Alexa started to get traction. Google Assistant gave the company a central focus for making everyday tasks simpler for consumers and offer a user interface that could easily integrate access to dozens of online services.
WHY HE MADE THE LIST // SoundHound built a global reputation for music recognition, but Keyvan Mohajer’s goal was always around speech recognition, natural language understanding, and creating a voice-interactive assistant that could understand you and get things done. After a decade of hiding within the popular SoundHound mobile app, the Hound assistant and Houndify development platform emerged as a white label option for companies that want to launch their own assistants. Nuance dominated the market for voice-interactive solutions for enterprises for a decade when Houndify showed up and started winning over auto manufacturers. SoundHound’s early success as the white-label voice assistant alternative resulted in over $200 million in venture capital funding and a billion-dollar valuation. Big companies bought into Mohajer’s vision, including automakers Mercedes, PSA Group, Honda, and Hyundai, along with Chinese appliance giant Midea, and French telecom provider Orange. More recently, mobile media app publishers Snapchat and Pandora have adopted SoundHound. The company had started pushing into hotels too before the global pandemic curtailed travel. SoundHound’s market position is under assault from tech giants and rising startups, so Mohajer’s biggest challenge yet might be to solidify and expand the company’s gains as the market for custom voice assistants accelerates.
WHY HE MADE THE LIST // Most of the people that exited the voice industry in 2020 fell off the Top Leaders list, but Adam Cheyer is an exception in multiple categories. Adam built his first voice assistant in the early 1990s while a Stanford Research Institute researcher and later was a co-founder of both Siri and Viv Labs. Siri was first an iOS app that wound up at Apple as a revolutionary feature in the iPhone 4s that launched the modern voice assistant era. Viv Labs became Samsung’s new Bixby in 2018. After two nine-figure exits, more than 25 years in the field, and a hand in two of the four leading consumer voice assistants globally, Adam has a rock-solid reputation among developers and remains influential even after departing Samsung in recent months.
WHY HE MADE THE LIST // Cerence just celebrated its first birthday with record new business bookings and revenue despite challenges brought on by the global pandemic and automotive industry slowdown. What was known as Nuance Automotive was spun out as a new publicly-traded company in 2019, and a voice giant was born. Sanjay Dhawan was not from the voice industry but had previous success running technology businesses and was tapped to lead the newly formed Cerence. Since then, Dhawan has made his mark by arresting SoundHound’s auto industry momentum and racking up several long-term contracts with global automakers. Cerence today has voice assistant technology in 325 million cars, owns over 1,400 patents, and supports more than 70 languages. That reach makes Cerence unique and among the top seven globally for voice assistant distribution. The company is also a leader in embedded voice solutions that can operate in the car without any cloud access and is integrating several complementary technologies into the in-car experience.
WHY SHE MADE THE LIST // Many people are aware that automatic speech recognition (ASR) often struggles with accented speech. What few people recognize is that ASR faces similar challenges when interpreting speech from children. The unexpected pronunciation and grammar combinations lead to higher rates of failed interactions than for adults. SoapBox Labs is focused on solving this problem. Dr. Patricia Scanlon earned a Ph.D. in speech recognition from the University College Dublin in 2005 and later was a lecturer at Trinity College and researcher for Bell Laboratories and Alcatel-Lucent. She founded SoapBox Labs in 2013 after realizing that her own young children faced challenges interacting with voice assistants. SoapBox delivers customized ASR technology built for performance and privacy. Scanlon put children on the map for voice assistants, and the company is now working with several high profile educational institutions and toymakers.
WHY HE MADE THE LIST // Sensory is the little giant of voice technology. Its solutions are have shipped in more than three billion devices worldwide, and it may just be the wake word king. AT&T, Garmin, Google, GoPro, LG, Mattel, Microsoft, Samsung, Sony, Waze, and even some tech giants they cannot name publicly have all embedded Sensory’s low power, small memory footprint, on-device voice recognition products. Todd Mozer founded Sensory in 1994 to help people communicate with digital devices as easily as they do with each other. It’s fair to say he was ahead of his time. Sensory has expanded its portfolio in 2020 with the introduction of VoiceHub, a tool that will create a custom wake word voice model for you to deploy on devices in a matter of hours.
WHY SHE MADE THE LIST // Dr. Rupal Patel wants everyone to have their own unique voice. She recognized years ago that speechless, those that cannot communicate with their own voice were benefitting immensely from voice prostheses. These technologies transformed text into synthetic speech enabling real-time conversation for many people for the first time in years. However, there was a problem. Back in 2014, the voice protheses all used similar-sounding robotic voices. That is not a way to make someone feel like an individual. Backed by a Ph.D. in speech language pathology, a degree in neuropsychology 16 years as a professor and academic at Harvard and Northeastern University, Patel founded VocalID in 2014 to provide unique voice prostheses that could be customized to each individual’s sound. That work has extended from serving patients to brands that want to have their own customized synthetic voice sound. VocalID can even clone the voice of an individual.
WHY HE MADE THE LIST // Igor Jablokov is best known for selling Yap to Amazon. Yap was a speech recognition business that, among other use cases, pioneered voice mail transcription for mobile wireless carriers. That meant users could read their messages instead of listening to them. The business was growing quickly in 2011 when Amazon swept in to acquire Yap for its automatic speech recognition (ASR) engine. Yap served as a foundational technology for the yet-to-be-launched Alexa voice assistant. After spending a few years as an Eisenhower Fellow, Jablokov founded Pryon in 2017 as a voice interactive augmented intelligence platform for enterprises. Pryon was in stealth mode for its first couple of years but emerged in 2019 with a $20 million funding round and then with publicly available products in 2020. The company is building a true assistant for accessing information in enterprises and can be automatically be trained on multiple data sources in a few hours or days using unsupervised learning techniques.
WHY HE MADE THE LIST // Dr. Ben Goertzel is not often cited as a voice industry leader, but that may be about to change. A leader in artificial intelligence for two decades, Goertzel led the software engineering team behind the world’s most famous robot, Sophia, from Hanson Robotics. Sophia included the entire NLP stack in addition to the facial expressions that gained global notoriety. He also has published scientific research papers related to voice interaction, including a treatise on unsupervised grammar induction earlier this year. A new collaboration with Hanson Robotics led to the launch of Awakening Health, which intends to provide a robot with a human face to provide care for isolated older adults. And, SingularityNET is a distributed network for facilitating interactions between AI agents. Some AI agents are likely to be voice assistants not tied into the Amazon, Apple, or Google that will benefit from connections to other AI-based agents or services.
WHY HE MADE THE LIST // Oren Jacob has been mostly quiet since his company Pullstring was acquired by Apple in late 2018. However, his team’s work is expected to manifest as part of Siri’s most significant update since launching nearly a decade ago. Jacob began shaping how people think about voice interaction around the time of Siri’s launch on the iPhone in 2011. His 20-year career making movies at Pixar offered a different perspective about voice assistant interactions than the engineers and designers focused on utilitarian applications and efficient, short engagements. He co-founded PullString in 2011 (originally named ToyTalk) and went on to raise over $40 million in venture capital. Along the way, PullString worked on high-profile projects for Mattel, Amazon’s Grand Tour, and SpongeBob and launched the Converse SaaS solution for building, deploying, and maintaining conversational apps for Alexa and Google Assistant. Everyone is anxious to see how Jacob’s vision of deeply immersive voice experiences will be rolled out with Siri in 2021.
WHY HE MADE THE LIST // Audioburst is transforming consumer access to radio and podcast content. The company’s software ingests millions of minutes of audio content every month, and the AI engine transcribes, automatically annotates the recordings with topical categories and other contextual markers, and segments it into short topic-specific “bursts” that range from a few seconds to a few minutes. In an hour-long podcast or even a brief radio news program, there might be dozens of bursts extracted or short segments that users can listen to in sequence. Audioburst aggregates this content across thousands of podcasts and radio shows every day in real-time, making it instantly searchable moments after it is broadcast. Hirsh founded Audioburst in 2014 with the idea of transcribing and indexing all audio content to make search and discovery as easy for audio as it is for text. The natural medium for voice queries is audio responses, but most content today visual or text-based. Voice assistants need content for users. One way to generate that content is to harvest the audio content already generated each day. Audiobust has raised $25 million from Samsung Ventures, Dentsu, and Hyundai, introduced a consumer app in 2020, and updated its podcaster tools.
WHY SHE MADE THE LIST // RAIN was one of the first agencies to help big consumer brands establish an Alexa presence. Early customers included P&G, the NFL, Campbell’s, and Warner Brothers. By the end of 2017, RAIN had already implemented Alexa skills for more than 30 companies. That’s when new CEO Nithya Thadani’s vision for voice took shape. While RAIN formed as a digital agency with a track record deploying for the web and mobile, Thadani was the first agency head to pivot the entire business to interactive voice applications, dropping previous clients and projects in other areas. RAIN continues to be a leader in voice, recently working with Nike on a campaign where consumers could order a new shoe model by speaking with Google Assistant. The agency says it has worked with 23 Fortune 100 companies.
WHY HE MADE THE LIST // Verbit has had a good year. It is one thing to grow by more than five times over 2019, but the company turned heads by taking down a $31 million funding round in January and another $60 million just ten months later in October. Automated transcription using AI-based speech-to-text technologies has been around for decades. And, the accuracy has gotten better–sometimes even as high as 85%. However, many customers in the legal and academic communities require near perfect accuracy to meet business or regulatory standards. Their only option has been using human transcribers, which is costly and often plagued by long turnaround times. Tom Livne’s solution was to use both: build accurate AI with automated speech recognition that could streamline the transcription process and build up a network of thousands of human transcribers that could log into a platform and quickly edit transcriptions to bring them to 99.9% accuracy. Verbit is doing this faster and at a lower cost than legacy service providers. And, you can tell by the product roadmap that Verbit may not be satisfied merely disrupting the $30 billion transcription market.
WHY HE MADE THE LIST // 2020 has been a busy year for Deepgram. The company raised $12 million in funding in March and then added more capital with an investment from In-Q-Tel three months later. June saw the launch of a free enterprise platform, and in August, the company rolled out a tool that can teach other AIs how to recognize speech. Dr. Scott Stephenson comes to the voice industry indirectly by way of post-doc research into dark matter. While two miles underground, he created a rudimentary speech recognition and transcription tool to record his conversations with a colleague. Dissatisfied with what he’d built and what was in the market, Stephenson set out to use his machine learning expertise that he had honed in particle physics research. That led him and the Deepgram team to build an ASR and NLU within a single platform based on a deep learning model that increases speed, accuracy, and processing efficiency. In most systems today, these core engines of NLP are separated. Data is handed off between them, which means some context is lost between the models. By keeping everything in one model, Stephenson says Deepgram has improved performance and cost.
WHY SHE MADE THE LIST // Jargon was among the second class of companies in Amazon’s Alexa Accelerator back in 2018. Initially, the company planned to provide an on-demand conversational translation service for voice apps, but the founders ultimately shifted focus to building a content management system for conversational AI. Milkana Brace saw that there was a proliferation of voice platforms and hypothesized that many organizations would need to support multiple platforms and modalities. Jargon helps companies “separate content from code” to enable more flexibility and easier management of the content voice apps and voice assistants utilize to respond to user requests. Jargon has raised funding from Amazon, Crosslink Capital, and Ubiquity Ventures, among others, including a $1.8 million seed round in 2019. Brace is a veteran of both Groupon and Expedia, two companies that know a lot about content management complexity.
WHY HE MADE THE LIST // Navigation is one of the top voice use cases in the car and on smartphones. However, location naming is inconsistent around the globe and filled with redundancy, homonyms, and a lack of precision. Chris Sheldrick co-founded what3words to address this problem by assigning a three-word location designation for every 3-meter square on the globe. This is a revolution in places like Mongolia, Brazil, and other countries where many dwellings don’t have an official address. It is also an innovative solution for converging on a location that might be a field without an address or resolving the issue of two places with a duplicate address. Originally built to operate by text input, Sheldrick recognized that what3words was an optimal solution for navigation requests to voice assistant and added a voice API for customers in early 2020 with partner Speechmatics. what3words claims customers ranging from Mercedes, Ford, Mitsubishi, and Tata to the Mongolian Postal Service. The company has raised more than $70 million in funding to date. Voice assistants need high-value content and services for users, and what3words certainly fits well into a voice-first world.