Voice assistants exist today because of the endless hours and keen insights of many researchers, software developers, computational linguists, machine learning experts, mathematicians, and more. And, once the core technologies were developed, they still needed to be improved and then molded into applications that end-users could access and engage with reliably. The technology behind voice assistants has come so far because the technologists have brought us so much innovation. Many have also taken those innovations and formed them into useful applications that people now use billions of times every month.
Topping our list this year is a leading researcher in AI that has focused his recent efforts on taking NLP processing to a scale never before attempted. Ilya Sutskever and OpenAI may not be the technology behind a generally available voice assistant today, but the architecture supporting GPT-3 may soon be incorporated into the high performance virtual assistants of tomorrow. Not surprisingly, also topping our list are leading technology innovators from Amazon, Google, Apple, Samsung, and Nuance. They are joined by new entrants from Rasa, Stanford University, Deutsche Telekom, Mercedes, and others on our 2020 list. We have 17 honorees in the technologist category this year which was necessary as innovation leadership in voice AI continues to accelerate.
To learn more about the selection methodology and honorees in the other categories go here: Go to Overview
WHY HE MADE THE LIST // Rohit Prasad has taken an increasingly public role each year as Amazon’s vice president and head scientist for Alexa. In 2019, he introduced Alexa Conversations and the ability for Alexa to coordinate complex scenarios across multiple first and third-party skills. Amazon showed even more ambition in 2020 with the demonstration of natural turn-taking and the ability for Alexa to learn a user’s preferences to deliver a more personalized experience. That was followed by the announcement of latent goals detection, which can lead to context-derived suggestions to users that anticipate needs before a user makes a request of Alexa. Prasad joined the Alexa team as director of machine learning in 2013 prior to the public launch of the voice assistant. Since that time, Amazon has transformed the voice assistant market and typically exceeded consumer expectations with the rapid delivery of new features, which sets the pace for competitors who then look to match Alexa’s progress in their own subsequent releases.
WHY HE MADE THE LIST // Brad Abrams is not the highest-profile Google executive working in voice, but he is the one that software engineers most often cite as someone they respect and appreciate for both the quality of Google Assistant and his early engagement with the developer community. Abrams is Group Product Manager for the Actions teams and is responsible for developer experiences on Google Assistant. The Assistant has evolved quickly and added a lot of new features over the past year. One of the most notable is the strong focus on bringing Google Assistant functionality into third-party mobile apps for Android with App Actions. As the supplier of the world’s most widely adopted mobile OS, Google has over 1 billion device users it can now serve by linking the ease of use of Google Assistant with millions of mobile apps consumers use every day. Prior to working on Assistant, Abrams led efforts in Chrome and Google Cloud. He is also the author of several books, including the popular Framework Design Guidelines and .NET Standard Framework Library Standard Reference.
WHY HE MADE THE LIST // John Giannandrea has mostly been quiet this year from a Siri perspective. He is Apple’s SVP of Machine Learning and AI Strategy and reports directly to CEO Tim Cook. Giannandrea joined Apple in 2018 from Google and was thought to be just what was needed to corral disparate and uncoordinated AI efforts. He now oversees the strategy for AI and machine learning across the company, which includes the development of Core ML and Siri technologies. Siri is quietly getting more features each year and enabling integration across devices, but the more notable updates have been cosmetic and not performance-related. It appears that 2020 was the year that Siri began to look different, and we are anticipating 2021 will be the year Siri starts to act differently and maybe even think differently. Apple has been accumulating startups with technology that could shore up Siri’s shortcomings, and as a former executive at Tellme Networks, Giannandrea likely knows just where the gaps reside. Plus, it is widely believed that Apple plans to introduce smart glasses in 2022. That product will need a more capable Siri to fulfill what are sure to be high expectations.
Apple Snags New AI Leader from Google
WHY HE MADE THE LIST // Nuance at one time was so dominant in speech technology that it had no peer. The company achieved this through its own innovation and also by acquiring nearly all of the leading speech technology companies that arose in the 1990s through the early 2000s. It even spun out its automotive business in 2019 to create Cerence, which is the dominant global supplier of interactive voice technology for cars. Today, Nuance is a giant in voice technology for healthcare, financial services, telecommunication, and government for customer service, enterprise workflow, and custom solutions. Nuance is to speech technology in the enterprise what Amazon, Apple, and Google are to voice assistants for consumers. Overseeing this formidable technology stack, which includes over 2,700 patents, $200 million in annual R&D, and a team of nearly 1,500 that serves 85% of the Fortune 500 today is Joe Petro, executive vice president and chief technology officer. Petro joined Nuance in 2009 after several years with Eclipsys and Aspect Development.
WHY HE MADE THE LIST // At some point, important software segments get an open-source alternative to proprietary systems that created the initial demand and proved there was value to be had. For conversational AI, Rasa may be assuming that role and separating from competitors. Rasa has raised more than $40 million in venture capital, including a $26 million round earlier this year led by Andreessen Horowitz. Rasa is building infrastructure and tools that help anyone build high performance and context-relevant voice and chat-based assistants. As co-founder and CTO, Alan Nichol oversees the development of the free but proprietary Rasa X software and the paid custom Rasa enterprise in addition to cultivating the open-source community development around the core Rasa open-source framework, which has over 3 million downloads since the company’s founding in 2016. Rasa initially became popular for chatbot development, but it has quickly expanded its role in voice assistant projects over the past two years. Nichol earned a Ph.D. in engineering from the University of Cambridge, where he focused on developing machine learning algorithms for atomic-scale matter simulations. He also has a degree in chemical physics from the University of Edinburgh and was a research engineer at P&G early in his career.
WHY HE MADE THE LIST // Dr. Larry Heck has been working in voice technology for nearly 30 years, starting at the famed SRI back in the 1990s in the STAR Laboratory after earning a Ph.D. in electrical engineering at Georgia Tech. He served as a vice president for R&D at Nuance, was on the advisory board for Yap when it was acquired by Amazon, and spent time as a principal research scientist at Google and chief scientist for Microsoft Speech. Across his career, Heck had already worked with three of the four most recognized voice assistant technologies, which went into creating Google Assistant, Cortana, and Alexa, before arriving at Samsung to become president and CEO of Viv Labs after the departure of Dag Kittlaus. He also holds the title of SVP for Bixby North America and Samsung Research America. Today, Heck is responsible for shepherding the development and adoption of the voice assistant promoted by the world’s largest seller of smartphones. Widely heralded for its innovative architecture and capabilities, Heck is helping Bixby navigate a consumer technology segment that has become fiercely competitive while has the advantage of knowing that Bixby will soon be available through 500 million devices shipped annually.
WHY HE MADE THE LIST // Jovo is a popular open-source framework for voice app development. In fact, many experienced voice app developers have standardized on Jovo over the past couple of years. Its appeal began with the ability to build a voice app once and prepare it for publication on both Amazon Alexa and Google Assistant. Version 3 of the Jovo Framework expanded support to include Bixby, Raspberry Pie, web apps, and the ability to support custom assistant features. Jovo for web just landed, expanding the capabilities even further into web and chatbot development. Jan König and Alex Swetlow co-founded Jovo in 2017 just before arriving at Betaworks’ Voice Camp accelerator. Since that time, the company has strived to give voice developers tools to enable efficiency, flexibility, and consistency. Jovo was recently recognized by Google for support of the updated Actions Builder. König earned a Master’s Degree in industrial engineering from Karlsruher Institut für Technologie. Swetlow also graduated from Karlsruhe University with a Master’s Degree in computer science.
WHY SHE MADE THE LIST // Dr. Monica Lam is a computer science professor at Stanford University and faculty director for the Open Virtual Assistant Lab (OVAL). The lab is best known for developing the privacy-focused open source Almond virtual assistant. With that said, Lam suggested in a conversation with Voicebot that one of the bigger breakthroughs from the lab is the ability to substantially reduce the cost of data labeling, a critical function for machine learning algorithms to optimize. The virtual assistant technology stack includes the Genie semantic parser generator, which auto-generates training data based on far fewer inputs than traditional methods. Her Stanford profile says Lam’s “research mission is to disrupt the status quo where centralized monopoly platforms are prevalent, and consumer privacy is compromised.” Lam earned a BS in computer science from the University of British Columbia and a Ph.D. from Carnegie Mellon.
WHY HE MADE THE LIST // A mathemetician by training, Jeff Adams has been developing speech and voice technologies for more than two decades, including stints at Kurzweil Applied Intelligence, Learnout & Hauspie, and Nuance, where he was director of language modeling for eight years. He then went onto become vice president of research at Yap, which was eventually acquired and established as Amazon’s speech and language team. That group built the core speech recognition engine for Alexa and the first far-field ASR, a key technology enabling the smart speaker product category to emerge. He departed Amazon the week before Alexa’s launch in 2014 to found Cobalt Speech & Language, a specialty technical consultancy that helps companies tackle tough challenges in voice technology. He is also a founder and CTO of both Canary Speech and Omnibot.ai is listed as an author on 21 patents.
WHY HE MADE THE LIST // Dr. Chris Mitchell is helping elevate sound recognition as a peer of speech recognition. While everyone is talking to their voice assistants, the research team at Audio Analytic is building sound recognition models that can detect non-speech ambient noises ranging from breaking glass and safety alarms to babies crying and even notice whether you are in a crowded bar or a seaside beach. A recent deal with Qualcomm promises to bring Audio Analytic’s acoustic scene recognition to millions of smartphones in 2021. Mitchell holds 31 filed and granted patents and is an associate lecturer at Anglia Ruskin University. Following the completion of his Ph.D. in sound information systems and signal processing, Chris received a Kauffman/NCGE Fellowship to investigate the commercial implications of his research, which included attending Harvard Business School and a brief tour with Cisco Systems. He then founded Audio Analytic in 2010 and serves as CEO. Today, the technology is applied to identifying sounds that can indicate a health or safety risk for people within a building, and that is expanding to lifestyle applications. Audio Analytic raised a $12 million funding round in 2019 and over $20 million to date.
WHY HE MADE THE LIST // Many people think Amazon and Google have the smart speaker and media player market locked up with Alexa and Assistant devices. Deutsche Telekom shows with its Magenta smart speaker and assistant that there is still room for others to offer consumer value with specialized products. Deutsche Telekom is a media giant in Germany and enables voice control for television interactions and entertainment search. However, the company didn’t stop with just a speaker or voice remote. It also wanted consumers to have access to local information and services through Magenta and even integrated Alexa into its smart speaker to address a broader set of use cases. Reghu Ram Thanumalayn is SVP of the Magenta voice program at Deutsche Telekom and the technical leader overseeing the solution. He led the program pre-launch and was instrumental in integrating both Amazon and SoundHound technology while also building out a custom NLU. Thanumalayan earlier spent 13 years at SAP in software engineering and leadership roles.
WHY HE MADE THE LIST // Automakers were among the first to embrace voice interfaces 20 years ago, and Mercedes has been the recent leader in rolling out a full-featured voice assistant that even supports multiple other popular assistants. The company even made a one-minute Super Bowl commercial that highlighted the benefits of voice control for 40 seconds before ever showing a car and then immediately showed off how the Hey Mercedes assistant works. Jeurgen Schmerder is director of experience AI at Mercedes-Benz R&D North America and leads the team that enables drivers to talk to their cars. As the second most widely used voice interface, cars are an important category for the spread of the technology, and Mercedes is setting the standard for others to match. Before his current role, Schmerder was a director at Sonos where he oversaw the platform and integrations, including onboarding voice assistants such as Alexa. Earlier in his career, he spent 15 years at SAP.
WHY THEY MADE THE LIST // The Two Voice Devs took advantage of a no travel 2020 to launch a new podcast and YouTube channel to talk about all things voice development. Mark Tucker is one of the first Amazon Alexa Champions and has been involved for years engaging with voice developers and end-users around the technology. Tucker is currently a senior technical director at voice agency RAIN. Allen Firstenberg is Google Developer Expert (GDE) for the Assistant, IoT, and wearables. He is very active in the Google Assistant community offering frequent advice to developers facing voice app (and policy change) challenges. He was also one of the first Google Glass users and continues to wear his to this day. Firstenberg is a project guru for Objective Consulting. Together, Tucker and Firstenberg exemplify the most supportive elements of the voice dev community and treat the space as a collaborative endeavor.
WHY HE MADE THE LIST // John Kelvie is the founder and head llama of Bespoken, which he launched in 2016 to automate testing and monitoring for Alexa skills and Google Actions. The company is a partner with Amazon and Google and has expanded its services in recent years to include other conversational voice and chat applications ranging from custom assistants in the car to contact center chatbots and smart home products. Kelvie began building voice interactive ads for mobile streaming apps in 2013 as CTO of XAPPmedia and learned the ins and outs of numerous speech recognition systems. After creating some initial Alexa skills for XAPP clients, John immediately recognized that voice app developers were missing essential tools common in other development environments. That led to Bespoken and the first automating testing and monitoring solution for conversational AI apps.
WHY HE MADE THE LIST // BBC was one of the most enthusiastic adopters of Alexa skills and later Google Actions in the media industry. In fact, the team at BBC believes so strongly that voice assistant technology will transform media access, search, and consumption that it introduced a custom voice assistant called Beeb earlier this year. The Beeb also has a custom synthetic voice and distinct persona that reflects BBC’s sensibilities. Andy Webb is VP of product strategy and head of product for voice and AI at BBC. He led the team which developed an innovative architecture based on Microsoft technologies designed to serve up common user experience elements whether the BBC content is accessed via Alexa, Google Assistant, Siri, a web chatbot, or the Beeb custom assistant. While Mercedes is creating a reference implementation for automotive, BBC is performing the same role for the future of media with the Beeb and support for multiple voice channels.
WHY HE MADE THE LIST // Michael Myers is VP of Product at XAPPmedia and started out as lead developer for the voice interactive advertising products in 2013. He has experience deploying voice technologies from Nuance, AT&T, SRI, Alexa, Google Assistant, Microsoft Cortana, and many more. He even figured out how to build and custom NLU that would recognize the phonetic alphabet for aircraft pilots calling in flight plans and checking the weather. His team also built a solution that was able to bring over 1,000 Alexa skills live for media organizations and enterprises in under 18 months. More recently, XAPP has been focused on new technology, which they refer to as machine teaching for conversational AI. It’s a set of workflow and automation tools that teach custom virtual assistants to understand and respond to users.