Over 100 Voice AI Predictions for 2021 from 50 Industry Leaders
This represents our fifth annual Voice AI predictions article and there is no question it is the most interesting and insightful to date. It is also the largest with over 100 predictions from 50 voice industry leaders. You will note that some of our guest contributors are confident enough to make multiple predictions. That may be smart. It increases the odds at least one of them will be correct. 😃
What is striking for our 2021 issue is the breadth of predictions and the interesting insights. The industry is simply more mature, has seen more, and has a better grasp on what is coming. I enjoyed reading this year’s predictions and am sure you will as well.
Custom Assistants, Mobile, Personalization, Multimodal, and Virtual Humans
Despite the breadth of topics covered, there are at least two topics that arose with meaningfully higher frequency than the others. Predictions related to the rise of custom assistants were mentioned by at least 11 contributors followed by an increased focus on voice solutions while on-the-go. Personalization, both in terms of the user preference and emotion recognition or empathy, and a rise in multimodal user experiences were next in line mentioned by about 10% of the contributors.
After that, a number of topics showed some popularity ranging from more rapid voice AI adoption in customer service (including virtual humans) to more growth in voice assistant features for audio media. It was interesting to see two guests (Audrey Arbeeny and Kirill Petrov) mention an expected rise of voice assistant and custom synthetic voices in games, a couple who are optimistic about Apple making a big Siri update this year (Brian Roemmele and Max Child), and how AR might spur voice adoption (Joan Palmiter Bajorek and Craig Sanders).
There were also some topics that were not as popular for predictions in 2021 as in earlier years. Voice commerce is almost completely absent but was mentioned by Bahubali Shete. Monetization comes up a few times but mostly in relation to disappointment about a lack of progress. How the big tech platforms (namely Alexa and Google Assistant) will change (Ralf Eggert and Steve Tingiris) was not particularly popular this year. However, discussion about how regulation might impact how Amazon and Google operate (Nithya Thadani and Malaika Paquiot) was new this year.
How to Read this Article
The predictions are organized by topic. Since many contributors listed more than one topic in their predictions you might not agree where I have placed them. I’m okay with that. This also means that you might find several of your favorite topics mentioned in other sections. You might just have to read a bit further or do a search to find what you are looking for. It provides more accurate context to keep the thoughts of each contributor together than spread them out by topic. So, use the topic headers as loose guides as opposed to a map to precisely what is in each section.
We did very light editing of these predictions and even include many longer entries in their entirety because they were interesting. I recommend you read them all (particularly in the Everything Else category) and let me know which you think are just totally wrong. That would be fun. Or, you could tell me which are right but that might be less fun. @bretkinsella on Twitter is a good place to start. 😎
Custom Voice Assistants (Independent, Owned, etc)
Sanjay Dhawan, CEO, Cerence
I expect that we’ll see voice-powered interaction expand beyond the car, smartphone and smart home to new areas of mobility and other unexpected areas. I’m thinking voice-powered elevators, unique applications for voice in retail settings, and an increase in the usage of voice technologies in transit and travel. While there was already increasing consumer interest and affinity for these types of interactions, COVID-19 will accelerate adoption.
In addition, I expect that we’ll see better connection between mobility and consumers’ expanding digital lives. While advances in voice-powered technologies have been plentiful in recent years, there is a still a gap to be bridged between the wide variety of services people interact with on a daily basis. From our viewpoint at Cerence, we’ll be focused on making sure the car doesn’t become a separate digital island but rather a place where a driver’s entire digital ecosystem is brought together with voice-powered access to everything they need, directly from the car.
Finally, as deep learning technology applied to voice AI continues to expand it capabilities, we will see voice assistants becoming more “humanized” in their interaction, going beyond purely conversational aspects based on the voice commands provided by the user and taking richer context into account. Voice assistants will be able to better grasp the state of the speaker using emotional cues as well as non-verbal expression modalities like gaze or gesture captured by the sensors and cameras that are starting to permeate more devices. Even further, voice assistants will leverage this context for proactivity, anticipating users’ needs and delivering relevant, helpful information and further elevating their role in users’ day-to-day lives.
Nithya Thadani, CEO, RAIN
In 2020, we’ve seen unprecedented scrutiny of big tech, but voice assistants have largely avoided the spotlight in this conversation. As cases play out in 2021 and the years to come, we predict that voice assistants will become a variable in antitrust litigation, or at a minimum, will be materially impacted as a bi-product of broader decisions regarding tech company power.
Placing greater restrictions on how voice assistant data is used or enriched across services could threaten quality of experience and personalization. It could also limit search and commerce experiences, which are largely dependent on an unrestricted flow of data to provide relevant results. The result? We expect to see continued investment by companies into owned voice assistants that exist outside of big tech ecosystems, allowing for greater control of data and the ability to operate across owned digital channels without restrictions.
David Low, Principal Consultant, Waracle
It’s been written about in Voicebot and elsewhere for 12-18 months, but I’m convinced 2021 will be the year of the domain specialist assistant and ultimately better forms of conversational service journey.
General purpose assistants can’t keep up with expectations either for users or brands, and need to allow the market to grow where expertise and demand exists. Contact centres are already geared up for service and for sectors like fintech, where a direct relationship between customer and provider is cruicial for privacy and security, opening up a specific assistant would surely make existing devices far more useful and trusted by bridging that service tech with consumer devices. With things like the Voice Interoperability Initiative, I fully expect to see movement down this path in early 2021.
Ron Jaworski, CEO, TrinityAudio
There is no doubt the usage of voice assistants will increase as we will start seeing more and more personal assistants that are not based on Amazon Alexa, Google Assistant, and other major VAs. Projects like Almond and GPT-3 will enable voice enthusiasts to create their own voice assistant. I believe 2021 will be the turning point where we’ll go from the two leading voice assistants (I would not include Siri in the same level as Google Assistant and Alexa) to multiple voice assistants with a specific domain or expertise.
Braden Ream, CEO, Voiceflow
We’re going to see two major trends continue in 2021: the merging of chat and voice as one “conversational product/AI” team within enterprises, and the continued rise of independent assistants.
Milkana Brace, CEO, Jargon
We’ll see far more chat and voice custom assistants that support advanced multi-turn conversations. Assistants so far have primarily been deployed for Customer Support and while those will continue, we’ll see much wider use cases being enabled – from internal operations, to consumer-facing search and discovery, to commerce, to field support, etc. Conversations are the new platform that will redefine all digital experiences. We’ll also continue to see existing products and services adding a non-conversational voice layer for basic commands.
Kumar Rangarajan, CEO, Slang Labs
From a market perspective, In-App Custom Voice Assistants will see a bigger growth and India will lead the way. 2020 saw the biggest e-commerce players in India like Flipkart, Amazon, Bigbasket, and JioMart either launch their In-App Voice Assistant or expand its use-case and language support. With the unprecedented growth in e-commerce during the pandemic year (grocery segment for eg example saw a 4x jump during this year), and the tier-2 and 3 cities leading the way, this growth is fueled by newcomers to the online transaction world and all powered by mobile and web apps of the various brands. But this is still less than 10% of the overall market. 2021 is expected to see an even bigger growth and e-comm players will look to provide better and newer layers of convenience to their customers, esp for their first time customers to be attract this deluge of incoming customers.
Multi-modal and multi-lingual In-App Voice Assistants will be a big part of that strategy (in addition to other strategies like adding more vernacular content) and more and more brands will start embracing this inside their apps. From a technology perspective, I expect the mechanism of adding In-App Voice Assistants will become as easy as adding Alexa skills. Voice Assistant as a Service will be a new buzz word in the market, with voice tech companies providing simpler and more developer friendly abstractions to add this tech into their apps.
Audrey Arbeeny, CEO, Audiobrain
I believe the gigantic rise in customized AI assistants, synthetic voices, and our ability to create realistic sounding voices for a fraction of the cost of recording traditional, authentic voices for hundreds of hours opens up not just huge developments but cultural changes.
One area in particular is the e-sports / game market. AI in gaming that can enable voice customization, custom character voices generated by the player, guidance, added enhancements to both the game experience and hardware will add to this already exploding culture, which will increase by billions the next couple of years.We did a lot of work in this area this year and on different platforms. That’s when I know something big is on its way. And, it was across the board: hardware, experience design, voice, music. It’s here.
Another area is advertising as we can create very realistic voices for brands now when only a handful had the funds to do this just a few years ago. So, in short, it’s all about the drop in cost and rise in personality to AI Voice Assistants.
Maarten Lens-FitzGerald, CEO, Open Voice
Beeb will launch and introduce the notion of non-Amazon, Apple, or Google voice assistants to the consumer as well as organizations who are gearing up to explore voice. This will feed the white label and other voice tooling industry like Rasa and Microsoft. As well as pushing the OVN and VII work around interoperability standards and guidelines.
Michael Zagorsek, COO, SoundHound
In 2021, spurred by the pandemic and the adoption of voice assistants for the home, consumers will be looking for easier, more convenient, and safer ways to interact. To that end, custom branded voice assistants will increasingly become accessible from a multitude of channels — breaking through traditional siloes where companies begin to offer the same voice assistant through their hardware device, their app, and over the phone.
Creating a cross-channel voice assistant will better connect customers to the brand while saving time and streamlining costs for organizations. The branded experience thus becomes the unified voice of the company, and customers will no longer experience fragmented services.
To achieve their goals, companies will begin to realize the benefits of specialized, customized voice assistants that better serve niche customer use cases and deliver more accurate responses through highly-accurate custom domain knowledge relevant to their business.
Voice Moves to Mobile Devices of All Sorts
Amir Hirsh, CEO, Audioburst
I think we’ll see a lot more usage of voice in apps and the automotive sector. Smart speakers rose in popularity this year while people were stuck in their homes. In 2021, as we (hopefully) start returning to workplaces and resuming our daily commutes, we’ll be expecting our mobile apps and cars to be voice-activated as well. Virtual assistants are like genies, they’re never going back into the bottle.
Amazon will begin to push, promote and streamline podcasts through Alexa, and we’ll see an increase in the popularity of listening to podcasts through their devices. Podcasts were a convenient third-party feature in Alexa, but following the September launch of a dedicated podcast repository within Amazon Music, I think they’ll begin to leverage Alexa in order to become one of the top three listening platforms.
Short-form audio clips will increasingly replace robotic text-to-speech responses to voice commands. Thanks to improved search accessibility, content discovery, and delivery, people will be able to hear originally-produced audio content responses instead of monotonous text-to-speech replies from their virtual assistants.
Susan Westwater, CEO, Pragmatic Digital
We are starting to see the rise of Voice beyond smart speakers and in 2021, I expect to see that continue, especially with mobile. Voice search will play a major role in demonstrating the value of Voice beyond the first-party utilities they know now. In order to be successful, Voice experiences must focus on user needs and enhance experiences that are relevant to the brand or business. I also expect to see more exploration into the “creative” side of Voice experiences so that we see experiences become more robust through the use of sonic branding, audio cues, and well-developed conversational copy.
I believe we will see a big increase in the use cases in voice on the go. This will be driven via enhanced functionality of hugely popular mobile apps like TikTok, the rise of dedicated apps for hearables, and developers embracing opportunities that functionalities like Google Assistant App Actions are creating.
Voice desperately needs a significant shift in consumer attention. We are on a steady rise, but there is this anxiousness for a leap moment. A lot of people say it will come with “the Killer App.” I personally believe this will happen because of convenience. It has to be easier to achieve goals via voice, ones that you previously had to use your phone for and consumers have to start building habits around it. I believe that 2021 will be the year of building more habits in the use of voice on the go and it will in turn drive the usage patters for voice in other situations.
Chris Sheldrick, CEO, what3words
People will start to use the microphones embedded in their Apple Watch (or other smartwatches) to much greater effect by making voice-powered mini-versions of phone apps. Having a microphone on your wrist is very convenient, especially as it’s so hard to type on a watch. At what3words we’ve seen fantastic take-up of 3WordGo (book Ubers by saying 3 words to your watch) and 3WordAuto (set your Mercedes destination by saying 3 words to your watch) and am sure we’ll see many others making similar kinds of apps.
Roger Kibbe, Senior Developer Evangelist, Viv Labs
The voice first industry will keep going strong but where I predict massive increases in voice usage are in mobile apps. Voice is simply the very best or the most convenient input modality in so many cases that it will drive mobile app developers to adopt voice interfaces to augment their UI. I’d really like to see this happen on the desktop (I really want a voice enabled Excel and Photoshop) but that may be an outlier of a prediction for 2021.
Kirill Petrov, CEO, Just AI
Voice in mobile apps is the hottest trend right now and it will stay so because voice is a natural interface. Natural interfaces are about to displace swiping and typing. Voice-powered apps increase functionality saving us from complicated navigation, form-filling, overlaid menus, support, etc.
Voice Cloning Machine learning tech and GPU power development commoditize custom voice creation and make the speech more emotional, which makes this computer-generated voice indistinguishable from the real one. Voice cloning becomes an indispensable tool for advertisers, filmmakers, game developers, and other content creators.
When talking about Conversational AI and gaming, one cannot fail to mention text-to-speech (TTS), synthetic voices, and generative neural networks that help developers create spoken and dynamic dialogue. In the upcoming year, developers will be able to use sophisticated neural networks to mimic human voices. In fact, looking a little bit ahead, neural networks will be able to even create appropriate NPC responses.
Voice assistants in smart TVs is an obvious placement for a voice assistant. With a smart assistant on your TV, you can easily browse the channels, search for the content, launch apps, change the sound mode, look for the information, and many more, depending on the TV model.
We expect there will be more customized, more technologically advanced devices in 2021. Smart displays, like the Russian Sber portal or a Chinese smart screen Xiaodu, are already equipped with a suite of upgraded AI-powered functions, including far-field voice interaction, facial recognition, hand gesture control, and eye gesture detection. What’s next?
Personalization, Emotion Recognition, and Context
Pete Erickson, CEO, Modev
I predict that one of the major platforms will enable custom wake words. While this may seem like a simple prediction, it opens up a new world in terms of how voice assistants are used within the home. It also allows third party OEMs such as appliance companies, TV makers to customize the voice experience and align with their brands. This will further support the move of the major provider to be a platform vs. a product.
Karol Stryja, Chief Commercial Officer, utter.one
As we got more distant from the people we like to spend time with over a couple of last months, I believe the smart speakers will be able to play a huge role in the mindfulness space. Just imagine not only voice ID but also recognition of sentiments. – You seem to be sad today. How can I help you? That will be HUGE.
Multimodal Becomes Commonplace
Joan Palmiter Bajorek, Founder, Women in Voice / Head of User Research, NLX
In 2021, we will see more integrated, multimodal, user-centric experiences that are voice-enabled. Alexa skills and Google actions are great jumping-off points, but more evolved AR, video, behavioral analyses, etc. will be embedded and by default more and more across the board.
Tobias Dengel, CEO, WillowTree
2021 will be the year Multimodal gets real. Apps will become “voice powered” and the mic button will become ubiquitous in apps, allowing the user to “tell” the app what to do, e.g. “Get my usual” to the Starbucks app, or “Order me two pepperoni pizzas” to the Domino’s app.
Jan König, Co-founder, Jovo
The conversation is going to shift from voice to multimodal. Even though most voice-powered devices today offer a screen or other input/output capabilities, the design of those other modalities is usually an afterthought. In 2021, more companies will think about delivering truly multimodal experiences. Context-first, not voice-first.
Sarah Andrew Wilson, Chief Content Officer, Matchbox.io
What I’m about to say isn’t a mind-blowing prediction, but it’s a realistic one. Those of us in Voice AI like to talk about multi-modality to the point where it’s now a given for us — we assume the future is here, everything is multimodal and we’re all looking way beyond that idea. But the general public, as a group, hasn’t caught on yet. They’re getting there. And we’re all helping them get there, but it’s not a given that the general public understands and expects a multimodal experience in their daily lives. Yet.
So for 2021, my boring but realistic prediction is that the general public will become much more accustomed to choosing voice as one of several ways to interact with a product. Within the same product, people will discover that sometimes it’s easier to speak, while other times it’s easier to touch or type. They’ll realize that voice and the option to use voice isn’t just a novelty. And the result is that there will be even more products released that allow people to interact in the manner that suits them best.
Amy Stapleton, Co-founder, Chatables
Prediction A) An unknown company will gain international stardom by launching a voice-interactive digital human who convinces a majority of vaccine skeptics to get vaccinated. Prediction B) In an attempt to squelch misinformation, tech giants will quickly muzzle a voice-interactive digital human poised to convince a majority of vaccine skeptics to get vaccinated.
Oxana Gouliaeva, Innovation and Insights Manager, Accenture
With the on-going pandemic we will see even more vocal interactions between people, more audio content consumption, and an ever growing use of voice search, voice command and other voice-enabled features, such as smart home appliances, voice search on websites, tools reading out loud articles, emails, social media feed, etc.
While redefining their purpose and positioning to stay aligned with the values shift, brands will think audio assets/branding and pay more attention to the spoken audio content, such as podcasts.
The convergence of software agents, voice and conversational technologies, 3D, VR/AR, supported by the 5G, will lead to a further development of virtual characters acting as contextualized interactive storytellers and influencers on behalf of brands. This accelerated blending of the physical and digital worlds will be particularly to see in the gaming, entertainement and educational fields for a better engagement with Gen Z.
Customer Service and Enterprise Applications
Pat Higbie, CEO, xapp.ai
Enterprises are being squeezed by increased service requests and budget tightening combined with the need to improve customer service to remain competitive. Automated knowledge capture will emerge as a “game changer” in the adoption of conversational self service to address these issues across voice and chat channels with measurable return on investment.
Scott Stephenson CEO, Deepgram
1. Real-time is finally real. Call center leaders have long acknowledged there is value in real-time monitoring of calls, but have not had the tools or talent to implement it. Just a year ago, companies adopting this technology were seen as cutting-edge, but this adoption cycle is accelerating. Looking ahead to 2021, I predict more companies will invest in and rely on this deployment method to ensure customer service, agent productivity and compliance issues aren’t slipping through the cracks.
2. The urgent shift to remote work has forced companies to quickly adapt, with most meetings, sales presentations, HR conversations, and more now taking place over video conferencing. As a result, we’ve seen more organizations invest in speech recognition software to transcribe conversations, identify insights and trends, and unlock value for the business. Looking ahead, I predict more enterprises will allocate budget to voice-enabled experiences—both agent and customer facing. At the same time, software providers will aggressively fund speech-related product developments to break through the noise and try to become the next big player in the CX technology space.
3. Call centers today are often composed of entry-level personnel who experience plenty of “bear with me” moments as they try to troubleshoot in real-time. The call center of the future, on the other hand, will automate the most mundane tasks—like resetting a password or changing an address—and appropriately direct more detailed questions to humans within those areas of expertise. I predict that by leaning on automation, call centers will have more budget to train teams on more nuanced tasks, improving agent retention and customer service for everyone as a whole.
Pete Haas, Founder, Conversation Curve
Expect large investment from organizations in the area of RPA ‘Robotic Process Automation’ for employee tasks that would have been considered too difficult to automate previously. Companies like Automation Anywhere and Blue Prism are well positioned to take advantage.
Bots will move beyond innovation groups within the enterprise. Companies that have been ‘experimenting’ for the last few years see the value, and will differentiate their products and services based on discoveries made during this learning period.
Malaika Paquiot, VP of Product, K4Connect
We will continue to see the impact of COVID-19 on work and that will result in the following developments in 2021:
1) People who can, will continue to work remotely throughout 2021. We will see the corresponding uptick in voice search, the use of voice assistants and smart speakers, as a result of the change in commuting habits continue, but at a slower pace as more of the workforce returns to work toward the end of the year.
2) As businesses look to cut costs, we will see increasing usage of deep fake technology for voices and so-called digital associates for marketing and education, corporate communications, and in assistive technology across industries. This will continue as these technologies continue to improve.
3) Increased usage of multimodal displays like the Amazon Echo Show in the healthcare industry. On the other hand, due to global civil unrest, and new U.S. leadership, expect to see greater scrutiny of AI technology, voice included, and its impact on marginalized communities. There will be an increased expectation that AI be used equitably and there may be regulations to ensure that is the case.
4) Finally, expect increased competition from Asian companies to fill the language gaps in voice assistants left by large players like Amazon, Apple and Google.
Giulio Caperdoni, Head of Innovation, Vidiemme
The increasing adoption of digital representation of sales agents: the digital double is capable of interpreting, integrating and supporting the agent and interacts with the end customer, replacing the agent in predefined tasks and processes, giving continuity to the relationship and relieving the agent from low value-added practices.
Voice and Audio – the Killer Combo
Dave Kemp, Director of Business Development, Oaktree Products / Host, FutureEar Radio
I continue to believe that Spotify will soon emerge as one of the most prominent and important companies relating to the voice industry. Spotify will enhance two of the most popular use cases for smart speakers and voice assistants – music and podcasts – by enabling much more robust search and curation. I strongly believe that media is the, “killer use case,” for voice assistants because assistants have the potential to provide a 10x better experience in accessing, consuming, discovering and sharing ambient media. With Spotify’s strong push into podcasts in 2020, I believe the next step to further strengthen its value proposition will be through voice assistant-enabled curation and discovery of its content catalog.
Steven Goldstein, CEO, Amplifi Media
We see continued growth of short form audio content which fits smart speaker use patterns. Ambient audio – streaming – will continue to do well but more content producers are recognizing the potential of short content as people look for quick hits in various locations such as kitchens and bathrooms.
Ambient, Omnichannel, and Device Independence
Jason Fields, Chief Strategy Officer, Voicify
I predict that the UX world will begin to expand their purview into voice significantly this year. Rooted in wide exposure of at home use in 2020 I think UX professionals have awoke to the fact that their omni-channel designs must include voice/chat/ai or it is not a complete solution.
Dan Miller, Opus Research
Voice is fulfilling an important role in device-independent, “ambient” conversational services. That means that it is a ubiquitous, highly personal user interface in at home, in cars and through smart phones.
Will Mayo, Founder and Chief Strategy Officer, Spoken Layer
I believe in 2021 smart assistants will begin their move to becoming truly ambient – right now assistants are tied to specific devices, or speakers or pieces of hardware. We are getting to a point where they will be able to be ambient in the same way that we experience wifi – no longer tied to a single device or router and then evolved to a mesh network. We are going to see assistants living outside their own hardware in 2021.
Voice Replaces Touch Due to Health Concerns
Brian Roemmele, Founder, Voicefirst.Expert
The challenges of the 2020 pandemic will unleash new use cases for Voice First systems. From the obvious “touch-less” functions to much more friendly food ordering via voice will rise though 2021.
Apple will finally begin to show how Voice First will become the central user interface for an array of new Apple devices. From the Apple Visor AR/MR platform on to other devices telegraphed in 2021. The Cold Winter for Voice First I predicted here over two years ago will continue but begin to thaw.
The largest systemic issue is to solve discovery of “apps” and monetizing of “apps”. This is vital for a healthy, long term developer ecosystems. The entire system must be replaced by a logical system that fits a Voice First ecosystem. If it does not take place, there will be a longer Cold Winter.
Privacy and security will continue to become a massive issue and current platforms continue to take the wrong advice on how this will play out. This will create a massive opportunity for independent companies that do not need any “cloud” to operate and they will begin to surface in 2021. Overall the Voice First sector will continue to be the fastest adopted technology in history even with the caveats.
John Campbell, Managing Director, Rabbit & Pork
I think we are going to see more companies invest in voice control out of the home, with the focus on being being “Covid safe” so you don’t need to touch. For example at point of sales, ticket machines and food ordering kiosks. It will be interesting to see if these voice first solutions work compared the alternative solution of allowing the customer to control or connect to the kiosk using their phone.
Roy Lindemann, Co-founder and CMO, Readspeaker
COVID-19 is reshaping how companies interact with their customers by putting health considerations at the top of their priorities. I think that the pandemic will have a long-lasting effect on accelerating voice interactions to limit contact points that end users have with physical devices.
Alexis Hue, Founder & CEO, Voxalyze
I forecast two things happening: Alexa gaining even more ground into the home and at the same time starting to introduce voice ads. Potentially even interactive voice ads. This will have a major impact on skill developers as well as radio and broadcasters present on the platform as it will finally open up the opportunity of proper monetization.
Stas Tushinskiy, CEO, Instreamatic
I predict that voice advertising will evolve into a multi-step conversation between a brand and a consumer. We will witness the emergence of new voice ad supported verticals such as gaming and navigation.
Dr. Ben Goertzel, CEO, SingularityNET
In 2021, we may start to see neural-symbolic systems emerge as practical tools on the back end of home automation and voice assistant systems. We may see rapid acceleration in speech-to-speech machine translation, including for languages with relatively limited training resources, an advance with clear breakthrough implications for the developing world.
Voice assistants, robotic assistants and related tools will continue to invade further industries, including inroads into health tech and eldercare where the human resources involved have been massively overstretched due to COVID-19. The big question is whether 2021 is the year we see a breakthrough to open-domain NLP dialogue systems that actually understand what’s going on — or whether this will still be a couple more years into the future.
Dr. Patricia Scanlon, Founder and CEO, SoapBox Labs
2020 saw a massive growth in Kidtech: apps, web services and smart toys built specifically for kids. Interacting with technology is still a fairly clunky experience for kids as the UX attempts to accommodate for kids’ limited reading ability and complex thinking.
Kids are voice natives. They’ve grown up with smart speakers in their homes and have no inhibitions when it comes to using voice to interact with technology. Truly engaging and interactive experiences for kids need to be voice-enabled but to date, parental privacy concerns have held back adoption and in 2020 there are still few new voice toys and apps on the market.
Joe Petro, EVP and CTO, Nuance Communications
In more ways than not, the last 12 months have significantly altered the ways in which we interact, which has, in turn, changed our frame of reference, online habits, and created stronger preferences and expectations for digital experiences. In 2021, we will continue to see AI come down from the hype cycle, and the ability, claims, and aspirations of AI solutions will increasingly need to be backed up by evident progress and quantifiable outcomes. From this, we will see an alteration in the field that focuses more on specific problem solving and developing solutions that deliver real outcomes that translate into tangible ROI — not gimmicks or building technology for the sake of it.
Max Child, Co-founder, Volley
Apple will finally fix Siri. Not sure if that means “SiriOS”, but the acquisition of Giannandrea pays off with a major leap forward in day-to-day Siri quality.
Steve Tingiris, Founder, Dabble Lab
My prediction is that in 2021 we’ll see an increasing trend away from 3rd party skills, actions, capsules, etc, and a move to incorporate high-demand/high-value 3rd party functionality as 1st party functionality. This doesn’t necessarily mean the functionality won’t be provided by a 3rd party. But, from a user-perspective, it will appear to be native functionality.
Bahubali Shete, CEO, IOK Labs
Voice shopping will get into the mainstream. This will happen for two reasons: As voice adoption will move from innovators to early adopters early majority on smart speakers and new normal continues in 2021, more consumers will start using voice shopping from their kitchens. As Google and Alexa start getting better on other native languages, that will generate more adoption and hence more usage for shopping.
John Kelvie, CEO and Co-founder, Bespoken
The most significant development is how companies like Rasa and Poly AI are putting close-to-the-metal NLP technology in the hands of developers. It’s the closest I’ve ever seen true research come to the mainstream in my 20+ year career in technology, and though I think the complexity has the potential to be overwhelming, it does allow developers to break through the performance glass ceilings they are bumping into with their current bots.
Hans Van Dam, Founder and CEO, Conversation Design Institute
As systems are trying to get more personalised and transactional, trust and security become bigger issues. Since we have no dedicated solutions for this, it will remain to be a challenge to truly unlock the potential of conversational AI. Companies are now committing themselves to AI Assistants, they invest in technology and people, however, customers will not just hand over their details in a comfortable way if conversations and systems aren’t designed for trust and security. Solving the trust and security issue are a requirement to make AI Assistants personal and transactional at scale.
Ralf Eggert, CEO, Travello
Actually, it’s less of a prediction and more of a wish, but I’ll still formulate it as a prediction. In my opinion, the major platforms (Alexa, GA) will focus on quality in 2021 after the years of growth in first-party and third-party voice applications. There are simply so many skills and actions that were released years ago and are aging rather lovelessly without being improved or getting regular content updates. I think a major cleanup would be appropriate here. In addition, the Alexa Skill Store and the Google Action directory need a fundamental overhaul. For example in the German Alexa Skill Store hardly any movement can be seen since a long time. Visible editorial support would be more than desirable here, both for the end users and for the operators of voice applications.
Ian Firth, VP of Products, Speechmatics
In 2021, AI Won’t Be Mapped on the Human Spectrum of Competence. We can have algorithms that crush any human at chess but are unable to make a cup of tea and computer programs that can perform mathematics millions of times faster than humans, but if asked who might win the next World Cup, they wouldn’t even understand the question. Their capabilities are not universal. We’ve reached a point with AI where we simultaneously overestimate and underestimate the power of algorithms.
When we overestimate them, we see human judgment relegated to an afterthought – a dangerous place to be. The use of a “mutant algorithm” in grading A-level results is the scandal du jour in the UK, despite the algorithm producing many results that simply violate common sense. When we underestimate algorithms, we see entire industries crumble because they didn’t see change on the horizon. How can the traditional taxi business compete when Uber’s algorithm can get you a ride in less than 3 minutes?
In 2021, expect engineers to avoid AI and algorithmic blunders by not trying to map algorithms onto the human spectrum of competence. Using AI technologies – such as any-context speech recognition – to enhance what humans can do and finding the right balance between AI automation and human knowledge for real world use cases – such as customer experience and web conferencing – will begin to shape the effective use of AI for the future.
Craig Sanders, Founder, HELPEN
2021 will be the year the dense, optimistic froth about the voice space is lifted. Will be a good thing. Voice outlook doesn’t become pessimistic, but it becomes more realistic (still incredible potential) with more eyes on the sector thanks to new audio thesis from larger investment firms.
Voice/Conversational AI is brought along with audio and becomes a huge focus for many investors. Analysis to date has been overly rosy, ignored the concept of smart speaker churn (which is very real), and incorrectly compared smart speakers to that of the iPhone from a different time period where tech adoption and customer expectations were on completely different curves with flatter slopes.
Looking back on 2020, relative to other tech, it will become obvious that it was disappointing in user adoption and usage for the magnitude of proximity to the smart speakers during multiple lockdowns. It will eventually be viewed as one of the biggest missed opportunities for the platforms in the voice space.
New AR/VR systems will cause submissions of new voice only gaming skills to nearly disappear. Competition for attention is just too difficult. 3D > imagination. Towards the end of 2021 it will become apparent that what will catapult voice forward will not be based on stand alone smart speakers, but voice more as a core feature of AR based applications.
For consumers it will be visuals first, voice second but ironically that will take voice adoption much further by an order of magnitude. Stand alone speakers, especially the HomePod mini, will benefit.
Overall, it will become apparent that opportunities in the space were not taken advantage of by major players. All the grief that Apple took over the years will be forgotten. They played the long game and their experiences for users, and developers will win out.
Dr. Teri Fisher, Found and Editor, Voice in Canada
After the Spanish flu of 1918, the world experienced an era of tremendous growth and prosperity known as the “Roaring 20’s.” Today, as we begin the 2020’s, and we can see the light at the end of the COVID tunnel, we are also poised to experience our own “Roaring 20’s.” Voice is going to be a key driver of this growth, as people become increasingly more comfortable with using voice as their operating system (#VOICEismyOS). Whether it is to stay in touch with each others, access medical information, interact with the world in a touch-free environment, or simply keep ourselves entertained, voice is going to see a decade of rapid growth. The “Roaring 20’s” are here again, and how do you roar? You use your voice!