Why Embedded Domain Specific Assistants Are Better, Faster, and Cheaper
Back in January, I wrote about voice assistants going embedded and how a variety of initiatives could help that movement. Now I’d like to talk more about Embedded Domain-Specific Assistants (EDSA)(if you have a better acronym than DSEA or EDSA, please let me know on Twitter). In fact, I want to let you in on something. EDSA cost less, are more private, respond faster, are easier to install, and are more reliable and accurate than a cloud-based voice assistant.
All of these are well-understood advantages. Better privacy More private because your personal information doesn’t need to be saved and, even if it is, it never leaves your product, meaning data can’t be harvested. Faster response times are possible because no data connection is needed, and it’s more reliable because no internet connection is required to make it work. An embedded assistant can work right out of the box. Nothing to download, no app required for your phone, no wifi hookups, or external devices needed. It’s much simpler and little installation is needed, just plug it in and it works.
Cloud-based speech recognizers can take advantage of more data and virtually unlimited size limitations. Adapting on devices to individual voices can help level the playing field between embedded voice assistants and cloud-based voice assistants. But where the magic really starts to come in is being Domain Specific. The Google Assistant, Siri, Alexa and other cloud assistants are trying to do everything. They use domain-based expertise to become good at certain functions. Like when Alexa came out She was very good at music and setting timers but not a lot else, then quickly domains were added for cooking, math, general Q&A, and she got better and better. Each of these domains might be very good, but choosing which domain should answer might not be so easy. A great example is a recent review of an Amazon smart oven:
“There is a button to air fryer items with exactly five menu choices. When I asked Alexa to connect to my device and air fryer chicken legs, she replied, “chickens have two legs.” There is no way to set a temperature on the air fryer”
This shows both a domain confusion (chickens have two legs) and an error of not having the device capable of executing in domain commands like temperature setting for the fryer. Here’s an example of a domain specific embedded assistant for a microwave that can respond to anything in the domain and there’s only one domain so no domain confusions can occur.
How can an embedded assistant cost less? Isn’t cloud computing more efficient? Doesn’t taking that processing power off devices lower the cost? The answers may not be what you think, and there are some dirty little secrets most people don’t understand. Let’s look at two key cost points:
- On-device hardware. Putting the speech recognition and synthesis on device adds cost in memory, computing power, and fees for speech tech. These fees are usually built into a per-unit cost and might add $1-$5 per unit depending on volumes and design strategy. Of course, if you don’t have to connect to wifi to go to the cloud this could save $1-$2. So using an embedded assistant over a cloud could be about the same or slightly greater hardware costs.
- Cloud computing costs. This is what most people don’t really understand. And that’s understandable because it is very confusing. Here’s something that’s EASY to understand…Amazon (AWS), Microsoft (Azure), and Google make a ton of money in cloud services:
The cloud revenue business is growing fast, high margin, and big revenues, so it is highly strategic to each of these companies. And that’s where voice processing comes in; as voice recognition grows it offers the opportunity to bring more people into their cloud services. It’s very easy and inexpensive to get started with Azure, AWS, Google Assistant or Dialogflow, or even using the great FREE tools provided by Google for using Tensor-flow. Even data that used to be sacrosanct at Google is now given away at no charge. This lures a lot of companies in.
The win for these companies is if usage takes off, and that’s where the companies paying for the cloud service might lose out. With an embedded assistant its typically a one-time fee that might add a buck or two to the cost of goods, and increase the retail price by up to $10. But with cloud fees, either the user needs to buy a subscription, or the manufacturer needs to cover the cost which, in a heavily used device, can make the sale completely unprofitable, not to mention a service that’s completely out of their control, collecting data on their customers, and often with the rules and changing rules beyond their control. The rise of edge computing and neural net processors is making on-device processing faster and cheaper and there are growing reasons to bring the assistant onto the device!
Todd Mozer is CEO of Sensory.
The Privacy and Security Episode with Molla, Mozer and Lens-FitzGerald – Voicebot Podcast Ep 126