Ruhi Sarikaya – Amazon

Amazon Alexa Features for Discovery, Context Maintenance and Memory Announced

Ruhi Sarikaya is director of applied science for the Alexa Machine Learning group. He delivered a keynote yesterday at the World Wide Web Conference in Lyon and announced in a blog post that Amazon will be rolling out three new Alexa features that will further enhance value to users and in some circumstances may help third-party developers. The new features are referred to as Skills Arbitration, Context Carryover and Memory.

Skills Arbitration as a Method of Skill Discovery

Skills arbitration is actually the combination of two features: skill discovery and auto-enablement. Mr. Sarikaya describes a scenario when he needed information about removing a stain from his shirt, but was not aware of the relevant Alexa skill created by Tide.

“For example, using an Echo Show device, I recently asked: ‘Alexa, how do I remove an oil stain from my shirt?’ She replied: ‘Here is Tide Stain Remover.’ This beta experience was friction-free; the skill just walked me through the process of removing an oil stain from my shirt. Previously, I would have had to discover the skill on my own to use it.”

A key word in his final sentence tells you what this is really about, Alexa skill discovery. Arbitration is simply the technical approach Alexa will employ to hopefully select the right skill to address the user need. Mr. Sarikaya mentions in his post that there are over 40,000 unique Alexa skill worldwide (N.B. 31,854 in the U.S as of this morning) and that means it is increasingly hard for users to find the skills they may be interested in or needed at any given time. Amazon has been experimenting with this feature since at least September 2017 when Voicebot was the first to report on its existence. However, Skills Arbitration has been available sparingly and inconsistently to date.

Amazon Alexa Now Recommends Third Party Skills

Alexa skill discovery is not only an issue today for users. Independent Alexa skill developers have faced increasing difficulty in getting their skills noticed by users. It is hard to make your skill stand out when users must wade through 30,000 others when looking for what you provide. Arbitration has the potential to surface third-party skills like Tide in the example above and introduce them to a new set of users. The downside of this, of course, is which skill will be recommended when Alexa hears a request and there are multiple skills that could potentially answer the question?

Many third-party developers are concerned that there will be a bias towards recommending (i.e. arbitrating to) Alexa’s first-party, native capabilities, and second-party skills made by Amazon but that technically exist outside of Alexa. After these priorities, the independent developers wonder if big brands will be favored through the skill arbitration process or if developers will be able to pay to be recommended giving the advantage to companies with the most resources.

This problem is potentially compounded by Alexa’s experimentation with skill auto-enablement which biases future interactions for an invocation name or specific topic to that skill. If a user is guided to a skill, then it is auto-enabled. After that enablement takes place, any other skills covering a similar topic could be permanently invisible to the user. This is further complicated by the fact that Amazon allows duplication of Alexa skill names so a recommendation by Alexa to use a particular skill and enabling it, would be at the detriment of all of the other skills with a similar name.

Amazon is Experimenting with Alexa Skill Auto-Enablement

Alexa skill discovery is the most important barrier faced by third-party developers. So, any feature that helps with this challenge is welcomed. The caveats listed above are genuine concerns. However, skill discovery is a problem that Amazon must address to help both users and developers. If the company can keep in mind the potential detrimental impact in third-party developers, it will be widely welcomed by the broader Alexa community.

It is also worth noting that Google Assistant already has this feature. Google Assistant will recommend Assistant apps to users when they ask general questions that match to certain keyword phrases. These keyword phrases are called Implicit Invocations (N.B. formerly known as implicit discoveries) and are a signal to Google Assistant’s search algorithm that a specific Assistant app may be able to answer the user’s question and do so with a voice-interactive experience. As a result, Amazon’s announcement of skills arbitration is not plowing new ground in the voice assistant space, but rather is closing a gap with Google.

Context Carryover

Context carryover is the concept of relating a current phrase spoken by a user to what was said previously. In this case, it is about what was said immediately prior to the spoken phrase. This feature already exists to a limited extent. Voicebot was the first demonstrate this context carryover feature (i.e. context maintenance) in March with a video demonstration. At the time, I explained this feature by saying:

“Consider a query that starts, ‘Alexa, who is Adele?’ To learn about her hometown without context maintenance you would have follow your original question with, ‘Where is Adele from?’ With context maintenance, you can simply say, ‘Where is she from?'”

I am assuming that Mr. Sarikaya is a Voicebot reader. The example he cites in his blog post was,” Alexa, what was Adele’s first album?” “Alexa, play it.” Was the focus on Adele among all singers a coincidence? How about his third example? He explains how the first Adele example referred to context carryover within a domain, namely music. His third example refers to context carryover between domains, “Alexa, how’s the weather in Portland?” → “How long does it take to get there?” This was also an example the video above shows by shifting to a question about the distance between Tottenham and London, England. The real finding here is that context carryover has been available for some time as Voicebot discovered in March.

The earlier Voicebot article also went on to distinguish between context carryover that is specific to a session from context persistence which would maintain context across sessions. Context persistence would be tied to a users history of interactions or at least a recent history of interactions that go beyond a single session. These were both differentiated from context personalization which applies context to the personal information, experience and preferences of a user and is session independent. In each case, the voice assistant provides added value to the user in a different way. It is good to have Amazon announce context carryover as a feature that will be rolled out first in the U.S., U.K. and Germany. Even better will be the addition of context persistence and context personalization features in the future.

Alexa Skill Memory Feature

The final feature announced by Mr. Sarikaya is called Memory. You can ask Alexa to remember something such as a fact and when you ask later, she will repeat it back to you.

“For example, a customer might ask: ‘Alexa, remember that Sean’s birthday is June 20th.’ Alexa will reply: ‘Okay, I’ll remember that Sean’s birthday is June 20th.’ This memory feature is the first of many launches this year that will make Alexa more personalized.”

Presumably, when you say in the future, “Alexa, when is Sean’s birthday?” the response will be, “Sean’s birthday is June 20th.” This goes back to the concept of context personalization mentioned above. It is a manual personalization of a user’s Alexa profile that could complement a programmatic personalization based on past Alexa interactions and links with calendar and social media accounts among other data sources.

If anyone is thinking this sounds like a simpler implementation of Alexa Skill Blueprints, you are not alone. Capturing information like “Sean’s birthday” is one of the features of the Q&A Blueprint announced just last week. The big difference is you don’t need to establish an Alexa developer account or fiddle with online forms to save the information you would like to be able to recall. You simply say the information and the Alexa Memory feature will record it for you for future access. Granted, the Alexa Skill Blueprints offer far greater functionality than simply recalling a fact or two. However, Memory is precisely what Tobias Goebel pointed out in the Vociebot Slack forum when he questioned why Blueprints required manual data entry instead of simply allowing users to speak the information they wanted to convey to Alexa.

New Innovations Previewed Previously, but Not Yet Widely Available

What we have learned from this analysis is that each of the new Alexa features–Skills Arbitration, Context Carryover and Memory–all have precursors that have been in testing for some time or are already available in another form. They have been tested and we presume Amazon has had a chance to learn and improve the features before announcing general availability. Amazon is now packaging them up to make them easier for users to interact with and understand. Up to this point, it seems like Amazon was primarily focused on getting scale and breadth for Alexa, but these new features and others previously chronicled by Voicebot note a shift to more depth around user experience and engagement.

Video: Alexa Context Maintenance in Follow-Up Mode Delivers Improved Voice UX

Alexa Skill Blueprints Mean Everyone Can Have a Personal Alexa Skill with No Coding