Amazon More Than Doubles Audio Clip Limit for Alexa Developers, Opens Up New Use Cases

Third party Alexa developers have been limited to 90-second audio streams since launch back in 2014. Recently, Amazon updated Alexa Skills Kit (ASK) and increased the limit to four minutes (i.e. 240 seconds) of audio for an interaction. Alexa developers say that this opens up a new set of use cases and will be good for users. Dave Young, the founder of Speakway, commented:

Dave Young, Founder of Speakway

“The new changes to SSML’s audio tag sound source files add increased capability in length and quality. Specifically, they’re upping the allowed content length to 240 seconds from 90, and increased the allowed sample rate to 22050Hz, 24000Hz, or 16000Hz, whereas previously it was only 16KHz. To use longer clips than 90 seconds, you previously had to use AudioPlayer, which takes the user out of session.

“For types of short-form audio skills where the content length is greater than 90 seconds (i.e. most song lengths), you can now let the user interact immediately after hearing the audio. This is a great help in opening up new engagement opportunities, as the user now remains in the skill and does not need to re-invoke it just to pick up where they left off. This could be extremely useful in musician, storytelling, narrative-style gaming, or messaging skills. You can also play several longer clips together without interruption or awkward list management if the user desires.”

Maintaining User Sessions

In 2016, Amazon introduced Alexa’s AudioPlayer which enabled longer-form content such as podcasts and streaming radio, but that came with a significant downside. Transitioning the user to AudioPlayer took them “out of session,” or out of the skill itself. The user could still be listening to your content delivered through an RSS feed, but was essentially in Alexa-land and not Skill-land. The result was users had to reinvoke your skill to get back to skill-specific functionality. This was an awkward user experience and created several creative limitations. Christiaan Quak, a founder of abqinteractive in the Netherlands, said in an email interview:

“We’re already re-recording a few chapters from the choose-your-own-adventure book I’m developing. I was really, really excited to hear this. Though, the new constraint is 120 seconds for the Google Assistant if you’re publishing cross-platform.”

How Long Should a Voice Interaction Be?

There is merit in keeping voice interactions short, most of the time. Even rudimentary user experience testing will quickly show frustrated users if they have to wait for a verbose Alexa skill to finish speaking before they can interact again. Most Alexa skills benefit from interactions that are only a few seconds long and should never even approach the old 90-second limit. However, it all depends on your use case. A musician can now play a full-length three-minute song and include an introduction or trailing message asking for fan feedback while maintaining the user session. Brevity is often your friend when designing voice user experiences, but the expansion of Alexa dialogue limits offers new opportunities to engage users.


Alexa Skill Voice Ratings Accelerate User Rating Submissions – Analysis of Box of Cats, Tricky Genie, and Heads Up!

Amazon Now Has More Than 50,000 Alexa Skills in the U.S. and It Has Tripled the Rate of New Skills Added Per Day

Castlingo is a Simple Way to Share News and Expertise with Alexa and Google Assistant Users