Spotify Patents Emotional Speech Recognition Tech for Song Recommendations
Spotify may someday suggest songs based on the emotion in your voice, judging from a newly approved patent by the popular streaming service. The “Identification of taste attributes from an audio signal” patent would augment Spotify’s existing speech recognition technology to account for not just what words are said, but how they are said and who is saying them, possibly making the resulting playlist more likely to be what the user is seeking.
The patent describes a way of determining several bits of information about a listener based on their voice and translating that data into a song or playlist recommendation. The AI would make a decision on what to play next using not only “emotional state, gender, age, or accent,” but “environmental metadata” too. As the patent figure to the right shows, those elements include where you are physically and the social environment around you. So if you’re a happy young woman from Brooklyn having a party at a bus stop or an angry middle-aged man alone in a park, Spotify would have song ideas for you. It’s easy to imagine playing with the AI, too, by putting on different accents or background noise, though that might mess up the personalized recommendations on your account. Regardless, the recommendation system is one of Spotify’s best selling points, so methods to improve it would be a natural focus for the company.
“In the field of on-demand media streaming services, it is common for a media streaming application to include features that provide personalized media recommendations to a user,” Spotify explained in the patent. “One challenge involving the foregoing approach is that it requires significant time and effort on the part of the user. In particular, the user is required to tediously input answers to multiple queries in order for the system to identify the user’s tastes. What is needed is an entirely different approach to collecting taste attributes of a user, particularly one that is rooted in technology so that the above-described human activity (e.g., requiring a user to provide input) is at least partially eliminated and performed more efficiently.”
The patent was originally submitted in 2018. Spotify’s technical ability to identify and analyze the factors it describes in the patent has improved since then, which is important when the library of tracks keeps getting better. Spotify hasn’t been sitting still as technology and culture have shifted the market for streaming content and voice technology since then. A potential voice assistant teased in a leak almost a year ago is still presumably in the works. It would supposedly add a wake word to the mobile app for content search and playback controls. The same goes for other new patents, including one for essentially making karaoke tracks by enabling users to put their vocals over music and one that would measure walking or running speed and play songs that align with that pace.