Stability AI Releases Augmented Text-to-Music Engine Stable Audio 2 With Upload and Style Transfer Features

on April 4, 2024 at 8:00 am

Synthetic media startup Stability AI has released an upgraded version of its text-to-music model. The new Stable Audio 2 is capable of producing music tracks of up to three minutes based on written prompts and with additional new features incorporated into the tool.

Stable Audio 2

Stable Audio 2 enhances the original Stable Audio model released in September, which the company claimed as the first commercially viable generative AI music composition tool. Both iterations leverage latent diffusion technology, but Stable Audio performs better and incorporates new features. One of those new features is audio-to-audio prompting. This tool enables users to upload audio samples and process them through the AI engine into different kinds of sounds based on written prompts. There are also more options for creating sound effects and an option to reproduce an audio sample in a different style. The sound effect enhancements include all kinds of incidental and environmental sounds, whether fingers tapping on a keyboard, a crowd cheering, or the sounds of a city outside a concert.

“Our most advanced audio model yet expands the creative toolkit for artists and musicians with its new functionalities. With both text-to-audio and audio-to-audio prompting, users can produce melodies, backing tracks, stems, and sound effects, thus enhancing the creative process,” Stability AI explained in a blog post. “Stable Audio 2.0 sets itself apart from other state-of-the-art models as it can generate songs up to three minutes in length, complete with structured compositions that include an intro, development, and outro, as well as stereo sound effects.”

The improved model is aimed at providing clear structures and high-quality sound. It simplifies complex audio waveforms into shorter, more manageable forms and then reshapes them to create music that tries to capture the essence of human compositions. The goal is for the AI to grasp the nuances of music in order to replicate the patterns and sequences. Stable Audio 2 was trained on a database of more than 800,000 pieces of audio, including music tracks, sound effects, and individual instrument sounds, all categorized with detailed descriptions. To address potential copyright concerns, Stability AI collaborated with Audible Magic, a company specializing in identifying and blocking copyrighted content in real time. The tool is available on Stability AI’s website and will be embedded in the Stable Audio API soon. Those curious to hear more AI tracks can listen to the new Stable Radio streaming station on YouTube.

Follow @voicebotai Follow @erichschwartz

Stability AI Upgrades Synthetic Media Engine to Stable Diffusion 3

Microsoft Copilot Adds Generative AI Music Engine

Generative AI Music Startup Riffusion Raises $4M

Stability AI Releases Augmented Text-to-Music Engine Stable Audio 2 With Upload and Style Transfer Features

Stable Audio 2

Subscribe to Voicebot Weekly

Latest Posts

McDonald’s Abandons Drive Through AI for Order Taking

Apple Debuts ‘Apple Intelligence’ Generative AI Features Across All Devices

Stability AI Shares Open-Source Generative AI Audio Model for Creative Sound Design

Fable Studio Launches Generative AI TV Show Production Platform for Custom Streaming Content