Generative AI Music Startup Riffusion Raises $4M
The researchers behind the generative AI synthetic audio experiment Riffusion have formed a startup and raised $4 million in a seed funding round led by Greycroft. Riffusion’s platform can turn text descriptions of music into audio with an accompanying visual representation of the sound and is now available as a mobile app as well as via its website.
Musicians Seth Forsgen and Hayk Martiros released Riffusion at the beginning of the year as a text-to-audio and text-to-image combination. Users simply describe lyrics, musical styles, and other qualities through text or voice. Riffusion’s AI then instantly generates riffs – short, customizable song snippets with vocals and art. Riffusion used the Stable Diffusion synthetic image generator, fine-tuned to work for music to create a sonogram, a visual representation of sound as a graph with time on the horizontal and sound frequency on the vertical axis.
Riffusion and its open-source tool immediately became very popular among music and AI enthusiasts and users have already created over 500,000 riffs. The app provides building blocks to craft original songs or simply share fun moments. The user list includes many professional musical groups, including The Chainsmokers, who are also investors and advisors for the startup. Riffusion’s AI is designed to produce novel outputs based on user prompts, not mimic famous artists, and the startup claims its model generates unique, customizable riffs each time rather than producing deepfakes.
“We started Riffusion as a hobby project last year and were blown away by the creativity it unleashed with casual users and professional musicians,” Forsgen said in a statement. “We see Riffusion as a new instrument—one that anyone can play. Everyone loves music, but most of us just listen passively. Riffusion enables everyone to be an active participant in it.”
The pursuit of text-to-music synthetic media generators is one of the facets pursued by several generative AI developers. Most notably, Meta recently created a generative AI music and sound model named AudioCraft that can transform a text prompt into any kind of sound by melding Meta’s text-to-music model MusicGen and text-to-natural-sound tool AudioGen, enhanced by EnCodec, a decoder that compresses the training required for the AI models to work. Meanwhile, Google’s MusicLM generative AI music composer has only been glimpsed in a few demonstrations as it is supposedly too good to be released without concern about copyright infringement.