R

Riffusion

Developed by Narsil
A real-time music generation model based on Stable Diffusion technology that generates spectrograms from text input and converts them into audio clips
Downloads 14
Release Time : 12/15/2022

Model Overview

Riffusion is a latent text-to-image diffusion model capable of generating spectrograms from text prompts, which can then be converted into audio clips. The model is fine-tuned from Stable-Diffusion-v1-5 and is suitable for creative music generation and research purposes.

Model Features

Real-time music generation
Capable of generating music spectrograms from text prompts in real-time and converting them into audio
Based on Stable Diffusion technology
Fine-tuned from the proven Stable-Diffusion-v1-5 model, ensuring reliable generation capabilities
Open license
Adopts the CreativeML OpenRAIL-M license, permitting commercial and research use

Model Capabilities

Text-to-audio generation
Music spectrogram generation
Real-time audio synthesis

Use Cases

Creative arts
Music composition
Artists and musicians can use text prompts to generate unique music clips
Generates spectrograms that can be converted into audio
Education and research
Generative model research
Researchers can explore text-to-audio generative model technologies
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase