Riffusion Open-Source Music Generation Model - Generate Audio Clips in Real-Time Based on Text and Create Music Effortlessly

Riffusion

Developed by Narsil

A real-time music generation model based on Stable Diffusion technology that generates spectrograms from text input and converts them into audio clips

Text-to-Audio Open Source License:Openrail #Spectrogram generation #Real-time music generation #Text-to-audio

Downloads 14

Release Time : 12/15/2022

Model Overview

Riffusion is a latent text-to-image diffusion model capable of generating spectrograms from text prompts, which can then be converted into audio clips. The model is fine-tuned from Stable-Diffusion-v1-5 and is suitable for creative music generation and research purposes.

Model Features

Real-time music generation

Capable of generating music spectrograms from text prompts in real-time and converting them into audio

Based on Stable Diffusion technology

Fine-tuned from the proven Stable-Diffusion-v1-5 model, ensuring reliable generation capabilities

Open license

Adopts the CreativeML OpenRAIL-M license, permitting commercial and research use

Model Capabilities

Text-to-audio generation

Music spectrogram generation

Real-time audio synthesis

Use Cases

Creative arts

Music composition

Artists and musicians can use text prompts to generate unique music clips

Generates spectrograms that can be converted into audio

Education and research

Generative model research

Researchers can explore text-to-audio generative model technologies

🚀 Riffusion

Riffusion is an app for real-time music generation with stable diffusion. It empowers users to generate music in real - time. You can read about it at https://www.riffusion.com/about and try it at https://www.riffusion.com/.

🚀 Quick Start

To get started with Riffusion, you can access the following resources:

Web app: https://github.com/hmartiro/riffusion-app
Inference server: https://github.com/hmartiro/riffusion-inference
Model checkpoint: https://huggingface.co/riffusion/riffusion-model-v1

✨ Features

This repository contains the model files, including:

A diffusers formated library
A compiled checkpoint file
A traced unet for improved inference speed
A seed image library for use with riffusion - app

📚 Documentation

Riffusion v1 Model

Riffusion is a latent text - to - image diffusion model capable of generating spectrogram images given any text input. These spectrograms can be converted into audio clips.

The model was created by Seth Forsgren and Hayk Martiros as a hobby project.

You can use the Riffusion model directly, or try the Riffusion web app.

The Riffusion model was created by fine - tuning the Stable - Diffusion - v1 - 5 checkpoint. Read about Stable Diffusion here 🤗's Stable Diffusion blog.

Model Details

Property	Details
Developed by	Seth Forsgren, Hayk Martiros
Model Type	Diffusion - based text - to - image generation model
Language(s)	English
License	The CreativeML OpenRAIL M license is an Open RAIL M license, adapted from the work that BigScience and the RAIL Initiative are jointly carrying in the area of responsible AI licensing. See also the article about the BLOOM Open RAIL license on which our license is based.
Model Description	This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (CLIP ViT - L/14) as suggested in the Imagen paper.

Direct Use

The model is intended for research purposes only. Possible research areas and tasks include:

Generation of artworks, audio, and use in creative processes.
Applications in educational or creative tools.
Research on generative models.

Citation

If you build on this work, please cite it as follows:

@software{Forsgren_Martiros_2022,
  author = {Forsgren, Seth* and Martiros, Hayk*},
  title = {{Riffusion - Stable diffusion for real - time music generation}},
  url = {https://riffusion.com/about},
  year = {2022}
}

📄 License

This model is open access and available to all, with a CreativeML OpenRAIL - M license further specifying rights and usage.

The CreativeML OpenRAIL License specifies:

You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
Riffusion claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license.
You may re - distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL - M to all your users (please read the license entirely and carefully).

Please read the full license carefully here: https://huggingface.co/spaces/CompVis/stable-diffusion-license

⚠️ Important Note

Please read the LICENSE to access this model.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご