M

Musicgen Medium

Developed by facebook
MusicGen is a text-to-music model that generates high-quality music samples based on text descriptions or audio prompts, utilizing a 1.5-billion-parameter autoregressive Transformer architecture.
Downloads 1.5M
Release Time : 6/8/2023

Model Overview

A single-stage autoregressive Transformer model that directly generates 32kHz sampled music audio from text descriptions, supporting parallel prediction and controllable music generation.

Model Features

Parallel Codebook Prediction
Achieves parallel prediction through minimal delays between codebooks, requiring only 50 autoregressive steps per second of audio.
No Semantic Representation Needed
Unlike solutions such as MusicLM, it directly generates audio codebooks without intermediate semantic representations.
Multiple Parameter Versions
Offers 300M/1.5B/3.3B parameter versions and melody-guided variants.

Model Capabilities

Generate music from text descriptions
Support style mixing (e.g., '80s hip-hop + funky house')
Produce high-quality 32kHz audio
Enable melody-guided generation (requires melody variant model)

Use Cases

Music Creation
Background Music Generation
Generate customized opening music for podcasts/videos
Examples demonstrate the ability to produce audio with catchy rhythms
Style Experimentation
Blend musical elements from different eras and styles
Successfully generates hybrid styles like '80s hip-hop + funky house'
Content Production
Lo-Fi Work Music
Generate soothing tracks with chill electronic elements
Can produce background music suitable for focused work
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase