M

Magnet Medium 30secs

Developed by facebook
MAGNeT is a text-to-music and text-to-sound model capable of generating high-quality audio samples from text descriptions.
Downloads 409
Release Time : 1/10/2024

Model Overview

MAGNeT is a masked generative non-autoregressive Transformer based on a 32kHz EnCodec tokenizer, trained using 4 codebooks sampled at 50Hz. It does not require semantic token conditioning or model cascading, using a single non-autoregressive Transformer to generate all 4 codebooks.

Model Features

Non-Autoregressive Generation
Uses a single non-autoregressive Transformer to generate all codebooks simultaneously without cascaded models
High-Quality Audio Generation
Capable of generating high-quality music and sound samples from text descriptions
Diverse Style Support
Supports generating various music styles such as hip-hop, EDM, etc.

Model Capabilities

Text-to-Music Generation
Text-to-Sound Generation
30-second Audio Generation

Use Cases

Music Creation
Stylized Music Generation
Generate music in specific styles from text descriptions, e.g., 80s hip-hop style
Produces high-quality music clips matching the description
Background Music Production
Generate custom background music for podcasts, videos, etc.
Creates music that matches the content atmosphere
Research Applications
Generative Model Research
Used to explore and understand the limitations of generative models
Advances scientific development in audio generation
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase