M

Magnet Medium 10secs

Developed by facebook
MAGNeT is a text-to-music and text-to-sound model that can generate high-quality audio samples based on text descriptions.
Downloads 322
Release Time : 1/10/2024

Model Overview

MAGNeT is a masked generative non-autoregressive Transformer based on the 32kHz EnCodec tokenizer, using 4 codebooks sampled at 50Hz. It does not require semantic token conditions or model cascading, and uses a single non-autoregressive Transformer to generate all 4 codebooks.

Model Features

Non-autoregressive generation
Use a single non-autoregressive Transformer to generate all codebooks without model cascading.
High-quality audio generation
Able to generate high-quality audio samples based on text descriptions.
Multi-codebook processing
Use 4 codebooks sampled at 50Hz for audio generation.

Model Capabilities

Text-to-music generation
Text-to-sound generation

Use Cases

Music creation
Generate music in a specific style
Generate music in a specific style based on text descriptions, such as funk house music in the 80s hip-hop style.
Generate a 10-second high-quality music sample.
Generate a relaxing song
Generate a relaxing song influenced by lo-fi, chill electronica, and slow tempo based on text descriptions.
Generate a 10-second high-quality music sample.
Podcast production
Generate podcast opening music
Generate an attractive rhythm for the podcast opening based on text descriptions.
Generate a 10-second high-quality music sample.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase