T

Tango2 Full

Developed by declare-lab
Tango 2 is an improved text-to-audio generation model based on Tango, achieving alignment training for audio generation through Direct Preference Optimization (DPO) technology
Downloads 63
Release Time : 4/13/2024

Model Overview

Tango 2 is a diffusion-based text-to-audio generation model. Building upon the Tango-full-ft checkpoint, it undergoes DPO alignment training using the Audio-alpaca paired text-audio preference dataset, capable of generating high-quality audio based on text descriptions

Model Features

Direct Preference Optimization (DPO)
Uses DPO technology for alignment training to improve the quality of generated audio and its match with text descriptions
Expanded Training Dataset
Trained on an extended version of the Audio-alpaca dataset to enhance the model's generalization capabilities
High-Quality Audio Generation
Supports 100-200 step sampling, capable of generating high-quality audio effects

Model Capabilities

Text-to-Audio Conversion
Batch Audio Generation
Scene Sound Effect Synthesis

Use Cases

Multimedia Production
Sound Effect Generation
Automatically generates specific scene sound effects based on text descriptions
Can generate high-quality sound effects such as thunder, cheers, etc.
Background Music Synthesis
Generates matching background music based on scene descriptions
Game Development
Game Sound Effect Production
Quickly generates various sound effects required for game scenes
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase