Tango2
T
Tango2
Developed by declare-lab
Tango 2 is an improved text-to-audio generation model based on Tango, optimizing audio generation quality through DPO alignment training
Downloads 147
Release Time : 4/13/2024
Model Overview
Tango 2 is a diffusion-based text-to-audio generation model that aligns with human audio preferences using Direct Preference Optimization (DPO) technology, capable of generating high-quality audio content from text prompts
Model Features
DPO Alignment Training
Uses audio-alpaca dataset for direct preference optimization to enhance audio generation quality
High-Quality Audio Generation
Supports 100-200 step sampling to produce more natural and realistic audio
Batch Generation Capability
Can generate multiple audio samples simultaneously for multiple text prompts
Model Capabilities
Text-to-Audio Conversion
High-Quality Audio Generation
Batch Audio Generation
Use Cases
Sound Effect Production
Environmental Sound Generation
Generate natural environmental sounds based on text descriptions
Produces realistic environmental sounds like water flow, wind, etc.
Event Sound Effect Generation
Generate sound effects for specific events such as applause or cheers
Creates vivid sound effects matching scene descriptions
Media Production
Film/TV Score Generation
Generate background music based on scene descriptions
Produces music segments that match the scene atmosphere
Featured Recommended AI Models
Š 2025AIbase