# Preference Alignment Training
DPO A5 Nlp
TRL is a reinforcement learning library based on the Transformer architecture for training and fine-tuning language models.
Large Language Model
Transformers

D
EraCoding
26
1
Tango2 Full
Tango 2 is an improved text-to-audio generation model based on Tango, achieving alignment training for audio generation through Direct Preference Optimization (DPO) technology
Audio Generation
Transformers English

T
declare-lab
63
9
Featured Recommended AI Models