# Reference-Free Reward Optimization
Llama 3 Base 8B SFT IPO
SimPO is a simple preference optimization method that eliminates the need for reference rewards, aiming to enhance model performance by simplifying the preference optimization process.
Large Language Model
Transformers

L
princeton-nlp
1,786
1
Llama 3 Base 8B SFT
SimPO is a preference optimization method that eliminates the need for reference reward models, simplifying the preference alignment process.
Large Language Model
Transformers

L
princeton-nlp
5,967
4
Featured Recommended AI Models