Mistral Orpo Beta
Mistral-ORPO-β is a 7B-parameter language model fine-tuned using the ORPO method based on Mistral-7B, capable of directly learning preferences without a supervised fine-tuning warm-up phase.
Downloads 18
Release Time : 3/12/2024
Model Overview
This is a 7B-parameter language model optimized via the ORPO method, focusing on text generation tasks and demonstrating outstanding performance across multiple benchmarks.
Model Features
ORPO Optimization
Uses Odds Ratio Preference Optimization method to directly learn preferences without a supervised fine-tuning warm-up phase.
Efficient Fine-Tuning
Achieves excellent performance with fine-tuning on just 61k UltraFeedback dataset instances.
Multi-Task Performance
Outperforms similar models in multiple benchmarks including AlpacaEval and MT-Bench.
Model Capabilities
Text Generation
Dialogue Systems
Question Answering
Instruction Following
Use Cases
Dialogue Systems
Intelligent Assistant
Can be used to build intelligent dialogue assistants.
Achieves a 91.16% win rate on AlpacaEval 1.0.
Educational Applications
Educational Q&A
Can be used for question-answering systems in the education field.
Achieves 63.26% accuracy on the MMLU test.
Featured Recommended AI Models