Smaug 34B V0.1
A large language model fine-tuned based on jondurbin/bagel-34b-v0.2, optimized with novel DPO-Positive (DPOP) technology for preference learning
Downloads 2,694
Release Time : 1/25/2024
Model Overview
Smaug-34B-v0.1 is a 34B-parameter large language model that improves upon standard DPO's shortcomings through DPOP technology, excelling in mathematical reasoning and general tasks.
Model Features
DPOP Optimization Technology
Addresses performance degradation in tasks with small edit distances through the novel DPO-Positive loss function
Multi-Domain Performance Improvement
Outstanding performance on diverse datasets such as ARC, HellaSwag, and MetaMath
Open-Source Tech Stack
Complete training details and datasets are open-sourced via research papers, supporting community-driven optimization
Model Capabilities
Complex text generation
Mathematical problem-solving
Common-sense reasoning
Open-domain question answering
Truthful answer generation
Use Cases
Education
Math Tutoring
Helps students solve math problems like GSM8K
GSM8K score of 72.18
Research
Preference Learning Research
Serves as a benchmark model for DPOP technology
Outperforms standard DPO in multiple tasks
Featured Recommended AI Models