L

Llama 3 Bophades V3 8B

Developed by nbeerbower
A DPO fine-tuned model based on Llama-3-8b, focused on enhancing truthfulness and mathematical reasoning capabilities
Downloads 44
Release Time : 5/2/2024

Model Overview

This model is an improved version of Llama-3-8b, fine-tuned using Direct Preference Optimization (DPO) with the truthy-dpo and orca_math_dpo datasets, aiming to enhance the model's ability to provide truthful answers and mathematical reasoning.

Model Features

Direct Preference Optimization (DPO)
Fine-tuned using the DPO method, optimizing output quality by comparing accepted and rejected answers
Multi-dataset fusion training
Combined training with truthy-dpo (truthfulness) and orca_math_dpo (mathematical reasoning) datasets
LoRA efficient fine-tuning
Employed Low-Rank Adaptation (LoRA) technology for parameter-efficient fine-tuning, reducing computational resource requirements

Model Capabilities

Text generation
Question answering systems
Mathematical problem solving
Truthful answer generation

Use Cases

Education
Mathematical problem solving
Helps students understand and solve various mathematical problems
Fine-tuned with the orca_math_dpo dataset to enhance mathematical reasoning capabilities
Information retrieval
Truthful question answering system
Provides more reliable and truthful question-answering services
Fine-tuned with the truthy-dpo dataset to reduce generation of false information
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase