C

Codellama 7b Hf ReFT GSM8k

Developed by lqtrung1998
Enhances the reasoning generalization capabilities of large language models through reinforcement fine-tuning, based on Codellama fine-tuning, suitable for code generation and comprehension tasks.
Downloads 38
Release Time : 1/29/2024

Model Overview

The ReFT method improves the performance of large language models on mathematical reasoning tasks through reinforcement fine-tuning, specifically optimized for the GSM8k math problem dataset.

Model Features

Reinforcement Fine-Tuning
Optimizes model performance on mathematical reasoning tasks through reinforcement learning.
Python SDP Chain-of-Thought
Trains the model using Python-structured chain-of-thought format.
Re-ranking Mechanism
Equipped with a dedicated re-ranking model to evaluate the correctness of output reasoning chains.

Model Capabilities

Mathematical Problem Solving
Python Code Generation
Structured Reasoning
Chain-of-Thought Generation

Use Cases

Education
Math Problem Solving
Solves mathematical word problems from the GSM8k dataset.
Achieves 81.2% accuracy on the GSM8k test set.
Programming Assistance
Code Generation
Generates Python solution code based on mathematical problem descriptions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase