Open RS1
O
Open RS1
Developed by knoveleng
A small-scale large language model enhanced by reinforcement learning, focused on improving the reasoning capabilities of a 1.5B parameter model
Downloads 6,229
Release Time : 3/18/2025
Model Overview
This project explores enhancing the reasoning capabilities of small-scale large language models (LLMs) under resource-constrained conditions using reinforcement learning (RL). It employs the Group Relative Policy Optimization (GRPO) algorithm and is trained on a carefully selected compact mathematical reasoning dataset.
Model Features
Enhanced Efficient Reasoning
Significant improvement in reasoning capabilities through reinforcement learning fine-tuning, with AMC23 accuracy rising from 63% to 80% and AIME24 reaching 46.7%
Low-Cost Training
Requires only 7,000 samples, costing $42, and completes training within 24 hours on 4 NVIDIA A40 GPUs
Resource Optimization
Designed for resource-constrained environments, significantly reducing computational costs compared to 7B models
Model Capabilities
Mathematical Reasoning
Text Generation
Logical Reasoning
Use Cases
Education
Mathematical Problem Solving
Solving various mathematical reasoning problems
AMC23 accuracy reaches 80%
Research
Small LLM Capability Validation
Validating the application of reinforcement learning on small-scale models
AIME24 score of 46.7%, surpassing the o1-preview model
Featured Recommended AI Models
ยฉ 2025AIbase