N

Nano Aha Moment 3b

Developed by McGill-NLP
A 3-billion-parameter language model trained with reinforcement learning for solving mathematical reasoning tasks, especially countdown games.
Downloads 55
Release Time : 3/31/2025

Model Overview

A language model based on Qwen2.5-3B, fine-tuned using GRPO, specifically designed for mathematical reasoning tasks, particularly countdown games.

Model Features

Mathematical Reasoning Optimization
Specifically trained with reinforcement learning for mathematical reasoning tasks such as countdown games
Structured Reasoning Output
Displays reasoning process within <think> tags and provides final answers within <answer> tags
Efficient Training Techniques
Utilizes Flash Attention 2, DeepSpeed ZeRO Stage 2, and vLLM for efficient training and inference

Model Capabilities

Mathematical Reasoning
Countdown Game Solving
Structured Reasoning Process Display

Use Cases

Education
Mathematical Thinking Training
Used to train students' ability to solve mathematical problems such as countdown games
Can display complete problem-solving steps and reasoning
Gaming
Countdown Game Assistance
Helps players solve mathematical challenges in countdown games
Provides multiple possible solutions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase