M

Mimo 7B RL Zero

Developed by XiaomiMiMo
MiMo-7B is a language model series launched by Xiaomi, specifically designed for reasoning tasks, including base models, SFT models, and RL models, excelling in mathematical and code reasoning tasks.
Downloads 216
Release Time : 4/29/2025

Model Overview

The MiMo-7B series enhances reasoning capabilities through optimized pre-training and post-training schemes, achieving or surpassing the performance of larger-scale models in mathematical and coding tasks.

Model Features

Pre-training Optimized for Reasoning
Adopts a three-stage data mixing strategy and multi-token prediction objectives to enhance model reasoning capabilities.
Innovative Post-training Scheme
Curates math and coding problems as RL training data, introducing a test-difficulty-driven code reward mechanism.
Efficient RL Infrastructure
Develops a seamless rollout engine to accelerate RL training and validation, reducing GPU idle time.
Multi-token Prediction Support
Supports speculative decoding with an acceptance rate of ~90%, accelerating the inference process.

Model Capabilities

Mathematical problem-solving
Code generation and understanding
Complex reasoning task handling
Multi-turn dialogue
Text generation

Use Cases

Education
Math Problem Solving
Solving high school math competition-level problems
Achieves 68.2% accuracy on AIME competition questions
Programming Education
Helping students understand and generate programming code
Achieves 57.8% accuracy on LiveCodeBench tests
Software Development
Code Assistance
Assisting developers in writing and optimizing code
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase