OpenMath-Nemotron-14B-Kaggle Open-Source Mathematical Reasoning Model - Free to Use for Efficient Resolution of Mathematical Problems

Home

Openmath Nemotron 14B Kaggle

Developed by nvidia

A mathematical reasoning model fine-tuned based on Qwen2.5-14B, achieving first place in the AIMO-2 Kaggle competition

Large Language Model

Transformers

English#Mathematical Reasoning #Competition-Level Precision #Code Execution Solving

Downloads 66

Release Time : 4/25/2025

Model Overview

This model was created by fine-tuning Qwen2.5-14B on a subset of the OpenMathReasoning dataset, focusing on mathematical reasoning tasks, and won first place in the AIMO-2 Kaggle competition.

Model Features

High-Performance Mathematical Reasoning

Achieves state-of-the-art results on multiple mathematical benchmarks

Code Execution Capability

Uses code execution to solve mathematical problems

Large-Scale Fine-Tuning

Extensive fine-tuning based on the OpenMathReasoning dataset

Competition-Validated

Won first place in the AIMO-2 Kaggle competition

Model Capabilities

Mathematical Problem Solving

Code Generation

Mathematical Reasoning

Competition-Level Math Problem Solving

Use Cases

Math Competitions

AIMO Competition

Participated in the AIMO-2 Kaggle Math Competition

Achieved first place

Academic Research

Mathematical Reasoning Research

Used for research in the field of mathematical reasoning

🚀 OpenMath-Nemotron-14B-Kaggle

OpenMath-Nemotron-14B-Kaggle is a remarkable model created by fine-tuning Qwen/Qwen2.5-14B on a subset of the OpenMathReasoning dataset. This model played a crucial role in our first-place submission to the AIMO-2 Kaggle competition!

Evaluation Results

OpenMath-Nemotron models have achieved state-of-the-art results on popular mathematical benchmarks. We present metrics as pass@1 (maj@64), where pass@1 represents the average accuracy across 64 generations, and maj@64 is the result of majority voting. For more details on the evaluation setup, please refer to our paper.

Property	Details
Model Type	Transformer decoder-only language model (Qwen2.5)
Training Data	A subset of OpenMathReasoning dataset

Model	AIME24	AIME25	HMMT-24-25	HLE-Math
DeepSeek-R1-Distill-Qwen-1.5B	26.8 (60.0)	21.4 (36.7)	14.2 (26.5)	2.9 (5.0)
OpenMath-Nemotron-1.5B CoT	61.6 (80.0)	49.5 (66.7)	39.9 (53.6)	5.4 (5.4)
OpenMath-Nemotron-1.5B TIR	52.0 (83.3)	39.7 (70.0)	37.2 (60.7)	2.5 (6.2)
+ Self GenSelect	83.3	70.0	62.2	7.9
+ 32B GenSelect	83.3	70.0	62.8	8.3
DeepSeek-R1-Distill-Qwen-7B	54.4 (80.0)	38.6 (53.3)	30.6 (42.9)	3.3 (5.2)
OpenMath-Nemotron-7B CoT	74.8 (80.0)	61.2 (76.7)	49.7 (57.7)	6.6 (6.6)
OpenMath-Nemotron-7B TIR	72.9 (83.3)	57.5 (76.7)	54.6 (66.3)	7.8 (10.8)
+ Self GenSelect	86.7	76.7	68.4	11.5
+ 32B GenSelect	86.7	76.7	69.9	11.9
DeepSeek-R1-Distill-Qwen-14B	65.8 (80.0)	48.4 (60.0)	40.1 (52.0)	4.2 (4.8)
OpenMath-Nemotron-14B-MIX (kaggle)	73.7 (86.7)	57.9 (73.3)	50.5 (64.8)	5.7 (6.5)
OpenMath-Nemotron-14B CoT	76.3 (83.3)	63.0 (76.7)	52.1 (60.7)	7.5 (7.6)
OpenMath-Nemotron-14B TIR	76.3 (86.7)	61.3 (76.7)	58.6 (70.9)	9.5 (11.5)
+ Self GenSelect	86.7	76.7	72.4	14.1
+ 32B GenSelect	90.0	76.7	71.9	13.7
QwQ-32B	78.1 (86.7)	66.5 (76.7)	55.9 (63.3)	9.0 (9.5)
DeepSeek-R1-Distill-Qwen-32B	66.9 (83.3)	51.8 (73.3)	39.9 (51.0)	4.8 (6.0)
OpenMath-Nemotron-32B CoT	76.5 (86.7)	62.5 (73.3)	53.0 (59.2)	8.3 (8.3)
OpenMath-Nemotron-32B TIR	78.4 (93.3)	64.2 (76.7)	59.7 (70.9)	9.2 (12.5)
+ Self GenSelect	93.3	80.0	73.5	15.7
DeepSeek-R1	79.1 (86.7)	64.3 (73.3)	53.0 (59.2)	10.5 (11.4)

🚀 Quick Start

Reproducing our results

The pipeline we used to produce the data and models is fully open-sourced!

We provide all instructions to fully reproduce our results, including data generation.

How to use the models?

This model will always use code execution to solve math problems. Therefore, we highly recommend running inference with our reference implementation in NeMo-Skills.

⚠️ Important Note

These models have not been instruction tuned on general data and thus might not provide good answers outside of the math domain.

📄 License

GOVERNING TERMS: Use of this model is governed by CC-BY-4.0. Additional Information: Apache License Version 2.0.

📚 Documentation

Deployment Geography

Global

Use Case

This model is intended to facilitate research in the area of mathematical reasoning.

Release Date

Huggingface 04/23/2025

Model Architecture

Architecture Type: Transformer decoder-only language model
Network Architecture: Qwen2.5
This model was developed based on Qwen2.5-1.5B
This model has 1.5B of model parameters.

Input

Input Type(s): Text
Input Format(s): String
Input Parameters: One-Dimensional (1D)
Other Properties Related to Input: Context length up to 131,072 tokens

Output

Output Type(s): Text
Output Format: String
Output Parameters: One-Dimensional (1D)
Other Properties Related to Output: Context length up to 131,072 tokens

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g., GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration

Runtime Engine(s):
- Tensor RT / Triton
Supported Hardware Microarchitecture Compatibility:
- NVIDIA Ampere
- NVIDIA Hopper
Preferred Operating System(s):
- Linux

Model Version(s)

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets the requirements for the relevant industry and use case and addresses unforeseen product misuse.

For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.

Please report security vulnerabilities or NVIDIA AI Concerns here.

Citation

If you find our work useful, please consider citing us!

@article{moshkov2025aimo2,
  title   = {AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset},
  author  = {Ivan Moshkov and Darragh Hanley and Ivan Sorokin and Shubham Toshniwal and Christof Henkel and Benedikt Schifferer and Wei Du and Igor Gitman},
  year    = {2025},
  journal = {arXiv preprint arXiv:2504.16891}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご