OpenMath-Nemotron-32B开源数学推理模型，多数学基准测试达先进水平！

首页

Openmath Nemotron 32B

由 nvidia 开发

OpenMath-Nemotron-32B 是通过在 OpenMathReasoning 数据集上微调 Qwen2.5-32B 创建的数学推理模型，在多个数学基准测试中取得最先进结果。

大型语言模型

Transformers

英语#数学推理 #竞赛级精度 #工具集成推理

下载量 189

发布时间 : 4/25/2025

模型简介

该模型专注于数学推理任务，通过思维链(CoT)和工具集成推理(TIR)等方式解决复杂数学问题，适用于数学研究和教育领域。

模型特点

数学推理能力

在多个数学基准测试中取得最先进结果，包括AIME、HMMT等竞赛题目

多种推理模式

支持思维链(CoT)、工具集成推理(TIR)和生成解决方案选择(GenSelect)三种推理模式

商业可用

模型已准备好用于商业用途，基于开源许可证发布

可复现性

提供完整的代码、数据集和训练流程，确保结果可复现

模型能力

数学问题求解

复杂推理

多步骤计算

数学证明

竞赛数学题解答

使用案例

教育

数学竞赛训练

帮助学生准备数学竞赛如AIME、HMMT等

在AIME24测试集上达到93.3%的准确率

数学教学辅助

为教师提供解题思路和分步解答

研究

数学推理研究

用于数学自动推理和问题求解的研究

🚀 OpenMath-Nemotron-32B

OpenMath-Nemotron-32B 是通过在 OpenMathReasoning 数据集上微调 Qwen/Qwen2.5-32B 而创建的。该模型可用于商业用途。

评估结果

OpenMath-Nemotron 系列模型在流行的数学基准测试中取得了最先进的成果。我们以 pass@1 (maj@64) 形式展示指标，其中 pass@1 是 64 次生成的平均准确率，maj@64 是多数投票的结果。有关评估设置的更多详细信息，请参阅我们的论文。

模型	AIME24	AIME25	HMMT-24-25	HLE-Math
DeepSeek-R1-Distill-Qwen-1.5B	26.8 (60.0)	21.4 (36.7)	14.2 (26.5)	2.9 (5.0)
OpenMath-Nemotron-1.5B CoT	61.6 (80.0)	49.5 (66.7)	39.9 (53.6)	5.4 (5.4)
OpenMath-Nemotron-1.5B TIR	52.0 (83.3)	39.7 (70.0)	37.2 (60.7)	2.5 (6.2)
+ Self GenSelect	83.3	70.0	62.2	7.9
+ 32B GenSelect	83.3	70.0	62.8	8.3
DeepSeek-R1-Distill-Qwen-7B	54.4 (80.0)	38.6 (53.3)	30.6 (42.9)	3.3 (5.2)
OpenMath-Nemotron-7B CoT	74.8 (80.0)	61.2 (76.7)	49.7 (57.7)	6.6 (6.6)
OpenMath-Nemotron-7B TIR	72.9 (83.3)	57.5 (76.7)	54.6 (66.3)	7.8 (10.8)
+ Self GenSelect	86.7	76.7	68.4	11.5
+ 32B GenSelect	86.7	76.7	69.9	11.9
DeepSeek-R1-Distill-Qwen-14B	65.8 (80.0)	48.4 (60.0)	40.1 (52.0)	4.2 (4.8)
OpenMath-Nemotron-14B-MIX (kaggle)	73.7 (86.7)	57.9 (73.3)	50.5 (64.8)	5.7 (6.5)
OpenMath-Nemotron-14B CoT	76.3 (83.3)	63.0 (76.7)	52.1 (60.7)	7.5 (7.6)
OpenMath-Nemotron-14B TIR	76.3 (86.7)	61.3 (76.7)	58.6 (70.9)	9.5 (11.5)
+ Self GenSelect	86.7	76.7	72.4	14.1
+ 32B GenSelect	90.0	76.7	71.9	13.7
QwQ-32B	78.1 (86.7)	66.5 (76.7)	55.9 (63.3)	9.0 (9.5)
DeepSeek-R1-Distill-Qwen-32B	66.9 (83.3)	51.8 (73.3)	39.9 (51.0)	4.8 (6.0)
OpenMath-Nemotron-32B CoT	76.5 (86.7)	62.5 (73.3)	53.0 (59.2)	8.3 (8.3)
OpenMath-Nemotron-32B TIR	78.4 (93.3)	64.2 (76.7)	59.7 (70.9)	9.2 (12.5)
+ Self GenSelect	93.3	80.0	73.5	15.7
DeepSeek-R1	79.1 (86.7)	64.3 (73.3)	53.0 (59.2)	10.5 (11.4)

我们使用 OpenMath-Nemotron-14B 的一个版本模型在 AIMO-2 Kaggle 竞赛中获得了第一名！

🚀 快速开始

复现结果

我们用于生成数据和模型的管道已完全开源！

我们提供了所有说明来完全复现我们的结果，包括数据生成。

模型使用方法

我们的模型可以在 3 种推理模式下使用：思维链 (CoT)、工具集成推理 (TIR) 和生成式解决方案选择 (GenSelect)。

💻 使用示例

基础用法

要在 CoT 模式下运行推理，你可以使用以下示例代码片段。

import transformers
import torch

model_id = "nvidia/OpenMath-Nemotron-32B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {
        "role": "user", 
        "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" + 
        "What is the minimum value of $a^2+6a-7$?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=4096,
)
print(outputs[0]["generated_text"][-1]['content'])

要在 TIR 或 GenSelect 模式下运行推理，我们强烈建议使用我们在 NeMo-Skills 中的参考实现。

请注意，这些模型尚未在通用数据上进行指令微调，因此在数学领域之外可能无法提供良好的答案。

📚 详细文档

引用

如果你觉得我们的工作有用，请考虑引用我们！

@article{moshkov2025aimo2,
  title   = {AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset},
  author  = {Ivan Moshkov and Darragh Hanley and Ivan Sorokin and Shubham Toshniwal and Christof Henkel and Benedikt Schifferer and Wei Du and Igor Gitman},
  year    = {2025},
  journal = {arXiv preprint arXiv:2504.16891}
}

额外信息

属性	详情
许可证/使用条款	适用条款：本模型的使用受 CC-BY-4.0 约束。额外信息：Apache 许可证版本 2.0。
部署地域	全球
使用场景	该模型旨在促进数学推理领域的研究。
发布日期	Huggingface 2025 年 4 月 23 日
模型架构	架构类型：Transformer 仅解码器语言模型网络架构：Qwen2.5 本模型基于 Qwen2.5-1.5B 开发本模型有 15 亿个模型参数。
输入	输入类型：文本输入格式：字符串输入参数：一维 (1D) 与输入相关的其他属性：上下文长度最长可达 131,072 个标记
输出	输出类型：文本输出格式：字符串输出参数：一维 (1D) 与输出相关的其他属性：上下文长度最长可达 131,072 个标记
软件集成	运行时引擎： * Tensor RT / Triton 支持的硬件微架构兼容性： * NVIDIA Ampere * NVIDIA Hopper 首选操作系统： * Linux
模型版本	OpenMath-Nemotron-1.5B OpenMath-Nemotron-7B OpenMath-Nemotron-14B OpenMath-Nemotron-32B