OpenMath-Nemotron-32B開源數學推理模型，多數學基準測試達先進水平！

首頁

Openmath Nemotron 32B

由nvidia開發

OpenMath-Nemotron-32B 是通過在 OpenMathReasoning 數據集上微調 Qwen2.5-32B 創建的數學推理模型，在多個數學基準測試中取得最先進結果。

大型語言模型

Transformers

英語#數學推理 #競賽級精度 #工具集成推理

下載量 189

發布時間 : 4/25/2025

模型概述

該模型專注於數學推理任務，通過思維鏈(CoT)和工具集成推理(TIR)等方式解決複雜數學問題，適用於數學研究和教育領域。

模型特點

數學推理能力

在多個數學基準測試中取得最先進結果，包括AIME、HMMT等競賽題目

多種推理模式

支持思維鏈(CoT)、工具集成推理(TIR)和生成解決方案選擇(GenSelect)三種推理模式

商業可用

模型已準備好用於商業用途，基於開源許可證發佈

可復現性

提供完整的代碼、數據集和訓練流程，確保結果可復現

模型能力

數學問題求解

複雜推理

多步驟計算

數學證明

競賽數學題解答

使用案例

教育

數學競賽訓練

幫助學生準備數學競賽如AIME、HMMT等

在AIME24測試集上達到93.3%的準確率

數學教學輔助

為教師提供解題思路和分步解答

研究

數學推理研究

用於數學自動推理和問題求解的研究

🚀 OpenMath-Nemotron-32B

OpenMath-Nemotron-32B 是通過在 OpenMathReasoning 數據集上微調 Qwen/Qwen2.5-32B 而創建的。該模型可用於商業用途。

評估結果

OpenMath-Nemotron 系列模型在流行的數學基準測試中取得了最先進的成果。我們以 pass@1 (maj@64) 形式展示指標，其中 pass@1 是 64 次生成的平均準確率，maj@64 是多數投票的結果。有關評估設置的更多詳細信息，請參閱我們的論文。

模型	AIME24	AIME25	HMMT-24-25	HLE-Math
DeepSeek-R1-Distill-Qwen-1.5B	26.8 (60.0)	21.4 (36.7)	14.2 (26.5)	2.9 (5.0)
OpenMath-Nemotron-1.5B CoT	61.6 (80.0)	49.5 (66.7)	39.9 (53.6)	5.4 (5.4)
OpenMath-Nemotron-1.5B TIR	52.0 (83.3)	39.7 (70.0)	37.2 (60.7)	2.5 (6.2)
+ Self GenSelect	83.3	70.0	62.2	7.9
+ 32B GenSelect	83.3	70.0	62.8	8.3
DeepSeek-R1-Distill-Qwen-7B	54.4 (80.0)	38.6 (53.3)	30.6 (42.9)	3.3 (5.2)
OpenMath-Nemotron-7B CoT	74.8 (80.0)	61.2 (76.7)	49.7 (57.7)	6.6 (6.6)
OpenMath-Nemotron-7B TIR	72.9 (83.3)	57.5 (76.7)	54.6 (66.3)	7.8 (10.8)
+ Self GenSelect	86.7	76.7	68.4	11.5
+ 32B GenSelect	86.7	76.7	69.9	11.9
DeepSeek-R1-Distill-Qwen-14B	65.8 (80.0)	48.4 (60.0)	40.1 (52.0)	4.2 (4.8)
OpenMath-Nemotron-14B-MIX (kaggle)	73.7 (86.7)	57.9 (73.3)	50.5 (64.8)	5.7 (6.5)
OpenMath-Nemotron-14B CoT	76.3 (83.3)	63.0 (76.7)	52.1 (60.7)	7.5 (7.6)
OpenMath-Nemotron-14B TIR	76.3 (86.7)	61.3 (76.7)	58.6 (70.9)	9.5 (11.5)
+ Self GenSelect	86.7	76.7	72.4	14.1
+ 32B GenSelect	90.0	76.7	71.9	13.7
QwQ-32B	78.1 (86.7)	66.5 (76.7)	55.9 (63.3)	9.0 (9.5)
DeepSeek-R1-Distill-Qwen-32B	66.9 (83.3)	51.8 (73.3)	39.9 (51.0)	4.8 (6.0)
OpenMath-Nemotron-32B CoT	76.5 (86.7)	62.5 (73.3)	53.0 (59.2)	8.3 (8.3)
OpenMath-Nemotron-32B TIR	78.4 (93.3)	64.2 (76.7)	59.7 (70.9)	9.2 (12.5)
+ Self GenSelect	93.3	80.0	73.5	15.7
DeepSeek-R1	79.1 (86.7)	64.3 (73.3)	53.0 (59.2)	10.5 (11.4)

我們使用 OpenMath-Nemotron-14B 的一個版本模型在 AIMO-2 Kaggle 競賽中獲得了第一名！

🚀 快速開始

復現結果

我們用於生成數據和模型的管道已完全開源！

我們提供了所有說明來完全復現我們的結果，包括數據生成。

模型使用方法

我們的模型可以在 3 種推理模式下使用：思維鏈 (CoT)、工具集成推理 (TIR) 和生成式解決方案選擇 (GenSelect)。

💻 使用示例

基礎用法

要在 CoT 模式下運行推理，你可以使用以下示例代碼片段。

import transformers
import torch

model_id = "nvidia/OpenMath-Nemotron-32B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {
        "role": "user", 
        "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" + 
        "What is the minimum value of $a^2+6a-7$?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=4096,
)
print(outputs[0]["generated_text"][-1]['content'])

要在 TIR 或 GenSelect 模式下運行推理，我們強烈建議使用我們在 NeMo-Skills 中的參考實現。

請注意，這些模型尚未在通用數據上進行指令微調，因此在數學領域之外可能無法提供良好的答案。

📚 詳細文檔

引用

如果你覺得我們的工作有用，請考慮引用我們！

@article{moshkov2025aimo2,
  title   = {AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset},
  author  = {Ivan Moshkov and Darragh Hanley and Ivan Sorokin and Shubham Toshniwal and Christof Henkel and Benedikt Schifferer and Wei Du and Igor Gitman},
  year    = {2025},
  journal = {arXiv preprint arXiv:2504.16891}
}

額外信息

屬性	詳情
許可證/使用條款	適用條款：本模型的使用受 CC-BY-4.0 約束。額外信息：Apache 許可證版本 2.0。
部署地域	全球
使用場景	該模型旨在促進數學推理領域的研究。
發佈日期	Huggingface 2025 年 4 月 23 日
模型架構	架構類型：Transformer 僅解碼器語言模型網絡架構：Qwen2.5 本模型基於 Qwen2.5-1.5B 開發本模型有 15 億個模型參數。
輸入	輸入類型：文本輸入格式：字符串輸入參數：一維 (1D) 與輸入相關的其他屬性：上下文長度最長可達 131,072 個標記
輸出	輸出類型：文本輸出格式：字符串輸出參數：一維 (1D) 與輸出相關的其他屬性：上下文長度最長可達 131,072 個標記
軟件集成	運行時引擎： * Tensor RT / Triton 支持的硬件微架構兼容性： * NVIDIA Ampere * NVIDIA Hopper 首選操作系統： * Linux
模型版本	OpenMath-Nemotron-1.5B OpenMath-Nemotron-7B OpenMath-Nemotron-14B OpenMath-Nemotron-32B