AceMath-RL-Nemotron-7B開源數學求解模型 - 免費解代數幾何微積分等題

首頁

Acemath RL Nemotron 7B

由nvidia開發

基於深度學習的數學問題自動求解系統，支持代數、幾何、微積分等多種數學題型

大型語言模型

Transformers

英語開源協議:其他 #分步解題 #數學推理 #教育輔助

下載量 2,990

發布時間 : 4/25/2025

模型概述

該模型專門設計用於理解自然語言描述的數學問題，通過多步推理生成解題過程和最終答案

模型特點

多步推理能力

可分解複雜問題為多個推理步驟，模擬人類解題思維過程

多模態理解

支持文本和LaTeX格式的數學表達式處理

解釋生成

除答案外還能提供詳細的步驟解釋

模型能力

代數方程求解

幾何證明

微積分計算

概率統計

數學歸納法

使用案例

教育輔助

作業自動批改

自動檢查學生解題步驟的正確性

準確率92.3%（MathEval基準測試）

科研輔助

公式推導驗證

輔助研究人員驗證數學推導過程的正確性

🚀 AceMath-RL-Nemotron-7B 數學推理模型

AceMath-RL-Nemotron-7B 是一個完全通過強化學習（RL）訓練的數學推理模型，它基於 Deepseek-R1-Distilled-Qwen-7B 進行訓練。該模型在數學推理任務上表現出色，同時在編碼任務上也有不錯的泛化能力。

🚀 快速開始

模型簡介

aime24_accuracy

我們很高興推出 AceMath-RL-Nemotron-7B，這是一個完全通過強化學習（RL）訓練的數學推理模型，它基於 Deepseek-R1-Distilled-Qwen-7B 開始訓練。該模型取得了令人印象深刻的成果，在 2024 年美國數學邀請賽（AIME 2024）中達到了 69.0% 的單樣本準確率（Pass@1）（提升了 13.5%），在 2025 年美國數學邀請賽（AIME 2025）中達到了 53.6% 的單樣本準確率（Pass@1）（提升了 14.4%）。

有趣的是，這種專注於數學的強化學習訓練還提高了模型在 LiveCodeBench 上的編碼準確率，達到了 44.4% 的單樣本準確率（Pass@1）（提升了 6.8%），展示了大規模強化學習訓練的泛化能力。

我們在博客中分享了我們的訓練方法、訓練日誌和數據整理細節。

模型評估結果

我們在 AIME 2024、AIME 2025 和 GPQA 上，將我們的模型與同等規模的競爭推理模型進行了評估。

模型	AIME 2024 (AVG@64)	AIME 2025 (AVG@64)	GPQA-Diamond (AVG@8)
DeepSeek-R1-Distill-Qwen-7B	55.5	39.2	49.1
Light-R1-7B-DS	59.1	44.3	49.4
AReaL-boba-RL-7B	61.9	48.3	47.6
Llama-Nemotron-Nano-v1 (8B)	63.8	47.1	54.1
Skywork-OR1-Math-7B-Preview	69.8	52.3	-
AceMath-RL-Nemotron-7B 🤗	69.0	53.6	52.1

此外，我們還在其他數學基準測試和 LiveCodeBench 上對我們的模型進行了評估，以進行更全面的評估。

模型	GSM8K (AVG@1)	MATH500 (AVG@4)	Minerva Math (AVG@1)	GaoKao2023En (AVG@1)	Olympiad Bench (AVG@1)	College Math (AVG@1)	ACM23 (AVG@5)	LiveCodeBench (AVG@8)
DeepSeek-R1-Distill-Qwen-7B	92.7	92.8	57.4	82.3	58.2	56.7	89.0	37.6
AceMath-RL-Nemotron-7B 🤗	93.3	94.1	56.6	85.5	66.7	59.8	94.0	44.4

💻 使用示例

基礎用法

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'nvidia/AceMath-RL-Nemotron-7B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

💡 使用建議

⚠️ 重要提示

不要包含系統提示，而是將所有指令直接放在用戶提示中。

💡 使用建議

我們建議對數學問題使用以下提示格式：
<｜begin▁of▁sentence｜><｜User｜>{數學問題}\n請逐步推理，並將最終答案放在 \boxed{} 內。<｜Assistant｜><think>\n

📞 聯繫方式

Yang Chen (yachen@nvidia.com)
Zihan Liu (zihanl@nvidia.com)
Chankyu Lee (chankyul@nvidia.com)
Wei Ping (wping@nvidia.com)

📄 許可證

您使用此模型受 NVIDIA 開放模型許可證約束。

📚 引用信息

@article{acemath2024,
  title={AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling},
  author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
  journal={arXiv preprint},
  year={2024}
}