AceMath-RL-Nemotron-7B开源数学求解模型 - 免费解代数几何微积分等题

首页

Acemath RL Nemotron 7B

由 nvidia 开发

基于深度学习的数学问题自动求解系统，支持代数、几何、微积分等多种数学题型

大型语言模型

Transformers

英语开源协议:其他 #分步解题 #数学推理 #教育辅助

下载量 2,990

发布时间 : 4/25/2025

模型简介

该模型专门设计用于理解自然语言描述的数学问题，通过多步推理生成解题过程和最终答案

模型特点

多步推理能力

可分解复杂问题为多个推理步骤，模拟人类解题思维过程

多模态理解

支持文本和LaTeX格式的数学表达式处理

解释生成

除答案外还能提供详细的步骤解释

模型能力

代数方程求解

几何证明

微积分计算

概率统计

数学归纳法

使用案例

教育辅助

作业自动批改

自动检查学生解题步骤的正确性

准确率92.3%（MathEval基准测试）

科研辅助

公式推导验证

辅助研究人员验证数学推导过程的正确性

🚀 AceMath-RL-Nemotron-7B 数学推理模型

AceMath-RL-Nemotron-7B 是一个完全通过强化学习（RL）训练的数学推理模型，它基于 Deepseek-R1-Distilled-Qwen-7B 进行训练。该模型在数学推理任务上表现出色，同时在编码任务上也有不错的泛化能力。

🚀 快速开始

模型简介

aime24_accuracy

我们很高兴推出 AceMath-RL-Nemotron-7B，这是一个完全通过强化学习（RL）训练的数学推理模型，它基于 Deepseek-R1-Distilled-Qwen-7B 开始训练。该模型取得了令人印象深刻的成果，在 2024 年美国数学邀请赛（AIME 2024）中达到了 69.0% 的单样本准确率（Pass@1）（提升了 13.5%），在 2025 年美国数学邀请赛（AIME 2025）中达到了 53.6% 的单样本准确率（Pass@1）（提升了 14.4%）。

有趣的是，这种专注于数学的强化学习训练还提高了模型在 LiveCodeBench 上的编码准确率，达到了 44.4% 的单样本准确率（Pass@1）（提升了 6.8%），展示了大规模强化学习训练的泛化能力。

我们在博客中分享了我们的训练方法、训练日志和数据整理细节。

模型评估结果

我们在 AIME 2024、AIME 2025 和 GPQA 上，将我们的模型与同等规模的竞争推理模型进行了评估。

模型	AIME 2024 (AVG@64)	AIME 2025 (AVG@64)	GPQA-Diamond (AVG@8)
DeepSeek-R1-Distill-Qwen-7B	55.5	39.2	49.1
Light-R1-7B-DS	59.1	44.3	49.4
AReaL-boba-RL-7B	61.9	48.3	47.6
Llama-Nemotron-Nano-v1 (8B)	63.8	47.1	54.1
Skywork-OR1-Math-7B-Preview	69.8	52.3	-
AceMath-RL-Nemotron-7B 🤗	69.0	53.6	52.1

此外，我们还在其他数学基准测试和 LiveCodeBench 上对我们的模型进行了评估，以进行更全面的评估。

模型	GSM8K (AVG@1)	MATH500 (AVG@4)	Minerva Math (AVG@1)	GaoKao2023En (AVG@1)	Olympiad Bench (AVG@1)	College Math (AVG@1)	ACM23 (AVG@5)	LiveCodeBench (AVG@8)
DeepSeek-R1-Distill-Qwen-7B	92.7	92.8	57.4	82.3	58.2	56.7	89.0	37.6
AceMath-RL-Nemotron-7B 🤗	93.3	94.1	56.6	85.5	66.7	59.8	94.0	44.4

💻 使用示例

基础用法

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'nvidia/AceMath-RL-Nemotron-7B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

💡 使用建议

⚠️ 重要提示

不要包含系统提示，而是将所有指令直接放在用户提示中。

💡 使用建议

我们建议对数学问题使用以下提示格式：
<｜begin▁of▁sentence｜><｜User｜>{数学问题}\n请逐步推理，并将最终答案放在 \boxed{} 内。<｜Assistant｜><think>\n

📞 联系方式

Yang Chen (yachen@nvidia.com)
Zihan Liu (zihanl@nvidia.com)
Chankyu Lee (chankyul@nvidia.com)
Wei Ping (wping@nvidia.com)

📄 许可证

您使用此模型受 NVIDIA 开放模型许可证约束。

📚 引用信息

@article{acemath2024,
  title={AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling},
  author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
  journal={arXiv preprint},
  year={2024}
}