AceReason - Nemotron - 7B - GGUFオープンソースモデル - 無料デプロイで数学とコードの効率的な推論をサポート

ホーム

Acereason Nemotron 7B GGUF

QuantFactoryによって開発

AceReason-Nemotron-7Bは強化学習に基づいて訓練された数学とコードの推論モデルで、DeepSeek-R1-Distilled-Qwen-7Bから訓練を開始し、複数の基準テストで優れた成績を収めています。

大規模言語モデル

Transformers

#数学推理強化 #コード生成最適化 #多基準向上

ダウンロード数 326

リリース時間 : 6/13/2025

モデル概要

このモデルは数学とコードの推論タスクに特化しており、強化学習による訓練で性能を向上させ、複雑な数学問題やプログラミングの課題を解くのに適しています。

モデル特徴

強化学習訓練

完全に強化学習によって訓練され、数学とコードの推論能力が大幅に向上します。

優れた性能表現

AIME 2024、AIME 2025、LiveCodeBench v5およびv6などの基準テストで著しい向上を達成しました。

効果的な訓練方法

まず数学のプロンプトに対して強化学習訓練を行い、次にコードのプロンプトに対して訓練を行い、性能表現を最適化します。

モデル能力

数学問題の解決

コード生成

複雑な推論

使用事例

教育

数学コンテスト問題の解答

複雑な数学コンテストの問題、例えばAIMEコンテストの問題を解きます。

AIME 2024で69.0%の正解率を達成しました。

プログラミング

コード生成と最適化

Pythonコードを生成して最適化し、プログラミングの問題を解きます。

LiveCodeBench v5で51.8%の正解率を達成しました。

🚀 QuantFactory/AceReason-Nemotron-7B-GGUF

このモデルは、llama.cppを使用して作成されたnvidia/AceReason-Nemotron-7Bの量子化バージョンです。

🚀 クイックスタート

このセクションでは、AceReason-Nemotron-7B-GGUFモデルの基本的な使い方を説明します。

モデルの読み込み

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'nvidia/AceReason-Nemotron-7B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

推論の実行

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

✨ 主な機能

AceReason-Nemotron-7Bは、強化学習（RL）を通じて完全に訓練された数学とコードの推論モデルです。DeepSeek-R1-Distilled-Qwen-7Bをベースに、以下のような優れた性能を発揮します。

AIME 2024で69.0%（+14.5%）、AIME 2025で53.6%（+17.4%）、LiveCodeBench v5で51.8%（+8%）、LiveCodeBench v6で44.1%（+7%）のスコアを達成。
数学のみのプロンプトでの強化学習を最初に行い、次にコードのみのプロンプトでの強化学習を行うというシンプルで効果的なアプローチを提案。
数学のみの強化学習が、数学ベンチマークだけでなくコード推論タスクの性能も大幅に向上させることがわかった。
コードのみの強化学習を拡張することで、コードベンチマークの性能がさらに向上し、数学の結果の低下は最小限に抑えられる。

📦 インストール

インストールに関する具体的な手順は、元のドキュメントに記載されていません。

💻 使用例

基本的な使用法

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'nvidia/AceReason-Nemotron-7B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

高度な使用法

# 数学問題の場合の推奨指示
question = "数学問題の内容"
instruction = "Please reason step by step, and put your final answer within \\boxed{}."
final_prompt = "<ï½œUserï½œ>" + question + instruction + "<ï½œAssistantï½œ><think>\n"

# コード問題の場合の推奨指示
question = "コード問題の内容"
starter_code = "" # スターターコードの関数ヘッダー

code_instruction_nostartercode = """Write Python code to solve the problem. Please place the solution code in the following format:\n```python\n# Your solution code here\n```"""
code_instruction_hasstartercode = """Please place the solution code in the following format:\n```python\n# Your solution code here\n```"""
if starter_code != "":
    question += "\n\n" + "Solve the problem starting with the provided function header.\n\nFunction header:\n" + "```\n" + starter_code + "\n```"
    question += "\n\n" + code_instruction_hasstartercode
else:
    question += "\n\n" + code_instruction_nostartercode

final_prompt = "<ï½œUserï½œ>" + question + "<ï½œAssistantï½œ><think>\n"

📚 ドキュメント

結果

AceReason-Nemotron-7Bモデルを、Qwen2.5およびLlama3.1モデルファミリー内の同等のサイズの競合する推論モデルと比較して評価しました。評価は、AIME 2024、AIME 2025、LiveCodeBench v5（2024/08/01 - 2025/02/01）、およびLiveCodeBench v6（2025/02/01 - 2025/05/01）で行われました。

モデル	AIME 2024 (avg@64)	AIME 2025 (avg@64)	LCB v5 (avg@8)	LCB v6 (avg@8)
QwQ-32B	79.5	65.8	63.4	-
DeepSeek-R1-671B	79.8	70.0	65.9	-
Llama-Nemotron-Ultra-253B	80.8	72.5	66.3	-
o3-mini (medium)	79.6	76.7	67.4	-
Light-R1-7B	59.1	44.3	40.6	36.4
Light-R1-14B	74	60.2	57.9	51.5
DeepCoder-14B (32K Inference)	71	56.1	57.9	50.4
OpenMath-Nemotron-7B	74.8	61.2	-	-
OpenCodeReasoning-Nemotron-7B	-	-	51.3	46.1
Llama-Nemotron-Nano-8B-v1	61.3	47.1	46.6	46.2
DeepSeek-R1-Distilled-Qwen-7B	55.5	39.0	37.6	34.1
DeepSeek-R1-Distilled-Qwen-14B	69.7	50.2	53.1	47.9
DeepSeek-R1-Distilled-Qwen-32B	72.6	54.9	57.2	-
AceReason-Nemotron-7B 🤖	69.0	53.6	51.8	44.1
AceReason-Nemotron-14B 🤖	78.6	67.4	61.1	54.9

評価ツールキット

評価コード、スクリプト、キャッシュされた予測ファイルについては、AceReason Evalutionを確認してください。

対応者

Yang Chen (yachen@nvidia.com)
Zhuolin Yang (zhuoliny@nvidia.com)
Zihan Liu (zihanl@nvidia.com)
Chankyu Lee (chankyul@nvidia.com)
Wei Ping (wping@nvidia.com)

🔧 技術詳細

強化学習を用いたモデルの訓練に関する詳細な技術情報は、技術レポート 2505.16400-Technical_Report で公開されています。

📄 ライセンス

このモデルの使用は、NVIDIA Open Model Licenseに従います。

引用

@article{chen2025acereason,
  title={AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning},
  author={Chen, Yang and Yang, Zhuolin and Liu, Zihan and Lee, Chankyu and Xu, Peng and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
  journal={arXiv preprint arXiv:2505.16400},
  year={2025}
}