Acereason Nemotron 14B GGUF

QuantFactoryによって開発

AceReason-Nemotron-14Bは強化学習によって訓練された数学とコード推論モデルで、複数の数学とコード推論の基準テストで優れた成績を収めています。

大規模言語モデル

Transformers

#数学推理強化 #コード生成最適化 #RL訓練モデル

ダウンロード数 326

リリース時間 : 6/14/2025

モデル概要

このモデルは数学とコード推論タスクに特化しており、強化学習によって訓練され、数学とプログラミングの問題解決において優れた性能を持っています。

モデル特徴

強化学習訓練

完全に強化学習(RL)によって訓練された数学とコード推論モデル

体系的研究

広範なアブレーション実験を通じてRL訓練プロセスを体系的に研究しました

性能向上

数学とコード推論の基準テストで優れた成績を収めました

段階的訓練

まず数学のみのプロンプトでRL訓練を行い、次にコードのみのプロンプトでRL訓練を行います

モデル能力

数学問題解答

コード生成

段階的推論

複雑問題解決

使用事例

教育

数学コンテスト問題解答

高度な数学コンテストの問題、例えばAIMEコンテストの問題を解く

AIME 2024と2025のコンテスト問題で優れた成績を収めました

プログラミング

コード問題解決

問題の説明に基づいてPythonコードの解決策を生成する

LiveCodeBench基準テストで良好な成績を収めました

🚀 QuantFactory/AceReason-Nemotron-14B-GGUF

このモデルは、llama.cppを使用して作成されたnvidia/AceReason-Nemotron-14Bの量子化バージョンです。

オリジナルモデルカード

library_name: transformers license: other license_name: nvidia-open-model-license license_link: >- https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/ pipeline_tag: text-generation language:

en tags:
nvidia
reasoning
math
code
reinforcement learning
pytorch

🚀 クイックスタート

AceReason-Nemotron-14Bは、DeepSeek-R1-Distilled-Qwen-14Bをベースに、強化学習（RL）のみを用いて訓練された数学とコード推論モデルです。AIME 2024で78.6%（+8.9%）、AIME 2025で67.4%（+17.4%）、LiveCodeBench v5で61.1%（+8%）、LiveCodeBench v6で54.9%（+7%）、Codeforces 2024で+543という印象的な結果を達成しています。

✨ 主な機能

📢 ニュース

2025年6月11日: AceReason Evalutionで評価ツールキットを公開しました。内容は以下の通りです。
- 推論とスコアリングを実行するスクリプト
- LiveCodeBench (avg@8): 各月（2023/5 - 2025/5）のモデル予測ファイルとスコア
- AIME24/25 (avg@64): モデル予測ファイルとスコア
2025年6月2日: AceReason-Mathで数学の強化学習訓練データセットを公開しました。

📊 結果

Qwen2.5とLlama3.1モデルファミリー内の同等のサイズの競合する推論モデルと、AIME 2024、AIME 2025、LiveCodeBench v5（2024/08/01 - 2025/02/01）、LiveCodeBench v6（2025/02/01 - 2025/05/01）で評価を行いました。より詳細な評価結果は、技術レポートを参照してください。

モデル	AIME 2024 (avg@64)	AIME 2025 (avg@64)	LCB v5 (avg@8)	LCB v6 (avg@8)
QwQ-32B	79.5	65.8	63.4	-
DeepSeek-R1-671B	79.8	70.0	65.9	-
Llama-Nemotron-Ultra-253B	80.8	72.5	66.3	-
o3-mini (medium)	79.6	76.7	67.4	-
Light-R1-14B	74	60.2	57.9	51.5
DeepCoder-14B (32K Inference)	71	56.1	57.9	50.4
OpenMath-Nemotron-14B	76.3	63.0	-	-
OpenCodeReasoning-Nemotron-14B	-	-	59.4	54.1
Llama-Nemotron-Super-49B-v1	67.5	60.0	45.5	-
DeepSeek-R1-Distilled-Qwen-14B	69.7	50.2	53.1	47.9
DeepSeek-R1-Distilled-Qwen-32B	72.6	54.9	57.2	-
AceReason-Nemotron-7B 🤖	69.0	53.6	51.8	44.1
AceReason-Nemotron-14B 🤖	78.6	67.4	61.1	54.9

💻 使用例

基本的な使用法

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'nvidia/AceReason-Nemotron-14B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

📚 ドキュメント

💡 使用アドバイス

システムプロンプトを含めず、すべての指示をユーザープロンプトに直接記述してください。
数学の質問には、以下の指示を使用することをおすすめします。Please reason step by step, and put your final answer within \boxed{}.
コードの質問には、以下の指示を使用することをおすすめします。

question = "" # コードの質問
starter_code = "" # スターターコードの関数ヘッダー

code_instruction_nostartercode = """Write Python code to solve the problem. Please place the solution code in the following format:
```python
# Your solution code here
```"""
code_instruction_hasstartercode = """Please place the solution code in the following format:
```python
# Your solution code here
```"""
if starter_code != "":
    question += "\n\n" + "Solve the problem starting with the provided function header.\n\nFunction header:\n" + "```\n" + starter_code + "\n```"
    question += "\n\n" + code_instruction_hasstartercode
else:
    question += "\n\n" + code_instruction_nostartercode

final_prompt = "<ï½œUserï½œ>" + question + "<ï½œAssistantï½œ><think>\n"

評価用の推論エンジンは、vLLM==0.7.3を使用し、top-p=0.95、temperature=0.6、max_tokens=32768です。

🔍 評価ツールキット

評価コード、スクリプト、キャッシュされた予測ファイルについては、https://huggingface.co/nvidia/AceReason-Nemotron-14B/blob/main/README_EVALUATION.md をご確認ください。

📧 問い合わせ先

Yang Chen (yachen@nvidia.com), Zhuolin Yang (zhuoliny@nvidia.com), Zihan Liu (zihanl@nvidia.com), Chankyu Lee (chankyul@nvidia.com), Wei Ping (wping@nvidia.com)

📄 ライセンス

このモデルの使用は、NVIDIA Open Model Licenseに準拠しています。

📖 引用

@article{chen2025acereason,
  title={AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning},
  author={Chen, Yang and Yang, Zhuolin and Liu, Zihan and Lee, Chankyu and Xu, Peng and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
  journal={arXiv preprint arXiv:2505.16400},
  year={2025}
}