AceMath-7B-Instruct Open-source Mathematical Reasoning Model - Free Deployment to Solve English Math Problems

Acemath 7B Instruct

Developed by nvidia

AceMath-7B-Instruct is a specialized instruction model for mathematical reasoning developed by NVIDIA, based on an improved Qwen architecture, excelling at solving English math problems through chain-of-thought (CoT) reasoning.

Large Language Model

Safetensors

English#Mathematical Reasoning #Chain-of-Thought Optimization #Multi-Stage Fine-Tuning

Downloads 1,454

Release Time : 1/13/2025

Model Overview

The AceMath series of models are specifically designed for mathematical reasoning, including instruction models and reward models of varying scales. The instruction models excel at solving math problems through chain-of-thought reasoning, while the reward models focus on evaluating and scoring mathematical solutions.

Model Features

Specialized Math Optimization

Designed specifically for mathematical reasoning, enhancing problem-solving capabilities through a multi-stage supervised fine-tuning process.

Chain-of-Thought Reasoning

Excels at solving complex mathematical problems through chain-of-thought (CoT) reasoning.

Outstanding Performance

The 7B version significantly outperforms previous best models on multiple mathematical reasoning benchmarks, with performance approaching that of a 72B version with 10 times the parameters.

Complete Training Data Publicly Available

All training data is publicly available to support related research.

Model Capabilities

Mathematical problem solving

Chain-of-thought reasoning

English text generation

Use Cases

Education

Math Problem Solving

Helps students understand and solve complex mathematical problems.

Performs excellently on multiple mathematical reasoning benchmarks.

Research

Mathematical Reasoning Research

Supports research on mathematical reasoning and chain-of-thought.

Publicly available training data can be used for further research.

🚀 AceMath

AceMath is a family of frontier models designed for mathematical reasoning, offering high - performance solutions for math problems and evaluations.

🚀 Quick Start

AceMath is a family of models crafted for mathematical reasoning. The AceMath family includes models like AceMath - 1.5B/7B/72B - Instruct and AceMath - 7B/72B - RM, which are Improved using Qwen.

The AceMath - 1.5B/7B/72B - Instruct models are great at solving English math problems using Chain - of - Thought (CoT) reasoning. Meanwhile, the AceMath - 7B/72B - RM models, as outcome reward models, are specialized in evaluating and scoring math solutions.

The AceMath - 1.5B/7B/72B - Instruct models are developed from the Qwen2.5 - Math - 1.5B/7B/72B - Base models through a multi - stage supervised fine - tuning (SFT) process: first with general - purpose SFT data, then with math - specific SFT data. All training data is being released to support further research in this field.

We only recommend using the AceMath models for solving math problems. For other tasks, we also release AceInstruct - 1.5B/7B/72B, a series of general - purpose SFT models for code, math, and general knowledge tasks, built upon the Qwen2.5 - 1.5B/7B/72B - Base.

For more information about AceMath, check our website and paper.

✨ Features

Powerful Math Reasoning: AceMath - Instruct models can efficiently solve English math problems using CoT reasoning.
Accurate Evaluation: AceMath - RM models can accurately evaluate and score math solutions.
Multi - stage SFT: The Instruct models are developed through a multi - stage SFT process for better performance.
Data Release: All training data is released to support further research.

📦 All Resources

💻 AceMath Instruction Models

[AceMath - 1.5B - Instruct](https://huggingface.co/nvidia/AceMath - 1.5B - Instruct), [AceMath - 7B - Instruct](https://huggingface.co/nvidia/AceMath - 7B - Instruct), [AceMath - 72B - Instruct](https://huggingface.co/nvidia/AceMath - 72B - Instruct)

📊 AceMath Reward Models

[AceMath - 7B - RM](https://huggingface.co/nvidia/AceMath - 7B - RM), [AceMath - 72B - RM](https://huggingface.co/nvidia/AceMath - 72B - RM)

📈 Evaluation & Training Data

[AceMath - RewardBench](https://huggingface.co/datasets/nvidia/AceMath - RewardBench), [AceMath - Instruct Training Data](https://huggingface.co/datasets/nvidia/AceMath - Instruct - Training - Data), [AceMath - RM Training Data](https://huggingface.co/datasets/nvidia/AceMath - RM - Training - Data)

🌟 General Instruction Models

[AceInstruct - 1.5B](https://huggingface.co/nvidia/AceInstruct - 1.5B), [AceInstruct - 7B](https://huggingface.co/nvidia/AceInstruct - 7B), [AceInstruct - 72B](https://huggingface.co/nvidia/AceInstruct - 72B)

📈 Benchmark Results (AceMath - Instruct + AceMath - 72B - RM)

AceMath Benchmark Results

We compare AceMath to leading proprietary and open - access math models in the above table. Our AceMath - 7B - Instruct largely outperforms the previous best - in - class Qwen2.5 - Math - 7B - Instruct (Average pass@1: 67.2 vs. 62.9) on a variety of math reasoning benchmarks, and comes close to the performance of 10× larger Qwen2.5 - Math - 72B - Instruct (67.2 vs. 68.2). Notably, our AceMath - 72B - Instruct outperforms the state - of - the - art Qwen2.5 - Math - 72B - Instruct (71.8 vs. 68.2), GPT - 4o (67.4) and Claude 3.5 Sonnet (65.6) by a margin. We also report the rm@8 accuracy (best of 8) achieved by our reward model, AceMath - 72B - RM, which sets a new record on these reasoning benchmarks. This excludes OpenAI’s o1 model, which relies on scaled inference computation.

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "nvidia/AceMath-7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Jen enters a lottery by picking $4$ distinct numbers from $S=\\{1,2,3,\\cdots,9,10\\}.$ $4$ numbers are randomly chosen from $S.$ She wins a prize if at least two of her numbers were $2$ of the randomly chosen numbers, and wins the grand prize if all four of her numbers were the randomly chosen numbers. The probability of her winning the grand prize given that she won a prize is $\\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

📚 Correspondence to

Zihan Liu (zihanl@nvidia.com), Yang Chen (yachen@nvidia.com), Wei Ping (wping@nvidia.com)

📚 Citation

If you find our work helpful, we’d appreciate it if you could cite us.

@article{acemath2024,
  title={AceMath: Advancing Frontier Math Reasoning with Post - Training and Reward Modeling},
  author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
  journal={arXiv preprint},
  year={2024}
}

📄 License

All models in the AceMath family are for non - commercial use only, subject to [Terms of Use](https://openai.com/policies/row - terms - of - use/) of the data generated by OpenAI. We put the AceMath models under the license of [Creative Commons Attribution: Non - Commercial 4.0 International](https://spdx.org/licenses/CC - BY - NC - 4.0).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご