đ AceInstruct: Advanced SFT Models
AceInstruct is a family of advanced SFT models designed for coding, mathematics, and general - purpose tasks. It offers high - performance solutions across multiple domains, with performance comparable to Qwen2.5 - Instruct.
đ Quick Start
For a quick start, you can use the following code to interact with the AceInstruct-72B
model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "AceInstruct-72B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
prompt = "Tell me something about artificial intelligence."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to("cuda")
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
⨠Features
- Versatile Application: AceInstruct is a versatile model family that can be applied to a wide range of domains, including coding, mathematics, and general knowledge tasks.
- Improved Performance: It is improved using Qwen and fine - tuned on Qwen2.5 - Base. Benchmark evaluations show that it delivers performance comparable to Qwen2.5 - Instruct.
- Multiple Sizes: The AceInstruct family includes models of different sizes (AceInstruct - 1.5B, 7B, and 72B), catering to various resource requirements.
đ Documentation
Introduction
We introduce AceInstruct, a family of advanced SFT models for coding, mathematics, and general - purpose tasks. The AceInstruct family, which includes AceInstruct - 1.5B, 7B, and 72B, is Improved using Qwen.
These models are fine - tuned on Qwen2.5 - Base using general SFT datasets. These same datasets are also used in the training of AceMath - Instruct. Different from AceMath - Instruct which is specialized for math questions, AceInstruct is versatile and can be applied to a wide range of domains. Benchmark evaluations across coding, mathematics, and general knowledge tasks demonstrate that AceInstruct delivers performance comparable to Qwen2.5 - Instruct.
For more information about AceInstruct, check our website and paper.
Benchmark Results
|
Qwen2.5 - 1.5B - Instruct |
AceInstruct - 1.5B |
Qwen2.5 - 7B - Instruct |
AceInstruct - 7B |
Qwen2.5 - 72B - Instruct |
AceInstruct - 72B |
HumanEval |
61.60 |
73.17 |
84.80 |
85.37 |
86.60 |
89.63 |
MBPP |
63.20 |
65.76 |
79.20 |
74.32 |
88.20 |
83.66 |
GSM8K |
73.20 |
80.44 |
91.60 |
93.10 |
95.80 |
96.36 |
MATH |
55.20 |
60.34 |
75.50 |
76.40 |
83.10 |
84.50 |
MMLU |
58.37 |
58.17 |
74.51 |
74.68 |
84.67 |
83.88 |
MMLU Pro |
32.40 |
33.78 |
56.30 |
54.50 |
71.10 |
66.10 |
Average |
57.33 |
61.94 |
76.99 |
76.40 |
84.91 |
84.02 |
We compare AceInstruct to Qwen2.5 - Instruct across coding, mathematics, and general knowledge tasks. We find that AceInstruct - 1.5B outperforms Qwen2.5 - 1.5B - Instruct (61.94 vs. 57.33), while AceInstruct - 7B and AceInstruct - 72B perform similarly to Qwen2.5 - 7B - Instruct and Qwen2.5 - 72B - Instruct.
All Resources
AceMath Instruction Models
AceMath Reward Models
Evaluation & Training Data
General Instruction Models
Correspondence to
Zihan Liu (zihanl@nvidia.com), Yang Chen (yachen@nvidia.com), Wei Ping (wping@nvidia.com)
Citation
If you find our work helpful, we'd appreciate it if you could cite us.
@article{acemath2024,
title={AceMath: Advancing Frontier Math Reasoning with Post - Training and Reward Modeling},
author={Liu, Zihan and Chen, Yang and Shoeybi, Mohammad and Catanzaro, Bryan and Ping, Wei},
journal={arXiv preprint},
year={2024}
}
đ License
All models in the AceInstruct family are for non - commercial use only, subject to [Terms of Use](https://openai.com/policies/row - terms - of - use/) of the data generated by OpenAI. We put the AceInstruct models under the license of [Creative Commons Attribution: Non - Commercial 4.0 International](https://spdx.org/licenses/CC - BY - NC - 4.0).