模型简介
模型特点
模型能力
使用案例
🚀 Skywork o1 Open模型系列
Skywork o1 Open模型系列由昆仑万维的Skywork团队开发,创新性地融入了类似o1的慢思考与推理能力,为AI推理领域带来了新的突破。该系列包含三款先进模型,在推理能力上表现卓越,推动了AI技术的发展。
🚀 快速开始
若需了解使用PRM的详细说明,请参考 Skywork-o1-PRM-Inference仓库。
huggingface推理
- 克隆Skywork PRM推理仓库
git clone https://github.com/SkyworkAI/skywork-o1-prm-inference.git
cd skywork-o1-prm-inference
- 运行PRM推理
from transformers import AutoTokenizer
from model_utils.prm_model import PRM_MODEL
from model_utils.io_utils import prepare_input, prepare_batch_input_for_model, derive_step_rewards
prm_model_path = "/path/to/prm_model"
tokenizer = AutoTokenizer.from_pretrained(prm_model_path, trust_remote_code=True)
datas = [
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To determine how much money Janet makes every day at the farmers' market, we need to follow these steps:\n1. **Calculate the total number of eggs laid by the ducks per day.**\n Janet's ducks lay 16 eggs per day.\n2. **Determine the number of eggs Janet uses each day.**\n - She eats 3 eggs for breakfast every morning.\n - She bakes muffins for her friends every day with 4 eggs.\n So, the total number of eggs used per day is:\n \\[\n 3 + 4 = 7 \\text{ eggs}\n \\]\n3. **Calculate the number of eggs Janet sells at the farmers' market each day.**\n Subtract the number of eggs used from the total number of eggs laid:\n \\[\n 16 - 7 = 9 \\text{ eggs}\n \\]\n4. **Determine how much money Janet makes from selling the eggs.**\n She sells each egg for $2, so the total amount of money she makes is:\n \\[\n 9 \\times 2 = 18 \\text{ dollars}\n \\]\nTherefore, the amount of money Janet makes every day at the farmers' market is $\\boxed{18}$."
},
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To find out how much Janet makes every day at the farmers' market, we need to follow these steps:\n1. Calculate the total number of eggs laid by the ducks: \n - Janet's ducks lay 16 eggs per day.\n2. Calculate the total number of eggs consumed by Janet and used for baking:\n - Janet eats 3 eggs for breakfast.\n - She bakes 4 eggs for muffins.\n - Total eggs used: 3 (breakfast) + 4 (baking) = 7 eggs.\n3. Calculate the remaining number of eggs for sale:\n - Total eggs laid: 16\n - Eggs used: 7\n - Remaining eggs: 16 - 7 = 9 eggs\n4. Calculate the total amount of money made at the farmers' market:\n - Price per egg: $2\n - Number of eggs sold: 9\n - Total money made: 9 * $2 = $18\nTherefore, Janet makes $\\boxed{18}$ dollars every day at the farmers' market."
}
]
processed_data = [prepare_input(d["problem"], d["response"], tokenizer=tokenizer, step_token="\n") for d in datas]
input_ids, steps, reward_flags = zip(*processed_data)
model = PRM_MODEL.from_pretrained(prm_model_path, device_map="auto").eval()
input_ids, attention_mask, reward_flags = prepare_batch_input_for_model(input_ids, reward_flags, tokenizer.pad_token_id)
_, _, rewards = model(input_ids=input_ids, attention_mask=attention_mask, return_probs=True)
step_rewards = derive_step_rewards(rewards, reward_flags)
print("step_rewards:",step_rewards[0])
print("step_rewards:",step_rewards[1])
vllm服务器推理
- 安装vllm和vllm PRM插件
pip install vllm==v0.6.4.post1
git clone https://github.com/SkyworkAI/skywork-o1-prm-inference.git
cd skywork-o1-prm-inference
pip install -e .
- 启动vllm服务器
CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve /path/to/prm_model \
--host 0.0.0.0 \
--port 8081 \
--tensor-parallel-size 4 \
--gpu-memory-utilization 0.9 \
--enable-prefix-caching \
--dtype auto
- 请求服务器进行推理
from openai import OpenAI
from transformers import AutoTokenizer
from model_utils.io_utils import prepare_input, derive_step_rewards_vllm
prm_model_path = "/path/to/prm_model"
tokenizer = AutoTokenizer.from_pretrained(prm_model_path, trust_remote_code=True)
datas = [
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To determine how much money Janet makes every day at the farmers' market, we need to follow these steps:\n1. **Calculate the total number of eggs laid by the ducks per day.**\n Janet's ducks lay 16 eggs per day.\n2. **Determine the number of eggs Janet uses each day.**\n - She eats 3 eggs for breakfast every morning.\n - She bakes muffins for her friends every day with 4 eggs.\n So, the total number of eggs used per day is:\n \\[\n 3 + 4 = 7 \\text{ eggs}\n \\]\n3. **Calculate the number of eggs Janet sells at the farmers' market each day.**\n Subtract the number of eggs used from the total number of eggs laid:\n \\[\n 16 - 7 = 9 \\text{ eggs}\n \\]\n4. **Determine how much money Janet makes from selling the eggs.**\n She sells each egg for $2, so the total amount of money she makes is:\n \\[\n 9 \\times 2 = 18 \\text{ dollars}\n \\]\nTherefore, the amount of money Janet makes every day at the farmers' market is $\\boxed{18}$."
},
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To find out how much Janet makes every day at the farmers' market, we need to follow these steps:\n1. Calculate the total number of eggs laid by the ducks: \n - Janet's ducks lay 16 eggs per day.\n2. Calculate the total number of eggs consumed by Janet and used for baking:\n - Janet eats 3 eggs for breakfast.\n - She bakes 4 eggs for muffins.\n - Total eggs used: 3 (breakfast) + 4 (baking) = 7 eggs.\n3. Calculate the remaining number of eggs for sale:\n - Total eggs laid: 16\n - Eggs used: 7\n - Remaining eggs: 16 - 7 = 9 eggs\n4. Calculate the total amount of money made at the farmers' market:\n - Price per egg: $2\n - Number of eggs sold: 9\n - Total money made: 9 * $2 = $18\nTherefore, Janet makes $\\boxed{18}$ dollars every day at the farmers' market."
}
]
# 数据预处理
processed_data = [prepare_input(d["problem"], d["response"], tokenizer=tokenizer, step_token="\n") for d in datas]
input_ids, steps, reward_flags = zip(*processed_data)
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8081/v1"
client = OpenAI(
# defaults to os.environ.get("OPENAI_API_KEY")
api_key=openai_api_key,
base_url=openai_api_base,
)
models = client.models.list()
model = models.data[0].id
rewards = client.embeddings.create(
input=input_ids,
model=model,
)
step_rewards = derive_step_rewards_vllm(rewards, reward_flags)
print("step_rewards:",step_rewards[0])
print("step_rewards:",step_rewards[1])
✨ 主要特性
- 融入慢思考与推理能力:Skywork o1 Open模型系列创新性地融入了类似o1的慢思考和推理能力,在输出中展现出先天的思考、规划和反思能力。
- 多模型选择:包含三款先进模型,分别为 Skywork o1 Open-Llama-3.1-8B、Skywork o1 Open-PRM-Qwen-2.5-1.5B 和 Skywork o1 Open-PRM-Qwen-2.5-7B,可满足不同场景需求。
- 推理能力显著提升:在标准基准测试中,推理能力有显著提升,推动了AI推理任务的发展。
📦 安装指南
huggingface推理
- 克隆Skywork PRM推理仓库
git clone https://github.com/SkyworkAI/skywork-o1-prm-inference.git
cd skywork-o1-prm-inference
- 运行PRM推理
from transformers import AutoTokenizer
from model_utils.prm_model import PRM_MODEL
from model_utils.io_utils import prepare_input, prepare_batch_input_for_model, derive_step_rewards
prm_model_path = "/path/to/prm_model"
tokenizer = AutoTokenizer.from_pretrained(prm_model_path, trust_remote_code=True)
datas = [
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To determine how much money Janet makes every day at the farmers' market, we need to follow these steps:\n1. **Calculate the total number of eggs laid by the ducks per day.**\n Janet's ducks lay 16 eggs per day.\n2. **Determine the number of eggs Janet uses each day.**\n - She eats 3 eggs for breakfast every morning.\n - She bakes muffins for her friends every day with 4 eggs.\n So, the total number of eggs used per day is:\n \\[\n 3 + 4 = 7 \\text{ eggs}\n \\]\n3. **Calculate the number of eggs Janet sells at the farmers' market each day.**\n Subtract the number of eggs used from the total number of eggs laid:\n \\[\n 16 - 7 = 9 \\text{ eggs}\n \\]\n4. **Determine how much money Janet makes from selling the eggs.**\n She sells each egg for $2, so the total amount of money she makes is:\n \\[\n 9 \\times 2 = 18 \\text{ dollars}\n \\]\nTherefore, the amount of money Janet makes every day at the farmers' market is $\\boxed{18}$."
},
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To find out how much Janet makes every day at the farmers' market, we need to follow these steps:\n1. Calculate the total number of eggs laid by the ducks: \n - Janet's ducks lay 16 eggs per day.\n2. Calculate the total number of eggs consumed by Janet and used for baking:\n - Janet eats 3 eggs for breakfast.\n - She bakes 4 eggs for muffins.\n - Total eggs used: 3 (breakfast) + 4 (baking) = 7 eggs.\n3. Calculate the remaining number of eggs for sale:\n - Total eggs laid: 16\n - Eggs used: 7\n - Remaining eggs: 16 - 7 = 9 eggs\n4. Calculate the total amount of money made at the farmers' market:\n - Price per egg: $2\n - Number of eggs sold: 9\n - Total money made: 9 * $2 = $18\nTherefore, Janet makes $\\boxed{18}$ dollars every day at the farmers' market."
}
]
processed_data = [prepare_input(d["problem"], d["response"], tokenizer=tokenizer, step_token="\n") for d in datas]
input_ids, steps, reward_flags = zip(*processed_data)
model = PRM_MODEL.from_pretrained(prm_model_path, device_map="auto").eval()
input_ids, attention_mask, reward_flags = prepare_batch_input_for_model(input_ids, reward_flags, tokenizer.pad_token_id)
_, _, rewards = model(input_ids=input_ids, attention_mask=attention_mask, return_probs=True)
step_rewards = derive_step_rewards(rewards, reward_flags)
print("step_rewards:",step_rewards[0])
print("step_rewards:",step_rewards[1])
vllm服务器推理
- 安装vllm和vllm PRM插件
pip install vllm==v0.6.4.post1
git clone https://github.com/SkyworkAI/skywork-o1-prm-inference.git
cd skywork-o1-prm-inference
pip install -e .
- 启动vllm服务器
CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve /path/to/prm_model \
--host 0.0.0.0 \
--port 8081 \
--tensor-parallel-size 4 \
--gpu-memory-utilization 0.9 \
--enable-prefix-caching \
--dtype auto
- 请求服务器进行推理
from openai import OpenAI
from transformers import AutoTokenizer
from model_utils.io_utils import prepare_input, derive_step_rewards_vllm
prm_model_path = "/path/to/prm_model"
tokenizer = AutoTokenizer.from_pretrained(prm_model_path, trust_remote_code=True)
datas = [
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To determine how much money Janet makes every day at the farmers' market, we need to follow these steps:\n1. **Calculate the total number of eggs laid by the ducks per day.**\n Janet's ducks lay 16 eggs per day.\n2. **Determine the number of eggs Janet uses each day.**\n - She eats 3 eggs for breakfast every morning.\n - She bakes muffins for her friends every day with 4 eggs.\n So, the total number of eggs used per day is:\n \\[\n 3 + 4 = 7 \\text{ eggs}\n \\]\n3. **Calculate the number of eggs Janet sells at the farmers' market each day.**\n Subtract the number of eggs used from the total number of eggs laid:\n \\[\n 16 - 7 = 9 \\text{ eggs}\n \\]\n4. **Determine how much money Janet makes from selling the eggs.**\n She sells each egg for $2, so the total amount of money she makes is:\n \\[\n 9 \\times 2 = 18 \\text{ dollars}\n \\]\nTherefore, the amount of money Janet makes every day at the farmers' market is $\\boxed{18}$."
},
{
"problem" : "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"response" : "To find out how much Janet makes every day at the farmers' market, we need to follow these steps:\n1. Calculate the total number of eggs laid by the ducks: \n - Janet's ducks lay 16 eggs per day.\n2. Calculate the total number of eggs consumed by Janet and used for baking:\n - Janet eats 3 eggs for breakfast.\n - She bakes 4 eggs for muffins.\n - Total eggs used: 3 (breakfast) + 4 (baking) = 7 eggs.\n3. Calculate the remaining number of eggs for sale:\n - Total eggs laid: 16\n - Eggs used: 7\n - Remaining eggs: 16 - 7 = 9 eggs\n4. Calculate the total amount of money made at the farmers' market:\n - Price per egg: $2\n - Number of eggs sold: 9\n - Total money made: 9 * $2 = $18\nTherefore, Janet makes $\\boxed{18}$ dollars every day at the farmers' market."
}
]
# 数据预处理
processed_data = [prepare_input(d["problem"], d["response"], tokenizer=tokenizer, step_token="\n") for d in datas]
input_ids, steps, reward_flags = zip(*processed_data)
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8081/v1"
client = OpenAI(
# defaults to os.environ.get("OPENAI_API_KEY")
api_key=openai_api_key,
base_url=openai_api_base,
)
models = client.models.list()
model = models.data[0].id
rewards = client.embeddings.create(
input=input_ids,
model=model,
)
step_rewards = derive_step_rewards_vllm(rewards, reward_flags)
print("step_rewards:",step_rewards[0])
print("step_rewards:",step_rewards[1])
📚 详细文档
模型信息
Skywork-o1-Open-PRM系列基于 Qwen2.5-Math-1.5B-Instruct 和 Qwen2.5-Math-7B-Instruct 进行训练。
PRM评估
评估设置
数学评估
我们使用了 Qwen2.5-Math 的评估脚本,并遵循其配置以确保一致性。选择的数据集包括 GSM8K、MATH、GaoKao、CN-Middle School 24、OlympiadBench、AMC-23 和 AIME-24。其中,GaoKao 和 CN-Middle School 24 是中文数据集,其余为英文数据集。值得注意的是,OlympiadBench、AIME-24 和 AMC-23 是竞赛级数据集。
代码评估
对于代码评估,我们采用了 Qwen2.5-Coder 的评估脚本,并基本保持相同的配置。选择的数据集包括 HumanEval、MBPP 和 LiveCodeBench,其中 LiveCodeBench 特别使用了版本 2024.01 - 2024 - 11。由于先前版本在测试和代码清理方面存在问题,我们使用了 evalplus 的最新版本(0.3.1)。
评估基础模型
我们在三个基础模型上评估了RM的性能:Qwen2.5-7B-Instruct、Llama3.1-8B-Instruct 和 Skywork-o1-Open-8B。进行了数据采样以验证RM在不同模型上的性能。数学问题的采样温度设置为 0.7,代码相关任务的采样温度设置为 1.0。
对比的RM
- Qwen2.5-Math-RM-72B:由Qwen团队提供的开源ORM。
- OpenR-MATH-psa-PRM-7B:来自OpenR项目的开源PRM。
- RLHFlow-Deepseek-Data-PRM-8B:来自RLHFlow项目的开源PRM。
评估指标
- Greedy Sampling Pass@1:使用贪心采样生成答案。
- Majority Voting@64:随机采样64个答案。
- Best-of-N@64:根据奖励模型(RM)提供的输出值对64个答案进行排序。ORM和PRM的加权方法不同:
- 对于 ORM,仅使用最后一步的奖励。
- 对于 PRM,使用所有步骤的平均奖励进行加权。
评估结果
数学评估
奖励模型 | 方法 | GSM8K | MATH | GaoKao | CN-Middle School 24 | OlympiadBench | AIME-24 | AMC-23 | 平均 |
---|---|---|---|---|---|---|---|---|---|
N/A | Greedy Sampling Pass@1 | 91.6 | 78.1 | 63.6 | 67.3 | 43.1 | 13.3 | 55.0 | 58.9 |
N/A | Majority Voting@64 | 93.9 | 84.3 | 69.5 | 73.3 | 50.4 | 16.7 | 52.5 | 62.9 |
OpenR-MATH-psa-PRM-7B | Best-of-N@64 | 95.1 | 82.7 | 67.1 | 70.3 | 47.6 | 20.0 | 57.5 | 62.9 |
RLHFlow-Deepseek-Data-PRM-8B | Best-of-N@64 | 94.4 | 80.1 | 59.1 | 74.3 | 43.0 | 20.0 | 50.0 | 60.1 |
Qwen2.5-Math-RM-72B | Best-of-N@64 | 96.1 | 86.9 | 76.0 | 76.2 | 53.3 | 26.7 | 65.0 | 68.6 |
Skywork-o1-Open-PRM-1.5B | Best-of-N@64 | 94.5 | 85.0 | 65.6 | 73.3 | 49.9 | 16.7 | 62.5 | 63.9 |
Skywork-o1-Open-PRM-7B | Best-of-N@64 | 96.7 | 87.0 | 70.3 | 76.2 | 52.3 | 23.3 | 65.0 | 67.3 |
代码评估
由于对比的PRM未在代码相关任务上进行训练,本节仅关注Skywork-o1-Open-PRM的性能。
奖励模型 | 方法 | MBPP | MBPP+ | HumanEval | HumanEval+ | LiveCodeBench-2024.01 - 2024 - 11 |
---|---|---|---|---|---|---|
N/A | Greedy Sampling Pass@1 | 79.9 | 65.9 | 82.9 | 78.7 | 26.0 |
Skywork-o1-Open-PRM-7B | Best-of-N@64 | 81.2 | 68.5 | 81.1 | 74.4 | 31.3 |
🔧 技术细节
待办事项
- 为逐步RM评估方法添加更多结果。
- 调整RM架构,以增强与vLLM/sglang推理的兼容性。
- 通过纳入更多类型的推理任务来扩展RM的用例。
- 缓解不同推理任务之间的性能冲突。
📄 许可证
Skywork模型的社区使用需要遵循Skywork社区许可证。Skywork模型支持商业使用。如果您计划将Skywork模型或其衍生产品用于商业目的,则必须遵守Skywork社区许可证中的条款和条件。
免责声明
我们在此声明,Skywork模型不得用于任何对国家或社会安全构成威胁的活动或从事非法行为。此外,我们要求用户在未进行适当的安全审查和记录的情况下,不要将Skywork模型部署用于互联网服务。我们希望所有用户都能遵守这一原则,以确保技术进步在规范和合法的环境中进行。
我们已尽最大努力确保模型训练过程中使用的数据的合规性。然而,尽管我们付出了巨大努力,但由于模型和数据的复杂性,仍可能存在不可预测的风险和问题。因此,如果因使用Skywork开源模型而出现任何问题,包括但不限于数据安全问题、舆论风险,或因模型被误导、滥用、传播或不当使用而产生的任何风险和问题,我们将不承担任何责任。
引用
如果您认为我们的工作有帮助,请使用以下BibTeX条目引用我们:
@misc{skyworkopeno12024,
title={Skywork-o1 Open Series},
author={He, Jujie and Wei, Tianwen and Yan, Rui and Liu, Jiacai and Wang, Chaojie and Gan, Yimeng and Tu, Shiwen and Liu, Chris Yuhao and Zeng, Liang and Wang, Xiaokun and Wang, Boyang and Li, Yongcong and Zhang, Fuxiang and Xu, Jiacheng and An, Bo and Liu, Yang and Zhou, Yahui},
year={2024},
month={November},
howpublished={\url{https://huggingface.co/Skywork}},
url={https://huggingface.co/Skywork},
}
联系我们
如果您有任何问题,请随时通过 {jujie.he, jiacai.liu}@kunlun-inc.com 与我们联系。



