🚀 數學獎勵模型(Mistral-7B)
本項目是用於 Math-Shepherd 的過程獎勵模型(mistral-7b)。該模型可根據輸入的問題及逐步解決方案,輸出相應的對數幾率(logits),通過後處理可得到每一步的得分。
🚀 快速開始
輸入格式
輸入為問題和帶有特殊步驟標籤 ки
的逐步解決方案,例如:
Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes .... ? Step 1: Janet's ducks lay 16 eggs per day. ки
Step 2: She eats three for breakfast every morning, so she has 16 - 3 = 13 eggs left. ки
Step 3: She bakes muffins for her friends every day with four eggs, so she has 13 - 4 = 9 eggs left. ки
Step 4: She sells the remainder at the farmers' market daily for $2 per fresh duck egg, so she makes 9 * $2 = $18 every day at the farmers' market. The answer is: 18 ки
輸出格式
輸出為對數幾率(logits),你需要對其進行後處理以得到每一步的得分。
💻 使用示例
基礎用法
from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM
import torch
good_token = '+'
bad_token = '-'
step_tag = 'ки'
tokenizer = AutoTokenizer.from_pretrained('peiyi9979/math-shepherd-mistral-7b-prm')
candidate_tokens = tokenizer.encode(f"{good_token} {bad_token}")[1:]
step_tag_id = tokenizer.encode(f"{step_tag}")[-1]
model = AutoModelForCausalLM.from_pretrained('peiyi9979/math-shepherd-mistral-7b-prm').eval()
question = """Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?"""
output1 = """Step 1: Janet's ducks lay 16 eggs per day. ки
Step 2: She eats three for breakfast every morning, so she has 16 - 3 = 13 eggs left. ки
Step 3: She bakes muffins for her friends every day with four eggs, so she has 13 - 4 = 9 eggs left. ки
Step 4: She sells the remainder at the farmers' market daily for $2 per fresh duck egg, so she makes 9 * $2 = $18 every day at the farmers' market. The answer is: 18 ки"""
output2 = """Step 1: Janet's ducks lay 16 eggs per day. ки
Step 2: She eats three for breakfast every morning, so she has 16 - 3 = 13 eggs left. ки
Step 3: She bakes muffins for her friends every day with four eggs, so she has 13 - 4 = 9 eggs left. ки
Step 4: She sells the remainder at the farmers' market daily for $2 per fresh duck egg, so she makes 9 * $2 = $17 every day at the farmers' market. The answer is: 17 ки"""
for output in [output1, output2]:
input_for_prm = f"{question} {output}"
input_id = torch.tensor([tokenizer.encode(input_for_prm)])
with torch.no_grad():
logits = model(input_id).logits[:,:,candidate_tokens]
scores = logits.softmax(dim=-1)[:,:,0]
step_scores = scores[input_id == step_tag_id]
print(step_scores)