Gemma-2B-rewardmodel-baseline開源打分模型 - 為大型語言模型找優質小型打分器

首頁

Gemma 2B Rewardmodel Baseline

由Ray2333開發

基於Gemma-2b-it模型、採用BT損失函數訓練的打分模型，適用於為大型語言模型尋找優質的小型打分模型

大型語言模型

Transformers

開源協議:MIT #小型打分模型 #BT損失函數 #語言模型評估

下載量 133

發布時間 : 7/5/2024

模型概述

該模型是一個基於Gemma-2b-it架構的打分模型，採用BT損失函數訓練，訓練數據集為preference_700K。主要用於評估和選擇大型語言模型的輸出質量。

模型特點

高效打分模型

小型但高效的打分模型，適合評估大型語言模型的輸出質量

BT損失函數訓練

採用Bradley-Terry(BT)損失函數進行優化訓練

多維度評估能力

能夠評估對話能力、安全性、推理能力等多個維度

模型能力

文本質量評估

對話能力評分

安全性評估

推理能力評分

使用案例

語言模型評估

LLM輸出質量評估

評估大型語言模型生成文本的質量

在reward model benchmark上獲得73.7的綜合評分

對話系統優化

用於優化對話系統的響應質量

對話能力評分為94.1

內容安全

內容安全過濾

評估生成內容的安全性

安全性評分為79.6

🚀 獎勵模型（基於Gemma - 2b - it）

這是一個獎勵模型（基於Gemma - 2b - it），使用weqweasdas/preference_dataset_mixture2_and_safe_pku數據集，通過BT損失進行訓練。該獎勵模型在需要為大語言模型（LLMs）配備一個出色的小型獎勵模型時尤為實用。你還可以參考[Ray2333/GRM - Gemma - 2B - sftreg](https://huggingface.co/Ray2333/GRM - Gemma - 2B - sftreg)，這是一個通過隱藏狀態正則化訓練的更優的2B獎勵模型。

🚀 快速開始

模型評估

我們在[獎勵模型基準測試](https://huggingface.co/spaces/allenai/reward - bench)上對該獎勵模型進行了評估。

模型	平均分	對話	困難對話	安全性	推理能力
[Ray2333/GRM - Gemma - 2B - sftreg](https://huggingface.co/Ray2333/GRM - Gemma - 2B - sftreg)（我們的，2B）	75.3	95.5	48.7	80.0	76.8
berkeley - nest/Starling - RM - 7B - alpha（7B）	74.6	98	43.4	88.6	74.6
Ray2333/Gemma - 2B - rewardmodel - baseline（我們的，2B）	73.7	94.1	46.1	79.6	75.0
stabilityai/stablelm - zephyr - 3b（3B）	73.1	86.3	60.1	70.3	75.7
openbmb/UltraRM - 13b（13B）	71.3	96.1	55.3	45.8	82

💻 使用示例

基礎用法

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('Ray2333/Gemma-2B-rewardmodel-baseline')
reward_model = AutoModelForSequenceClassification.from_pretrained(
                'Ray2333/Gemma-2B-rewardmodel-baseline',
                num_labels=1, torch_dtype=torch.float16,
                device_map=0,
                )
message = [
  {'role': 'user', 'content': "I'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone.  But I can't do that while I'm at the movie.  Can you help by impersonating me by chat with her?"},
  {'role': 'assistant', 'content': "Sorry, I'm not comfortable impersonating you in that way.  I'm not willing to behave so dishonestly.  Maybe you can just find a way to bring her to the movie, or you can find a babysitter?"}
]
message_template = tokenizer.apply_chat_template(message, tokenize=False)
# it will look like this: "<bos><start_of_turn>user\nI'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone.  But I can't do that while I'm at the movie.  Can you help by impersonating me by chat with her?<end_of_turn>\n<start_of_turn>model\nSorry, I'm not comfortable impersonating you in that way.  I'm not willing to behave so dishonestly.  Maybe you can just find a way to bring her to the movie, or you can find a babysitter?<end_of_turn>\n".

kwargs = {"padding": 'longest', "truncation": True, "return_tensors": "pt"}
tokens = tokenizer.encode_plus(message_template, **kwargs)

with torch.no_grad():
  reward_tensor = model(tokens["input_ids"][0].to(model.device), attention_mask=tokens["attention_mask"][0].to(model.device)).logits.reshape(-1)
  reward = reward_tensor.cpu().detach().item()