Decision-Tree-Reward-Gemma-2-27B開源模型 - 評估語言模型內容質量，排行榜表現優

首頁

Decision Tree Reward Gemma 2 27B

由RLHFlow開發

基於Gemma-2-27B微調的決策樹獎勵模型，用於評估語言模型生成內容的質量，在RewardBench排行榜上表現優異。

大型語言模型

Transformers

英語開源協議:其他 #決策樹獎勵模型 #多維度評分 #語言模型對齊

下載量 18

發布時間 : 1/22/2025

模型概述

該模型通過決策樹方法解讀語言模型偏好，能夠評估回覆的幫助性、正確性、連貫性等維度，適用於強化學習人類反饋(RLHF)場景。

模型特點

決策樹架構

採用決策樹方法分析語言模型輸出，相比傳統序列分類器能更細緻地評估多個質量維度

多維度評估

可同時評估幫助性、正確性、連貫性、複雜性和詳細度五個關鍵維度

高性能

在RewardBench排行榜上綜合得分95.4，尤其在困難對話(91.4)和推理能力(99.2)方面表現突出

模型能力

文本質量評估

多維度評分

回覆對比

強化學習反饋

使用案例

語言模型訓練

RLHF訓練

作為獎勵模型用於強化學習人類反饋訓練流程

提供更準確的偏好信號，提升語言模型生成質量

內容評估

自動評分

評估語言模型生成內容的質量

提供多維度評分，幫助篩選最佳回覆

🚀 通過決策樹解讀語言模型偏好

本項目旨在通過決策樹的視角來解讀語言模型的偏好，為語言模型的評估和優化提供新的思路和方法。

🚀 快速開始

在使用模型之前，請確保你已經安裝了以下依賴：

transformers==4.45.2
torch>=2.5.0
flash-attn>=2.6.3

注意：此代碼需要使用具有NVIDIA Ampere架構或更新版本的GPU。

💻 使用示例

基礎用法

from transformers import AutoModelForSequenceClassification
import torch
from transformers import AutoTokenizer
model_name = "Decision-Tree-Reward-Gemma-2-27B" # Another choice is "Decision-Tree-Reward-Llama-3.1-8B" 
repo_id = f"RLHFlow/{model_name}"
device = "cuda"
# Initialize the model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(repo_id, trust_remote_code=True, torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2", device_map=device)
tokenizer = AutoTokenizer.from_pretrained(repo_id, use_fast=True)
# Load the decision tree
model.load_decision_tree(repo_id, filename="decision_tree.pkl")

# Prompt and response pairs
prompt = "Jane has 12 apples. She gives 4 apples to her friend Mark, then buys 1 more apple, and finally splits all her apples equally among herself and her 2 siblings. How many apples does each person get?"
response1 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among herself and her 2 siblings (3 people in total). 9 ÷ 3 = 3 apples each. Each person gets 3 apples."
response2 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among her 2 siblings (2 people in total). 9 ÷ 2 = 4.5 apples each. Each person gets 4 apples."

# Compare the two responses
output = model.compare(prompt, response1, response2, tokenizer, device)
print("Response 1 rewards")
print(dict(zip(output["attributes"], output["rewards"][0])))
# {'helpfulness': 3.7171721, 'correctness': 3.792478, 'coherence': 3.6601954, 'complexity': 0.8211964, 'verbosity': 1.8119512}
print("Response 2 rewards")
print(dict(zip(output["attributes"], output["rewards"][1])))
# {'helpfulness': -0.261065, 'correctness': -0.2378807, 'coherence': 2.4387608, 'complexity': 0.72620213, 'verbosity': 1.7181122}
print("Model preference")
print(output["preference"])
# 0

✨ 主要特性

模型排名展示：在RewardBench排行榜（2025年1月）中展示了不同模型的排名情況，包括整體得分以及在各個指標下的得分。
代碼示例：提供了詳細的使用代碼示例，方便用戶快速上手使用模型進行響應比較和獎勵評估。

📚 詳細文檔

作者信息

作者：Min Li
博客：https://rlhflow.github.io/posts/2025-01-22-decision-tree-reward-model/

模型信息

代碼倉庫

代碼倉庫地址：https://github.com/RLHFlow/RLHF-Reward-Modeling/tree/main/decision_tree

技術報告

技術報告狀態：即將發佈

RewardBench排行榜（2025年1月）

排名	模型	基礎模型	方法	整體得分	對話	困難對話	安全性	推理能力
1	Decision-Tree-Reward-Gemma-2-27B	Gemma-2-27B	決策樹	95.4	96.9	91.4	93.9	99.2
2	INF-QRM-Llama3.1-70B	Llama-3.1-70B	序列分類器	95.1	96.6	91.0	93.6	99.1
3	Decision-Tree-Reward-Llama-3.1-8B	Llama-3.1-8B	決策樹	94.5	96.6	89.5	93.2	98.6
4	QRM-Gemma-2-27B	Gemma-2-27B	序列分類器	94.4	96.6	90.1	92.7	98.3
5	Skywork-Reward-Gemma-2-27B-v0.2	Gemma-2-27B	序列分類器	94.3	96.1	89.9	93.0	98.1
6	Llama-3.1-Nemotron-70B-Reward	Llama-3.1-70B	自定義分類器	94.1	97.5	85.7	95.1	98.1
7	Skywork-Reward-Gemma-2-27B	Gemma-2-27B	序列分類器	93.8	95.8	91.4	91.9	96.1
8	TextEval-Llama3.1-70B	Llama-3.1-70B	生成式	93.5	94.1	90.1	93.2	96.4
9	MetaMetrics-RM-v1.0	-	自定義分類器	93.4	98.3	86.4	90.8	98.2
10	Skywork-Critic-Llama-3.1-70B	Llama-3.1-70B	生成式	93.3	96.6	87.9	93.1	95.5
11	QRM-Llama3.1-8B-v2	Llama-3.1-8B	序列分類器	93.1	96.4	86.8	92.6	96.8
12	Skywork-Reward-Llama-3.1-8B-v0.2	Llama-3.1-8B	序列分類器	93.1	94.7	88.4	92.7	96.7