Decision-Tree-Reward-Gemma-2-27B Open-source Model - Evaluating the Content Quality of Language Models, with Excellent Performance in the Ranking List

Decision Tree Reward Gemma 2 27B

Developed by RLHFlow

A decision tree reward model fine-tuned based on Gemma-2-27B, used to evaluate the quality of content generated by language models, with outstanding performance on the RewardBench leaderboard.

Large Language Model

Transformers

EnglishOpen Source License:Other #Decision Tree Reward Model #Multi-dimensional Scoring #Language Model Alignment

Downloads 18

Release Time : 1/22/2025

Model Overview

This model interprets language model preferences through decision tree methods, capable of assessing dimensions such as helpfulness, correctness, coherence, etc., suitable for Reinforcement Learning from Human Feedback (RLHF) scenarios.

Model Features

Decision Tree Architecture

Uses decision tree methods to analyze language model outputs, enabling more detailed evaluation across multiple quality dimensions compared to traditional sequence classifiers.

Multi-dimensional Evaluation

Can simultaneously evaluate five key dimensions: helpfulness, correctness, coherence, complexity, and detail.

High Performance

Achieved a comprehensive score of 95.4 on the RewardBench leaderboard, with particularly outstanding performance in challenging dialogues (91.4) and reasoning capabilities (99.2).

Model Capabilities

Text Quality Evaluation

Multi-dimensional Scoring

Response Comparison

Reinforcement Learning Feedback

Use Cases

Language Model Training

RLHF Training

Used as a reward model in Reinforcement Learning from Human Feedback training processes.

Provides more accurate preference signals, improving the quality of language model outputs.

Content Evaluation

Automatic Scoring

Evaluates the quality of content generated by language models.

Provides multi-dimensional scoring to help identify the best responses.

🚀 Interpreting Language Model Preferences Through the Lens of Decision Trees

This project interprets language model preferences using decision trees, offering high - scoring models on the RewardBench Leaderboard.

🚀 Quick Start

Before using the model, ensure you have the following dependencies installed:

transformers==4.45.2
torch>=2.5.0
flash-attn>=2.6.3

Note: This code requires a GPU with NVIDIA Ampere architecture or newer.

💻 Usage Examples

Basic Usage

from transformers import AutoModelForSequenceClassification
import torch
from transformers import AutoTokenizer
model_name = "Decision-Tree-Reward-Gemma-2-27B" # Another choice is "Decision-Tree-Reward-Llama-3.1-8B" 
repo_id = f"RLHFlow/{model_name}"
device = "cuda"
# Initialize the model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(repo_id, trust_remote_code=True, torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2", device_map=device)
tokenizer = AutoTokenizer.from_pretrained(repo_id, use_fast=True)
# Load the decision tree
model.load_decision_tree(repo_id, filename="decision_tree.pkl")

# Prompt and response pairs
prompt = "Jane has 12 apples. She gives 4 apples to her friend Mark, then buys 1 more apple, and finally splits all her apples equally among herself and her 2 siblings. How many apples does each person get?"
response1 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among herself and her 2 siblings (3 people in total). 9 ÷ 3 = 3 apples each. Each person gets 3 apples."
response2 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among her 2 siblings (2 people in total). 9 ÷ 2 = 4.5 apples each. Each person gets 4 apples."

# Compare the two responses
output = model.compare(prompt, response1, response2, tokenizer, device)
print("Response 1 rewards")
print(dict(zip(output["attributes"], output["rewards"][0])))
# {'helpfulness': 3.7171721, 'correctness': 3.792478, 'coherence': 3.6601954, 'complexity': 0.8211964, 'verbosity': 1.8119512}
print("Response 2 rewards")
print(dict(zip(output["attributes"], output["rewards"][1])))
# {'helpfulness': -0.261065, 'correctness': -0.2378807, 'coherence': 2.4387608, 'complexity': 0.72620213, 'verbosity': 1.7181122}
print("Model preference")
print(output["preference"])
# 0

✨ Features

High - scoring Models: The Decision-Tree-Reward-Gemma-2-27B and Decision-Tree-Reward-Llama-3.1-8B models rank high on the RewardBench Leaderboard in January 2025.
Decision Tree Approach: Utilizes decision trees to interpret language model preferences.

📚 Documentation

RewardBench Leaderboard (Jan 2025)

Rank	Model	Base Model	Method	Overall Score	Chat	Chat Hard	Safety	Reasoning
1	Decision-Tree-Reward-Gemma-2-27B	Gemma-2-27B	Decision Tree	95.4	96.9	91.4	93.9	99.2
2	INF-QRM-Llama3.1-70B	Llama-3.1-70B	Sequence Classifier	95.1	96.6	91.0	93.6	99.1
3	Decision-Tree-Reward-Llama-3.1-8B	Llama-3.1-8B	Decision Tree	94.5	96.6	89.5	93.2	98.6
4	QRM-Gemma-2-27B	Gemma-2-27B	Sequence Classifier	94.4	96.6	90.1	92.7	98.3
5	Skywork-Reward-Gemma-2-27B-v0.2	Gemma-2-27B	Sequence Classifier	94.3	96.1	89.9	93.0	98.1
6	Llama-3.1-Nemotron-70B-Reward	Llama-3.1-70B	Custom Classifier	94.1	97.5	85.7	95.1	98.1
7	Skywork-Reward-Gemma-2-27B	Gemma-2-27B	Sequence Classifier	93.8	95.8	91.4	91.9	96.1
8	TextEval-Llama3.1-70B	Llama-3.1-70B	Generative	93.5	94.1	90.1	93.2	96.4
9	MetaMetrics-RM-v1.0	-	Custom Classifier	93.4	98.3	86.4	90.8	98.2
10	Skywork-Critic-Llama-3.1-70B	Llama-3.1-70B	Generative	93.3	96.6	87.9	93.1	95.5
11	QRM-Llama3.1-8B-v2	Llama-3.1-8B	Sequence Classifier	93.1	96.4	86.8	92.6	96.8
12	Skywork-Reward-Llama-3.1-8B-v0.2	Llama-3.1-8B	Sequence Classifier	93.1	94.7	88.4	92.7	96.7

Project Information

Author: Min Li
Blog: https://rlhflow.github.io/posts/2025-01-22-decision-tree-reward-model/
Models:
- Decision-Tree-Reward-Gemma-2-27B
- Decision-Tree-Reward-Llama-3.1-8B
Code Repository: https://github.com/RLHFlow/RLHF-Reward-Modeling/tree/main/decision_tree
Tech Report: To release soon

To - Do

[x] Reward Model Usage code
[ ] Architecture diagram

📄 License

Note: This model is finetuned from a Skywork model under the following license agreement:

The community usage of Skywork model requires Skywork Community License. The Skywork model supports commercial use. If you plan to use the Skywork model or its derivatives for commercial purposes, you must abide by terms and conditions within Skywork Community License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご