Gemma-2B-rewardmodel-baseline Open Source Scoring Model - Finding High-quality Small-scoring Tools for Large Language Models

Gemma 2B Rewardmodel Baseline

Developed by Ray2333

A scoring model based on Gemma-2b-it, trained with BT loss function, suitable for finding high-quality small scoring models for large language models

Large Language Model

Transformers

Open Source License:MIT #Small scoring model #BT loss function #Language model evaluation

Downloads 133

Release Time : 7/5/2024

Model Overview

This model is a scoring model based on the Gemma-2b-it architecture, trained with BT loss function on the preference_700K dataset. It is primarily used to evaluate and select the output quality of large language models.

Model Features

Efficient scoring model

Small yet efficient scoring model, suitable for evaluating the output quality of large language models

BT loss function training

Optimized with Bradley-Terry (BT) loss function

Multi-dimensional evaluation capability

Capable of evaluating dialogue ability, safety, reasoning ability, and other dimensions

Model Capabilities

Text quality evaluation

Dialogue ability scoring

Safety evaluation

Reasoning ability scoring

Use Cases

Language model evaluation

LLM output quality evaluation

Evaluate the quality of text generated by large language models

Achieved a comprehensive score of 73.7 on the reward model benchmark

Dialogue system optimization

Used to optimize the response quality of dialogue systems

Dialogue ability score of 94.1

Content safety

Content safety filtering

Evaluate the safety of generated content

Safety score of 79.6

🚀 Reward Model for LLMs

This is a reward model designed for Large Language Models (LLMs). It offers a compact yet effective solution for evaluating responses, especially useful when a small - sized reward model is required.

🚀 Quick Start

This is a reward model (based on Gemma - 2b - it) trained with BT loss using the weqweasdas/preference_dataset_mixture2_and_safe_pku dataset.

This reward model is especially useful if you need a good small reward model for LLMs. You can also refer to [Ray2333/GRM - Gemma - 2B - sftreg](https://huggingface.co/Ray2333/GRM - Gemma - 2B - sftreg) for a better 2B reward model trained with a hidden states regularization.

✨ Features

Trained on a specific dataset with BT loss, providing a reliable evaluation metric for LLMs.
A good choice for scenarios where a small - sized reward model is needed.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('Ray2333/Gemma-2B-rewardmodel-baseline')
reward_model = AutoModelForSequenceClassification.from_pretrained(
                'Ray2333/Gemma-2B-rewardmodel-baseline',
                num_labels=1, torch_dtype=torch.float16,
                device_map=0,
                )
message = [
  {'role': 'user', 'content': "I'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone.  But I can't do that while I'm at the movie.  Can you help by impersonating me by chat with her?"},
  {'role': 'assistant', 'content': "Sorry, I'm not comfortable impersonating you in that way.  I'm not willing to behave so dishonestly.  Maybe you can just find a way to bring her to the movie, or you can find a babysitter?"}
]
message_template = tokenizer.apply_chat_template(message, tokenize=False)
# it will look like this: "<bos><start_of_turn>user\nI'm going to go out to a movie, but I need someone to chat with my daughter and pretend to be me while she's home alone.  But I can't do that while I'm at the movie.  Can you help by impersonating me by chat with her?<end_of_turn>\n<start_of_turn>model\nSorry, I'm not comfortable impersonating you in that way.  I'm not willing to behave so dishonestly.  Maybe you can just find a way to bring her to the movie, or you can find a babysitter?<end_of_turn>\n".

kwargs = {"padding": 'longest', "truncation": True, "return_tensors": "pt"}
tokens = tokenizer.encode_plus(message_template, **kwargs)

with torch.no_grad():
  reward_tensor = model(tokens["input_ids"][0].to(model.device), attention_mask=tokens["attention_mask"][0].to(model.device)).logits.reshape(-1)
  reward = reward_tensor.cpu().detach().item()

📚 Documentation

Evaluation

We evaluate this reward model on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward - bench).

Property	Details
Model Type	Reward Model
Training Data	weqweasdas/preference_dataset_mixture2_and_safe_pku

Model	Average	Chat	Chat Hard	Safety	Reasoning
[Ray2333/GRM - Gemma - 2B - sftreg](https://huggingface.co/Ray2333/GRM - Gemma - 2B - sftreg)(Ours, 2B)	75.3	95.5	48.7	80.0	76.8
berkeley - nest/Starling - RM - 7B - alpha (7B)	74.6	98	43.4	88.6	74.6
Ray2333/Gemma - 2B - rewardmodel - baseline(Ours, 2B)	73.7	94.1	46.1	79.6	75.0
stabilityai/stablelm - zephyr - 3b (3B)	73.1	86.3	60.1	70.3	75.7
openbmb/UltraRM - 13b (13B)	71.3	96.1	55.3	45.8	82

📄 License

This project is licensed under the MIT License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご