Reward - Model - Deberta - V3 - Base Open - Source Reward Model: A Practical Tool for Predicting Human

Home

Reward Model Deberta V3 Base

Developed by OpenAssistant

A reward model trained based on human feedback, used to predict answers preferred by humans

Large Language Model

Transformers

EnglishOpen Source License:MIT #Human Feedback Reward Model #Question and Answer Evaluation #RLHF Training

Downloads 1,193

Release Time : 1/15/2023

Model Overview

This reward model is trained to predict which generated answer humans consider better given a question. It is suitable for evaluating question and answer models and for reward scoring in reinforcement learning based on human feedback (RLHF).

Model Features

Human Feedback Training

The model is trained based on human feedback data and can accurately predict answers preferred by humans

Multi-dataset Training

Trained on multiple datasets such as webgpt_comparisons, summarize_from_feedback, and synthetic-instruct-gptj-pairwise

Cross-domain Applicability

Suitable for evaluating various text generation tasks such as question and answer and summary generation

Model Capabilities

Answer Quality Evaluation

Text Generation Scoring

Reinforcement Learning Reward Calculation

Use Cases

Question and Answer System

Question and Answer Model Evaluation

Evaluate the quality of answers generated by different question and answer models

Reinforcement Learning

RLHF Reward Model

Serves as a reward function in reinforcement learning based on human feedback

🚀 Reward model trained from human feedback

A reward model (RM) trained to predict the better human-judged generated answer for a given question.

Reward models (RMs) are trained to predict which generated answer is better, as judged by a human, when presented with a question. They are valuable in the following domains:

Evaluating QA models
Providing reward scores in Reinforcement Learning from Human Feedback (RLHF)

All models are trained on the following datasets with the same split seed across datasets (if a validation split was unavailable):

🚀 Quick Start

Installation

Ensure you have the transformers library installed. You can install it using the following command:

pip install transformers

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer
reward_name = "OpenAssistant/reward-model-deberta-v3-base"
rank_model, tokenizer = AutoModelForSequenceClassification.from_pretrained(reward_name), AutoTokenizer.from_pretrained(reward_name)
question, answer = "Explain nuclear fusion like I am five", "Nuclear fusion is the process by which two or more protons and neutrons combine to form a single nucleus. It is a very important process in the universe, as it is the source of energy for stars and galaxies. Nuclear fusion is also a key process in the production of energy for nuclear power plants."
inputs = tokenizer(question, answer, return_tensors='pt')
score = rank_model(**inputs).logits[0].cpu().detach()
print(score)

📚 Documentation

Performance

The following table shows the validation split accuracy for different models on various datasets:

Model	WebGPT	Summary	SytheticGPT
electra-large-discriminator	59.30	68.66	99.85
deberta-v3-large	61.13	72.23	99.94
deberta-v3-base	59.07	66.84	99.85

It's likely that SytheticGPT has some kind of surface pattern in the chosen-rejected pairs, making it easy to differentiate between better answers.

📄 License

This project is licensed under the MIT License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご