AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
RLHF reward model

# RLHF reward model

RM Mistral 7B
A reward model trained on Mistral-7B for response quality evaluation in Reinforcement Learning from Human Feedback (RLHF) scenarios
Large Language Model Transformers
R
weqweasdas
552
22
RM Gemma 2B
A reward model trained on google/gemma-2b-it for evaluating text generation quality
Large Language Model Transformers
R
weqweasdas
2,618
25
Gpt2 Large Helpful Reward Model
MIT
A GPT2 large model trained on the Anthropic/hh-rlhf helpfulness dataset, specifically designed for helpful response detection or RLHF (Reinforcement Learning from Human Feedback).
Large Language Model Transformers
G
Ray2333
2,935
11
Prometheus 13b V1.0
Apache-2.0
Prometheus is an evaluation-focused language model fine-tuned from Llama-2-Chat, excelling at assessing text quality against custom criteria, serving as a cost-effective alternative to GPT-4 evaluation.
Large Language Model Transformers English
P
prometheus-eval
1,726
139
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase