Model Selection

RLHF Reward Model

# RLHF Reward Model

Prometheus 7b V2.0

Prometheus 2 is a language model based on Mistral-Instruct, specifically designed for fine-grained evaluation and reinforcement learning from human feedback, serving as an alternative to GPT-4 evaluation.

Large Language Model

Transformers English

prometheus-eval

Hh Rlhf Rm Open Llama 3b

A reward model trained based on the LMFlow framework. It is trained on the HH - RLHF dataset (only the useful part) with open_llama_3b as the base model and has good generalization ability.

Large Language Model

ToxicityModel is a fine-tuned model based on RoBERTa, designed to assess the toxicity level of English sentences.

Text Classification

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase