B

Beaver 7b V1.0 Cost

Developed by PKU-Alignment
The Beaver Cost Model is a preference model trained on the PKU-SafeRLHF dataset, designed to evaluate the safety of model outputs in safe RLHF algorithms.
Downloads 3,336
Release Time : 7/10/2023

Model Overview

This model plays a role in safe RLHF algorithms, helping the Beaver model become safer and more harmless, based on the Transformer architecture's autoregressive language model.

Model Features

Safe Reinforcement Learning
Designed specifically for safe RLHF algorithms to help models output safer and more harmless content
Based on LLaMA Architecture
Fine-tuned on LLaMA and Alpaca models, equipped with strong language understanding capabilities
Safety Preference Scoring
Capable of evaluating and scoring the safety of model outputs

Model Capabilities

Safety Preference Scoring
Dialogue Safety Evaluation
Reinforcement Learning Safety Feedback

Use Cases

AI Safety
Dialogue System Safety Evaluation
Evaluate the safety of dialogue system outputs to prevent harmful content generation
Enhance the safety and reliability of dialogue systems
RLHF Training
Provide safety preference signals during reinforcement learning human feedback training
Help train safer AI models
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase