AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
RLHF Reinforcement Learning

# RLHF Reinforcement Learning

Neuralbeagle14 7B 8.0bpw H8 Exl2
Apache-2.0
NeuralBeagle14-7B is a 7B-parameter large language model fine-tuned using the DPO method based on the Beagle14-7B model, excelling in the 7B parameter category.
Large Language Model Transformers
N
LoneStriker
111
5
Japanese Gpt Neox 3.6b Instruction Ppo
MIT
A 3.6 billion parameter Japanese GPT-NeoX model trained with Reinforcement Learning from Human Feedback (RLHF), capable of better following instructions in conversations.
Large Language Model Transformers Supports Multiple Languages
J
rinna
3,062
71
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase