Japanese Gpt Neox 3.6b Instruction Ppo
MIT
A 3.6 billion parameter Japanese GPT-NeoX model trained with Reinforcement Learning from Human Feedback (RLHF), capable of better following instructions in conversations.
Large Language Model
Transformers Supports Multiple Languages