J

Japanese Gpt Neox 3.6b Instruction Ppo

Developed by rinna
A 3.6 billion parameter Japanese GPT-NeoX model trained with Reinforcement Learning from Human Feedback (RLHF), capable of better following instructions in conversations.
Downloads 3,062
Release Time : 5/30/2023

Model Overview

Based on the rinna/japanese-gpt-neox-3.6b-instruction-sft-v2 model, trained with PPO reinforcement learning to optimize instruction-following capabilities, suitable for Japanese dialogue generation tasks.

Model Features

Reinforcement Learning Optimization
Trained with PPO reinforcement learning, achieving a 47% win rate in human evaluations compared to the SFT version
Japanese Instruction Optimization
Specifically optimized for understanding and generating Japanese instructions
Dialogue Format Support
Supports user-system dialogue format input, suitable for building dialogue systems

Model Capabilities

Japanese text generation
Instruction understanding and response
Dialogue system construction

Use Cases

Dialogue Systems
Customer Service Dialogue System
Used to build Japanese customer service dialogue systems
Capable of understanding user queries and providing relevant answers
Personal Assistant
Development of Japanese personal digital assistants
Capable of understanding and executing user instructions
Content Generation
Japanese Content Creation
Generating Japanese articles, stories, and other content
Capable of generating coherent Japanese text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase