Japanese - GPT - 1B: An Open - Source Japanese GPT Model - Free Deployment to Boost Japanese Text Generation

Japanese Gpt 1b

Developed by rinna

A 1.3 billion parameter Japanese GPT model trained by Rinna Co., Ltd., specializing in Japanese text generation tasks

Supports Multiple LanguagesOpen Source License:MIT #Japanese text generation #Large language model #Philosophical content generation

Downloads 2,763

Release Time : 3/2/2022

Model Overview

This is a large-scale Japanese language model based on the Transformer architecture, primarily used for Japanese text generation and related natural language processing tasks

Model Features

Large-scale Japanese pre-training

Trained on large-scale Japanese datasets such as Japanese C4, Japanese CC-100, and Japanese Wikipedia

Optimized token processing

Uses a sentencepiece-based tokenizer supplemented with emoji and special symbols

High-quality text generation

Achieves a perplexity of approximately 14 on validation sets, capable of generating fluent Japanese text

Model Capabilities

Japanese text generation

Language modeling

Text continuation

Use Cases

Academic research

Philosophical text generation

Generating coherent texts about Kitaro Nishida's philosophical thoughts

Examples demonstrate the model's ability to generate coherent texts comparing Nishida's philosophy with Kant's philosophy

Content creation

Japanese article continuation

Completing a full Japanese article based on a given opening

🚀 Japanese GPT-1B

This repository offers a 1.3B-parameter Japanese GPT model trained by rinna Co., Ltd.. It provides a powerful solution for Japanese text generation tasks in the field of natural language processing.

🚀 Quick Start

The following is a basic guide on how to use the model:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rinna/japanese-gpt-1b", use_fast=False)
model = AutoModelForCausalLM.from_pretrained("rinna/japanese-gpt-1b")

if torch.cuda.is_available():
    model = model.to("cuda")

text = "西田幾多郎は、"
token_ids = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt")

with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_length=100,
        min_length=100,
        do_sample=True,
        top_k=500,
        top_p=0.95,
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        bad_words_ids=[[tokenizer.unk_token_id]]
    )

output = tokenizer.decode(output_ids.tolist()[0])
print(output)  
# sample output: 西田幾多郎は、その主著の「善の研究」などで、人間の内面に自然とその根源があると指摘し、その根源的な性格は、この西田哲学を象徴しているとして、カントの「純粋理性批判」と「判断力批判」を対比して捉えます。それは、「人が理性的存在であるかぎりにおいて、人はその当人に固有な道徳的に自覚された善悪の基準を持っている」とするもので、この理性的な善悪の観念を否定するのがカントの

✨ Features

Powerful Text Generation: Capable of generating high - quality Japanese text.
Transformer - Based: Built on a 24 - layer, 2048 - hidden - size transformer architecture.

📦 Installation

The code examples in the quick start section assume you have installed the necessary libraries. You can install them using the following command:

pip install torch transformers

📚 Documentation

Model Architecture

The model is a 24 - layer, 2048 - hidden - size transformer - based language model.

Training

The model was trained on Japanese C4, [Japanese CC - 100](http://data.statmt.org/cc - 100/ja.txt.xz) and Japanese Wikipedia to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.

Tokenization

The model uses a sentencepiece - based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.

Release Date

January 26, 2022

How to Cite

@misc{rinna-japanese-gpt-1b,
    title = {rinna/japanese-gpt-1b},
    author = {Zhao, Tianyu and Sawada, Kei},
    url = {https://huggingface.co/rinna/japanese-gpt-1b}
}

@inproceedings{sawada2024release,
    title = {Release of Pre-Trained Models for the {J}apanese Language},
    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    month = {5},
    year = {2024},
    pages = {13898--13905},
    url = {https://aclanthology.org/2024.lrec-main.1213},
    note = {\url{https://arxiv.org/abs/2404.01657}}
}

📄 License

This project is licensed under The MIT license.

Additional Information

Property	Details
Model Type	Japanese GPT
Training Data	Japanese CC100, Wikipedia, C4
License	MIT
Tags	GPT, Text Generation, LM, NLP

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご