Fugaku-LLM-13B Open-source Large Language Model - Highly Transparent and Secure, Excellent Performance in Japanese Communication

Fugaku LLM 13B

Developed by Fugaku-LLM

Fugaku-LLM is a domestically produced large language model in Japan, pre-trained from scratch using the supercomputer 'Fugaku.' It boasts high transparency and security, with particularly outstanding performance in Japanese.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Japanese optimization #Supercomputer training #Transparent data sources

Downloads 25

Release Time : 4/18/2024

Model Overview

A large-scale language model developed based on Japan's supercomputer 'Fugaku,' primarily used for text generation tasks in Japanese and English.

Model Features

Domestically produced model in Japan

Trained entirely using Japan's supercomputer 'Fugaku,' with transparent and controllable data sources.

Japanese optimization

Training data is primarily in Japanese, excelling in Japanese language tasks.

Business-friendly license

Permits both commercial and non-commercial use, including modifications and redistribution.

Model Capabilities

Japanese text generation

English text generation

Instruction following

Question answering system

Use Cases

Education

Japanese learning assistance

Helps non-native speakers learn Japanese expressions.

Business

Japanese customer service chatbot

Builds intelligent customer service systems for the Japanese market.

🚀 Fugaku-LLM

Our Fugaku LLM model is a domestic model pre-trained from scratch using the supercomputer Fugaku. It features high transparency and safety as it is trained from scratch with our own data. The training data mainly consists of Japanese data, enabling the model to deliver excellent performance in Japanese.

This model is developed by Fugaku-LLM. Links to other models can be found in the index.

🚀 Quick Start

The Fugaku LLM model offers a reliable and efficient solution for language - related tasks, especially in Japanese. You can start using it by referring to the "How to use" section below.

✨ Features

High Transparency and Safety: Trained from scratch with proprietary data, ensuring high levels of transparency and safety.
Excellent Japanese Performance: With training data primarily in Japanese, the model shows outstanding performance in Japanese language tasks.

📦 Installation

The README does not provide specific installation steps. If you need to install the model, you can refer to the relevant libraries and dependencies mentioned in the "Model Details" section, such as DeepSpeedFugaku and llm - jp - tokenizer.

💻 Usage Examples

Basic Usage

Use the instruction - tuned model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Fugaku-LLM/Fugaku-LLM-13B-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map="auto")
model.eval()

system_example = "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。"
instruction_example = "スーパーコンピュータ「富岳」の名前の由来を教えてください。"

prompt = f"{system_example}\n\n### 指示:\n{instruction_example}\n\n### 応答:\n"

input_ids = tokenizer.encode(prompt,
                             add_special_tokens=False,
                             return_tensors="pt")
tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=128,
    do_sample=True,
    temperature=0.1,
    top_p=1.0,
    repetition_penalty=1.0,
    top_k=0
)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)

Use the base model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Fugaku-LLM/Fugaku-LLM-13B"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map="auto")
model.eval()

prompt = "スーパーコンピュータ「富岳」という名称は"

input_ids = tokenizer.encode(prompt,
                             add_special_tokens=False,
                             return_tensors="pt")
tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=128,
    do_sample=True,
    temperature=0.1,
    top_p=1.0,
    repetition_penalty=1.0,
    top_k=0
)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)

📚 Documentation

Fugaku-LLM Model Index

Model	Fugaku-LLM	Fugaku-LLM-instruct
13B	Link	Link

Model Details

Property	Details
Developed by	Fugaku-LLM
Model Type	GPT - 2
Language(s)	Japanese, English
Library	DeepSpeedFugaku
Tokenizer	llm-jp-tokenizer, code10k_en20k_ja30k of v2.2
License	Fugaku-LLM Terms of Use

Model Performance

Instruction - tuned model

We evaluated our model using the Japanese MT benchmark in the same way as Nejumi LLM Leaderboard Neo. We only modified the following parts of the Fastchat code:

Added "add_special_tokens=False" when calling the tokenizer for the input prompt.
Limited the number of tokens generated to less than 2048.

Model Name	average	coding	extraction	humanities	math	reasoning	roleplay	stem	writing
Fugaku-LLM-13B-instruct	5.47	2.10	4.10	9.18	2.30	3.40	8.20	7.25	7.25

Training Datasets

Instruction Tuning

📄 License

Fugaku-LLM Terms of Use is available at LICENSE and LICENSE_ja files.

🔧 Technical Details

The README does not provide in - depth technical details about the model's architecture, training algorithms, etc.

⚠️ Important Note

The results of processing using Fugaku-LLM may contain falsehoods, biases, content that infringes on the rights of others, or content that does not meet the effectiveness or usefulness expected by Users.

💡 Usage Tip

When using the Fugaku-LLM, make sure to check the accuracy, legality, and ethical validity of the processing results on your own.

Acknowledgements

This achievement is based on the Government - Initiated Projects of Supercomputer Fugaku "Development of Distributed... (The original README seems incomplete here).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご