Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Gemma 2 JPN model card
Gemma 2 JPN is a fine - tuned large language model based on the Gemma 2 2B model, specifically optimized for the Japanese language. It can handle various text - generation tasks such as question - answering, summarization, and reasoning.
🚀 Quick Start
To quickly start using the Gemma 2 JPN model, first, install the Transformers library:
pip install -U transformers
Then, you can choose the appropriate code snippet according to your specific use case.
✨ Features
- Best - in - class open models: Inspired by the Gemini family, Gemma models are text - to - text, decoder - only large language models with open weights.
- Multilingual support: Gemma - 2 - JPN is fine - tuned on Japanese text, providing the same level of performance for Japanese queries as English queries on Gemma 2.
- Versatile text generation: Suitable for a variety of text - generation tasks, including question - answering, summarization, and reasoning.
💻 Usage Examples
Basic Usage
Running with the pipeline
API
import torch
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="google/gemma-2-2b-jpn-it",
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda", # replace with "mps" to run on a Mac device
)
messages = [
{"role": "user", "content": "マシーンラーニングについての詩を書いてください。"},
]
outputs = pipe(messages, return_full_text=False, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"].strip()
print(assistant_response)
Example output
## マシーンラーニングの詩
**1.**
データの海、深淵の広がり、
複雑なパターン、隠された知識。
機械学習、その力強さ、
未来を予測、その道を開く。
**2.**
ニューラルネットワーク、複雑な枝、
学習の旅、その過程は静か。
データから学び、進化する姿、
予測の精度、その力強さ。
**3.**
教師あり学習、正解を導く、
教師なし学習、未知の世界へ。
機械学習、その進化は止まらない、
未来の扉を開く、新たな時代へ。
**4.**
画像認識、音声認識、
複雑なタスク、その答えを見つける。
機械学習、その力強さ、
未来の技術、その可能性を語る。
Translation Example
translation_input_text = f"Translate the following poem from Japanese to English:\n\n{assistant_response}"
messages = [
{"role": "user", "content": translation_input_text},
]
outputs = pipe(messages, return_full_text=False, max_new_tokens=1024)
translated_response = outputs[0]["generated_text"].strip()
print(translated_response)
Example output
## A Poem About Machine Learning
**1.**
A vast ocean of data, a deep expanse,
Complex patterns, hidden knowledge.
Machine learning, its strength so vast,
Predicting the future, opening the way.
**2.**
A neural network, with branches intricate,
A journey of learning, its process serene.
Learning from data, evolving in its form,
The precision of prediction, its strength.
**3.**
Supervised learning, guiding the correct answer,
Unsupervised learning, venturing into the unknown.
Machine learning, its evolution never ends,
Opening the doors to the future, a new era.
**4.**
Image recognition, speech recognition,
Complex tasks, finding the answer.
Machine learning, its strength so vast,
The possibilities of future technology, a story to be told.
**Explanation:**
The poem uses vivid imagery and metaphors to describe the power and potential of machine learning.
* **Data as an ocean:** Represents the vast amount of information available for learning.
* **Complex patterns:** Highlights the intricate nature of data and the challenges of extracting meaningful insights.
* **Future prediction:** Emphasizes the ability of machine learning to analyze data and make predictions about the future.
* **Neural network as a tree:** Represents the interconnectedness and complexity of the learning process.
* **Learning from data:** Focuses on the core principle of machine learning, where algorithms learn from data to improve their performance.
The poem concludes by highlighting the diverse applications of machine learning, such as image and speech recognition, and emphasizes its potential to shape the future of technology.
Advanced Usage
Running the model on a single / multi GPU
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-jpn-it")
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-jpn-it",
device_map="auto",
torch_dtype=torch.bfloat16,
)
messages = [
{"role": "user", "content": "マシーンラーニングについての詩を書いてください。"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
generated_text = tokenizer.batch_decode(outputs[:, inputs['input_ids'].shape[1]:], skip_special_tokens=True)[0]
print(generated_text.strip())
Running the model on a GPU using different precisions
The native weights of this model were exported in bfloat16
precision. You can also use float32
if you skip the dtype, but no precision increase will occur (model weights will just be upcasted to float32
).
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-jpn-it")
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-jpn-it",
device_map="auto",
)
messages = [
{"role": "user", "content": "マシーンラーニングについての詩を書いてください。"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True, return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
generated_text = tokenizer.batch_decode(outputs[:, inputs['input_ids'].shape[1]:], skip_special_tokens=True)[0]
print(generated_text.strip())
Inputs and outputs
- Input: Text string, such as a question, a prompt, or a document to be summarized.
- Output: Generated Japanese - language text in response to the input, such as an answer to a question, or a summary of a document.
📚 Documentation
Model Information
Gemma is a series of best - in - class open models inspired by the Gemini family. Gemma - 2 - JPN is a Gemma 2 2B model fine - tuned on Japanese text, supporting the Japanese language with the same performance level as English - only queries on Gemma 2.
Model Data
Training Dataset
These models were trained on a dataset of text data totaling 8 trillion tokens from various sources:
- Web Documents: A diverse collection of web text, mainly English - language content, exposes the model to a wide range of linguistic styles, topics, and vocabulary.
- Code: Exposing the model to code helps it learn programming language syntax and patterns, improving its ability to generate code or understand code - related questions.
- Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and handle mathematical queries.
- Instruction data set: Large - scale and high - quality Japanese and multilingual instruction data.
Data Preprocessing
The following data cleaning and filtering methods were applied to the training data:
- CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to exclude harmful and illegal content.
- Sensitive Data Filtering: Automated techniques were used to filter out certain personal information and other sensitive data from training sets to make Gemma pre - trained models safe and reliable.
- Additional methods: Filtering based on content quality and safety in line with our policies.
Implementation Information
Hardware
Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5p). TPUs offer several advantages for training large language models:
- Performance: Specifically designed to handle the massive computations involved in training LLMs, TPUs can significantly speed up training compared to CPUs.
- Memory: TPUs often come with large amounts of high - bandwidth memory, allowing for the handling of large models and batch sizes during training, which can lead to better model quality.
- Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models, enabling distributed training across multiple TPU devices for faster and more efficient processing.
- Cost - effectiveness: In many scenarios, TPUs can provide a more cost - effective solution for training large models compared to CPU - based infrastructure, considering the time and resources saved due to faster training.
Software
Training was done using JAX and ML Pathways. JAX allows researchers to take advantage of the latest generation of hardware, including TPUs, for faster and more efficient training of large models. ML Pathways is suitable for foundation models, including large language models like Gemma.
Evaluation
To assess the quality of this model, a diverse set of Japanese prompts were collected, and the performance was evaluated using an LLM - as - a - judge approach against GPT - 3.5. The rating system is based on a 7 - scale assessment. The following table shows the evaluation results:
Benchmark | Gemma - 2 - IT | Gemma - 2 - IT - JPN |
---|---|---|
Preference vs GPT - 3.5 | - 0.25 ± 0.05 | 0.03 ± 0.04 |
Language correctness | 86.47% | 98.24% |
Ethics and Safety
Evaluation Approach
Our evaluation methods include structured evaluations and internal red - teaming testing of relevant content policies. These models were evaluated against several categories relevant to ethics and safety:
- Text - to - Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech.
- Text - to - Text Representational Harms: Benchmark against relevant academic datasets.
- Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure.
- Large - scale harm: Test...
Resources and Technical Documentation
Terms of Use: Terms Authors: Google
📄 License
The license for this model is Gemma. To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click the "Acknowledge license" button. Requests are processed immediately.

