Gemma-2-Baku-2B-IT Open-source Model - Optimized for Instruction Following, Suitable for Natural Language Processing Tasks

Gemma 2 Baku 2b It

Developed by rinna

An instruction fine-tuned model based on Gemma 2 Baku 2B, which optimizes the instruction following ability and is suitable for natural language processing tasks.

Large Language Model

Transformers

Japanese#Japanese instruction fine-tuning #ORPO optimization #Chat vector enhancement

Downloads 2,555

Release Time : 10/2/2024

Model Overview

This model is a language model obtained through specific optimization and adjustment, which performs excellently in instruction following and supports various natural language processing tasks.

Model Features

Instruction fine-tuning

Fine-tune instructions based on Gemma 2 Baku 2B to optimize the instruction following ability.

ORPO optimization

Use the Odds Ratio Preference Optimization (ORPO) technique to further improve the model performance.

Chat vector

Endow the model with instruction following ability through the chat vector addition process.

Model Capabilities

Text generation

Instruction following

Natural language processing

Use Cases

Question answering system

Person information query

Answer questions about specific people, such as 'Who is Nishida Kitaro?'

Generate a detailed description of the person

Dialogue system

Multi-round dialogue

Support continuous dialogue based on context

Generate coherent and contextually appropriate responses

🚀 Gemma 2 Baku 2B Instruct (rinna/gemma-2-baku-2b-it)

This model is an instruction - tuned variant of rinna/gemma - 2 - baku - 2b, offering enhanced conversational capabilities.

🚀 Quick Start

The Gemma 2 Baku 2B Instruct (rinna/gemma-2-baku-2b-it) is an instruction - tuned model based on rinna/gemma-2-baku-2b. It utilizes Chat Vector and Odds Ratio Preference Optimization (ORPO) for fine - tuning and follows the gemma - 2 chat format.

✨ Features

Model Information

Property	Details
Thumbnail	https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
License	gemma
Language	ja
Tags	gemma2, conversational
Base Model	google/gemma - 2 - 2b, google/gemma - 2 - 2b - it, rinna/gemma - 2 - baku - 2b
Base Model Relation	merge
Pipeline Tag	text - generation
Library Name	transformers

Model Size and Tuning

Size	Continual Pre - Training	Instruction - Tuning
2B	Gemma 2 Baku 2B [HF]	Gemma 2 Baku 2B Instruct [HF]

Model Architecture

A 26 - layer, 2304 - hidden - size transformer - based language model. For detailed information on the model's architecture, please refer to the Gemma 2 Model Card.

Training

Model merging: The base model was given instruction - following capabilities through a chat vector addition process. The chat vector was obtained by subtracting the parameter vectors of google/gemma-2-2b from google/gemma-2-2b-it, as shown below:

rinna/gemma-2-baku-2b + 1.0 * (google/gemma-2-2b-it - google/gemma-2-2b)

During this process, the embedding layer was excluded during the subtraction and addition of parameter vectors.

ORPO: ORPO was applied using a subset of rinna's internal dataset to further refine the performance of the merged model.

Contributors

Release Date

October 3, 2024

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "rinna/gemma-2-baku-2b-it"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda",
    torch_dtype=dtype,
    attn_implementation="eager",
)

chat = [
    { "role": "user", "content": "西田幾多郎とはどんな人物ですか？" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
outputs = model.generate(
    input_ids,
    max_new_tokens=512,
)

response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

⚠️ Important Note

It is recommended to use eager attention when conducting batch inference under bfloat16 precision. Currently, Gemma 2 yields NaN values for input sequences with padding when the default attention mechanism (torch.scaled_dot_product_attention) is employed in conjunction with bfloat16.

📚 Documentation

Tokenization

The model uses the original google/gemma-2-2b-it tokenizer.

How to Cite

@misc{rinna-gemma-2-baku-2b-it,
    title = {rinna/gemma-2-baku-2b-it},
    author = {Chen, Xinqi and Wakatsuki, Toshiaki and Sawada, Kei},
    url = {https://huggingface.co/rinna/gemma-2-baku-2b-it}
}

@inproceedings{sawada2024release,
    title = {Release of Pre-Trained Models for the {J}apanese Language},
    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    month = {5},
    year = {2024},
    pages = {13898--13905},
    url = {https://aclanthology.org/2024.lrec-main.1213},
    note = {\url{https://arxiv.org/abs/2404.01657}}
}

References

@article{gemma-2-2024,
    title = {Gemma 2},
    url = {https://www.kaggle.com/models/google/gemma-2},
    publisher = {Kaggle},
    author = {Gemma Team},
    year = {2024}
}

@article{huang2023chat,
    title = {Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages},
    author = {Huang, Shih-Cheng and Li, Pin-Zu and Hsu, Yu-Chi and Chen, Kuang-Ming and Lin, Yu Tung and Hsiao, Shih-Kai and Tzong-Han Tsai, Richard and Lee, Hung-yi},
    year = {2023},
    url = {https://arxiv.org/abs/2310.04799}
}

@article{hong2024orpo,
  title = {ORPO: Monolithic Preference Optimization without Reference Model},
  author = {Hong, Jiwoo and Lee, Noah and Thorne, James},
  year = {2024},
  url = {https://arxiv.org/abs/2403.07691}
}

📄 License

Gemma Terms of Use

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご