đ Tanuki-8x8B-dpo-v1.0
Tanuki-8x8B-dpo-v1.0 is a large - scale language model fine - tuned for dialogue. It offers high - quality language interaction capabilities and has been evaluated through various benchmarks.
đ Quick Start
Prerequisites
The inference of this model requires flash attention. Install it as follows:
pip install --no-build-isolation flash_attn
Inference with HuggingFace Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model = AutoModelForCausalLM.from_pretrained("weblab-GENIAC/Tanuki-8x8B-dpo-v1.0", device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("weblab-GENIAC/Tanuki-8x8B-dpo-v1.0")
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
messages = [
{"role": "system", "content": "The following is an instruction that describes a task. Write a response that appropriately meets the requirements."},
{"role": "user", "content": "Can a raccoon dog understand the Critique of Pure Reason?"}
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
output_ids = model.generate(input_ids,
max_new_tokens=1024,
temperature=0.5,
streamer=streamer)
Inference with vLLM
When using vLLM for inference, it is necessary to adapt to the custom architecture. Build the modified vLLM from here as follows:
git clone https://github.com/team-hatakeyama-phase2/vllm.git
cd vllm
LD_LIBRARY_PATH="" MAX_JOBS=16 pip install -e .
from time import time
from vllm import LLM, SamplingParams
model_name = "weblab-GENIAC/Tanuki-8x8B-dpo-v1.0"
vllm = LLM(model_name, trust_remote_code=True, tensor_parallel_size=2)
tokenizer = vllm.get_tokenizer()
messages = [
{"role": "system", "content": "The following is an instruction that describes a task. Write a response that appropriately meets the requirements."},
{"role": "user", "content": "Can a raccoon dog understand the Critique of Pure Reason?"}
]
inputs_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(f"inputs_text: {inputs_text}")
sampling_params = SamplingParams(temperature=0.0, max_tokens=1024, seed=1, repetition_penalty=1.1)
start = time()
outputs = vllm.generate(inputs_text, sampling_params=sampling_params, use_tqdm=False)
end = time()
outputs_text = outputs[0].outputs[0].text
print(f"outputs_text: {outputs_text}")
print(f"Elapsed time: {(end - start):.4f} sec.")
⨠Features
Tanuki-8x8B is a large - scale language model with 8x8B parameters (approximately 47B total parameters and about 13B active parameters), which has been pre - trained on approximately 1.7T tokens from scratch. Tanuki-8x8B-dpo-v1.0 has been fine - tuned for dialogue through SFT and DPO.
đĻ Quantized Models
đ Documentation
Prompt Format
Tanuki-8x8B-dpo-v1.0 uses the Japanese Alpaca prompt format.
<s>The following is an instruction that describes a task. Write a response that appropriately meets the requirements.
### Instruction:
Can a raccoon dog understand the Critique of Pure Reason?
### Response:
<s>The following is an instruction that describes a task. Write a response that appropriately meets the requirements.
### Instruction:
{Input for the first turn}
### Response:
{Response for the first turn}</s>
### Instruction:
{Input for the second turn}
### Response:
It is recommended to use the default system prompt "The following is an instruction that describes a task. Write a response that appropriately meets the requirements." because the model has not been trained on other system prompts. Please describe the details of the task in the user prompt.
Benchmarks
Human Evaluation
A system simulating Chatbot Arena was created, and a manual blind test was conducted. (For more details, see here)
All evaluation data (approximately 2000 cases) is publicly available.

Japanese MT - Bench
Evaluation by GPT - 4 (gpt - 4 - 0613, scores of - 1 are excluded when calculating the average score)
|
Tanuki-8B-dpo-v1.0 |
Tanuki-8x8B-dpo-v1.0 |
Average Score |
7.24 |
7.96 |
Coding |
5.4 |
6.75 |
Extraction |
6.65 |
6.90 |
Humanities |
9.1 |
9.3 |
Math |
3.9 |
5.75 |
Reasoning |
5.75 |
7.35 |
Role - play |
8.75 |
8.95 |
STEM |
9.35 |
9.40 |
Writing |
9.05 |
8.85 |
đĨ Development Team
Kanehisa Hatakeyama [Leader], asaoka_tadashi, Atsushi Saito, Chattso - GPT, Chihiro Arata, Chihiro HIGUCHI, Daichi Kohmoto, Esty, Hideaki Hayashi, hiroaki shioya, Issei Fujimoto, Jie Zeng, Jinsei Shiraishi, K. Nishizawa, Kazutaka Nishimae, Kunihiro Watanabe, masaki okamura, Minami Someya, Mr. M, Nishi, Nishijima, p1atdev, Rumi Nakagawa, Ryota Mitsuhashi, Susumu Ota, takagi, Toshio Nishida, y_morinaga, Yuki Namiuchi, Yukie Kawano, Tsuneji Nagahara, Jun Kato, Atsushi Kawagoe, Kenta Iwata, Mitsuho Kikuchi, Masato Kumada, Shota Eguni, Toshiyuki Sano, Hiroki Yamaguchi, Yasutaka Nishiie, Masaharu Kawamura, Shun Katakami, Shiso Horie, Kanta Hayashi
đ License
This project is licensed under the Apache - 2.0 license.