🚀 Tanuki-8B-dpo-v1.0
Tanuki-8B-dpo-v1.0 is a large - scale language model optimized for dialogue, developed by volunteers. It provides various quantized models and has been evaluated in multiple benchmarks.
🚀 Quick Start
To use Tanuki-8B-dpo-v1.0, you can follow the steps in the code example below.
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model = AutoModelForCausalLM.from_pretrained("weblab-GENIAC/Tanuki-8B-dpo-v1.0", device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("weblab-GENIAC/Tanuki-8B-dpo-v1.0")
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
messages = [
{"role": "system", "content": "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。"},
{"role": "user", "content": "たぬきに純粋理性批判は理解できますか?"}
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
output_ids = model.generate(input_ids,
max_new_tokens=1024,
temperature=0.5,
streamer=streamer)
✨ Features
- Model Overview: Tanuki-8B is a large - scale language model with approximately 8B parameters, pre - trained from scratch on about 1.3T tokens. Tanuki-8x8B-dpo-v1.0 is fine - tuned for dialogue using SFT and DPO. For more detailed information, please refer to the blog post.
- Quantized Models:
- Prompt Format: Tanuki-8B-dpo-v1.0 uses the Japanese Alpaca prompt format.
<s>以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。
### 指示:
たぬきに純粋理性批判は理解できますか?
### 応答:
- **Multi - turn**:
<s>以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。
### 指示:
{1ターン目の入力}
### 応答:
{1ターン目の応答}</s>
### 指示:
{2ターン目の入力}
### 応答:
It is recommended to use the default system prompt "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。" as the model has not been trained on other system prompts. Please describe the task details in the user prompt.
📚 Documentation
Benchmarks
Human Evaluation
A system mimicking Chatbot Arena was created, and a manual blind test was conducted. (For details, see here). All evaluation data (about 2000 cases) is publicly available.

Japanese MT - Bench
Evaluation by GPT - 4 (gpt - 4 - 0613, scores of - 1 are excluded when calculating the average score)
Property |
Tanuki-8B-dpo-v1.0 |
Tanuki-8x8B-dpo-v1.0 |
Average Score |
7.24 |
7.96 |
Coding |
5.4 |
6.75 |
Extraction |
6.65 |
6.90 |
Humanities |
9.1 |
9.3 |
Math |
3.9 |
5.75 |
Reasoning |
5.75 |
7.35 |
Roleplay |
8.75 |
8.95 |
STEM |
9.35 |
9.40 |
Writing |
9.05 |
8.85 |
🔧 Technical Details
The development of Tanuki-8B-dpo-v1.0 was carried out by volunteers (including corporate employees, students, researchers, etc.) gathered through a public call under the GENIAC Matsuo Lab LLM Development Project.
📄 License
This project is licensed under the Apache - 2.0 license.