🚀 RakutenAI-7B-chat
RakutenAI-7B-chat是一個系統性的項目,它將最新技術引入到日語大語言模型領域。該模型在日語理解基準測試中取得了最佳成績,同時在英語測試集上與OpenCalm、Elyza、Youri、Nekomata和Swallow等類似模型相比,也保持著有競爭力的性能。
🚀 快速開始
RakutenAI-7B-chat模型在日語和英語語言處理方面表現出色。若你正在尋找基礎模型,可查看 RakutenAI-7B;若你需要指令微調模型,可查看 RakutenAI-7B-instruct。
✨ 主要特性
- 性能卓越:RakutenAI-7B在日語語言理解基準測試中取得最佳成績,在英語測試集上也有有競爭力的表現。
- 架構先進:採用Mistral模型架構,並基於 Mistral-7B-v0.1 預訓練檢查點。
- 詞彙擴展:將Mistral的詞彙表從32k擴展到48k,為日語提供更好的字符與標記比率。
💻 使用示例
基礎用法
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "Rakuten/RakutenAI-7B-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
model.eval()
chat = [
{"role": "system", "content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."},
{"role": "user", "content": "How to make an authentic Spanish Omelette?"},
]
input_ids = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device=model.device)
tokens = model.generate(
input_ids,
max_length=4096,
do_sample=False,
num_beams=1,
pad_token_id=tokenizer.eos_token_id,
)
out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)
print("ASSISTANT:\n" + out)
print()
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "Rakuten/RakutenAI-7B-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
model.eval()
requests = [
"「馬が合う」はどう言う意味ですか",
"How to make an authentic Spanish Omelette?",
]
system_message = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {user_input} ASSISTANT:"
for req in requests:
input_req = system_message.format(user_input=req)
input_ids = tokenizer.encode(input_req, return_tensors="pt").to(device=model.device)
tokens = model.generate(
input_ids,
max_new_tokens=1024,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)
print("USER:\n" + req)
print("ASSISTANT:\n" + out)
print()
print()
📚 詳細文檔
模型詳情
侷限性和偏差
RakutenAI-7B系列模型能夠在廣泛的主題上生成類似人類的文本。然而,像所有大語言模型一樣,它們也有侷限性,可能會產生有偏見、不準確或不安全的輸出。在與它們交互時,請謹慎並運用判斷力。
📄 許可證
本模型採用 Apache License, Version 2.0 許可。
🔧 技術細節
技術報告可在 arXiv 上獲取。
📚 引用
如需引用我們在RakutenAI-7B系列模型上的工作,請使用以下格式:
@misc{rakutengroup2024rakutenai7b,
title={RakutenAI-7B: Extending Large Language Models for Japanese},
author={{Rakuten Group, Inc.} and Aaron Levine and Connie Huang and Chenguang Wang and Eduardo Batista and Ewa Szymanska and Hongyi Ding and Hou Wei Chou and Jean-François Pessiot and Johanes Effendi and Justin Chiu and Kai Torben Ohlhus and Karan Chopra and Keiji Shinzato and Koji Murakami and Lee Xiong and Lei Chen and Maki Kubota and Maksim Tkachenko and Miroku Lee and Naoki Takahashi and Prathyusha Jwalapuram and Ryutaro Tatsushima and Saurabh Jain and Sunil Kumar Yadav and Ting Cai and Wei-Te Chen and Yandi Xia and Yuki Nakayama and Yutaka Higashiyama},
year={2024},
eprint={2403.15484},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
⚠️ 重要提示
該模型可能會產生有偏見、不準確或不安全的輸出,請謹慎使用。
💡 使用建議
在使用模型時,可根據具體需求選擇合適的版本,如基礎模型或指令微調模型。