モデル概要
モデル特徴
モデル能力
使用事例
🚀 LLaMAX
LLaMAXは、強力な多言語機能を備えた言語モデルで、命令追従能力も損なわずに維持しています。102言語の広範なトレーニングセットを収集し、Llama2を継続事前学習し、英語の命令微調整データセットであるAlpacaを利用して、命令追従能力を微調整しています。
🚀 クイックスタート
LLaMAXを使って、簡単に多言語翻訳を行うことができます。以下の手順に従って、翻訳を実行してみましょう。
翻訳用のプロンプトを生成する関数
def Prompt_template(query, src_language, trg_language):
instruction = f'Translate the following sentences from {src_language} to {trg_language}.'
prompt = (
'Below is an instruction that describes a task, paired with an input that provides further context. '
'Write a response that appropriately completes the request.\n'
f'### Instruction:\n{instruction}\n'
f'### Input:\n{query}\n### Response:'
)
return prompt
翻訳を実行するコード
from transformers import AutoTokenizer, LlamaForCausalLM
model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
query = "你好,今天是个好日子"
prompt = Prompt_template(query, 'Chinese', 'English')
inputs = tokenizer(prompt, return_tensors="pt")
generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
# => "Hello, today is a good day"
✨ 主な機能
🔥 シンプルなプロンプトでの多言語翻訳
LLaMAXは100以上の言語間の翻訳をサポートしており、同規模のLLMを上回る性能を発揮します。
🔥 優れた翻訳性能
LLaMAX3-8B-Alpacaは、Flores-101データセットで、LLaMA3-8B-Alpacaモデルと比較して、平均spBLEUスコアが5ポイント以上向上しています。
システム | サイズ | en-X (COMET) | en-X (BLEU) | zh-X (COMET) | zh-X (BLEU) | de-X (COMET) | de-X (BLEU) | ne-X (COMET) | ne-X (BLEU) | ar-X (COMET) | ar-X (BLEU) | az-X (COMET) | az-X (BLEU) | ceb-X (COMET) | ceb-X (BLEU) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LLaMA3-8B-Alpaca | 8B | 67.97 | 17.23 | 64.65 | 10.14 | 64.67 | 13.62 | 62.95 | 7.96 | 63.45 | 11.27 | 60.61 | 6.98 | 55.26 | 8.52 |
LLaMAX3-8B-Alpaca | 8B | 75.52 | 22.77 | 73.16 | 14.43 | 73.47 | 18.95 | 75.13 | 15.32 | 72.29 | 16.42 | 72.06 | 12.41 | 68.88 | 15.85 |
システム | サイズ | X-en (COMET) | X-en (BLEU) | X-zh (COMET) | X-zh (BLEU) | X-de (COMET) | X-de (BLEU) | X-ne (COMET) | X-ne (BLEU) | X-ar (COMET) | X-ar (BLEU) | X-az (COMET) | X-az (BLEU) | X-ceb (COMET) | X-ceb (BLEU) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LLaMA3-8B-Alpaca | 8B | 77.43 | 26.55 | 73.56 | 13.17 | 71.59 | 16.82 | 46.56 | 3.83 | 66.49 | 10.20 | 58.30 | 4.81 | 52.68 | 4.18 |
LLaMAX3-8B-Alpaca | 8B | 81.28 | 31.85 | 78.34 | 16.46 | 76.23 | 20.64 | 65.83 | 14.16 | 75.84 | 15.45 | 70.61 | 9.32 | 63.35 | 12.66 |
📚 ドキュメント
モデルのソース
- 論文: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
- リンク: https://arxiv.org/pdf/2407.05975
- リポジトリ: https://github.com/CONE-MT/LLaMAX/
- デモ: https://huggingface.co/spaces/vilarin/LLaMAX3-Translator @AnnioDanceの努力に感謝します。
サポートされる言語
Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)
モデルのインデックス
LLaMAXモデルの複数のバージョンを実装しており、モデルのリンクは以下の通りです。
モデル | LLaMAX | LLaMAX-Alpaca |
---|---|---|
Llama-2 | リンク | リンク |
Llama-3 | リンク | リンク |
引用
もし当社のモデルがあなたの研究に役立った場合は、以下の論文を引用してください。
@inproceedings{lu-etal-2024-llamax,
title = "{LL}a{MAX}: Scaling Linguistic Horizons of {LLM} by Enhancing Translation Capabilities Beyond 100 Languages",
author = "Lu, Yinquan and
Zhu, Wenhao and
Li, Lei and
Qiao, Yu and
Yuan, Fei",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-emnlp.631",
doi = "10.18653/v1/2024.findings-emnlp.631",
pages = "10748--10772",
abstract = "Large Language Models (LLMs) demonstrate remarkable translation capabilities in high-resource language tasks, yet their performance in low-resource languages is hindered by insufficient multilingual data during pre-training. To address this, we conduct extensive multilingual continual pre-training on the LLaMA series models, enabling translation support across more than 100 languages. Through a comprehensive analysis of training strategies, such as vocabulary expansion and data augmentation, we develop LLaMAX. Remarkably, without sacrificing its generalization ability, LLaMAX achieves significantly higher translation performance compared to existing open-source LLMs (by more than 10 spBLEU points) and performs on-par with specialized translation model (M2M-100-12B) on the Flores-101 benchmark. Extensive experiments indicate that LLaMAX can serve as a robust multilingual foundation model. The code and the models are publicly available.",
}
📄 ライセンス
このモデルはMITライセンスの下で提供されています。



