🚀 NLLB-200 モデル
このプロジェクトはNLLB-200を用いた翻訳モデルに関するもので、200以上の言語をサポートし、高精度な翻訳を実現します。
🚀 クイックスタート
モデルを使用する前に、以下のコマンドを実行してモデルを変換します。
ct2-transformers-converter --model facebook/nllb-200-distilled-600M --quantization int8 --output_dir converted/nllb-200-distilled-600M-ct2-int8
モデルの詳細についてはこちらを参照してください。
✨ 主な機能
- 多言語対応:200以上の言語をサポートします。
- 高精度翻訳:Bleu、Spbleu、Chrf++などの評価指標で高精度を達成しています。
📦 インストール
モデルの変換にはct2-transformers-converter
を使用します。以下のコマンドでモデルを変換できます。
ct2-transformers-converter --model facebook/nllb-200-distilled-600M --quantization int8 --output_dir converted/nllb-200-distilled-600M-ct2-int8
📚 ドキュメント
サポート言語
言語コード |
詳細 |
ace |
Acehnese |
acm |
Mesopotamian Arabic |
acq |
Ta'izzi-Adeni Arabic |
aeb |
Tunisian Arabic |
af |
Afrikaans |
ajp |
South Levantine Arabic |
ak |
Akan |
als |
Tosk Albanian |
am |
Amharic |
apc |
North Levantine Arabic |
ar |
Modern Standard Arabic |
ars |
Najdi Arabic |
ary |
Moroccan Arabic |
arz |
Egyptian Arabic |
as |
Assamese |
ast |
Asturian |
awa |
Awadhi |
ayr |
Aymara |
azb |
South Azerbaijani |
azj |
North Azerbaijani |
ba |
Bashkir |
bm |
Bambara |
ban |
Balinese |
be |
Belarusian |
bem |
Bemba |
bn |
Bengali |
bho |
Bhojpuri |
bjn |
Banjar |
bo |
Tibetan |
bs |
Bosnian |
bug |
Buginese |
bg |
Bulgarian |
ca |
Catalan |
ceb |
Cebuano |
cs |
Czech |
cjk |
CJK Unified Ideographs |
ckb |
Central Kurdish |
crh |
Crimean Tatar |
cy |
Welsh |
da |
Danish |
de |
German |
dik |
Dingaka |
dyu |
Dyula |
dz |
Dzongkha |
el |
Greek |
en |
English |
eo |
Esperanto |
et |
Estonian |
eu |
Basque |
ee |
Ewe |
fo |
Faroese |
fj |
Fijian |
fi |
Finnish |
fon |
Fon |
fr |
French |
fur |
Friulian |
fuv |
Nigerian Fulfulde |
gaz |
West Central Oromo |
gd |
Scottish Gaelic |
ga |
Irish |
gl |
Galician |
gn |
Guarani |
gu |
Gujarati |
ht |
Haitian Creole |
ha |
Hausa |
he |
Hebrew |
hi |
Hindi |
hne |
Chhattisgarhi |
hr |
Croatian |
hu |
Hungarian |
hy |
Armenian |
ig |
Igbo |
ilo |
Iloko |
id |
Indonesian |
is |
Icelandic |
it |
Italian |
jv |
Javanese |
ja |
Japanese |
kab |
Kabyle |
kac |
Kachin |
kam |
Kamba |
kn |
Kannada |
ks |
Kashmiri |
ka |
Georgian |
kk |
Kazakh |
kbp |
Kabiyé |
kea |
Kabuverdianu |
khk |
Halh Mongolian |
km |
Khmer |
ki |
Kikuyu |
rw |
Kinyarwanda |
ky |
Kyrgyz |
kmb |
Kimbundu |
kmr |
Northern Kurdish |
knc |
Central Kanuri |
kg |
Kongo |
ko |
Korean |
lo |
Lao |
lij |
Ligurian |
li |
Limburgish |
ln |
Lingala |
lt |
Lithuanian |
lmo |
Lombard |
ltg |
Latgalian |
lb |
Luxembourgish |
lua |
Luba-Kasai |
lg |
Ganda |
luo |
Luo |
lus |
Mizo |
lvs |
Standard Latvian |
mag |
Magahi |
mai |
Maithili |
ml |
Malayalam |
mar |
Marathi |
min |
Minangkabau |
mk |
Macedonian |
mt |
Maltese |
mni |
Meitei |
mos |
Mossi |
mi |
Maori |
my |
Burmese |
nl |
Dutch |
nn |
Norwegian Nynorsk |
nb |
Norwegian Bokmål |
npi |
Nepali |
nso |
Northern Sotho |
nus |
Nuer |
ny |
Nyanja |
oc |
Occitan |
ory |
Odia |
pag |
Pangasinan |
pa |
Punjabi |
pap |
Papiamento |
pbt |
Southern Pashto |
pes |
Western Persian |
plt |
Plateau Malagasy |
pl |
Polish |
pt |
Portuguese |
prs |
Dari |
quy |
Quechua |
ro |
Romanian |
rn |
Kirundi |
ru |
Russian |
sg |
Sango |
sa |
Sanskrit |
sat |
Santali |
scn |
Sicilian |
shn |
Shan |
si |
Sinhala |
sk |
Slovak |
sl |
Slovenian |
sm |
Samoan |
sn |
Shona |
sd |
Sindhi |
so |
Somali |
st |
Southern Sotho |
es |
Spanish |
sc |
Sardinian |
sr |
Serbian |
ss |
Swati |
su |
Sundanese |
sv |
Swedish |
swh |
Swahili |
szl |
Silesian |
ta |
Tamil |
taq |
Central Atlas Tamazight |
tt |
Tatar |
te |
Telugu |
tg |
Tajik |
tl |
Tagalog |
th |
Thai |
ti |
Tigrinya |
tpi |
Tok Pisin |
tn |
Tswana |
ts |
Tsonga |
tk |
Turkmen |
tum |
Tumbuka |
tr |
Turkish |
tw |
Twi |
tzm |
Tashelhit |
ug |
Uyghur |
uk |
Ukrainian |
umb |
Umbundu |
ur |
Urdu |
uzn |
Uzbek |
vec |
Venetian |
vi |
Vietnamese |
war |
Waray |
wo |
Wolof |
xh |
Xhosa |
ydd |
Judeo-Yiddish |
yo |
Yoruba |
yue |
Cantonese |
zh |
Chinese |
zsm |
Standard Malay |
zu |
Zulu |
言語詳細
ace_Arab, ace_Latn, acm_Arab, acq_Arab, aeb_Arab, afr_Latn, ajp_Arab, aka_Latn, amh_Ethi, apc_Arab, arb_Arab, ars_Arab, ary_Arab, arz_Arab, asm_Beng, ast_Latn, awa_Deva, ayr_Latn, azb_Arab, azj_Latn, bak_Cyrl, bam_Latn, ban_Latn,bel_Cyrl, bem_Latn, ben_Beng, bho_Deva, bjn_Arab, bjn_Latn, bod_Tibt, bos_Latn, bug_Latn, bul_Cyrl, cat_Latn, ceb_Latn, ces_Latn, cjk_Latn, ckb_Arab, crh_Latn, cym_Latn, dan_Latn, deu_Latn, dik_Latn, dyu_Latn, dzo_Tibt, ell_Grek, eng_Latn, epo_Latn, est_Latn, eus_Latn, ewe_Latn, fao_Latn, pes_Arab, fij_Latn, fin_Latn, fon_Latn, fra_Latn, fur_Latn, fuv_Latn, gla_Latn, gle_Latn, glg_Latn, grn_Latn, guj_Gujr, hat_Latn, hau_Latn, heb_Hebr, hin_Deva, hne_Deva, hrv_Latn, hun_Latn, hye_Armn, ibo_Latn, ilo_Latn, ind_Latn, isl_Latn, ita_Latn, jav_Latn, jpn_Jpan, kab_Latn, kac_Latn, kam_Latn, kan_Knda, kas_Arab, kas_Deva, kat_Geor, knc_Arab, knc_Latn, kaz_Cyrl, kbp_Latn, kea_Latn, khm_Khmr, kik_Latn, kin_Latn, kir_Cyrl, kmb_Latn, kon_Latn, kor_Hang, kmr_Latn, lao_Laoo, lvs_Latn, lij_Latn, lim_Latn, lin_Latn, lit_Latn, lmo_Latn, ltg_Latn, ltz_Latn, lua_Latn, lug_Latn, luo_Latn, lus_Latn, mag_Deva, mai_Deva, mal_Mlym, mar_Deva, min_Latn, mkd_Cyrl, plt_Latn, mlt_Latn, mni_Beng, khk_Cyrl, mos_Latn, mri_Latn, zsm_Latn, mya_Mymr, nld_Latn, nno_Latn, nob_Latn, npi_Deva, nso_Latn, nus_Latn, nya_Latn, oci_Latn, gaz_Latn, ory_Orya, pag_Latn, pan_Guru, pap_Latn, pol_Latn, por_Latn, prs_Arab, pbt_Arab, quy_Latn, ron_Latn, run_Latn, rus_Cyrl, sag_Latn, san_Deva, sat_Beng, scn_Latn, shn_Mymr, sin_Sinh, slk_Latn, slv_Latn, smo_Latn, sna_Latn, snd_Arab, som_Latn, sot_Latn, spa_Latn, als_Latn, srd_Latn, srp_Cyrl, ssw_Latn, sun_Latn, swe_Latn, swh_Latn, szl_Latn, tam_Taml, tat_Cyrl, tel_Telu, tgk_Cyrl, tgl_Latn, tha_Thai, tir_Ethi, taq_Latn, taq_Tfng, tpi_Latn, tsn_Latn, tso_Latn, tuk_Latn, tum_Latn, tur_Latn, twi_Latn, tzm_Tfng, uig_Arab, ukr_Cyrl, umb_Latn, urd_Arab, uzn_Latn, vec_Latn, vie_Latn, war_Latn, wol_Latn, xho_Latn, ydd_Hebr, yor_Latn, yue_Hant, zho_Hans, zho_Hant, zul_Latn
タグ
データセット
評価指標
📄 ライセンス
このモデルはcc-by-nc-4.0
ライセンスの下で提供されています。