🚀 afro-xlmr-large
AfroXLMR-large 是通過對 XLM-R-large 模型進行掩碼語言模型(MLM)適配而創建的,該模型適配了 17 種非洲語言(阿非利卡語、阿姆哈拉語、豪薩語、伊博語、馬達加斯加語、奇切瓦語、奧羅莫語、尼日利亞皮欽語、基尼亞盧旺達語、基隆迪語、紹納語、索馬里語、塞索托語、斯瓦希里語、科薩語、約魯巴語和祖魯語),這些語言覆蓋了主要的非洲語系,同時還適配了 3 種高資源語言(阿拉伯語、法語和英語)。
✨ 主要特性
- 基於 XLM-R-large 模型進行 MLM 適配,適用於多種非洲語言和高資源語言。
- 在多種語言的評估任務中表現出色。
📚 詳細文檔
評估結果(MasakhaNER 數據集 F1 分數)
語言 |
XLM-R-miniLM |
XLM-R-base |
XLM-R-large |
afro-xlmr-large |
afro-xlmr-base |
afro-xlmr-small |
afro-xlmr-mini |
amh |
69.5 |
70.6 |
76.2 |
79.7 |
76.1 |
70.1 |
69.7 |
hau |
74.5 |
89.5 |
90.5 |
91.4 |
91.2 |
91.4 |
87.7 |
ibo |
81.9 |
84.8 |
84.1 |
87.7 |
87.4 |
86.6 |
83.5 |
kin |
68.6 |
73.3 |
73.8 |
79.1 |
78.0 |
77.5 |
74.1 |
lug |
64.7 |
79.7 |
81.6 |
86.7 |
82.9 |
83.2 |
77.4 |
luo |
11.7 |
74.9 |
73.6 |
78.1 |
75.1 |
75.4 |
17.5 |
pcm |
83.2 |
87.3 |
89.0 |
91.0 |
89.6 |
89.0 |
85.5 |
swa |
86.3 |
87.4 |
89.4 |
90.4 |
88.6 |
88.7 |
86.0 |
wol |
51.7 |
63.9 |
67.9 |
69.6 |
67.4 |
65.9 |
59.0 |
yor |
72.0 |
78.3 |
78.9 |
85.2 |
82.1 |
81.3 |
75.1 |
avg |
66.4 |
79.0 |
80.5 |
83.9 |
81.8 |
80.9 |
71.6 |
BibTeX 引用信息
@inproceedings{alabi-etal-2022-adapting,
title = "Adapting Pre-trained Language Models to {A}frican Languages via Multilingual Adaptive Fine-Tuning",
author = "Alabi, Jesujoba O. and
Adelani, David Ifeoluwa and
Mosbach, Marius and
Klakow, Dietrich",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = oct,
year = "2022",
address = "Gyeongju, Republic of Korea",
publisher = "International Committee on Computational Linguistics",
url = "https://aclanthology.org/2022.coling-1.382",
pages = "4336--4349",
abstract = "Multilingual pre-trained language models (PLMs) have demonstrated impressive performance on several downstream tasks for both high-resourced and low-resourced languages. However, there is still a large performance drop for languages unseen during pre-training, especially African languages. One of the most effective approaches to adapt to a new language is language adaptive fine-tuning (LAFT) {---} fine-tuning a multilingual PLM on monolingual texts of a language using the pre-training objective. However, adapting to target language individually takes large disk space and limits the cross-lingual transfer abilities of the resulting models because they have been specialized for a single language. In this paper, we perform multilingual adaptive fine-tuning on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent to encourage cross-lingual transfer learning. To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT, thus reducing the model size by around 50{\%}. Our evaluation on two multilingual PLMs (AfriBERTa and XLM-R) and three NLP tasks (NER, news topic classification, and sentiment classification) shows that our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space. Additionally, we show that our adapted PLM also improves the zero-shot cross-lingual transfer abilities of parameter efficient fine-tuning methods.",
}
📄 許可證
本項目採用 MIT 許可證。