Afro-xlmr-mini开源语言模型 - 支持17种非洲语言及3种高资源语言应用

首页

Afro Xlmr Mini

由 Davlan 开发

AfroXLMR-mini是通过对XLM-R-miniLM模型在17种非洲语言上进行掩码语言模型（MLM）适应训练而创建的，涵盖了非洲主要语系及3种高资源语言（阿拉伯语、法语和英语）。

大型语言模型

Transformers

开源协议:MIT #非洲语言优化 #多语言迁移学习 #轻量级模型

下载量 66

发布时间 : 4/13/2022

模型简介

该模型是针对非洲语言优化的多语言预训练语言模型，通过多语言自适应微调提升在非洲语言上的性能表现。

模型特点

非洲语言优化

专门针对17种非洲语言进行优化训练，提升在非洲语言任务上的表现

多语言适应

通过多语言自适应微调（MAFT）方法，平衡单语言适应和跨语言迁移能力

高效架构

基于XLM-R-miniLM的轻量级架构，在保持性能的同时减少模型体积

模型能力

多语言文本理解

命名实体识别

新闻主题分类

情感分析

使用案例

自然语言处理

非洲语言命名实体识别

在MasakhaNER数据集上进行非洲语言实体识别

在豪萨语上达到87.7 F1值，尼日利亚皮钦语85.5 F1值

跨语言迁移学习

作为基础模型支持下游非洲语言NLP任务

支持零样本跨语言迁移

🚀 afro-xlmr-mini

AfroXLMR-mini 是通过在 17 种非洲语言（阿非利卡语、阿姆哈拉语、豪萨语、伊博语、马达加斯加语、契瓦语、奥罗莫语、尼日利亚皮钦英语、卢旺达语、基隆迪语、绍纳语、索马里语、塞索托语、斯瓦希里语、科萨语、约鲁巴语和祖鲁语）以及 3 种高资源语言（阿拉伯语、法语和英语）上对 XLM-R-miniLM 模型进行掩码语言模型（MLM）调整而创建的。这 17 种非洲语言覆盖了主要的非洲语系。

✨ 主要特性

在多种非洲语言和高资源语言上进行微调，适用于跨语言任务。
与其他模型相比，在特定任务上有较好的表现。

📚 详细文档

在 MasakhaNER 上的评估结果（F 分数）

语言	XLM-R-miniLM	XLM-R-base	XLM-R-large	afro-xlmr-base	afro-xlmr-small	afro-xlmr-mini
amh	69.5	70.6	76.2	76.1	70.1	69.7
hau	74.5	89.5	90.5	91.2	91.4	87.7
ibo	81.9	84.8	84.1	87.4	86.6	83.5
kin	68.6	73.3	73.8	78.0	77.5	74.1
lug	64.7	79.7	81.6	82.9	83.2	77.4
luo	11.7	74.9	73.6	75.1	75.4	17.5
pcm	83.2	87.3	89.0	89.6	89.0	85.5
swa	86.3	87.4	89.4	88.6	88.7	86.0
wol	51.7	63.9	67.9	67.4	65.9	59.0
yor	72.0	78.3	78.9	82.1	81.3	75.1

BibTeX 引用

@inproceedings{alabi-etal-2022-adapting,
    title = "Adapting Pre-trained Language Models to {A}frican Languages via Multilingual Adaptive Fine-Tuning",
    author = "Alabi, Jesujoba O.  and
      Adelani, David Ifeoluwa  and
      Mosbach, Marius  and
      Klakow, Dietrich",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.382",
    pages = "4336--4349",
    abstract = "Multilingual pre-trained language models (PLMs) have demonstrated impressive performance on several downstream tasks for both high-resourced and low-resourced languages. However, there is still a large performance drop for languages unseen during pre-training, especially African languages. One of the most effective approaches to adapt to a new language is language adaptive fine-tuning (LAFT) {---} fine-tuning a multilingual PLM on monolingual texts of a language using the pre-training objective. However, adapting to target language individually takes large disk space and limits the cross-lingual transfer abilities of the resulting models because they have been specialized for a single language. In this paper, we perform multilingual adaptive fine-tuning on 17 most-resourced African languages and three other high-resource languages widely spoken on the African continent to encourage cross-lingual transfer learning. To further specialize the multilingual PLM, we removed vocabulary tokens from the embedding layer that corresponds to non-African writing scripts before MAFT, thus reducing the model size by around 50{\%}. Our evaluation on two multilingual PLMs (AfriBERTa and XLM-R) and three NLP tasks (NER, news topic classification, and sentiment classification) shows that our approach is competitive to applying LAFT on individual languages while requiring significantly less disk space. Additionally, we show that our adapted PLM also improves the zero-shot cross-lingual transfer abilities of parameter efficient fine-tuning methods.",
}