EXLMR
E
EXLMR
Developed by Hailay
EXLMR is an extended version of XLM-R that supports new languages by expanding the tokenizer vocabulary to mitigate out-of-vocabulary issues, specifically optimized for low-resource Ethiopian languages.
Downloads 27
Release Time : 8/21/2024
Model Overview
EXLMR is an extended model based on XLM-RoBERTa, which enhances support for low-resource languages (such as Amharic and Tigrinya) through specialized initialization of new vocabulary embeddings while improving the performance of the original XLM-R on high-resource languages.
Model Features
Vocabulary expansion
Vocabulary expanded from 250,002 to 280,147, effectively mitigating out-of-vocabulary issues for low-resource languages
Cross-lingual optimization
Specialized optimization for underrepresented Ethiopian languages (e.g., Amharic, Tigrinya)
Embedding initialization
Specialized initialization method for new vocabulary embeddings ensures effective utilization of new words
Model Capabilities
Multilingual text classification
Cross-lingual transfer learning
Zero-shot classification
Use Cases
Natural Language Processing
Multilingual text classification
Classify texts in low-resource languages such as Amharic and Tigrinya
Improved out-of-vocabulary handling compared to XLM-R
Cross-lingual QA systems
Build question-answering systems supporting Ethiopian languages
Featured Recommended AI Models
Š 2025AIbase