🚀 mLUKE
mLUKE (multilingual LUKE) is a multilingual extension of LUKE, which can be applied to tasks like named entity recognition, relation classification, and question answering.
Please check the official repository for more details and updates.
✨ Features
- Multilingual Support: It supports multiple languages including Arabic, Bengali, German, Greek, English, Spanish, Finnish, French, Hindi, Indonesian, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Russian, Swedish, Swahili, Telugu, Thai, Turkish, Vietnamese, and Chinese.
- Model Architecture: This is the mLUKE base model with 12 hidden layers and a 768 hidden size. The total number of parameters in this model is 279M.
- Initialization and Training: The model was initialized with the weights of XLM - RoBERTa(base) and trained using the December 2020 version of Wikipedia in 24 languages.
- Lite - weight Version: It is a lite - weight version of studio - ousia/mluke - base, without Wikipedia entity embeddings but only with special entities such as
[MASK]
.
📚 Documentation
Note
When you load the model from AutoModel.from_pretrained
with the default configuration, you will see the following warning:
Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [
'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias',
'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias',
'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias',
...]
These weights are the weights for entity - aware attention (as described in the LUKE paper). This is expected because use_entity_aware_attention
is set to false
by default, but the pretrained weights contain the weights for it in case you enable use_entity_aware_attention
and have the weights loaded into the model.
Citation
If you find mLUKE useful for your work, please cite the following paper:
@inproceedings{ri-etal-2022-mluke,
title = "m{LUKE}: {T}he Power of Entity Representations in Multilingual Pretrained Language Models",
author = "Ri, Ryokan and
Yamada, Ikuya and
Tsuruoka, Yoshimasa",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
year = "2022",
url = "https://aclanthology.org/2022.acl-long.505",
📄 License
This model is licensed under the Apache - 2.0 license.