XLM-mlm-enfr-1024 Open-source Model - A Practical Tool for Cross-lingual Tasks

Xlm Mlm Enfr 1024

Developed by FacebookAI

XLM-mlm-enfr-1024 is a Transformer model pre-trained on English-French masked language modeling objectives, supporting cross-lingual tasks.

Large Language Model

Transformers

Supports Multiple Languages#English-French Bilingual #Masked Language Modeling #Cross-lingual Pretraining

Downloads 344

Release Time : 3/2/2022

Model Overview

This model uses language embeddings to specify the language used during inference, primarily for masked language modeling tasks between English and French.

Model Features

Cross-lingual Capability

Supports cross-lingual task processing between English and French

Language Embeddings

Uses language embeddings to specify the language used during inference

Efficient Training

Utilizes float16 operations to accelerate training and reduce memory usage

Model Capabilities

English-French Masked Language Modeling

Cross-lingual Text Processing

Use Cases

Natural Language Processing

Text Filling

Predicts and fills missing parts in text

Cross-lingual Text Understanding

Processes and understands English and French texts

🚀 xlm-mlm-enfr-1024

The xlm-mlm-enfr-1024 is a transformer model pretrained for English-French using masked language modeling. It offers capabilities for cross - lingual language processing.

🚀 Quick Start

This model uses language embeddings to specify the language used at inference. For further details, refer to the Hugging Face Multilingual Models for Inference docs.

✨ Features

Cross - lingual Processing: Trained for English - French, enabling cross - lingual language tasks.
Masked Language Modeling: Can be used for masked language modeling tasks.

📚 Documentation

🔍 Model Details

The XLM model was proposed in Cross - lingual Language Model Pretraining by Guillaume Lample, Alexis Conneau. xlm - mlm - enfr - 1024 is a transformer pretrained using a masked language modeling (MLM) objective for English - French.

Model Description

Property	Details
Developed by	Guillaume Lample, Alexis Conneau, see associated paper
Model Type	Language model
Language(s) (NLP)	English - French
License	CC - BY - NC - 4.0
Related Models	[xlm - clm - ende - 1024](https://huggingface.co/xlm - clm - enfr - 1024), [xlm - clm - ende - 1024](https://huggingface.co/xlm - clm - ende - 1024), [xlm - mlm - ende - 1024](https://huggingface.co/xlm - mlm - ende - 1024), [xlm - mlm - enro - 1024](https://huggingface.co/xlm - mlm - enro - 1024)
Resources for more information	Associated paper, GitHub Repo, Hugging Face Multilingual Models for Inference docs

💼 Uses

Direct Use

The model is a language model and can be used for masked language modeling.

Downstream Use

To learn more about this task and potential downstream uses, see the Hugging Face [fill mask docs](https://huggingface.co/tasks/fill - mask) and the Hugging Face Multilingual Models for Inference docs.

Out - of - Scope Use

The model should not be used to intentionally create hostile or alienating environments for people.

⚠️ Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl - long.330.pdf) and Bender et al. (2021)).

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

🏋️ Training

The model developers write:

In all experiments, we use a Transformer architecture with 1024 hidden units, 8 heads, GELU activations (Hendrycks and Gimpel, 2016), a dropout rate of 0.1 and learned positional embeddings. We train our models with the Adam optimizer (Kingma and Ba, 2014), a linear warm - up (Vaswani et al., 2017) and learning rates varying from 10^−4 to 5.10^−4.

See the associated paper for links, citations, and further details on the training data and training procedure.

The model developers also write that:

If you use these models, you should use the same data preprocessing / BPE codes to preprocess your data.

See the associated [GitHub Repo](https://github.com/facebookresearch/XLM#ii - cross - lingual - language - model - pretraining - xlm) for further details.

📊 Evaluation

Testing Data, Factors & Metrics

The model developers evaluated the model on the WMT'14 English - French dataset using the [BLEU metric](https://huggingface.co/spaces/evaluate - metric/bleu). See the associated paper for further details on the testing data, factors and metrics.

Results

For xlm - mlm - enfr - 1024 results, see Table 1 and Table 2 of the associated paper.

🌱 Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Property	Details
Hardware Type	More information needed
Hours used	More information needed
Cloud Provider	More information needed
Compute Region	More information needed
Carbon Emitted	More information needed

🔧 Technical Specifications

The model developers write:

We implement all our models in PyTorch (Paszke et al., 2017), and train them on 64 Volta GPUs for the language modeling tasks, and 8 GPUs for the MT tasks. We use float16 operations to speed up training and to reduce the memory usage of our models.

See the associated paper for further details.

📖 Citation

BibTeX:

@article{lample2019cross,
  title={Cross - lingual language model pretraining},
  author={Lample, Guillaume and Conneau, Alexis},
  journal={arXiv preprint arXiv:1901.07291},
  year={2019}
}

APA:

Lample, G., & Conneau, A. (2019). Cross - lingual language model pretraining. arXiv preprint arXiv:1901.07291.

👥 Model Card Authors

This model card was written by the team at Hugging Face.

📄 License

The model is licensed under CC - BY - NC - 4.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご