Xlm Clm Ende 1024

Model Overview
Model Features
Model Capabilities
Use Cases
🚀 xlm-clm-ende-1024
This is a transformer model pretrained for English - German using a causal language modeling (CLM) objective. It can be used for causal language - modeling tasks and provides a foundation for various downstream applications.
🚀 Quick Start
Use the following code to start using the model:
Click to expand
import torch
from transformers import XLMTokenizer, XLMWithLMHeadModel
tokenizer = XLMTokenizer.from_pretrained("xlm-clm-ende-1024")
model = XLMWithLMHeadModel.from_pretrained("xlm-clm-ende-1024")
input_ids = torch.tensor([tokenizer.encode("Wikipedia was used to")]) # batch size of 1
language_id = tokenizer.lang2id["en"] # 0
langs = torch.tensor([language_id] * input_ids.shape[1]) # torch.tensor([0, 0, 0, ..., 0])
# We reshape it to be of size (batch_size, sequence_length)
langs = langs.view(1, -1) # is now of shape [1, sequence_length] (we have a batch size of 1)
outputs = model(input_ids, langs=langs)
✨ Features
- Direct Use: It is a language model that can be used for causal language modeling.
- Downstream Use: It can support various downstream tasks. For more details, refer to the Hugging Face Multilingual Models for Inference docs.
📚 Documentation
Model Details
The XLM model was proposed in Cross - lingual Language Model Pretraining by Guillaume Lample, Alexis Conneau. xlm - clm - ende - 1024 is a transformer pretrained for English - German using a causal language modeling (CLM) objective (next token prediction).
Property | Details |
---|---|
Developed by | Guillaume Lample, Alexis Conneau, see associated paper |
Model Type | Language model |
Language(s) (NLP) | English - German |
License | Unknown |
Related Models | [xlm - clm - enfr - 1024](https://huggingface.co/xlm - clm - enfr - 1024), [xlm - mlm - ende - 1024](https://huggingface.co/xlm - mlm - ende - 1024), [xlm - mlm - enfr - 1024](https://huggingface.co/xlm - mlm - enfr - 1024), [xlm - mlm - enro - 1024](https://huggingface.co/xlm - mlm - enro - 1024) |
Resources for more information | Associated paper, GitHub Repo, [Hugging Face Multilingual Models for Inference docs](https://huggingface.co/docs/transformers/v4.20.1/en/multilingual#xlm - with - language - embeddings) |
Uses
Direct Use
The model can be directly used for causal language modeling.
Downstream Use
To learn more about this task and potential downstream uses, see the [Hugging Face Multilingual Models for Inference](https://huggingface.co/docs/transformers/v4.20.1/en/multilingual#xlm - with - language - embeddings) docs.
Out - of - Scope Use
The model should not be used to intentionally create hostile or alienating environments for people.
Bias, Risks, and Limitations
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl - long.330.pdf) and Bender et al. (2021)).
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
Training
See the associated paper for details on the training data and training procedure.
Evaluation
Testing Data, Factors & Metrics
See the associated paper for details on the testing data, factors and metrics.
Results
For xlm - clm - ende - 1024 results, see Table 2 of the associated paper.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Property | Details |
---|---|
Hardware Type | More information needed |
Hours used | More information needed |
Cloud Provider | More information needed |
Compute Region | More information needed |
Carbon Emitted | More information needed |
Technical Specifications
The model developers write:
We implement all our models in PyTorch (Paszke et al., 2017), and train them on 64 Volta GPUs for the language modeling tasks, and 8 GPUs for the MT tasks. We use float16 operations to speed up training and to reduce the memory usage of our models.
See the associated paper for further details.
Citation
BibTeX:
@article{lample2019cross,
title={Cross - lingual language model pretraining},
author={Lample, Guillaume and Conneau, Alexis},
journal={arXiv preprint arXiv:1901.07291},
year={2019}
}
APA:
- Lample, G., & Conneau, A. (2019). Cross - lingual language model pretraining. arXiv preprint arXiv:1901.07291.
Model Card Authors
This model card was written by the team at Hugging Face.

