xlm-clm-ende-1024 Open Source Model - Realize Free Bilingual Communication between English and German

Home

Xlm Clm Ende 1024

Developed by FacebookAI

English-German bilingual Transformer model pretrained with causal language modeling objective

Large Language Model

Transformers

Supports Multiple Languages#English-German Bilingual #Causal Language Modeling #Cross-lingual Pretraining

Downloads 111

Release Time : 3/2/2022

Model Overview

This model is an English-German bilingual Transformer pretrained with the Causal Language Modeling (CLM) objective, primarily designed for language modeling tasks.

Model Features

Bilingual Support

Supports language modeling tasks in both English and German.

Causal Language Modeling

Pretrained with next-word prediction objective.

Cross-lingual Capabilities

Capable of handling cross-lingual text generation and understanding tasks.

Model Capabilities

Text Generation

Language Modeling

Cross-lingual Text Processing

Use Cases

Natural Language Processing

Text Generation

Generate coherent English or German text.

Language Model Fine-tuning

Fine-tune as a pretrained model for downstream tasks.

🚀 xlm-clm-ende-1024

This is a transformer model pretrained for English - German using a causal language modeling (CLM) objective. It can be used for causal language - modeling tasks and provides a foundation for various downstream applications.

🚀 Quick Start

Use the following code to start using the model:

Click to expand

import torch
from transformers import XLMTokenizer, XLMWithLMHeadModel

tokenizer = XLMTokenizer.from_pretrained("xlm-clm-ende-1024")
model = XLMWithLMHeadModel.from_pretrained("xlm-clm-ende-1024")

input_ids = torch.tensor([tokenizer.encode("Wikipedia was used to")])  # batch size of 1

language_id = tokenizer.lang2id["en"]  # 0
langs = torch.tensor([language_id] * input_ids.shape[1])  # torch.tensor([0, 0, 0, ..., 0])

# We reshape it to be of size (batch_size, sequence_length)
langs = langs.view(1, -1)  # is now of shape [1, sequence_length] (we have a batch size of 1)

outputs = model(input_ids, langs=langs)

✨ Features

Direct Use: It is a language model that can be used for causal language modeling.
Downstream Use: It can support various downstream tasks. For more details, refer to the Hugging Face Multilingual Models for Inference docs.

📚 Documentation

Model Details

The XLM model was proposed in Cross - lingual Language Model Pretraining by Guillaume Lample, Alexis Conneau. xlm - clm - ende - 1024 is a transformer pretrained for English - German using a causal language modeling (CLM) objective (next token prediction).

Property	Details
Developed by	Guillaume Lample, Alexis Conneau, see associated paper
Model Type	Language model
Language(s) (NLP)	English - German
License	Unknown
Related Models	[xlm - clm - enfr - 1024](https://huggingface.co/xlm - clm - enfr - 1024), [xlm - mlm - ende - 1024](https://huggingface.co/xlm - mlm - ende - 1024), [xlm - mlm - enfr - 1024](https://huggingface.co/xlm - mlm - enfr - 1024), [xlm - mlm - enro - 1024](https://huggingface.co/xlm - mlm - enro - 1024)
Resources for more information	Associated paper, GitHub Repo, [Hugging Face Multilingual Models for Inference docs](https://huggingface.co/docs/transformers/v4.20.1/en/multilingual#xlm - with - language - embeddings)

Uses

Direct Use

The model can be directly used for causal language modeling.

Downstream Use

To learn more about this task and potential downstream uses, see the [Hugging Face Multilingual Models for Inference](https://huggingface.co/docs/transformers/v4.20.1/en/multilingual#xlm - with - language - embeddings) docs.

Out - of - Scope Use

The model should not be used to intentionally create hostile or alienating environments for people.

Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl - long.330.pdf) and Bender et al. (2021)).

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training

See the associated paper for details on the training data and training procedure.

Evaluation

Testing Data, Factors & Metrics

See the associated paper for details on the testing data, factors and metrics.

Results

For xlm - clm - ende - 1024 results, see Table 2 of the associated paper.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Property	Details
Hardware Type	More information needed
Hours used	More information needed
Cloud Provider	More information needed
Compute Region	More information needed
Carbon Emitted	More information needed

Technical Specifications

The model developers write:

We implement all our models in PyTorch (Paszke et al., 2017), and train them on 64 Volta GPUs for the language modeling tasks, and 8 GPUs for the MT tasks. We use float16 operations to speed up training and to reduce the memory usage of our models.

See the associated paper for further details.

Citation

BibTeX:

@article{lample2019cross,
  title={Cross - lingual language model pretraining},
  author={Lample, Guillaume and Conneau, Alexis},
  journal={arXiv preprint arXiv:1901.07291},
  year={2019}
}

APA:

Lample, G., & Conneau, A. (2019). Cross - lingual language model pretraining. arXiv preprint arXiv:1901.07291.

Model Card Authors

This model card was written by the team at Hugging Face.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご