đ Funnel Transformer xlarge model (B10-10-10 with decoder)
A pretrained model on English language, similar to ELECTRA, introduced in a research paper and released in a specific repository.
đ Quick Start
This is a pretrained model on English language using a similar objective as ELECTRA. It was introduced in this paper and first released in this repository. This model is uncased, making no difference between "english" and "English".
Disclaimer: The team releasing Funnel Transformer did not write a model card for this model, so this model card is written by the Hugging Face team.
⨠Features
Model description
Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self - supervised fashion. It was pretrained on raw texts only, with no human labeling, using an automatic process to generate inputs and labels from those texts.
More precisely, a small language model corrupts the input texts and serves as a generator of inputs for this model. The pretraining objective is to predict which token is original and which one has been replaced, similar to GAN training.
This way, the model learns an inner representation of the English language, which can be used to extract features for downstream tasks. For example, if you have a dataset of labeled sentences, you can train a standard classifier using the features produced by the BERT model as inputs.
Intended uses & limitations
You can use the raw model to extract a vector representation of a given text, but it's mainly intended for fine - tuning on downstream tasks. Check the model hub for fine - tuned versions on tasks that interest you.
Note that this model is mainly for fine - tuning on tasks that use the whole sentence (potentially masked) for decision - making, such as sequence classification, token classification, or question answering. For text generation tasks, consider models like GPT2.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
Basic Usage
Here is how to use this model to get the features of a given text in PyTorch:
from transformers import FunnelTokenizer, FunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/xlarge")
model = FunnelModel.from_pretrained("funnel-transformer/xlarge")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
Advanced Usage
Here is how to use this model to get the features of a given text in TensorFlow:
from transformers import FunnelTokenizer, TFFunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/xlarge")
model = TFFunnelModel.from_pretrained("funnel-transformer/xlarge")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)
đ Documentation
Training data
The BERT model was pretrained on:
BibTeX entry and citation info
@misc{dai2020funneltransformer,
title={Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing},
author={Zihang Dai and Guokun Lai and Yiming Yang and Quoc V. Le},
year={2020},
eprint={2006.03236},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
đ License
This model is licensed under the Apache 2.0 license.