đ Funnel Transformer small model (B4-4-4 with decoder)
This is a pretrained model on the English language, using a similar objective as ELECTRA. It was introduced in this paper and first released in this repository. This model is uncased, meaning it does not distinguish between "english" and "English".
Disclaimer: The team releasing Funnel Transformer did not write a model card for this model, so this model card has been written by the Hugging Face team.
⨠Features
Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self - supervised fashion. A small language model corrupts the input texts and serves as a generator of inputs for this model. The pretraining objective is to predict which token is original and which one has been replaced, similar to GAN training. This way, the model learns an inner representation of the English language, which can be used to extract features for downstream tasks.
đ Documentation
Model description
Funnel Transformer is pretrained on a large English corpus in a self - supervised manner. It was trained on raw texts without human labeling, using an automatic process to generate inputs and labels. A small language model corrupts the input texts and acts as an input generator for this model. The pretraining goal is to distinguish original tokens from replaced ones, similar to GAN training. The learned inner representation of the English language can be used for downstream tasks.
Intended uses & limitations
You can use the raw model to extract a vector representation of a given text, but it's mainly for fine - tuning on downstream tasks. Check the model hub for fine - tuned versions. Note that this model is mainly for tasks using the whole sentence (possibly masked) for decision - making, such as sequence classification, token classification, or question answering. For text generation, consider models like GPT2.
đģ Usage Examples
Basic Usage
Here is how to use this model to get the features of a given text in PyTorch:
from transformers import FunnelTokenizer, FunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/small")
model = FunnelModel.from_pretrained("funnel-transformer/small")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
Advanced Usage
Here is the usage in TensorFlow:
from transformers import FunnelTokenizer, TFFunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/small")
model = TFFunnelModel.from_pretrained("funnel-transformer/small")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)
đĻ Installation
The installation steps are not provided in the original document, so this section is skipped.
đ§ Technical Details
The technical details are not provided in the original document, so this section is skipped.
đ License
This model is released under the Apache 2.0 license.
đ Training Data
The BERT model was pretrained on the following datasets:
Property |
Details |
Training Data |
BookCorpus (a dataset of 11,038 unpublished books), English Wikipedia (excluding lists, tables and headers), Clue Web (a dataset of 733,019,372 English web pages), GigaWord (an archive of newswire text data), Common Crawl (a dataset of raw web pages) |
BibTeX entry and citation info
@misc{dai2020funneltransformer,
title={Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing},
author={Zihang Dai and Guokun Lai and Yiming Yang and Quoc V. Le},
year={2020},
eprint={2006.03236},
archivePrefix={arXiv},
primaryClass={cs.LG}
}