Small open-source Transformer model - Based on English corpora, used for text feature extraction and task fine-tuning

Small

Developed by funnel-transformer

Transformer model pre-trained on English corpus using ELECTRA-like objectives, suitable for text feature extraction and downstream task fine-tuning

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #ELECTRA-style pretraining #Case-insensitive #Text feature extraction

Downloads 6,084

Release Time : 3/2/2022

Model Overview

This model is pre-trained in a self-supervised manner on large English text corpora to learn internal representations of the English language, which can be used to extract features for downstream tasks or for fine-tuning

Model Features

Funnel structure

Achieves efficient language processing by filtering sequence redundancy, improving model efficiency

ELECTRA-style pretraining

Uses ELECTRA-like adversarial training to predict original/replaced tokens

Case-insensitive

Processes text input uniformly without case distinction

Model Capabilities

Text feature extraction

Sequence classification

Token classification

Question answering task processing

Use Cases

Natural Language Processing

Text classification

Perform sentiment analysis or topic classification on text

Named entity recognition

Identify entities such as person names and locations in text

🚀 Funnel Transformer small model (B4-4-4 with decoder)

This is a pretrained model on the English language, using a similar objective as ELECTRA. It was introduced in this paper and first released in this repository. This model is uncased, meaning it does not distinguish between "english" and "English".

Disclaimer: The team releasing Funnel Transformer did not write a model card for this model, so this model card has been written by the Hugging Face team.

✨ Features

Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self - supervised fashion. A small language model corrupts the input texts and serves as a generator of inputs for this model. The pretraining objective is to predict which token is original and which one has been replaced, similar to GAN training. This way, the model learns an inner representation of the English language, which can be used to extract features for downstream tasks.

📚 Documentation

Model description

Funnel Transformer is pretrained on a large English corpus in a self - supervised manner. It was trained on raw texts without human labeling, using an automatic process to generate inputs and labels. A small language model corrupts the input texts and acts as an input generator for this model. The pretraining goal is to distinguish original tokens from replaced ones, similar to GAN training. The learned inner representation of the English language can be used for downstream tasks.

Intended uses & limitations

You can use the raw model to extract a vector representation of a given text, but it's mainly for fine - tuning on downstream tasks. Check the model hub for fine - tuned versions. Note that this model is mainly for tasks using the whole sentence (possibly masked) for decision - making, such as sequence classification, token classification, or question answering. For text generation, consider models like GPT2.

💻 Usage Examples

Basic Usage

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import FunnelTokenizer, FunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/small")
model = FunnelModel.from_pretrained("funnel-transformer/small")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Advanced Usage

Here is the usage in TensorFlow:

from transformers import FunnelTokenizer, TFFunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/small")
model = TFFunnelModel.from_pretrained("funnel-transformer/small")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

📦 Installation

The installation steps are not provided in the original document, so this section is skipped.

🔧 Technical Details

The technical details are not provided in the original document, so this section is skipped.

📄 License

This model is released under the Apache 2.0 license.

📊 Training Data

The BERT model was pretrained on the following datasets:

Property	Details
Training Data	BookCorpus (a dataset of 11,038 unpublished books), English Wikipedia (excluding lists, tables and headers), Clue Web (a dataset of 733,019,372 English web pages), GigaWord (an archive of newswire text data), Common Crawl (a dataset of raw web pages)

BibTeX entry and citation info

@misc{dai2020funneltransformer,
    title={Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing},
    author={Zihang Dai and Guokun Lai and Yiming Yang and Quoc V. Le},
    year={2020},
    eprint={2006.03236},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご