Funnel Transformer Open-Source English Text Pre-training Model - Filter Redundancy for Efficient Language Processing

Xlarge

Developed by funnel-transformer

Funnel Transformer is an English text pre-training model based on self-supervised learning, adopting objectives similar to ELECTRA, achieving efficient language processing by filtering sequence redundancy.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #ELECTRA-style pre-training #Text feature extraction #Case-insensitive

Downloads 31

Release Time : 3/2/2022

Model Overview

This model is pre-trained on a large corpus of English text, primarily used for text feature extraction or fine-tuning on downstream tasks, suitable for sequence classification, token classification, or question answering tasks.

Model Features

Efficient sequence processing

Achieves more efficient language processing through sequence redundancy filtering design

ELECTRA-like pre-training

Uses pre-training with replaced token detection tasks similar to ELECTRA

Case-insensitive

The model is case-insensitive, uniformly processing different case forms of the same word

Model Capabilities

Text feature extraction

Sequence classification

Token classification

Question answering tasks

Use Cases

Natural Language Processing

Text classification

Can be used for sentiment analysis, topic classification, and other text classification tasks

Named entity recognition

Can be used to identify entities such as person names, locations, and organization names in text

🚀 Funnel Transformer xlarge model (B10-10-10 with decoder)

A pretrained model on English language, similar to ELECTRA, introduced in a research paper and released in a specific repository.

🚀 Quick Start

This is a pretrained model on English language using a similar objective as ELECTRA. It was introduced in this paper and first released in this repository. This model is uncased, making no difference between "english" and "English".

Disclaimer: The team releasing Funnel Transformer did not write a model card for this model, so this model card is written by the Hugging Face team.

✨ Features

Model description

Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self - supervised fashion. It was pretrained on raw texts only, with no human labeling, using an automatic process to generate inputs and labels from those texts.

More precisely, a small language model corrupts the input texts and serves as a generator of inputs for this model. The pretraining objective is to predict which token is original and which one has been replaced, similar to GAN training.

This way, the model learns an inner representation of the English language, which can be used to extract features for downstream tasks. For example, if you have a dataset of labeled sentences, you can train a standard classifier using the features produced by the BERT model as inputs.

Intended uses & limitations

You can use the raw model to extract a vector representation of a given text, but it's mainly intended for fine - tuning on downstream tasks. Check the model hub for fine - tuned versions on tasks that interest you.

Note that this model is mainly for fine - tuning on tasks that use the whole sentence (potentially masked) for decision - making, such as sequence classification, token classification, or question answering. For text generation tasks, consider models like GPT2.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import FunnelTokenizer, FunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/xlarge")
model = FunnelModel.from_pretrained("funnel-transformer/xlarge")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Advanced Usage

Here is how to use this model to get the features of a given text in TensorFlow:

from transformers import FunnelTokenizer, TFFunnelModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/xlarge")
model = TFFunnelModel.from_pretrained("funnel-transformer/xlarge")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

📚 Documentation

Training data

The BERT model was pretrained on:

BookCorpus, a dataset of 11,038 unpublished books.
English Wikipedia (excluding lists, tables, and headers).
Clue Web, a dataset of 733,019,372 English web pages.
GigaWord, an archive of newswire text data.
Common Crawl, a dataset of raw web pages.

BibTeX entry and citation info

@misc{dai2020funneltransformer,
    title={Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing},
    author={Zihang Dai and Guokun Lai and Yiming Yang and Quoc V. Le},
    year={2020},
    eprint={2006.03236},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

📄 License

This model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご