intermediate-base Open-source Model - Free to Use, Providing Strong Support for English Sentence Summarization Tasks

Intermediate Base

Developed by funnel-transformer

A Transformer model pre-trained on English corpus using ELECTRA-like self-supervised approach, suitable for sentence summarization tasks.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #English Text Summarization #ELECTRA-style Pretraining #Sequence Classification Optimization

Downloads 20

Release Time : 3/2/2022

Model Overview

Funnel Transformer is a self-supervised pre-trained Transformer model on large-scale English corpus, primarily used for extracting text features for downstream task fine-tuning.

Model Features

Self-supervised Pretraining

Trained using ELECTRA-like self-supervised approach without manual annotation data

Efficient Sequence Processing

Output hidden state sequence length is one-fourth of input, suitable for sentence summarization tasks

Case Insensitive

Model is case-insensitive, treating 'english' and 'English' as identical

Model Capabilities

Text Feature Extraction

Sentence Classification

Sequence Classification

Token Classification

Question Answering

Use Cases

Natural Language Processing

Sentence Classification

Using model-extracted features as input to train standard classifiers

Question Answering System

Building QA systems using text features extracted by the model

🚀 Funnel Transformer intermediate model (B6-6-6 without decoder)

A pretrained model on English language, similar to ELECTRA, useful for extracting features for downstream tasks.

🚀 Quick Start

This model is pretrained on English language using a similar objective as ELECTRA. It was introduced in this paper and first released in this repository. This uncased model treats "english" and "English" the same.

Disclaimer: The team releasing Funnel Transformer did not write a model card for this model, so this model card has been written by the Hugging Face team.

✨ Features

Pretrained on a large corpus of English data in a self - supervised fashion.
Learns an inner representation of the English language for downstream tasks.
Outputs hidden states with a sequence length of one - fourth of the inputs (without decoder).

📚 Documentation

Model description

Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self - supervised manner. It was pretrained on raw texts without human labeling, using an automatic process to generate inputs and labels.

Specifically, a small language model corrupts the input texts and serves as an input generator. The pretraining objective is to predict original and replaced tokens, similar to GAN training.

The model learns an inner representation of English, which can be used to extract features for downstream tasks. For example, you can train a standard classifier using the features produced by this model as inputs.

Note: This model does not contain the decoder, so it outputs hidden states with a sequence length of one - fourth of the inputs. It's suitable for tasks requiring sentence summaries (like sentence classification) but not for tasks needing one input per initial token. In that case, use the intermediate model.

Intended uses & limitations

You can use the raw model to extract vector representations of text, but it's mainly for fine - tuning on downstream tasks. Check the model hub for fine - tuned versions.

This model is mainly for fine - tuning on tasks using the whole sentence, such as sequence classification, token classification, or question answering. For text generation, consider models like GPT2.

How to use

💻 Usage Examples

Basic Usage

# PyTorch example
from transformers import FunnelTokenizer, FunnelBaseModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/intermediate-base")
model = FunnelBaseModel.from_pretrained("funnel-transformer/intermediate-base")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

# TensorFlow example
from transformers import FunnelTokenizer, TFFunnelBaseModel
tokenizer = FunnelTokenizer.from_pretrained("funnel-transformer/intermediate-base")
model = TFFunnelBaseModel.from_pretrained("funnel-transformer/intermediate-base")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

Training data

The model was pretrained on:

Property	Details
Model Type	Funnel Transformer intermediate model (B6 - 6 - 6 without decoder)
Training Data	BookCorpus, English Wikipedia (excluding lists, tables and headers), Clue Web, GigaWord, Common Crawl

BibTeX entry and citation info

@misc{dai2020funneltransformer,
    title={Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing},
    author={Zihang Dai and Guokun Lai and Yiming Yang and Quoc V. Le},
    year={2020},
    eprint={2006.03236},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Important Note

This model does not contain the decoder, so it outputs hidden states that have a sequence length of one fourth of the inputs. It's good to use for tasks requiring a summary of the sentence (like sentence classification) but not if you need one input per initial token. You should use the intermediate model in that case.

Usage Tip

You can use the raw model to extract a vector representation of a given text, but it's mostly intended to be fine - tuned on a downstream task. See the model hub to look for fine - tuned versions on a task that interests you.

📄 License

This model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご