bart-large-mnli Open-source Zero-shot Classification Model - Free Deployment for Quick Text Category Judgment

Bart Large Mnli

Developed by facebook

Zero-shot classification model based on BART-large architecture, fine-tuned on MultiNLI dataset

Large Language Model Open Source License:MIT #Zero-shot classification #Multi-label inference #NLI fine-tuning

Downloads 3.7M

Release Time : 3/2/2022

Model Overview

This model is a BART-large model fine-tuned on the MultiNLI dataset, specifically designed for zero-shot text classification tasks. Through Natural Language Inference (NLI), it can classify text into any custom category.

Model Features

Zero-shot classification capability

Classify text into any custom category without fine-tuning

Flexible classification based on NLI

Achieve open-ended classification by constructing hypothesis statements

Multi-label support

Can simultaneously identify multiple related categories in text

Model Capabilities

Zero-shot text classification

Natural language inference

Multi-label classification

Use Cases

Text classification

News classification

Automatically classify news into custom topic categories

Shows up to 99% accuracy in examples

Content moderation

Identify sensitive categories of text content

🚀 bart-large-mnli

This is a model checkpoint based on bart-large, trained on the MultiNLI (MNLI) dataset, which can be used for zero-shot text classification.

✨ Features

Trained on MNLI: This model is trained on the MultiNLI (MNLI) dataset, enhancing its performance in natural language inference tasks.
Zero-shot Classification: It can be used for zero-shot text classification, enabling classification into any specified class names without additional training.

📦 Installation

Since this model is used through the transformers library, you need to install the transformers library first. You can install it using the following command:

pip install transformers

💻 Usage Examples

Basic Usage

With the zero-shot classification pipeline

The model can be loaded with the zero-shot-classification pipeline like so:

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli")

You can then use this pipeline to classify sequences into any of the class names you specify.

sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)
#{'labels': ['travel', 'dancing', 'cooking'],
# 'scores': [0.9938651323318481, 0.0032737774308770895, 0.002861034357920289],
# 'sequence': 'one day I will see the world'}

If more than one candidate label can be correct, pass multi_label=True to calculate each class independently:

candidate_labels = ['travel', 'cooking', 'dancing', 'exploration']
classifier(sequence_to_classify, candidate_labels, multi_label=True)
#{'labels': ['travel', 'exploration', 'dancing', 'cooking'],
# 'scores': [0.9945111274719238,
#  0.9383890628814697,
#  0.0057061901316046715,
#  0.0018193122232332826],
# 'sequence': 'one day I will see the world'}

Advanced Usage

With manual PyTorch

# pose sequence as a NLI premise and label as a hypothesis
from transformers import AutoModelForSequenceClassification, AutoTokenizer
nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')

premise = sequence
hypothesis = f'This example is {label}.'

# run through model pre-trained on MNLI
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
                     truncation_strategy='only_first')
logits = nli_model(x.to(device))[0]

# we throw away "neutral" (dim 1) and take the probability of
# "entailment" (2) as the probability of the label being true 
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)
prob_label_is_true = probs[:,1]

📚 Documentation

NLI-based Zero Shot Text Classification

Yin et al. proposed a method for using pre-trained NLI models as a ready-made zero-shot sequence classifiers. The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label. For example, if we want to evaluate whether a sequence belongs to the class "politics", we could construct a hypothesis of This text is about politics.. The probabilities for entailment and contradiction are then converted to label probabilities.

This method is surprisingly effective in many cases, particularly when used with larger pre-trained models like BART and Roberta. See this blog post for a more expansive introduction to this and other zero shot methods.

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご