🚀 bart-large-mnli
This is a model checkpoint based on bart-large, trained on the MultiNLI (MNLI) dataset, which can be used for zero-shot text classification.
✨ Features
- Trained on MNLI: This model is trained on the MultiNLI (MNLI) dataset, enhancing its performance in natural language inference tasks.
- Zero-shot Classification: It can be used for zero-shot text classification, enabling classification into any specified class names without additional training.
📦 Installation
Since this model is used through the transformers
library, you need to install the transformers
library first. You can install it using the following command:
pip install transformers
💻 Usage Examples
Basic Usage
With the zero-shot classification pipeline
The model can be loaded with the zero-shot-classification
pipeline like so:
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="facebook/bart-large-mnli")
You can then use this pipeline to classify sequences into any of the class names you specify.
sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)
If more than one candidate label can be correct, pass multi_label=True
to calculate each class independently:
candidate_labels = ['travel', 'cooking', 'dancing', 'exploration']
classifier(sequence_to_classify, candidate_labels, multi_label=True)
Advanced Usage
With manual PyTorch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')
premise = sequence
hypothesis = f'This example is {label}.'
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
truncation_strategy='only_first')
logits = nli_model(x.to(device))[0]
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)
prob_label_is_true = probs[:,1]
📚 Documentation
NLI-based Zero Shot Text Classification
Yin et al. proposed a method for using pre-trained NLI models as a ready-made zero-shot sequence classifiers. The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label. For example, if we want to evaluate whether a sequence belongs to the class "politics", we could construct a hypothesis of This text is about politics.
. The probabilities for entailment and contradiction are then converted to label probabilities.
This method is surprisingly effective in many cases, particularly when used with larger pre-trained models like BART and Roberta. See this blog post for a more expansive introduction to this and other zero shot methods.
📄 License
This project is licensed under the MIT license.