e5-base-v2-mnli-anli Open-source Model - Free Deployment to Aid Zero-shot Classification and Natural Language Inference

E5 Base V2 Mnli Anli

Developed by mjwong

This model is a fine-tuned version of intfloat/e5-base-v2 on the GLUE (MNLI) and ANLI datasets, suitable for zero-shot classification and natural language inference tasks.

Text Classification

Transformers

EnglishOpen Source License:MIT #Zero-shot classification #Natural language inference #Multi-turn dialogue understanding

Downloads 6,598

Release Time : 7/23/2023

Model Overview

A text embedding model generated through weakly supervised contrastive pre-training, primarily used for natural language inference and zero-shot classification tasks.

Model Features

Zero-shot classification capability

Supports text classification without task-specific training

Natural language inference

Can determine the logical relationship between two sentences (entailment/neutral/contradiction)

Multi-dataset fine-tuning

Fine-tuned on GLUE (MNLI) and ANLI datasets to enhance reasoning capabilities

Model Capabilities

Text classification

Natural language inference

Zero-shot learning

Use Cases

Text analysis

Sentiment classification

Classify text sentiment without training

Topic classification

Identify the topic category of text

Logical reasoning

Text consistency judgment

Determine the logical relationship between two sentences

Performs well on MNLI and ANLI datasets

🚀 e5-base-v2-mnli-anli

This model is a fine - tuned version of intfloat/e5-base-v2 for zero - shot classification, enhancing performance on glue (mnli) and anli datasets.

✨ Features

Based on the pre - trained model intfloat/e5-base-v2, fine - tuned on glue (mnli) and anli datasets.
Suitable for zero - shot classification tasks and can also be used for NLI tasks.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

With the zero - shot classification pipeline

The model can be loaded with the zero - shot classification pipeline like so:

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="mjwong/e5-base-v2-mnli-anli")

You can then use this pipeline to classify sequences into any of the class names you specify.

sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)

If more than one candidate label can be correct, pass multi_class=True to calculate each class independently:

candidate_labels = ['travel', 'cooking', 'dancing', 'exploration']
classifier(sequence_to_classify, candidate_labels, multi_class=True)

Advanced Usage

With manual PyTorch

The model can also be applied on NLI tasks like so:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# device = "cuda:0" or "cpu"
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "mjwong/e5-base-v2-mnli-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "But I thought you'd sworn off coffee."
hypothesis = "I thought that you vowed to drink more coffee."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 2) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 Documentation

Model description

Text Embeddings by Weakly - Supervised Contrastive Pre - training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

Eval results

The model was evaluated using the dev sets for MultiNLI and test sets for ANLI. The metric used is accuracy.

Property	Details
Model Type	e5 - base - v2 - mnli - anli
Training Data	glue (mnli), anli

Datasets	mnli_dev_m	mnli_dev_mm	anli_test_r1	anli_test_r2	anli_test_r3
e5-base-v2-mnli-anli	0.812	0.809	0.557	0.460	0.448
e5-large-mnli	0.868	0.869	0.301	0.296	0.294
e5-large-mnli-anli	0.843	0.848	0.646	0.484	0.458
e5-large-v2-mnli	0.875	0.876	0.354	0.298	0.313
e5-large-v2-mnli-anli	0.846	0.848	0.638	0.474	0.479

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1

Framework versions

Transformers 4.28.1
Pytorch 1.12.1+cu116
Datasets 2.11.0
Tokenizers 0.12.1

📄 License

This model is released under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご