multilingual-e5-base-xnli Open-source Model - Supports Free Deployment of Multilingual Zero-shot Classification Tasks

Multilingual E5 Base Xnli

Developed by mjwong

This model is a fine-tuned version of multilingual-e5-base on the XNLI dataset, supporting multilingual zero-shot classification tasks.

Text Classification

Transformers

Supports Multiple LanguagesOpen Source License:MIT #Multilingual zero-shot classification #XNLI fine-tuning #Cross-lingual inference

Downloads 18

Release Time : 6/18/2023

Model Overview

Based on the intfloat/multilingual-e5-base model, fine-tuned on the XNLI dataset, suitable for multilingual natural language inference and zero-shot classification tasks.

Model Features

Multilingual Support

Supports zero-shot classification and natural language inference tasks in 15 languages.

XNLI Fine-tuning

Fine-tuned on the XNLI dataset, optimizing performance for multilingual natural language inference.

Zero-shot Classification Capability

Capable of classifying new categories without task-specific training.

Model Capabilities

Multilingual text classification

Natural language inference

Zero-shot learning

Use Cases

Text Classification

News Classification

Classify news texts into categories such as politics, economy, entertainment, etc.

Achieved 0.785 accuracy on the Chinese XNLI test set.

Natural Language Understanding

Textual Entailment Judgment

Determine the logical relationship between two texts (entailment/neutral/contradiction).

Achieved 0.849 accuracy on the English XNLI test set.

🚀 multilingual-e5-base-xnli

This model is a fine - tuned version of intfloat/multilingual-e5-base on the XNLI dataset, which can be used for zero - shot classification tasks across multiple languages.

🚀 Quick Start

This model is a fine - tuned version of intfloat/multilingual-e5-base on the XNLI dataset.

✨ Features

Multilingual Support: It supports multiple languages including English (en), Arabic (ar), Bulgarian (bg), German (de), Greek (el), Spanish (es), French (fr), Hindi (hi), Russian (ru), Swahili (sw), Thai (th), Turkish (tr), Urdu (ur), Vietnamese (vi), and Chinese (zh).
Zero - Shot Classification: Can be used with the zero - shot - classification pipeline to classify sequences into user - specified class names.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

The model can be loaded with the zero - shot - classification pipeline like so:

from transformers import pipeline
classifier = pipeline("zero - shot - classification",
                      model="mjwong/multilingual - e5 - base - xnli")

You can then use this pipeline to classify sequences into any of the class names you specify.

sequence_to_classify = "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
classifier(sequence_to_classify, candidate_labels)

If more than one candidate label can be correct, pass multi_class = True to calculate each class independently:

candidate_labels = ["politics", "economy", "entertainment", "environment"]
classifier(sequence_to_classify, candidate_labels, multi_label = True)

Advanced Usage

The model can also be applied on NLI tasks like so:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# device = "cuda:0" or "cpu"
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "mjwong/multilingual - e5 - base - xnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "But I thought you'd sworn off coffee."
hypothesis = "I thought that you vowed to drink more coffee."

input = tokenizer(premise, hypothesis, truncation = True, return_tensors="pt")
output = model(input["input_ids"].to(device))
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 2) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 Documentation

Model description

Text Embeddings by Weakly - Supervised Contrastive Pre - training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

Eval results

The model was evaluated using the XNLI test sets on 15 languages: English (en), Arabic (ar), Bulgarian (bg), German (de), Greek (el), Spanish (es), French (fr), Hindi (hi), Russian (ru), Swahili (sw), Thai (th), Turkish (tr), Urdu (ur), Vietnam (vi) and Chinese (zh). The metric used is accuracy.

Property	en	ar	bg	de	el	es	fr	hi	ru	sw	th	tr	ur	vi	zh
[multilingual - e5 - base - xnli](https://huggingface.co/mjwong/multilingual - e5 - base - xnli)	0.849	0.768	0.803	0.800	0.792	0.809	0.805	0.738	0.782	0.728	0.756	0.766	0.713	0.787	0.785
[multilingual - e5 - base - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - base - xnli - anli)	0.811	0.711	0.751	0.759	0.746	0.778	0.765	0.685	0.728	0.662	0.705	0.716	0.683	0.736	0.740
[multilingual - e5 - large - xnli](https://huggingface.co/mjwong/multilingual - e5 - large - xnli)	0.867	0.791	0.832	0.825	0.823	0.837	0.824	0.778	0.806	0.749	0.787	0.793	0.738	0.813	0.808
[multilingual - e5 - large - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - large - xnli - anli)	0.865	0.765	0.811	0.811	0.795	0.823	0.816	0.743	0.785	0.713	0.765	0.774	0.706	0.788	0.787
[multilingual - e5 - large - instruct - xnli](https://huggingface.co/mjwong/multilingual - e5 - large - instruct - xnli)	0.864	0.793	0.839	0.821	0.824	0.837	0.823	0.770	0.810	0.744	0.784	0.791	0.716	0.807	0.807
[multilingual - e5 - large - instruct - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - large - instruct - xnli - anli)	0.861	0.780	0.816	0.808	0.806	0.825	0.816	0.758	0.799	0.727	0.775	0.780	0.721	0.787	0.795

The model was also evaluated using the dev sets for MultiNLI and test sets for ANLI. The metric used is accuracy.

Property	mnli_dev_m	mnli_dev_mm	anli_test_r1	anli_test_r2	anli_test_r3
[multilingual - e5 - base - xnli](https://huggingface.co/mjwong/multilingual - e5 - base - xnli)	0.835	0.837	0.287	0.276	0.301
[multilingual - e5 - base - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - base - xnli - anli)	0.814	0.811	0.588	0.437	0.439
[multilingual - e5 - large - xnli](https://huggingface.co/mjwong/multilingual - e5 - large - xnli)	0.865	0.865	0.312	0.316	0.300
[multilingual - e5 - large - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - large - xnli - anli)	0.863	0.863	0.623	0.456	0.455
[multilingual - e5 - large - instruct - xnli](https://huggingface.co/mjwong/multilingual - e5 - large - instruct - xnli)	0.867	0.866	0.341	0.330	0.323
[multilingual - e5 - large - instruct - xnli - anli](https://huggingface.co/mjwong/multilingual - e5 - large - instruct - xnli - anli)	0.862	0.862	0.615	0.459	0.462

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Framework versions

Transformers 4.28.1
Pytorch 1.12.1+cu116
Datasets 2.11.0
Tokenizers 0.12.1

🔧 Technical Details

The model is based on the pre - trained intfloat/multilingual - e5 - base model and fine - tuned on the XNLI dataset. The paper Text Embeddings by Weakly - Supervised Contrastive Pre - training provides more theoretical basis.

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご