drama-large-xnli-anli Open-source Zero-shot Classification Model - Supports Natural Language Inference Tasks in 15 Languages

Drama Large Xnli Anli

Developed by mjwong

A zero-shot classification model fine-tuned on XNLI and ANLI datasets based on facebook/drama-large, supporting natural language inference tasks in 15 languages.

Large Language Model

Safetensors

Supports Multiple Languages#Multilingual NLI #Zero-shot Classification #Semantic Reasoning

Downloads 23

Release Time : 3/1/2025

Model Overview

This model is a fine-tuned version for zero-shot classification and multilingual natural language inference tasks, particularly suitable for cross-lingual text classification and reasoning scenarios.

Model Features

Multilingual Support

Supports zero-shot classification and natural language inference tasks in 15 languages.

High-performance Inference

Performs excellently on XNLI and ANLI datasets, especially in English and other major languages.

Zero-shot Classification Capability

Can classify new categories without task-specific training.

Model Capabilities

Zero-shot Text Classification

Multilingual Natural Language Inference

Cross-lingual Text Understanding

Multi-category Classification

Use Cases

Text Classification

Multilingual Content Classification

Classify text content in multiple languages, such as news classification, product review classification, etc.

Demonstrates good classification accuracy in 15 languages.

Natural Language Inference

Cross-lingual Text Reasoning

Determine the logical relationship (entailment, neutral, or contradiction) between texts in two languages.

Achieves 79.9% accuracy in English on the XNLI dataset, with accuracy ranging from 59.4% to 75.4% in other languages.

🚀 drama-large-xnli-anli

This model is a fine - tuned version of facebook/drama-large on the XNLI and ANLI dataset. It enables zero - shot classification across multiple languages, offering a powerful solution for various natural language processing tasks.

✨ Features

Multilingual Support: Supports multiple languages including English, Arabic, Bulgarian, German, Greek, Spanish, French, Hindi, Russian, Swahili, Thai, Turkish, Urdu, Vietnamese, and Chinese.
Zero - Shot Classification: Can perform zero - shot classification tasks effectively.
Fine - Tuned on Diverse Datasets: Fine - tuned on XNLI and ANLI datasets, enhancing its performance on natural language inference tasks.

📦 Installation

This section is skipped as no specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

The model can be loaded with the zero - shot classification pipeline like so:

from transformers import AutoTokenizer, pipeline
model = "mjwong/drama-large-xnli-anli"
classifier = pipeline("zero-shot-classification",
                      model=model)

You can then use this pipeline to classify sequences into any of the class names you specify.

sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)

If more than one candidate label can be correct, pass multi_class=True to calculate each class independently:

candidate_labels = ['travel', 'cooking', 'dancing', 'exploration']
classifier(sequence_to_classify, candidate_labels, multi_class=True)

Advanced Usage

The model can also be applied on NLI tasks like so:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# device = "cuda:0" or "cpu"
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "mjwong/drama-large-xnli-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "But I thought you'd sworn off coffee."
hypothesis = "I thought that you vowed to drink more coffee."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 2) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 Documentation

Model description

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers. Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen - tau Yih, Xilun Chen, arXiv 2025

Eval results

The model was evaluated using the XNLI test sets on 15 languages: English (en), Arabic (ar), Bulgarian (bg), German (de), Greek (el), Spanish (es), French (fr), Hindi (hi), Russian (ru), Swahili (sw), Thai (th), Turkish (tr), Urdu (ur), Vietnam (vi) and Chinese (zh). The metric used is accuracy.

Datasets	en	ar	bg	de	el	es	fr	hi	ru	sw	th	tr	ur	vi	zh
[drama - base - xnli - anli](https://huggingface.co/mjwong/drama - base - xnli - anli)	0.788	0.689	0.708	0.715	0.696	0.732	0.737	0.647	0.711	0.636	0.676	0.664	0.588	0.708	0.710
[drama - large - xnli - anli](https://huggingface.co/mjwong/drama - large - xnli - anli)	0.799	0.698	0.730	0.721	0.717	0.754	0.754	0.649	0.718	0.652	0.678	0.656	0.594	0.719	0.719

The model was also evaluated using the dev sets for MultiNLI and test sets for ANLI. The metric used is accuracy.

Datasets	mnli_dev_m	mnli_dev_mm	anli_test_r1	anli_test_r2	anli_test_r3
[drama - base - xnli - anli](https://huggingface.co/mjwong/drama - base - xnli - anli)	0.781	0.787	0.500	0.420	0.440
[drama - large - xnli - anli](https://huggingface.co/mjwong/drama - large - xnli - anli)	0.794	0.796	0.534	0.446	0.452

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1

Framework versions

Transformers 4.49.0
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0

🔧 Technical Details

This section is skipped as no specific technical implementation details are provided in the original document.

📄 License

The license for this model is cc - by - nc - 4.0.

Additional Information

Property	Details
Model Type	Fine - tuned version of facebook/drama - large on XNLI and ANLI datasets
Training Data	XNLI, facebook/anli
Pipeline Tag	zero - shot - classification
Base Model	facebook/drama - large

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご