xlm-roberta-large-it-mnli Open Source Model - Zero-shot Italian Text Classification with Multilingual Support

Xlm Roberta Large It Mnli

Developed by Jiva

Italian zero-shot classification model fine-tuned based on xlm-roberta-large, supporting multilingual text classification

Text Classification

Transformers

OtherOpen Source License:MIT #Italian zero-shot classification #Multilingual NLI #Automatic translation fine-tuning

Downloads 937

Release Time : 3/2/2022

Model Overview

This model is fine-tuned on an automatically translated Italian subset from the MNLI corpus, specifically designed for zero-shot classification of Italian texts, and can also be used for classification tasks in other languages.

Model Features

Multilingual support

Based on XLM-RoBERTa-large pre-training, supports text classification in 100 languages

Zero-shot classification

Classifies new categories without domain-specific training

Multi-label classification

Supports assigning multiple related labels to text simultaneously

Model Capabilities

Italian text classification

Cross-lingual text classification

Multi-label classification

Natural language inference

Use Cases

Text classification

Historical text classification

Classifies historical-related texts to identify their themes

Accurately distinguishes categories such as war, history, etc.

Geographic information classification

Classifies geography-related texts

Accurately identifies geographic-related content

🚀 XLM-roBERTa-large-it-mnli

This model is fine - tuned for zero - shot text classification of Italian texts. It builds on the xlm - roberta - large base model and shows effectiveness in multiple languages due to the pre - training on 100 languages.

🚀 Quick Start

With the zero - shot classification pipeline

The model can be loaded with the zero - shot - classification pipeline as follows:

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="Jiva/xlm-roberta-large-it-mnli", device=0, use_fast=True, multi_label=True)

You can then classify in any of the languages the model supports. You can even pass the labels in one language and the sequence to classify in another:

# we will classify the following wikipedia entry about Sardinia"
sequence_to_classify = "La Sardegna è una regione italiana a statuto speciale di 1 592 730 abitanti con capoluogo Cagliari, la cui denominazione bilingue utilizzata nella comunicazione ufficiale è Regione Autonoma della Sardegna / Regione Autònoma de Sardigna."
# we can specify candidate labels in Italian:
candidate_labels = ["geografia", "politica", "macchine", "cibo", "moda"]
classifier(sequence_to_classify, candidate_labels)
# {'labels': ['geografia', 'moda', 'politica', 'macchine', 'cibo'],
# 'scores': [0.38871392607688904, 0.22633370757102966, 0.19398456811904907, 0.13735772669315338, 0.13708525896072388]}

The default hypothesis template is the English, This text is {}. With this model better results are achieving when providing a translated template:

sequence_to_classify = "La Sardegna è una regione italiana a statuto speciale di 1 592 730 abitanti con capoluogo Cagliari, la cui denominazione bilingue utilizzata nella comunicazione ufficiale è Regione Autonoma della Sardegna / Regione Autònoma de Sardigna."
candidate_labels = ["geografia", "politica", "macchine", "cibo", "moda"]
hypothesis_template = "si parla di {}"
# classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)
# 'scores': [0.6068345904350281, 0.34715887904167175, 0.32433947920799255, 0.3068877160549164, 0.18744681775569916]}

With manual PyTorch

# pose sequence as a NLI premise and label as a hypothesis
from transformers import AutoModelForSequenceClassification, AutoTokenizer
nli_model = AutoModelForSequenceClassification.from_pretrained('Jiva/xlm-roberta-large-it-mnli')
tokenizer = AutoTokenizer.from_pretrained('Jiva/xlm-roberta-large-it-mnli')
premise = sequence
hypothesis = f'si parla di {}.'
# run through model pre-trained on MNLI
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
                     truncation_strategy='only_first')
logits = nli_model(x.to(device))[0]
# we throw away "neutral" (dim 1) and take the probability of
# "entailment" (2) as the probability of the label being true 
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)
prob_label_is_true = probs[:,1]

✨ Features

Zero - Shot Classification: Capable of zero - shot text classification for Italian texts and shows some effectiveness in other languages due to pre - training on 100 languages.
Based on Strong Base Model: Built on [xlm - roberta - large](https://huggingface.co/xlm - roberta - large), which provides a solid foundation for the fine - tuning.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="Jiva/xlm-roberta-large-it-mnli", device=0, use_fast=True, multi_label=True)              
sequence_to_classify = "La Sardegna è una regione italiana a statuto speciale di 1 592 730 abitanti con capoluogo Cagliari, la cui denominazione bilingue utilizzata nella comunicazione ufficiale è Regione Autonoma della Sardegna / Regione Autònoma de Sardigna."
candidate_labels = ["geografia", "politica", "macchine", "cibo", "moda"]
classifier(sequence_to_classify, candidate_labels)

Advanced Usage

sequence_to_classify = "La Sardegna è una regione italiana a statuto speciale di 1 592 730 abitanti con capoluogo Cagliari, la cui denominazione bilingue utilizzata nella comunicazione ufficiale è Regione Autonoma della Sardegna / Regione Autònoma de Sardigna."
candidate_labels = ["geografia", "politica", "macchine", "cibo", "moda"]
hypothesis_template = "si parla di {}"
classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)

📚 Documentation

Model Description

This model takes [xlm - roberta - large](https://huggingface.co/xlm - roberta - large) and fine - tunes it on a subset of NLI data taken from an automatically translated version of the MNLI corpus. It is intended to be used for zero - shot text classification, such as with the Hugging Face ZeroShotClassificationPipeline.

Intended Usage

This model is intended to be used for zero - shot text classification of Italian texts. Since the base model was pre - trained on 100 different languages, the model has shown some effectiveness in languages beyond those listed above as well. See the full list of pre - trained languages in appendix A of the XLM Roberata paper. For English - only classification, it is recommended to use [bart - large - mnli](https://huggingface.co/facebook/bart - large - mnli) or [a distilled bart MNLI model](https://huggingface.co/models?filter=pipeline_tag%3Azero - shot - classification&search=valhalla).

Training

Version 0.1

The model has been now retrained on the full training set. Around 1000 sentences pairs have been removed from the set because their translation was botched by the translation model.

Property	Details
learning_rate	4e - 6
optimizer	AdamW
batch_size	80
mcc	0.77
train_loss	0.34
eval_loss	0.40
stopped_at_step	9754

Version 0.0

This model was pre - trained on a set of 100 languages, as described in the original paper. It was then fine - tuned on the task of NLI on an Italian translation of the MNLI dataset (85% of the train set only so far). The model used for translating the texts is Helsinki - NLP/opus - mt - en - it, with a max output sequence length of 120. The model has been trained for 1 epoch with learning rate 4e - 6 and batch size 80, currently it scores 82 acc. on the remaining 15% of the training.

🔧 Technical Details

The model is based on the xlm - roberta - large architecture and is fine - tuned on NLI data from an automatically translated MNLI corpus. The translation model used is Helsinki - NLP/opus - mt - en - it.

📄 License

The model is released under the MIT license.

Model Performance in Version 0.1

	matched - it acc	mismatched - it acc
XLM - roberta - large - it - mnli	84.75	85.39

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご