rubert-base-cased-nli-threeway Open Source Model - Free Deployment for Predicting Logical Relations in Russian Texts

Rubert Base Cased Nli Threeway

Developed by cointegrated

A Russian natural language inference model fine-tuned from DeepPavlov/rubert-base-cased, capable of predicting logical relationships (entailment/contradiction/neutral) between two texts

Text Classification

Transformers

Other#Russian Natural Language Inference #Three-way Logical Relations #Zero-shot Classification

Downloads 144.68k

Release Time : 3/2/2022

Model Overview

This model is specifically designed for Russian natural language inference tasks, capable of determining the logical relationship between two text segments, supporting three classifications: entailment, contradiction, and neutral.

Model Features

Multi-dataset Training

Trained on multiple NLI datasets translated from English to Russian, including JOCI, MNLI, MPE, SICK, SNLI, etc.

Zero-shot Classification Capability

Can perform zero-shot short text classification (e.g., sentiment analysis) via natural language inference

Three-way Logical Relations

Can distinguish three types of logical relationships between texts: entailment, contradiction, and neutral

Model Capabilities

Natural Language Inference

Zero-shot Classification

Text Relation Analysis

Use Cases

Text Analysis

Logical Relation Judgment

Determine the logical relationship between two Russian texts (e.g., whether the premise and conclusion are consistent)

Can output probability distributions for the three relationships

Sentiment Analysis

Zero-shot Sentiment Classification

Perform sentiment analysis without training by defining positive/negative label texts

Example shows 94% accuracy in identifying negative reviews

🚀 RuBERT for NLI (natural language inference)

This is a fine - tuned model based on DeepPavlov/rubert-base-cased. It is designed to predict the logical relationship between two short texts, including entailment, contradiction, or neutral.

🚀 Quick Start

✨ Features

Fine - tuned from DeepPavlov/rubert-base-cased for natural language inference.
Can be used for zero - shot short text classification, such as sentiment analysis.

📦 Installation

Before using the model, you need to install the necessary libraries. You can use the following command:

!pip install transformers sentencepiece --quiet

💻 Usage Examples

Basic Usage

How to run the model for NLI:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_checkpoint = 'cointegrated/rubert-base-cased-nli-threeway'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint)
if torch.cuda.is_available():
    model.cuda()

text1 = 'Сократ - человек, а все люди смертны.'
text2 = 'Сократ никогда не умрёт.'
with torch.inference_mode():
    out = model(**tokenizer(text1, text2, return_tensors='pt').to(model.device))
    proba = torch.softmax(out.logits, -1).cpu().numpy()[0]
print({v: proba[k] for k, v in model.config.id2label.items()})
# {'entailment': 0.009525929, 'contradiction': 0.9332064, 'neutral': 0.05726764}

Advanced Usage

You can also use this model for zero - shot short text classification (by labels only), e.g. for sentiment analysis:

def predict_zero_shot(text, label_texts, model, tokenizer, label='entailment', normalize=True):
    label_texts
    tokens = tokenizer([text] * len(label_texts), label_texts, truncation=True, return_tensors='pt', padding=True)
    with torch.inference_mode():
        result = torch.softmax(model(**tokens.to(model.device)).logits, -1)
    proba = result[:, model.config.label2id[label]].cpu().numpy()
    if normalize:
        proba /= sum(proba)
    return proba

classes = ['Я доволен', 'Я недоволен']
predict_zero_shot('Какая гадость эта ваша заливная рыба!', classes, model, tokenizer)
# array([0.05609814, 0.9439019 ], dtype=float32)
predict_zero_shot('Какая вкусная эта ваша заливная рыба!', classes, model, tokenizer)
# array([0.9059292 , 0.09407079], dtype=float32)

Alternatively, you can use Huggingface pipelines for inference.

📚 Documentation

Sources

The model has been trained on a series of NLI datasets automatically translated to Russian from English.

Most datasets were taken from the repo of Felipe Salvatore: JOCI, MNLI, MPE, SICK, SNLI.

Some datasets obtained from the original sources: ANLI, NLI-style FEVER, IMPPRES.

Performance

The table below shows ROC AUC (one class vs rest) for five models on the corresponding dev sets:

tiny: a small BERT predicting entailment vs not_entailment
twoway: a base - sized BERT predicting entailment vs not_entailment
threeway (this model): a base - sized BERT predicting entailment vs contradiction vs neutral
vicgalle-xlm: a large multilingual NLI model
facebook-bart: a large multilingual NLI model

Property	Details
Model Type	Fine - tuned DeepPavlov/rubert-base-cased for NLI and zero - shot classification
Training Data	Datasets: cointegrated/nli-rus-translated-v2021, including data from JOCI, MNLI, MPE, SICK, SNLI, ANLI, NLI-style FEVER, IMPPRES

model	add_one_rte	anli_r1	anli_r2	anli_r3	copa	fever	help	iie	imppres	joci	mnli	monli	mpe	scitail	sick	snli	terra	total
n_observations	387	1000	1000	1200	200	20474	3355	31232	7661	939	19647	269	1000	2126	500	9831	307	101128
tiny/entailment	0.77	0.59	0.52	0.53	0.53	0.90	0.81	0.78	0.93	0.81	0.82	0.91	0.81	0.78	0.93	0.95	0.67	0.77
twoway/entailment	0.89	0.73	0.61	0.62	0.58	0.96	0.92	0.87	0.99	0.90	0.90	0.99	0.91	0.96	0.97	0.97	0.87	0.86
threeway/entailment	0.91	0.75	0.61	0.61	0.57	0.96	0.56	0.61	0.99	0.90	0.91	0.67	0.92	0.84	0.98	0.98	0.90	0.80
vicgalle-xlm/entailment	0.88	0.79	0.63	0.66	0.57	0.93	0.56	0.62	0.77	0.80	0.90	0.70	0.83	0.84	0.91	0.93	0.93	0.78
facebook-bart/entailment	0.51	0.41	0.43	0.47	0.50	0.74	0.55	0.57	0.60	0.63	0.70	0.52	0.56	0.68	0.67	0.72	0.64	0.58
threeway/contradiction		0.71	0.64	0.61		0.97			1.00	0.77	0.92		0.89		0.99	0.98		0.85
threeway/neutral		0.79	0.70	0.62		0.91			0.99	0.68	0.86		0.79		0.96	0.96		0.83

For evaluation (and for training of the tiny and twoway models), some extra datasets were used: Add-one RTE, CoPA, IIE, and SCITAIL taken from the repo of Felipe Salvatore and translatted, HELP and MoNLI taken from the original sources and translated, and Russian TERRa.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご