Bloomz-3b-nli Open-source Model - Free Natural Language Inference for Semantic Relationships in English and French

Bloomz 3b Nli

Developed by cmarkea

A natural language inference model fine-tuned based on Bloomz-3b-chat-dpo, supporting semantic relation judgment in English and French

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Openrail #Zero-shot classification #Multilingual reasoning #Semantic relation recognition

Downloads 22

Release Time : 11/28/2023

Model Overview

This model specializes in natural language inference tasks, capable of determining logical relationships (entailment/contradiction/neutral) between two sentences, with zero-shot classification capability. Trained in a language-agnostic manner, it supports arbitrary combinations of English and French inputs.

Model Features

Bilingual mixed reasoning

Supports arbitrary combinations of English and French inputs, maintaining high accuracy in cross-lingual scenarios

Zero-shot classification

Capable of multi-label classification for arbitrary texts without specific training, suitable for open-domain scenarios

Long-text understanding

Compared to traditional NLI models, it better handles semantic analysis of complex long-text structures

Model Capabilities

Natural language inference

Cross-lingual text classification

Semantic relation judgment

Zero-shot learning

Use Cases

Sentiment analysis

Movie review sentiment classification

Determines positive/negative sentiment in movie reviews

Achieved 89.06% accuracy on the Allociné dataset

Content classification

Multilingual news classification

Classifies themes (e.g., politics/technology/sports) in mixed English-French news

🚀 Bloomz-3b-NLI Model

The Bloomz-3b-NLI model is fine - tuned from the Bloomz - 3b - chat - dpo foundation model, trained on a language - agnostic Natural Language Inference (NLI) task.

🚀 Quick Start

The Bloomz-3b-NLI model is fine-tuned from the Bloomz-3b-chat-dpo foundation model. It is trained on a Natural Language Inference (NLI) task in a language-agnostic manner. The NLI task focuses on determining the semantic relationship between a hypothesis and a set of premises, typically presented as sentence pairs. The goal is to predict textual entailment (whether sentence A implies, contradicts, or has no relation to sentence B), which is a classification task.

Language-agnostic approach

It's worth noting that hypotheses and premises are randomly selected from English and French, with each language combination having a 25% probability.

Performance

class	precision (%)	f1-score (%)	support
global	81.96	81.07	5,010
contradiction	81.80	84.04	1,670
entailment	84.82	81.96	1,670
neutral	76.85	77.20	1,670

Benchmark

Hypothesis and premise in French

model	accuracy (%)	MCC (x100)
cmarkea/distilcamembert-base-nli	77.45	66.24
BaptisteDoyen/camembert-base-xnli	81.72	72.67
MoritzLaurer/mDeBERTa-v3-base-mnli-xnli	83.43	75.15
cmarkea/bloomz-560m-nli	68.70	53.57
cmarkea/bloomz-3b-nli	81.08	71.66
cmarkea/bloomz-7b1-mt-nli	83.13	74.89

Hypothesis in French and premise in English (cross - language context)

model	accuracy (%)	MCC (x100)
cmarkea/distilcamembert-base-nli	16.89	-26.82
BaptisteDoyen/camembert-base-xnli	74.59	61.97
MoritzLaurer/mDeBERTa-v3-base-mnli-xnli	85.15	77.74
cmarkea/bloomz-560m-nli	68.84	53.55
cmarkea/bloomz-3b-nli	82.12	73.22
cmarkea/bloomz-7b1-mt-nli	85.43	78.25

✨ Features

Zero-shot Classification

The main advantage of training such models lies in their zero-shot classification performance. This means the model can classify any text with any label without specific training. What differentiates the Bloomz-3b-NLI LLMs in this area is their ability to model and extract information from much more complex and lengthy text structures compared to models like BERT, RoBERTa, or CamemBERT.

The zero-shot classification task can be summarized by: $$P(hypothesis=i\in\mathcal{C}|premise)=\frac{e^{P(premise=entailment\vert hypothesis=i)}}{\sum_{j\in\mathcal{C}}e^{P(premise=entailment\vert hypothesis=j)}}$$ Here, i represents a hypothesis composed of a template (e.g., "This text is about {}."), and #C represents candidate labels ("cinema", "politics", etc.). The set of hypotheses consists of {"This text is about cinema.", "This text is about politics.", ...}. We measure these hypotheses against the premise, which is the sentence we want to classify.

Performance

The model is evaluated based on sentiment analysis on the French film review site Allociné. The dataset contains 20,000 reviews labeled into two classes: positive and negative comments. We use the hypothesis template "Ce commentaire est {}." and the candidate classes "positif" and "negatif".

model	accuracy (%)	MCC (x100)
cmarkea/distilcamembert-base-nli	80.59	63.71
BaptisteDoyen/camembert-base-xnli	86.37	73.74
MoritzLaurer/mDeBERTa-v3-base-mnli-xnli	84.97	70.05
cmarkea/bloomz-560m-nli	71.13	46.3
cmarkea/bloomz-3b-nli	89.06	78.10
cmarkea/bloomz-7b1-mt-nli	95.12	90.27

💻 Usage Examples

Basic Usage

from transformers import pipeline

classifier = pipeline(
    task='zero-shot-classification',
    model="cmarkea/bloomz-3b-nli"
)
result = classifier (
    sequences="Le style très cinéphile de Quentin Tarantino "
    "se reconnaît entre autres par sa narration postmoderne "
    "et non linéaire, ses dialogues travaillés souvent "
    "émaillés de références à la culture populaire, et ses "
    "scènes hautement esthétiques mais d'une violence "
    "extrême, inspirées de films d'exploitation, d'arts "
    "martiaux ou de western spaghetti.",
    candidate_labels="cinéma, technologie, littérature, politique",
    hypothesis_template="Ce texte parle de {}."
)

result
{"labels": ["cinéma",
            "littérature",
            "technologie",
            "politique"],
 "scores": [0.8745610117912292,
            0.10403601825237274,
            0.014962797053158283,
            0.0064402492716908455]}

# Resilience in cross-language French/English context
result = classifier (
    sequences="Quentin Tarantino's very cinephile style is "
    "recognized, among other things, by his postmodern and "
    "non-linear narration, his elaborate dialogues often "
    "peppered with references to popular culture, and his "
    "highly aesthetic but extremely violent scenes, inspired by "
    "exploitation films, martial arts or spaghetti western.",
    candidate_labels="cinéma, technologie, littérature, politique",
    hypothesis_template="Ce texte parle de {}."
)

result
{"labels": ["cinéma",
            "littérature",
            "technologie",
            "politique"],
 "scores": [0.9314399361610413,
            0.04960821941494942,
            0.013468802906572819,
            0.005483036395162344]}

📄 License

The model uses the bigscience-bloom-rail-1.0 license.

Property	Details
Model Type	Bloomz-3b-NLI
Training Data	xnli
License	bigscience-bloom-rail-1.0
Languages	French, English
Pipeline Tag	zero-shot-classification
Base Model	cmarkea/bloomz-3b-dpo-chat

📚 Documentation

Citation

@online{DeBloomzNLI,
  AUTHOR = {Cyrile Delestre},
  URL = {https://huggingface.co/cmarkea/bloomz-3b-nli},
  YEAR = {2024},
  KEYWORDS = {NLP ; Transformers ; LLM ; Bloomz},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご