DeBERTa-v3-large-zeroshot-v2.0-c Open Source Model - Free Deployment for Efficient Zero-shot Classification

Deberta V3 Large Zeroshot V2.0 C

Developed by MoritzLaurer

A DeBERTa-v3-large model specifically designed for efficient zero-shot classification, trained on fully commercially friendly synthetic data and NLI datasets, supporting GPU/CPU inference

Text Classification

Transformers

EnglishOpen Source License:MIT #Zero-shot Classification #Commercially Friendly Data #Multi-industry Adaptation

Downloads 1,560

Release Time : 3/20/2024

Model Overview

A zero-shot classification model based on the DeBERTa-v3-large architecture, achieving text classification without training data through Natural Language Inference (NLI) task formatting, suitable for multi-domain scenarios

Model Features

Commercially Friendly Data

Trained on synthetic data generated by Mixtral and commercially friendly datasets like MNLI/FEVER-NLI, meeting strict licensing requirements

Zero-shot Classification

Performs text classification tasks without training data by converting any classification task into NLI format using hypothesis templates

High-performance Architecture

Based on the DeBERTa-v3-large architecture, achieving an average F1 score of 0.676 across 28 text classification tasks, outperforming comparable benchmark models

Flexible Templates

Supports custom hypothesis templates, similar to LLM prompt engineering, to optimize classification performance

Model Capabilities

Zero-shot text classification

Multi-class classification (single-label/multi-label)

Cross-domain classification (supports 25+ industries)

Use Cases

Content Classification

News Topic Classification

Automatically categorizes news into topics such as politics, economy, entertainment, etc.

Shows high accuracy in synthetic data testing

Social Media Content Moderation

Identifies categories of prohibited content (hate speech, misinformation, etc.)

Business Analysis

Customer Feedback Classification

Automatically categorizes user reviews into dimensions like product features, service quality, etc.

🚀 🤖 DeBERTa-v3-Large-ZeroShot-V2.0-C

This model is designed for efficient zero-shot classification, enabling classification without training data and running on both GPUs and CPUs.

🚀 Quick Start

Prerequisites

Make sure you have installed the transformers library with sentencepiece. You can install it using the following command:

#!pip install transformers[sentencepiece]

Example Code

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

multi_label=False forces the model to decide on only one class. multi_label=True enables the model to choose multiple classes.

✨ Features

zeroshot-v2.0 Series of Models

Models in this series are designed for efficient zeroshot classification with the Hugging Face pipeline. These models can do classification without training data and run on both GPUs and CPUs. An overview of the latest zeroshot classifiers is available in my Zeroshot Classifier Collection.

The main update of this zeroshot-v2.0 series of models is that several models are trained on fully commercially-friendly data for users with strict license requirements.

These models can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (entailment vs. not_entailment).
This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task by the Hugging Face pipeline.

📦 Installation

The installation mainly involves installing the transformers library. You can use the following command:

#!pip install transformers[sentencepiece]

💻 Usage Examples

Basic Usage

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

Advanced Usage

You can set multi_label=True to enable the model to choose multiple classes:

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=True)
print(output)

📚 Documentation

Training Data

Models with a "-c" in the name are trained on two types of fully commercially-friendly data:

Synthetic data generated with Mixtral-8x7B-Instruct-v0.1. I first created a list of 500+ diverse text classification tasks for 25 professions in conversations with Mistral-large. The data was manually curated. I then used this as seed data to generate several hundred thousand texts for these tasks with Mixtral-8x7B-Instruct-v0.1. The final dataset used is available in the synthetic_zeroshot_mixtral_v0.1 dataset in the subset mixtral_written_text_for_tasks_v4. Data curation was done in multiple iterations and will be improved in future iterations.
Two commercially-friendly NLI datasets: (MNLI, FEVER-NLI). These datasets were added to increase generalization.
Models without a "-c" in the name also included a broader mix of training data with a broader mix of licenses: ANLI, WANLI, LingNLI, and all datasets in this list where used_in_v1.1==True.

Metrics

The models were evaluated on 28 different text classification tasks with the f1_macro metric. The main reference point is facebook/bart-large-mnli which is, at the time of writing (03.04.24), the most used commercially-friendly 0-shot classifier.

results_aggreg_v2.0

	facebook/bart-large-mnli	roberta-base-zeroshot-v2.0-c	roberta-large-zeroshot-v2.0-c	deberta-v3-base-zeroshot-v2.0-c	deberta-v3-base-zeroshot-v2.0 (fewshot)	deberta-v3-large-zeroshot-v2.0-c	deberta-v3-large-zeroshot-v2.0 (fewshot)	bge-m3-zeroshot-v2.0-c	bge-m3-zeroshot-v2.0 (fewshot)
all datasets mean	0.497	0.587	0.622	0.619	0.643 (0.834)	0.676	0.673 (0.846)	0.59	(0.803)
amazonpolarity (2)	0.937	0.924	0.951	0.937	0.943 (0.961)	0.952	0.956 (0.968)	0.942	(0.951)
imdb (2)	0.892	0.871	0.904	0.893	0.899 (0.936)	0.923	0.918 (0.958)	0.873	(0.917)
appreviews (2)	0.934	0.913	0.937	0.938	0.945 (0.948)	0.943	0.949 (0.962)	0.932	(0.954)
yelpreviews (2)	0.948	0.953	0.977	0.979	0.975 (0.989)	0.988	0.985 (0.994)	0.973	(0.978)
rottentomatoes (2)	0.83	0.802	0.841	0.84	0.86 (0.902)	0.869	0.868 (0.908)	0.813	(0.866)
emotiondair (6)	0.455	0.482	0.486	0.459	0.495 (0.748)	0.499	0.484 (0.688)	0.453	(0.697)
emocontext (4)	0.497	0.555	0.63	0.59	0.592 (0.799)	0.699	0.676 (0.81)	0.61	(0.798)
empathetic (32)	0.371	0.374	0.404	0.378	0.405 (0.53)	0.447	0.478 (0.555)	0.387	(0.455)
financialphrasebank (3)	0.465	0.562	0.455	0.714	0.669 (0.906)	0.691	0.582 (0.913)	0.504	(0.895)
banking77 (72)	0.312	0.124	0.29	0.421	0.446 (0.751)	0.513	0.567 (0.766)	0.387	(0.715)
massive (59)	0.43	0.428	0.543	0.512	0.52 (0.755)	0.526	0.518 (0.789)	0.414	(0.692)
wikitoxic_toxicaggreg (2)	0.547	0.751	0.766	0.751	0.769 (0.904)	0.741	0.787 (0.911)	0.736	(0.9)
wikitoxic_obscene (2)	0.713	0.817	0.854	0.853	0.869 (0.922)	0.883	0.893 (0.933)	0.783	(0.914)
wikitoxic_threat (2)	0.295	0.71	0.817	0.813	0.87 (0.946)	0.827	0.879 (0.952)	0.68	(0.947)
wikitoxic_insult (2)	0.372	0.724	0.798	0.759	0.811 (0.912)	0.77	0.779 (0.924)	0.783	(0.915)
wikitoxic_identityhate (2)	0.473	0.774	0.798	0.774	0.765 (0.938)	0.797	0.806 (0.948)	0.761	(0.931)
hateoffensive (3)	0.161	0.352	0.29	0.315	0.371 (0.862)	0.47	0.461 (0.847)	0.291	(0.823)
hatexplain (3)	0.239	0.396	0.314	0.376	0.369 (0.765)	0.378	0.389 (0.764)	0.29	(0.729)
biasframes_offensive (2)	0.336	0.571	0.583	0.544	0.601 (0.867)	0.644	0.656 (0.883)	0.541	(0.855)
biasframes_sex (2)	0.263	0.617	0.835	0.741	0.809 (0.922)	0.846	0.815 (0.946)	0.748	(0.905)
biasframes_intent (2)	0.616	0.531	0.635	0.554	0.61 (0.881)	0.696	0.687 (0.891)	0.467	(0.868)
agnews (4)	0.703	0.758	0.745	0.68	0.742 (0.898)	0.819	0.771 (0.898)	0.687	(0.892)
yahootopics (10)	0.299	0.543	0.62	0.578	0.564 (0.722)	0.621	0.613 (0.738)	0.587	(0.711)
trueteacher (2)	...	...	...	...	...	...	...	...	...

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご