DeBERTa-v3-large-zeroshot-v1 Open-source Model - A Practical and Efficient Solution for Zero-shot Multi-class Classification Tasks

Home

Deberta V3 Large Zeroshot V1

Developed by MoritzLaurer

A DeBERTa-v3 model specifically designed for zero-shot classification tasks, excelling in various classification tasks

Text Classification

Transformers

EnglishOpen Source License:MIT #Zero-shot classification #Multi-task fine-tuning #NLI reformulation

Downloads 10.72k

Release Time : 10/3/2023

Model Overview

This model is used for zero-shot text classification tasks, determining the relevance between text and given labels through Natural Language Inference (NLI)

Model Features

Zero-shot classification capability

Classifies new categories without task-specific training

Multi-task training

Trained on a mixed dataset of 27 tasks and 310 categories

Universal task format

Converts classification tasks into Natural Language Inference (NLI) format to determine entailment relationships between text and labels

Model Capabilities

Text classification

Zero-shot learning

Multi-label classification

Use Cases

Sentiment analysis

Review sentiment classification

Classifies product reviews as positive/negative

Performs well on datasets like AmazonPolarity

Content moderation

Harmful content detection

Identifies hate speech, offensive content, etc. in text

Trained on datasets like WikiToxic

Topic classification

News classification

Categorizes news articles into different topics

Trained on datasets like AGNews

🚀 deberta-v3-large-zeroshot-v1

This model is designed for zero - shot classification with the Hugging Face pipeline, offering enhanced performance in zero - shot classification compared to other models on the Hugging Face hub.

🚀 Quick Start

The model can perform zero - shot classification tasks. Here is a simple example of using the zero - shot classification pipeline:

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v1")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

✨ Features

Universal Task: The model can determine whether a hypothesis is true or not_true given a text, based on the Natural Language Inference task (NLI). Any classification task can be reformulated into this task.
Enhanced Performance: It is substantially better at zero - shot classification than other zero - shot models of the author on the Hugging Face hub.

📦 Installation

The installation mainly involves setting up the transformers library. You can use the following command to install it:

pip install transformers

📚 Documentation

Model description

The model is designed for zero - shot classification with the Hugging Face pipeline. It should perform significantly better at zero - shot classification than the author's other zero - shot models on the Hugging Face hub: https://huggingface.co/MoritzLaurer.

The model can handle a universal task: determining whether a hypothesis is true or not_true given a text (also called entailment vs. not_entailment). This task format is based on the Natural Language Inference task (NLI), and any classification task can be reformulated into this task.

Training data

The model was trained on a mixture of 27 tasks and 310 classes reformatted into the universal format:

26 classification tasks with ~400k texts: 'amazonpolarity', 'imdb', 'appreviews', 'yelpreviews', 'rottentomatoes', 'emotiondair', 'emocontext', 'empathetic', 'financialphrasebank', 'banking77', 'massive', 'wikitoxic_toxicaggregated', 'wikitoxic_obscene', 'wikitoxic_threat', 'wikitoxic_insult', 'wikitoxic_identityhate', 'hateoffensive', 'hatexplain', 'biasframes_offensive', 'biasframes_sex', 'biasframes_intent', 'agnews', 'yahootopics', 'trueteacher', 'spam', 'wellformedquery'. See details on each dataset here: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiI_LH4IXpr78wd_nmNd5FaE/edit?usp=sharing
Five NLI datasets with ~885k texts: "mnli", "anli", "fever", "wanli", "ling"

Note that compared to other NLI models, this model predicts two classes (entailment vs. not_entailment) instead of three classes (entailment/neutral/contradiction).

Details on data and training

The code for preparing the data and training & evaluating the model is fully open - source here: https://github.com/MoritzLaurer/zeroshot - classifier/tree/main

🔧 Technical Details

The model's training is based on a combination of multiple classification tasks and NLI datasets. By reformulating various classification tasks into the entailment vs. not_entailment format, the model can achieve zero - shot classification.

📄 License

The base model (DeBERTa - v3) is published under the MIT license. The datasets the model was fine - tuned on are published under a diverse set of licenses. The following spreadsheet provides an overview of the non - NLI datasets used for fine - tuning, containing information on licenses, the underlying papers etc.: https://docs.google.com/spreadsheets/d/1Z18tMh02IiWgh6o8pfoMiI_LH4IXpr78wd_nmNd5FaE/edit?usp=sharing

In addition, the model was also trained on the following NLI datasets: MNLI, ANLI, WANLI, LING - NLI, FEVER - NLI.

📋 Other Information

Limitations and bias

The model can only do text classification tasks. Please consult the original DeBERTa paper and the papers for the different datasets for potential biases.

Citation

If you use this model, please cite:

@article{laurer_less_2023,
	title = {Less {Annotating}, {More} {Classifying}: {Addressing} the {Data} {Scarcity} {Issue} of {Supervised} {Machine} {Learning} with {Deep} {Transfer} {Learning} and {BERT}-{NLI}},
	issn = {1047-1987, 1476-4989},
	shorttitle = {Less {Annotating}, {More} {Classifying}},
	url = {https://www.cambridge.org/core/product/identifier/S1047198723000207/type/journal_article},
	doi = {10.1017/pan.2023.20},
	language = {en},
	urldate = {2023-06-20},
	journal = {Political Analysis},
	author = {Laurer, Moritz and Van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper},
	month = jun,
	year = {2023},
	pages = {1--33},
}

Ideas for cooperation or questions?

If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or [LinkedIn](https://www.linkedin.com/in/moritz - laurer/)

Debugging and issues

Note that DeBERTa - v3 was released on 06.12.21 and older versions of HF Transformers seem to have issues running the model (e.g., resulting in an issue with the tokenizer). Using Transformers>=4.13 might solve some issues.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご