DeBERTa-v3-large-zeroshot-v2.0-c開源模型 - 免費部署實現高效零樣本分類

首頁

Deberta V3 Large Zeroshot V2.0 C

由MoritzLaurer開發

專為高效零樣本分類設計的DeBERTa-v3-large模型，使用完全商業友好的合成數據和NLI數據集訓練，支持GPU/CPU推理

文本分類

Transformers

英語開源協議:MIT #零樣本分類 #商業友好數據 #多行業適配

下載量 1,560

發布時間 : 3/20/2024

模型概述

基於DeBERTa-v3-large架構的零樣本分類模型，通過自然語言推理(NLI)任務格式實現無需訓練數據的文本分類，適用於多領域場景

模型特點

商業友好數據

使用Mixtral生成的合成數據和MNLI/FEVER-NLI商業友好數據集訓練，滿足嚴格許可證要求

零樣本分類

無需訓練數據即可執行文本分類任務，通過假設模板將任意分類任務轉化為NLI格式

高性能架構

基於DeBERTa-v3-large架構，在28個文本分類任務上平均F1分數達0.676，優於同類基準模型

靈活模板

支持自定義假設模板（hypothesis_template），類似LLM的提示工程，可優化分類效果

模型能力

零樣本文本分類

多類別分類（單標籤/多標籤）

跨領域分類（支持25+行業）

使用案例

內容分類

新聞主題分類

將新聞自動分類為政治、經濟、娛樂等主題

在合成數據測試中顯示高準確率

社交媒體內容審核

識別違規內容類別（仇恨言論、虛假信息等）

商業分析

客戶反饋分類

將用戶評論自動歸類到產品功能、服務質量等維度

🚀 DeBERTa-v3大模型零樣本分類器v2.0-c

本項目基於自然語言推理（NLI）任務，開發了一系列適用於零樣本分類的模型。這些模型無需訓練數據即可進行分類，可在GPU和CPU上運行，為文本分類任務提供了高效且靈活的解決方案。

🚀 快速開始

本系列模型旨在與Hugging Face管道配合使用，實現高效的零樣本分類。這些模型無需訓練數據即可進行分類，並且可以在GPU和CPU上運行。最新零樣本分類器的概述可在零樣本分類器集合中查看。

✨ 主要特性

無需訓練數據：模型能夠在沒有訓練數據的情況下進行分類，大大節省了時間和資源。
跨平臺運行：支持在GPU和CPU上運行，具有良好的通用性。
商業友好：部分模型使用完全商業友好的數據進行訓練，滿足嚴格的許可要求。

📦 安裝指南

使用以下命令安裝所需的庫：

pip install transformers[sentencepiece]

💻 使用示例

基礎用法

#!pip install transformers[sentencepiece]
from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

高級用法

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
# formulation 1
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
# formulation 2 depending on your use-case
hypothesis_template = "The topic of this text is {}"
classes_verbalized = ["political activities", "economic policy", "entertainment or music", "environmental protection"]
# test different formulations
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

📚 詳細文檔

模型描述

zeroshot-v2.0系列模型的主要更新是，部分模型使用完全商業友好的數據進行訓練，以滿足嚴格許可要求的用戶。這些模型可以執行一個通用的分類任務：根據給定的文本確定假設是“真”還是“假”（entailment vs. not_entailment）。該任務格式基於自然語言推理任務（NLI），任何分類任務都可以通過Hugging Face管道重新表述為該任務。

訓練數據

名稱中帶有“-c”的模型使用兩種完全商業友好的數據進行訓練：

合成數據：使用Mixtral-8x7B-Instruct-v0.1生成的合成數據。最終使用的數據集可在synthetic_zeroshot_mixtral_v0.1數據集中的mixtral_written_text_for_tasks_v4子集中找到。
商業友好的NLI數據集：MNLI和FEVER-NLI，用於提高泛化能力。

名稱中沒有“-c”的模型還包括更廣泛的訓練數據，其許可證也更加多樣化。

指標

模型在28個不同的文本分類任務上使用f1_macro指標進行評估。主要參考點是facebook/bart-large-mnli，在撰寫本文時（2024年4月3日），它是最常用的商業友好型零樣本分類器。

屬性	詳情
模型類型	DeBERTa-v3大模型零樣本分類器v2.0-c
訓練數據	合成數據和商業友好的NLI數據集

模型選擇建議

DeBERTa-v3零樣本分類器與RoBERTa零樣本分類器：DeBERTa-v3的性能明顯優於RoBERTa，但速度稍慢。RoBERTa與Hugging Face的生產推理TEI容器和Flash Attention直接兼容，適合生產環境。
商業用途：名稱中帶有“-c”的模型保證僅使用商業友好的數據進行訓練。沒有“-c”的模型使用更多數據進行訓練，性能更好，但包含非商業許可證的數據。對於有嚴格法律要求的用戶，建議使用名稱中帶有“-c”的模型。
多語言/非英語用途：建議使用bge-m3-zeroshot-v2.0或bge-m3-zeroshot-v2.0-c。多語言模型的性能不如僅支持英語的模型，也可以先使用EasyNMT等庫將文本機器翻譯為英語，然後再應用僅支持英語的模型。
上下文窗口：bge-m3模型可以處理多達8192個標記，其他模型可以處理多達512個標記。較長的文本輸入會使模型變慢並降低性能，如果僅處理最多400個單詞/1頁的文本，建議使用DeBERTa模型以獲得更好的性能。

復現

復現代碼可在此處的v2_synthetic_data目錄中找到。

侷限性和偏差

模型僅能執行文本分類任務。偏差可能來自底層基礎模型、人類NLI訓練數據和Mixtral生成的合成數據。

許可證

基礎模型根據MIT許可證發佈。訓練數據的許可證因模型而異，請參閱上文。

引用

如果在學術研究中使用此模型，請引用以下論文：

@misc{laurer_building_2023,
	title = {Building {Efficient} {Universal} {Classifiers} with {Natural} {Language} {Inference}},
	url = {http://arxiv.org/abs/2312.17543},
	doi = {10.48550/arXiv.2312.17543},
	abstract = {Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4\%.},
	urldate = {2024-01-05},
	publisher = {arXiv},
	author = {Laurer, Moritz and van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper},
	month = dec,
	year = {2023},
	note = {arXiv:2312.17543 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}