DeBERTa-v3-large-zeroshot-v2.0-c开源模型 - 免费部署实现高效零样本分类

首页

Deberta V3 Large Zeroshot V2.0 C

由 MoritzLaurer 开发

专为高效零样本分类设计的DeBERTa-v3-large模型，使用完全商业友好的合成数据和NLI数据集训练，支持GPU/CPU推理

文本分类

Transformers

英语开源协议:MIT #零样本分类 #商业友好数据 #多行业适配

下载量 1,560

发布时间 : 3/20/2024

模型简介

基于DeBERTa-v3-large架构的零样本分类模型，通过自然语言推理(NLI)任务格式实现无需训练数据的文本分类，适用于多领域场景

模型特点

商业友好数据

使用Mixtral生成的合成数据和MNLI/FEVER-NLI商业友好数据集训练，满足严格许可证要求

零样本分类

无需训练数据即可执行文本分类任务，通过假设模板将任意分类任务转化为NLI格式

高性能架构

基于DeBERTa-v3-large架构，在28个文本分类任务上平均F1分数达0.676，优于同类基准模型

灵活模板

支持自定义假设模板（hypothesis_template），类似LLM的提示工程，可优化分类效果

模型能力

零样本文本分类

多类别分类（单标签/多标签）

跨领域分类（支持25+行业）

使用案例

内容分类

新闻主题分类

将新闻自动分类为政治、经济、娱乐等主题

在合成数据测试中显示高准确率

社交媒体内容审核

识别违规内容类别（仇恨言论、虚假信息等）

商业分析

客户反馈分类

将用户评论自动归类到产品功能、服务质量等维度

🚀 DeBERTa-v3大模型零样本分类器v2.0-c

本项目基于自然语言推理（NLI）任务，开发了一系列适用于零样本分类的模型。这些模型无需训练数据即可进行分类，可在GPU和CPU上运行，为文本分类任务提供了高效且灵活的解决方案。

🚀 快速开始

本系列模型旨在与Hugging Face管道配合使用，实现高效的零样本分类。这些模型无需训练数据即可进行分类，并且可以在GPU和CPU上运行。最新零样本分类器的概述可在零样本分类器集合中查看。

✨ 主要特性

无需训练数据：模型能够在没有训练数据的情况下进行分类，大大节省了时间和资源。
跨平台运行：支持在GPU和CPU上运行，具有良好的通用性。
商业友好：部分模型使用完全商业友好的数据进行训练，满足严格的许可要求。

📦 安装指南

使用以下命令安装所需的库：

pip install transformers[sentencepiece]

💻 使用示例

基础用法

#!pip install transformers[sentencepiece]
from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

高级用法

from transformers import pipeline
text = "Angela Merkel is a politician in Germany and leader of the CDU"
# formulation 1
hypothesis_template = "This text is about {}"
classes_verbalized = ["politics", "economy", "entertainment", "environment"]
# formulation 2 depending on your use-case
hypothesis_template = "The topic of this text is {}"
classes_verbalized = ["political activities", "economic policy", "entertainment or music", "environmental protection"]
# test different formulations
zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0")  # change the model identifier here
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
print(output)

📚 详细文档

模型描述

zeroshot-v2.0系列模型的主要更新是，部分模型使用完全商业友好的数据进行训练，以满足严格许可要求的用户。这些模型可以执行一个通用的分类任务：根据给定的文本确定假设是“真”还是“假”（entailment vs. not_entailment）。该任务格式基于自然语言推理任务（NLI），任何分类任务都可以通过Hugging Face管道重新表述为该任务。

训练数据

名称中带有“-c”的模型使用两种完全商业友好的数据进行训练：

合成数据：使用Mixtral-8x7B-Instruct-v0.1生成的合成数据。最终使用的数据集可在synthetic_zeroshot_mixtral_v0.1数据集中的mixtral_written_text_for_tasks_v4子集中找到。
商业友好的NLI数据集：MNLI和FEVER-NLI，用于提高泛化能力。

名称中没有“-c”的模型还包括更广泛的训练数据，其许可证也更加多样化。

指标

模型在28个不同的文本分类任务上使用f1_macro指标进行评估。主要参考点是facebook/bart-large-mnli，在撰写本文时（2024年4月3日），它是最常用的商业友好型零样本分类器。

属性	详情
模型类型	DeBERTa-v3大模型零样本分类器v2.0-c
训练数据	合成数据和商业友好的NLI数据集

模型选择建议

DeBERTa-v3零样本分类器与RoBERTa零样本分类器：DeBERTa-v3的性能明显优于RoBERTa，但速度稍慢。RoBERTa与Hugging Face的生产推理TEI容器和Flash Attention直接兼容，适合生产环境。
商业用途：名称中带有“-c”的模型保证仅使用商业友好的数据进行训练。没有“-c”的模型使用更多数据进行训练，性能更好，但包含非商业许可证的数据。对于有严格法律要求的用户，建议使用名称中带有“-c”的模型。
多语言/非英语用途：建议使用bge-m3-zeroshot-v2.0或bge-m3-zeroshot-v2.0-c。多语言模型的性能不如仅支持英语的模型，也可以先使用EasyNMT等库将文本机器翻译为英语，然后再应用仅支持英语的模型。
上下文窗口：bge-m3模型可以处理多达8192个标记，其他模型可以处理多达512个标记。较长的文本输入会使模型变慢并降低性能，如果仅处理最多400个单词/1页的文本，建议使用DeBERTa模型以获得更好的性能。

复现

复现代码可在此处的v2_synthetic_data目录中找到。

局限性和偏差

模型仅能执行文本分类任务。偏差可能来自底层基础模型、人类NLI训练数据和Mixtral生成的合成数据。

许可证

基础模型根据MIT许可证发布。训练数据的许可证因模型而异，请参阅上文。

引用

如果在学术研究中使用此模型，请引用以下论文：

@misc{laurer_building_2023,
	title = {Building {Efficient} {Universal} {Classifiers} with {Natural} {Language} {Inference}},
	url = {http://arxiv.org/abs/2312.17543},
	doi = {10.48550/arXiv.2312.17543},
	abstract = {Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-like models can also learn universal tasks, which allow them to do any text classification task without requiring fine-tuning (zeroshot classification) or to learn new tasks with only a few examples (fewshot), while being significantly more efficient than generative LLMs. This paper (1) explains how Natural Language Inference (NLI) can be used as a universal classification task that follows similar principles as instruction fine-tuning of generative LLMs, (2) provides a step-by-step guide with reusable Jupyter notebooks for building a universal classifier, and (3) shares the resulting universal classifier that is trained on 33 datasets with 389 diverse classes. Parts of the code we share has been used to train our older zeroshot classifiers that have been downloaded more than 55 million times via the Hugging Face Hub as of December 2023. Our new classifier improves zeroshot performance by 9.4\%.},
	urldate = {2024-01-05},
	publisher = {arXiv},
	author = {Laurer, Moritz and van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper},
	month = dec,
	year = {2023},
	note = {arXiv:2312.17543 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}