模型简介
模型特点
模型能力
使用案例
🚀 DeBERTa-v3-large-mnli-fever-anli-ling-wanli
本模型可用于零样本分类,在自然语言推理任务中表现出色,基于多个高质量数据集微调,显著提升了模型性能。
🚀 快速开始
本模型基于 Microsoft的DeBERTa - v3 - large 微调而来,结合了多项创新,相比经典的掩码语言模型(如BERT、RoBERTa等)有显著优势,详情见 论文。
如何使用模型
简单的零样本分类管道
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)
NLI使用案例
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device)) # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
✨ 主要特性
- 本模型在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 数据集上进行了微调,这些数据集包含885242个NLI假设 - 前提对。
- 截至2022年6月6日,该模型是Hugging Face Hub上性能最佳的NLI模型,可用于零样本分类。
- 在 ANLI基准测试 中,该模型显著优于所有其他大型模型。
📦 安装指南
文档未提及安装步骤,故跳过此章节。
💻 使用示例
基础用法
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)
高级用法
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device)) # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
📚 详细文档
训练数据
DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 数据集上进行训练,这些数据集包含885242个NLI假设 - 前提对。由于 SNLI 数据集存在质量问题,因此明确将其排除在外。更多的数据并不一定能造就更好的NLI模型。
训练过程
DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 使用Hugging Face训练器进行训练,使用了以下超参数。在测试中发现,更长的训练时间和更多的训练轮数会损害模型性能(过拟合)。
training_args = TrainingArguments(
num_train_epochs=4, # total number of training epochs
learning_rate=5e-06,
per_device_train_batch_size=16, # batch size per device during training
gradient_accumulation_steps=2, # doubles the effective batch_size to 32, while decreasing memory requirements
per_device_eval_batch_size=64, # batch size for evaluation
warmup_ratio=0.06, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
fp16=True # mixed precision training
)
评估结果
该模型使用MultiNLI、ANLI、LingNLI、WANLI的测试集和Fever - NLI的开发集进行评估,使用的指标是准确率。该模型在每个数据集上都达到了最先进的性能。令人惊讶的是,它在 ANLI 上的表现比之前的最先进模型(ALBERT - XXL)高出8.3%。推测这是因为ANLI是为了迷惑像RoBERTa(或ALBERT)这样的掩码语言模型而创建的,而DeBERTa - v3使用了更好的预训练目标(RTD)、解耦注意力,并且在更高质量的NLI数据上进行了微调。
数据集 | mnli_test_m | mnli_test_mm | anli_test | anli_test_r3 | ling_test | wanli_test |
---|---|---|---|---|---|---|
准确率 | 0.912 | 0.908 | 0.702 | 0.64 | 0.87 | 0.77 |
速度(文本/秒,A100 GPU) | 696.0 | 697.0 | 488.0 | 425.0 | 828.0 | 980.0 |
🔧 技术细节
文档未提供足够的技术实现细节,故跳过此章节。
📄 许可证
本模型使用MIT许可证。
⚠️ 重要提示
请注意,DeBERTa - v3于2021年12月6日发布,较旧版本的HF Transformers在运行该模型时似乎存在问题(例如,会导致分词器出现问题)。使用Transformers >= 4.13可能会解决一些问题。
💡 使用建议
请参考原始的DeBERTa - v3论文和不同NLI数据集的相关文献,以获取更多关于训练数据和潜在偏差的信息。该模型会重现训练数据中的统计模式。
引用
如果您使用此模型,请引用:Laurer, Moritz, Wouter van Atteveldt, Andreu Salleras Casas, and Kasper Welbers. 2022. ‘Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT - NLI’. Preprint, June. Open Science Framework. https://osf.io/74b8k.
合作建议或问题咨询
如果您有合作想法或问题,请通过m{dot}laurer{at}vu{dot}nl联系我,或访问 LinkedIn。








