DeBERTa-v3-large-mnli-fever-anli-ling-wanli開源NLI模型

首頁

Deberta V3 Large Mnli Fever Anli Ling Wanli

由MoritzLaurer開發

基於DeBERTa-v3-large微調的NLI模型，在多個NLI數據集上達到最先進性能

文本分類

Transformers

英語開源協議:MIT #零樣本分類 #自然語言推理 #高準確率

下載量 312.01k

發布時間 : 6/6/2022

模型概述

該模型在MultiNLI、Fever-NLI、ANLI、LingNLI和WANLI數據集上進行了微調，用於自然語言推理和零樣本分類任務。

模型特點

多數據集訓練

在多個高質量NLI數據集上訓練，共885,242個假設-前提對

最先進性能

在ANLI等基準測試上顯著優於其他大型模型

零樣本分類能力

可用於無需特定領域訓練的零樣本分類任務

模型能力

自然語言推理

零樣本分類

文本分類

使用案例

文本分析

新聞分類

對新聞內容進行零樣本分類，如政治、經濟等類別

高準確率分類

情感分析

通過NLI判斷文本情感傾向

內容審核

有害內容識別

識別文本中是否包含特定類型的有害內容

🚀 DeBERTa-v3-large-mnli-fever-anli-ling-wanli

本模型可用於零樣本分類，在自然語言推理任務中表現出色，基於多個高質量數據集微調，顯著提升了模型性能。

🚀 快速開始

本模型基於 Microsoft的DeBERTa - v3 - large 微調而來，結合了多項創新，相比經典的掩碼語言模型（如BERT、RoBERTa等）有顯著優勢，詳情見論文。

如何使用模型

簡單的零樣本分類管道

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

NLI使用案例

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

✨ 主要特性

本模型在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 數據集上進行了微調，這些數據集包含885242個NLI假設 - 前提對。
截至2022年6月6日，該模型是Hugging Face Hub上性能最佳的NLI模型，可用於零樣本分類。
在 ANLI基準測試中，該模型顯著優於所有其他大型模型。

📦 安裝指南

文檔未提及安裝步驟，故跳過此章節。

💻 使用示例

基礎用法

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

高級用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 詳細文檔

訓練數據

DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 數據集上進行訓練，這些數據集包含885242個NLI假設 - 前提對。由於 SNLI 數據集存在質量問題，因此明確將其排除在外。更多的數據並不一定能造就更好的NLI模型。

訓練過程

DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 使用Hugging Face訓練器進行訓練，使用了以下超參數。在測試中發現，更長的訓練時間和更多的訓練輪數會損害模型性能（過擬合）。

training_args = TrainingArguments(
    num_train_epochs=4,              # total number of training epochs
    learning_rate=5e-06,
    per_device_train_batch_size=16,   # batch size per device during training
    gradient_accumulation_steps=2,    # doubles the effective batch_size to 32, while decreasing memory requirements
    per_device_eval_batch_size=64,    # batch size for evaluation
    warmup_ratio=0.06,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    fp16=True                        # mixed precision training
)

評估結果

該模型使用MultiNLI、ANLI、LingNLI、WANLI的測試集和Fever - NLI的開發集進行評估，使用的指標是準確率。該模型在每個數據集上都達到了最先進的性能。令人驚訝的是，它在 ANLI 上的表現比之前的最先進模型（ALBERT - XXL）高出8.3%。推測這是因為ANLI是為了迷惑像RoBERTa（或ALBERT）這樣的掩碼語言模型而創建的，而DeBERTa - v3使用了更好的預訓練目標（RTD）、解耦注意力，並且在更高質量的NLI數據上進行了微調。

數據集	mnli_test_m	mnli_test_mm	anli_test	anli_test_r3	ling_test	wanli_test
準確率	0.912	0.908	0.702	0.64	0.87	0.77
速度（文本/秒，A100 GPU）	696.0	697.0	488.0	425.0	828.0	980.0

🔧 技術細節

文檔未提供足夠的技術實現細節，故跳過此章節。

📄 許可證

本模型使用MIT許可證。

⚠️ 重要提示

請注意，DeBERTa - v3於2021年12月6日發佈，較舊版本的HF Transformers在運行該模型時似乎存在問題（例如，會導致分詞器出現問題）。使用Transformers >= 4.13可能會解決一些問題。

💡 使用建議

請參考原始的DeBERTa - v3論文和不同NLI數據集的相關文獻，以獲取更多關於訓練數據和潛在偏差的信息。該模型會重現訓練數據中的統計模式。

引用

如果您使用此模型，請引用：Laurer, Moritz, Wouter van Atteveldt, Andreu Salleras Casas, and Kasper Welbers. 2022. ‘Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT - NLI’. Preprint, June. Open Science Framework. https://osf.io/74b8k.