模型概述
模型特點
模型能力
使用案例
🚀 DeBERTa-v3-large-mnli-fever-anli-ling-wanli
本模型可用於零樣本分類,在自然語言推理任務中表現出色,基於多個高質量數據集微調,顯著提升了模型性能。
🚀 快速開始
本模型基於 Microsoft的DeBERTa - v3 - large 微調而來,結合了多項創新,相比經典的掩碼語言模型(如BERT、RoBERTa等)有顯著優勢,詳情見 論文。
如何使用模型
簡單的零樣本分類管道
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)
NLI使用案例
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device)) # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
✨ 主要特性
- 本模型在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 數據集上進行了微調,這些數據集包含885242個NLI假設 - 前提對。
- 截至2022年6月6日,該模型是Hugging Face Hub上性能最佳的NLI模型,可用於零樣本分類。
- 在 ANLI基準測試 中,該模型顯著優於所有其他大型模型。
📦 安裝指南
文檔未提及安裝步驟,故跳過此章節。
💻 使用示例
基礎用法
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)
高級用法
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model_name = "MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was not good."
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device)) # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)
📚 詳細文檔
訓練數據
DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 在 MultiNLI、Fever - NLI、Adversarial - NLI (ANLI)、LingNLI 和 WANLI 數據集上進行訓練,這些數據集包含885242個NLI假設 - 前提對。由於 SNLI 數據集存在質量問題,因此明確將其排除在外。更多的數據並不一定能造就更好的NLI模型。
訓練過程
DeBERTa - v3 - large - mnli - fever - anli - ling - wanli 使用Hugging Face訓練器進行訓練,使用了以下超參數。在測試中發現,更長的訓練時間和更多的訓練輪數會損害模型性能(過擬合)。
training_args = TrainingArguments(
num_train_epochs=4, # total number of training epochs
learning_rate=5e-06,
per_device_train_batch_size=16, # batch size per device during training
gradient_accumulation_steps=2, # doubles the effective batch_size to 32, while decreasing memory requirements
per_device_eval_batch_size=64, # batch size for evaluation
warmup_ratio=0.06, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
fp16=True # mixed precision training
)
評估結果
該模型使用MultiNLI、ANLI、LingNLI、WANLI的測試集和Fever - NLI的開發集進行評估,使用的指標是準確率。該模型在每個數據集上都達到了最先進的性能。令人驚訝的是,它在 ANLI 上的表現比之前的最先進模型(ALBERT - XXL)高出8.3%。推測這是因為ANLI是為了迷惑像RoBERTa(或ALBERT)這樣的掩碼語言模型而創建的,而DeBERTa - v3使用了更好的預訓練目標(RTD)、解耦注意力,並且在更高質量的NLI數據上進行了微調。
數據集 | mnli_test_m | mnli_test_mm | anli_test | anli_test_r3 | ling_test | wanli_test |
---|---|---|---|---|---|---|
準確率 | 0.912 | 0.908 | 0.702 | 0.64 | 0.87 | 0.77 |
速度(文本/秒,A100 GPU) | 696.0 | 697.0 | 488.0 | 425.0 | 828.0 | 980.0 |
🔧 技術細節
文檔未提供足夠的技術實現細節,故跳過此章節。
📄 許可證
本模型使用MIT許可證。
⚠️ 重要提示
請注意,DeBERTa - v3於2021年12月6日發佈,較舊版本的HF Transformers在運行該模型時似乎存在問題(例如,會導致分詞器出現問題)。使用Transformers >= 4.13可能會解決一些問題。
💡 使用建議
請參考原始的DeBERTa - v3論文和不同NLI數據集的相關文獻,以獲取更多關於訓練數據和潛在偏差的信息。該模型會重現訓練數據中的統計模式。
引用
如果您使用此模型,請引用:Laurer, Moritz, Wouter van Atteveldt, Andreu Salleras Casas, and Kasper Welbers. 2022. ‘Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT - NLI’. Preprint, June. Open Science Framework. https://osf.io/74b8k.
合作建議或問題諮詢
如果您有合作想法或問題,請通過m{dot}laurer{at}vu{dot}nl聯繫我,或訪問 LinkedIn。








