DeBERTa-v3-base-mnli-fever-docnli-ling-2c開源模型

首頁

Deberta V3 Base Mnli Fever Docnli Ling 2c

由MoritzLaurer開發

基於8個NLI數據集訓練的DeBERTa-v3二分類模型，擅長文本蘊含判斷

文本分類

Transformers

英語開源協議:MIT #零樣本文本分類 #長文本推理 #多NLI數據集訓練

下載量 234

發布時間 : 3/2/2022

模型概述

該模型是基於DeBERTa-v3架構的自然語言推理模型，專門針對二分類NLI任務訓練，能夠判斷前提是否蘊含假設。

模型特點

多數據集訓練

融合8個NLI數據集（含長文本DocNLI）的127萬樣本，增強泛化能力

改進的預訓練目標

採用DeBERTa-v3的增強預訓練方法，性能顯著優於前代版本

長文本處理

通過DocNLI數據集學習長距離推理能力

模型能力

零樣本文本分類

自然語言推理

文本蘊含判斷

使用案例

內容分析

影評情感分析

判斷用戶評論是否蘊含對電影的正面評價

示例中準確識別矛盾情感（87.3%非蘊含概率）

信息驗證

事實核查

驗證文本陳述是否與已知事實相符

在Fever-NLI測試集達89.7%準確率

🚀 DeBERTa-v3-base-mnli-fever-docnli-ling-2c

該模型主要用於文本分類和零樣本分類任務。它基於多個自然語言推理（NLI）數據集進行訓練，能有效預測“蘊含”或“非蘊含”關係，在相關任務中表現出色。

🚀 快速開始

本模型可用於簡單的零樣本分類任務，也能處理自然語言推理（NLI）相關用例。以下是使用示例：

💻 使用示例

基礎用法

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-docnli-ling-2c")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

高級用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-docnli-ling-2c"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "not_entailment"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

✨ 主要特性

該模型是模型中心唯一在8個NLI數據集上訓練的模型，其中包括包含長文本的DocNLI，可學習長距離推理。
基於Microsoft的DeBERTa-v3-base模型，其v3變體通過不同的預訓練目標顯著優於之前的版本。
模型在二分類NLI任務上進行訓練，可預測“蘊含”或“非蘊含”關係。

📦 安裝指南

文檔未提及具體安裝步驟，可參考Hugging Face的相關文檔進行安裝。

📚 詳細文檔

模型描述

該模型基於8個NLI數據集的1279665個假設 - 前提對進行訓練，這些數據集包括MultiNLI、Fever-NLI、LingNLI和DocNLI（其中包括ANLI、QNLI、DUC、CNN/DailyMail、Curation）。

基礎模型是Microsoft的DeBERTa-v3-base。DeBERTa的v3變體通過不同的預訓練目標顯著優於之前的版本，具體可參考原始DeBERTa論文的附錄11以及DeBERTa-V3論文。

若追求更高性能（但速度較慢），建議使用https://huggingface.co/MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli。

訓練數據

訓練過程

DeBERTa-v3-small-mnli-fever-docnli-ling-2c使用Hugging Face訓練器進行訓練，超參數如下：

training_args = TrainingArguments(
    num_train_epochs=3,              # total number of training epochs
    learning_rate=2e-05,
    per_device_train_batch_size=32,   # batch size per device during training
    per_device_eval_batch_size=32,    # batch size for evaluation
    warmup_ratio=0.1,                # number of warmup steps for learning rate scheduler
    weight_decay=0.06,               # strength of weight decay
    fp16=True                        # mixed precision training
)

評估結果

該模型使用MultiNLI和ANLI的二分類測試集以及Fever-NLI的二分類開發集進行評估（兩個類別而非三個），使用的指標是準確率。

數據集	準確率
mnli-m-2c	0.935
mnli-mm-2c	0.933
fever-nli-2c	0.897
anli-all-2c	0.710
anli-r3-2c	0.678
lingnli-2c	0.895

🔧 技術細節

模型在二分類NLI任務上進行訓練，將DocNLI數據集中的“中立”和“矛盾”類別合併為“非蘊含”類別。
使用Hugging Face訓練器進行訓練，設置了特定的超參數，如訓練輪數、學習率、批次大小等。

📄 許可證

本模型使用MIT許可證。

侷限性和偏差

請參考原始DeBERTa論文和不同NLI數據集的相關文獻，以瞭解潛在的偏差。

引用

如果使用此模型，請引用：Laurer, Moritz, Wouter van Atteveldt, Andreu Salleras Casas, and Kasper Welbers. 2022. ‘Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT - NLI’. Preprint, June. Open Science Framework. https://osf.io/74b8k.