DeBERTa-v3-base-mnli-fever-anli開源模型 - 免費使用，零樣本分類與語言推理利器

首頁

Deberta V3 Base Mnli Fever Anli

由MoritzLaurer開發

基於MultiNLI、Fever-NLI和ANLI數據集訓練的DeBERTa-v3模型，擅長零樣本分類和自然語言推理任務

文本分類

Transformers

英語開源協議:MIT #零樣本分類 #自然語言推理 #多任務訓練

下載量 613.93k

發布時間 : 3/2/2022

模型概述

該模型在自然語言推理(NLI)任務上表現優異，特別適用於零樣本文本分類場景。基於微軟DeBERTa-v3-base架構，通過改進預訓練目標提升性能。

模型特點

多數據集訓練

融合MultiNLI、Fever-NLI和ANLI三大數據集，共763,913個NLI樣本對

對抗性測試表現優異

在ANLI對抗性基準測試中超越多數大型模型表現

改進的預訓練架構

採用DeBERTa-v3改進版本，通過優化預訓練目標顯著提升性能

模型能力

零樣本文本分類

自然語言推理

文本蘊含判斷

多標籤分類

使用案例

內容分類

新聞分類

無需訓練即可將新聞自動分類到政治、經濟等預定義類別

示例準確率約49.5%（ANLI測試集）

語義分析

觀點矛盾檢測

識別文本中前後陳述是否自相矛盾

🚀 DeBERTa-v3-base-mnli-fever-anli

該模型在文本分類和零樣本分類任務中表現出色，基於特定數據集訓練，能有效處理自然語言推理問題，為相關領域研究和應用提供了有力支持。

🚀 快速開始

簡單的零樣本分類管道

#!pip install transformers[sentencepiece]
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

NLI使用案例

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

✨ 主要特性

該模型在MultiNLI、Fever - NLI和Adversarial - NLI (ANLI)數據集上進行訓練，包含763913個NLI假設 - 前提對。
此基礎模型在ANLI基準測試中幾乎優於所有大型模型。
基礎模型是微軟的DeBERTa - v3 - base，DeBERTa的v3變體通過不同的預訓練目標，顯著優於該模型的先前版本。

📦 安裝指南

在使用模型前，你需要安裝transformers庫，可使用以下命令進行安裝：

#!pip install transformers[sentencepiece]

💻 使用示例

基礎用法

#!pip install transformers[sentencepiece]
from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

高級用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "I first thought that I liked the movie, but upon second thought it was actually disappointing."
hypothesis = "The movie was good."

input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = {name: round(float(pred) * 100, 1) for pred, name in zip(prediction, label_names)}
print(prediction)

📚 詳細文檔

訓練數據

DeBERTa - v3 - base - mnli - fever - anli在MultiNLI、Fever - NLI和Adversarial - NLI (ANLI)數據集上進行訓練，這些數據集包含763913個NLI假設 - 前提對。

訓練過程

DeBERTa - v3 - base - mnli - fever - anli使用Hugging Face訓練器進行訓練，超參數如下：

training_args = TrainingArguments(
    num_train_epochs=3,              # total number of training epochs
    learning_rate=2e-05,
    per_device_train_batch_size=32,   # batch size per device during training
    per_device_eval_batch_size=32,    # batch size for evaluation
    warmup_ratio=0.1,                # number of warmup steps for learning rate scheduler
    weight_decay=0.06,               # strength of weight decay
    fp16=True                        # mixed precision training
)