nli-MiniLM2-L6-H768開源自然語言推理模型 - 免費判斷句子對間關係

首頁

Nli MiniLM2 L6 H768

由cross-encoder開發

基於MiniLMv2架構的預訓練自然語言推理模型，用於判斷句子對之間的關係（矛盾/蘊含/中立）

文本分類

Transformers

英語開源協議:Apache-2.0 #零樣本分類 #自然語言推理 #句子關係分析

下載量 10.02k

發布時間 : 3/2/2022

模型概述

該模型使用交叉編碼器架構，專門用於自然語言推理任務，能夠判斷兩個句子之間的邏輯關係（矛盾、蘊含或中立）。

模型特點

高效推理能力

採用蒸餾技術從RoBERTa-Large壓縮而來，在保持高性能的同時提升推理速度

多數據集訓練

在SNLI和MultiNLI兩個主流自然語言推理數據集上聯合訓練

零樣本分類支持

可直接用於零樣本分類任務，無需額外訓練

模型能力

自然語言推理

文本關係判斷

零樣本分類

句子對分析

使用案例

文本分析

內容一致性檢查

檢測兩段文本是否存在邏輯矛盾

可自動識別文本間的矛盾關係

知識驗證

驗證陳述是否符合已知事實

判斷陳述是否被知識庫蘊含

智能客服

問題匹配

判斷用戶問題與知識庫答案的匹配程度

提高自動問答系統的準確性

🚀 自然語言推理跨編碼器

該模型使用SentenceTransformers的跨編碼器類進行訓練，可對句子對進行自然語言推理，輸出矛盾、蘊含、中立三種標籤的得分。

🚀 快速開始

此模型使用 SentenceTransformers 的 Cross-Encoder 類進行訓練。

✨ 主要特性

基於預訓練模型 nreimers/MiniLMv2-L6-H768-distilled-from-RoBERTa-Large 進行訓練。
可對給定的句子對輸出對應“矛盾”“蘊含”“中立”標籤的得分。
支持零樣本分類任務。

📦 安裝指南

文檔未提供具體安裝步驟，可參考 SentenceTransformers 和 Transformers 官方文檔進行安裝。

💻 使用示例

基礎用法

使用預訓練模型的示例代碼如下：

from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/nli-MiniLM2-L6-H768')
scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])

#Convert scores to labels
label_mapping = ['contradiction', 'entailment', 'neutral']
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]

高級用法

直接使用 Transformers 庫（不使用 SentenceTransformers 庫）的示例代碼如下：

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-MiniLM2-L6-H768')
tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-MiniLM2-L6-H768')

features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    label_mapping = ['contradiction', 'entailment', 'neutral']
    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
    print(labels)

零樣本分類用法

該模型用於零樣本分類的示例代碼如下：

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-MiniLM2-L6-H768')

sent = "Apple just announced the newest iPhone X"
candidate_labels = ["technology", "sports", "politics"]
res = classifier(sent, candidate_labels)
print(res)

📚 詳細文檔

訓練數據

該模型在 SNLI 和 MultiNLI 數據集上進行訓練。對於給定的句子對，它將輸出對應“矛盾”“蘊含”“中立”標籤的三個得分。

性能表現

評估結果請參考 SBERT.net - Pretrained Cross-Encoder。

📄 許可證

本項目採用 apache-2.0 許可證。

📋 模型信息

屬性	詳情
模型類型	自然語言推理跨編碼器
訓練數據	SNLI 和 MultiNLI 數據集
基礎模型	nreimers/MiniLMv2-L6-H768-distilled-from-RoBERTa-Large
庫名稱	sentence-transformers
指標	準確率
標籤	transformers
任務類型	零樣本分類