roberta-large-fallacy-classification開源文本分類模型

首頁

Roberta Large Fallacy Classification

由MidhunKanadan開發

基於roberta-large微調的文本分類模型，專門用於識別13種常見邏輯謬誤類型

文本分類

Transformers

英語開源協議:Apache-2.0 #邏輯謬誤檢測 #論證質量評估 #批判性思維輔助

下載量 26

發布時間 : 11/9/2024

模型概述

該模型能夠對文本中的各類邏輯謬誤進行分類，適用於教育、論證分析和內容審核等場景

模型特點

多類別謬誤識別

能夠識別13種不同類型的邏輯謬誤，包括偷換概念、錯誤概括、虛假因果等

精細調優

採用類別權重處理數據不平衡問題，並使用低學習率(2e-6)進行精細調優

高效推理

支持最大128個token的輸入長度，在GPU上可實現快速推理

模型能力

文本分類

邏輯謬誤檢測

論證質量評估

使用案例

教育領域

批判性思維教學

通過識別常見謬誤來教授邏輯推理和批判性思維

幫助學生識別和避免論證中的邏輯錯誤

內容分析

論證有效性評估

評估辯論、論文和文章中的論證有效性

提供論證質量的量化指標

內容審核

識別在線辯論或社交媒體討論中的邏輯缺陷

提高討論質量，減少誤導性言論

AI增強

對話系統增強

增強對話系統的邏輯推理能力

使AI對話更具邏輯性和說服力

🚀 roberta-large-謬誤分類模型

本模型是roberta-large的微調版本，在邏輯謬誤分類數據集上進行訓練。它能夠對文本中的各種邏輯謬誤類型進行分類。

🚀 快速開始

此模型可通過文本管道進行快速分類，以下是使用示例：

from transformers import pipeline

pipe = pipeline("text-classification", model="MidhunKanadan/roberta-large-fallacy-classification", device=0)
text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
result = pipe(text)[0]
print(f"Predicted Label: {result['label']}, Score: {result['score']:.4f}")

預期輸出：

Predicted Label: false causality, Score: 0.9632

✨ 主要特性

基於roberta-large模型進行微調，能夠準確分類文本中的邏輯謬誤。
支持13種不同類型的邏輯謬誤分類。
採用類權重處理數據集不平衡問題。
支持截斷和填充的分詞方式（最大長度：128）。

📚 詳細文檔

模型詳情

屬性	詳情
基礎模型	`roberta-large`
訓練數據集	邏輯謬誤數據集
類別數量	13
學習率	2e - 6
批次大小	8（梯度累積，有效批次大小為16）
權重衰減	0.01
訓練輪數	15
混合精度（FP16）	啟用

支持的謬誤類型

該模型可以對以下類型的邏輯謬誤進行分類：

語義模糊謬誤（Equivocation）
錯誤概括謬誤（Faulty Generalization）
邏輯謬誤（Fallacy of Logic）
訴諸大眾謬誤（Ad Populum）
循環論證謬誤（Circular Reasoning）
假兩難推理謬誤（False Dilemma）
錯誤因果謬誤（False Causality）
外延謬誤（Fallacy of Extension）
可信度謬誤（Fallacy of Credibility）
相關性謬誤（Fallacy of Relevance）
故意謬誤（Intentional）
訴諸情感謬誤（Appeal to Emotion）
人身攻擊謬誤（Ad Hominem）

數據集

數據集名稱：邏輯謬誤分類數據集
來源：邏輯謬誤分類數據集
類別數量：13種謬誤（例如，人身攻擊、訴諸情感、錯誤概括等）

應用場景

教育領域：通過識別常見謬誤來教授邏輯推理和批判性思維。
論證分析：評估辯論、論文和文章中論點的有效性。
人工智能助手：為對話式人工智能系統增強批判性推理能力。
內容審核：識別在線辯論或社交媒體討論中的邏輯缺陷。

💻 使用示例

基礎用法

from transformers import pipeline

pipe = pipeline("text-classification", model="MidhunKanadan/roberta-large-fallacy-classification", device=0)
text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
result = pipe(text)[0]
print(f"Predicted Label: {result['label']}, Score: {result['score']:.4f}")

高級用法

以下代碼可用於獲取所有標籤的預測分數：

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch.nn.functional as F

model_path = "MidhunKanadan/roberta-large-fallacy-classification"
text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path).to("cuda")
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128).to("cuda")

with torch.no_grad():
    probs = F.softmax(model(**inputs).logits, dim=-1)
    results = {model.config.id2label[i]: score.item() for i, score in enumerate(probs[0])}

# Print scores for all labels
for label, score in sorted(results.items(), key=lambda x: x[1], reverse=True):
    print(f"{label}: {score:.4f}")

預期輸出：

false causality: 0.9632
fallacy of logic: 0.0139
faulty generalization: 0.0054
intentional: 0.0029
fallacy of credibility: 0.0023
equivocation: 0.0022
fallacy of extension: 0.0020
ad hominem: 0.0019
circular reasoning: 0.0016
false dilemma: 0.0015
fallacy of relevance: 0.0013
ad populum: 0.0009
appeal to emotion: 0.0009