roberta_toxicity_classifier開源毒性評論分類模型 - 精準識別英文文本有毒內容

首頁

Roberta Toxicity Classifier

由s-nlp開發

基於RoBERTa-large微調的毒性評論分類模型，在Jigsaw競賽數據集上訓練，用於識別英文文本中的毒性內容。

文本分類

Transformers

英語#毒性評論檢測 #高精度分類 #多競賽數據訓練

下載量 80.61k

發布時間 : 3/2/2022

模型概述

該模型專門用於對英文評論進行毒性分類，能夠有效識別文本中的有害內容。基於200萬條樣本訓練，在測試集上表現優異。

模型特點

高性能分類

在Jigsaw競賽測試集上達到AUC-ROC 0.98和F1分數0.76的優秀表現

大規模訓練數據

整合了Jigsaw三屆競賽約200萬條英文樣本進行訓練

基於RoBERTa優化

採用魯棒優化的RoBERTa-large預訓練模型進行微調

模型能力

文本毒性分類

有害內容檢測

自然語言處理

使用案例

內容審核

社交媒體評論過濾

自動識別並過濾社交媒體平臺上的有害評論

有效減少平臺上的毒性內容

在線社區管理

幫助論壇和社區管理員快速識別不當言論

提高社區內容質量

學術研究

語言毒性研究

用於研究網絡語言中的毒性特徵和模式

🚀 毒性分類模型

本模型專為毒性分類任務而訓練。訓練所用的數據集是由 Jigsaw 提供的三個數據集的英文部分合並而成（Jigsaw 2018、Jigsaw 2019、Jigsaw 2020），包含約 200 萬個示例。我們將其分為兩部分，並在其上微調了一個 RoBERTa 模型（RoBERTa: A Robustly Optimized BERT Pretraining Approach）。該分類器在第一個 Jigsaw 競賽的測試集上表現出色，AUC-ROC 達到 0.98，F1 分數達到 0.76。

🚀 快速開始

模型信息

屬性	詳情
模型類型	毒性分類模型
基礎模型	FacebookAI/roberta-large
訓練數據	由 Jigsaw 的三個數據集的英文部分合並而成，包含約 200 萬個示例
許可證	OpenRAIL++

如何使用

import torch
from transformers import RobertaTokenizer, RobertaForSequenceClassification

tokenizer = RobertaTokenizer.from_pretrained('s-nlp/roberta_toxicity_classifier')
model = RobertaForSequenceClassification.from_pretrained('s-nlp/roberta_toxicity_classifier')

batch = tokenizer.encode("You are amazing!", return_tensors="pt")

output = model(batch)
# idx 0 for neutral, idx 1 for toxic

📚 詳細文檔

引用信息

若要引用我們的工作，請使用以下引用信息：

@inproceedings{logacheva-etal-2022-paradetox,
    title = "{P}ara{D}etox: Detoxification with Parallel Data",
    author = "Logacheva, Varvara  and
      Dementieva, Daryna  and
      Ustyantsev, Sergey  and
      Moskovskiy, Daniil  and
      Dale, David  and
      Krotova, Irina  and
      Semenov, Nikita  and
      Panchenko, Alexander",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.469",
    pages = "6804--6818",
    abstract = "We present a novel pipeline for the collection of parallel data for the detoxification task. We collect non-toxic paraphrases for over 10,000 English toxic sentences. We also show that this pipeline can be used to distill a large existing corpus of paraphrases to get toxic-neutral sentence pairs. We release two parallel corpora which can be used for the training of detoxification models. To the best of our knowledge, these are the first parallel datasets for this task.We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources.We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches. We conduct both automatic and manual evaluations. All models trained on parallel data outperform the state-of-the-art unsupervised models by a large margin. This suggests that our novel datasets can boost the performance of detoxification systems.",
}