roberta_toxicity_classifierオープンソース毒性コメント分類モデル

ホーム

Roberta Toxicity Classifier

s-nlpによって開発

RoBERTa-largeを微調整した毒性コメント分類モデルで、Jigsawコンペティションのデータセットで訓練され、英語テキストの毒性コンテンツを識別するために使用されます。

テキスト分類

Transformers

英語#毒性コメント検出 #高精度分類 #多くのコンペティションデータによる訓練

ダウンロード数 80.61k

リリース時間 : 3/2/2022

モデル概要

このモデルは、英語のコメントを毒性分類するために特別に設計されており、テキスト内の有害コンテンツを効果的に識別することができます。200万件のサンプルで訓練され、テストセットで優れた性能を発揮します。

モデル特徴

高性能分類

Jigsawコンペティションのテストセットで、AUC-ROC 0.98およびF1スコア0.76の優れた性能を達成しました。

大規模な訓練データ

Jigsawの3回のコンペティションの約200万件の英語サンプルを統合して訓練しました。

RoBERTaに基づく最適化

堅牢に最適化されたRoBERTa-largeの事前学習モデルを使用して微調整しました。

モデル能力

テキストの毒性分類

有害コンテンツ検出

自然言語処理

使用事例

コンテンツ審査

ソーシャルメディアのコメントフィルタリング

ソーシャルメディアプラットフォーム上の有害コメントを自動的に識別してフィルタリングします。

プラットフォーム上の毒性コンテンツを効果的に削減します。

オンラインコミュニティ管理

フォーラムやコミュニティの管理者が不適切な発言を迅速に識別するのを支援します。

コミュニティのコンテンツ品質を向上させます。

学術研究

言語の毒性研究

ネットワーク言語の毒性特徴とパターンを研究するために使用されます。

🚀 毒性分類モデル

このモデルは、毒性分類タスク用に訓練されています。訓練に使用されたデータセットは、Jigsawによる3つのデータセット（Jigsaw 2018、Jigsaw 2019、Jigsaw 2020）の英語部分を統合したもので、約200万件のサンプルを含んでいます。これを2つに分割し、RoBERTaモデル（RoBERTa: A Robustly Optimized BERT Pretraining Approach）をファインチューニングしました。分類器は、最初のJigsawコンペティションのテストセットで良好な性能を発揮し、AUC-ROCが0.98、F1スコアが0.76に達しました。

属性	详情
モデルタイプ	毒性分類モデル
訓練データ	Jigsawの3つのデータセットの英語部分の統合（約200万件のサンプル）
ベースモデル	FacebookAI/roberta-large
使用データセット	google/jigsaw_toxicity_pred
ライセンス	OpenRAIL++

🚀 クイックスタート

この毒性分類モデルは、有害コメントを識別するために訓練されています。以下に、使用方法を説明します。

✨ 主な機能

有害コメントの分類タスクに特化して訓練されたモデルです。
大規模なデータセットを使用して訓練されており、高い性能を発揮します。

📦 インストール

このモデルを使用するには、transformersライブラリが必要です。以下のコマンドでインストールできます。

pip install transformers torch

💻 使用例

基本的な使用法

import torch
from transformers import RobertaTokenizer, RobertaForSequenceClassification

tokenizer = RobertaTokenizer.from_pretrained('s-nlp/roberta_toxicity_classifier')
model = RobertaForSequenceClassification.from_pretrained('s-nlp/roberta_toxicity_classifier')

batch = tokenizer.encode("You are amazing!", return_tensors="pt")

output = model(batch)
# idx 0 for neutral, idx 1 for toxic

📚 ドキュメント

このモデルは、有害コメントを分類するために訓練されています。訓練に使用されたデータセットは、Jigsawの3つのデータセットの英語部分を統合したもので、約200万件のサンプルを含んでいます。ベースモデルとしてRoBERTaを使用し、ファインチューニングを行っています。

📄 ライセンス

このモデルは、OpenRAIL++ライセンスの下で提供されています。このライセンスは、産業界や学界など、公共の利益に資する様々な技術の開発をサポートしています。

📖 引用

このモデルを使用する際には、以下の引用を使用してください。

@inproceedings{logacheva-etal-2022-paradetox,
    title = "{P}ara{D}etox: Detoxification with Parallel Data",
    author = "Logacheva, Varvara  and
      Dementieva, Daryna  and
      Ustyantsev, Sergey  and
      Moskovskiy, Daniil  and
      Dale, David  and
      Krotova, Irina  and
      Semenov, Nikita  and
      Panchenko, Alexander",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.469",
    pages = "6804--6818",
    abstract = "We present a novel pipeline for the collection of parallel data for the detoxification task. We collect non-toxic paraphrases for over 10,000 English toxic sentences. We also show that this pipeline can be used to distill a large existing corpus of paraphrases to get toxic-neutral sentence pairs. We release two parallel corpora which can be used for the training of detoxification models. To the best of our knowledge, these are the first parallel datasets for this task.We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources.We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches. We conduct both automatic and manual evaluations. All models trained on parallel data outperform the state-of-the-art unsupervised models by a large margin. This suggests that our novel datasets can boost the performance of detoxification systems.",
}