toxic-bertオープンソースモデル - 無料で複数種類のネット有害コメント内容を検出可能にデプロイ

ホーム

Toxic Bert

unitaryによって開発

PyTorch LightningとHugging Face Transformersを基にした毒性コメント分類システム、様々な種類のネット有害コンテンツを検出可能

テキスト分類オープンソースライセンス:Apache-2.0 #多言語毒性検出 #コメント内容審査 #バイアスなし分類

ダウンロード数 609.57k

リリース時間 : 3/2/2022

モデル概要

Detoxifyは、脅迫、侮辱、ヘイトスピーチなど、ネットコメント中の毒性コンテンツを識別・分類するための事前学習モデル集です。

モデル特徴

マルチタスク毒性検出

脅迫、侮辱、ヘイトスピーチなど、複数の毒性タイプを同時に検出可能

バイアス制御

特定のアイデンティティグループに対する誤判定を減らすよう特別に最適化

多言語サポート

7言語の毒性コンテンツ検出をサポート

即時利用可能モデル

追加トレーニング不要で使用可能な事前学習モデルを提供

モデル能力

テキスト毒性分類

マルチラベル分類

多言語テキスト処理

バイアス検出

使用事例

コンテンツ審査

ソーシャルメディアコメントフィルタリング

有害コメントを自動識別し審査員のチェック用にマーク

複数の毒性タイプを検出可能、精度93%以上

研究分析

ネット言論研究

異なるプラットフォームやコミュニティの毒性コンテンツ分布を分析

細粒度の毒性分類データを提供

🚀 🙊 Detoxify

このプロジェクトは、⚡ Pytorch Lightning と 🤗 Transformers を用いて、3つのJigsawチャレンジ（有害コメント分類、有害コメントにおける意図しないバイアス、多言語有害コメント分類）で有害コメントを予測するモデルとコードを提供します。

🚀 クイックスタート

インストール

# install detoxify
pip install detoxify

予測の実行

from detoxify import Detoxify

# each model takes in either a string or a list of strings
results = Detoxify('original').predict('example text')
results = Detoxify('unbiased').predict(['example text 1','example text 2'])
results = Detoxify('multilingual').predict(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de ejemplo','örnek metin','пример текста'])

# optional to display results nicely (will need to pip install pandas)
import pandas as pd
print(pd.DataFrame(results, index=input_text).round(5))

✨ 主な機能

3つのJigsawチャレンジに対応した有害コメント分類モデルを提供。
多言語に対応した有害コメント分類が可能。
ユーザーフレンドリーで使いやすいライブラリ。

📦 インストール

依存関係のインストール

# clone project
git clone https://github.com/unitaryai/detoxify

# create virtual env
python3 -m venv toxic-env
source toxic-env/bin/activate

# install project
pip install -e detoxify
cd detoxify

# for training
pip install -r requirements.txt

データのダウンロード

Kaggleアカウントがない場合は、まずアカウントを作成し、APIトークンをダウンロードして ~/.kaggle に配置してください。

# create data directory
mkdir jigsaw_data
cd jigsaw_data

# download data
kaggle competitions download -c jigsaw-toxic-comment-classification-challenge
kaggle competitions download -c jigsaw-unintended-bias-in-toxicity-classification
kaggle competitions download -c jigsaw-multilingual-toxic-comment-classification

💻 使用例

基本的な使用法

from detoxify import Detoxify
results = Detoxify('original').predict('example text')

高度な使用法

from detoxify import Detoxify
import pandas as pd

input_text = ['example text 1', 'example text 2']
results = Detoxify('unbiased').predict(input_text)
print(pd.DataFrame(results, index=input_text).round(5))

📚 ドキュメント

モデルの概要

モデル名	Transformerの種類	データソース
`original`	`bert-base-uncased`	Toxic Comment Classification Challenge
`unbiased`	`roberta-base`	Unintended Bias in Toxicity Classification
`multilingual`	`xlm-roberta-base`	Multilingual Toxic Comment Classification

予測の実行

# load model via torch.hub
python run_prediction.py --input 'example' --model_name original

# load model from from checkpoint path
python run_prediction.py --input 'example' --from_ckpt_path model_path

# save results to a .csv file
python run_prediction.py --input test_set.txt --model_name original --save_to results.csv

# to see usage
python run_prediction.py --help

学習の開始

Toxic Comment Classification Challenge

python create_val_set.py
python train.py --config configs/Toxic_comment_classification_BERT.json

Unintended Bias in Toxicicity Challenge

python train.py --config configs/Unintended_bias_toxic_comment_classification_RoBERTa.json

Multilingual Toxic Comment Classification

# stage 1
python train.py --config configs/Multilingual_toxic_comment_classification_XLMR.json

# stage 2
python train.py --config configs/Multilingual_toxic_comment_classification_XLMR_stage2.json

学習の進捗を監視

tensorboard --logdir=./saved

モデルの評価

Toxic Comment Classification Challenge

python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv

Unintended Bias in Toxicicity Challenge

python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv
# to get the final bias metric
python model_eval/compute_bias_metric.py

Multilingual Toxic Comment Classification

python evaluate.py --checkpoint saved/lightning_logs/checkpoints/example_checkpoint.pth --test_csv test.csv

🔧 技術詳細

このライブラリは、🤗 Transformers と ⚡ Pytorch Lightning を用いて構築されています。
学習データは、Jigsawの3つのチャレンジから取得されています。
多言語モデルは、7つの言語（英語、フランス語、スペイン語、イタリア語、ポルトガル語、トルコ語、ロシア語）で学習されています。

📄 ライセンス

このプロジェクトは、Apache-2.0ライセンスの下で公開されています。

⚠️ 重要提示

huggingfaceのモデルは、現在、detoxifyライブラリと異なる結果を返す場合があります（詳細はこちらを参照）。最新のモデルを使用する場合は、https://github.com/unitaryai/detoxify からモデルを使用することをお勧めします。

💡 使用建议

このライブラリは研究目的で使用することを想定しています。実世界の人口統計を反映したデータセットでの微調整や、コンテンツモデレーターが有害コンテンツをより迅速にフラグ付けするのを支援するために使用することができます。

引用

@misc{Detoxify,
  title={Detoxify},
  author={Hanu, Laura and {Unitary team}},
  howpublished={Github. https://github.com/unitaryai/detoxify},
  year={2020}
}