Russian Toxicity Classifier
A Russian toxicity comment classification model fine-tuned on conversational RuBERT, capable of accurately identifying toxic content in Russian text.
Downloads 17.93k
Release Time : 3/2/2022
Model Overview
This model is a classifier based on the BERT architecture, specifically designed to identify toxic comments in Russian text. It was trained by merging two Russian toxicity comment datasets, achieving high classification accuracy.
Model Features
High Accuracy
Achieved an accuracy of 0.97 on the test set, with an F1 score of 0.93 for toxic comments.
Multi-source Data Training
Combined two Russian toxicity comment datasets from 2ch.hk and ok.ru, enhancing the model's generalization capability.
Based on Conversational RuBERT
Fine-tuned on DeepPavlov/rubert-base-cased-conversational, making it particularly suitable for processing conversational text.
Model Capabilities
Russian Text Classification
Toxic Content Detection
Comment Content Analysis
Use Cases
Content Moderation
Social Media Comment Filtering
Automatically identify and filter toxic comments on social media platforms
Accuracy as high as 97%, effectively reducing inappropriate content on platforms
Forum Content Management
Assist forum administrators in identifying and handling toxic remarks
F1 score of 0.93, accurately marking comments requiring manual review
Featured Recommended AI Models