🚀 Russian Inappropriate Messages Classification Model
This model is designed to detect inappropriate messages in the Russian language. It focuses on a derivative of toxicity, providing an additional layer of filtering after toxicity and obscenity checks.
🚀 Quick Start
The 'inappropriateness' concept in this model is not a substitute for toxicity but a derivative of it. The model can serve as an additional layer of inappropriateness filtering after toxicity and obscenity filtration. You can detect the exact sensitive topic by using another model. The proposed pipeline is shown in the scheme below.

You can also train one classifier for both toxicity and inappropriateness detection. The data to be mixed with toxic labelled samples could be found on our GitHub or on kaggle
✨ Features
Inappropraiteness Intuition
This model is trained on a dataset of inappropriate Russian messages. An inappropriate utterance generally has no obscene words or toxic intent but can harm the speaker's reputation. Find some sample for more intuition in the table below. Learn more about the concept of inappropriateness in this article presented at the workshop for Balto - Slavic NLP at the EACL - 2021 conference. Note that this article describes the first version of the dataset, while the model is trained on the extended version available on our GitHub or on kaggle. The dataset properties are the same as described in the article, only the size differs.
Text |
Inappropriate |
Comment |
Unsafe topic |
Ладно бы видного деятеля завалили а тут какого то ноунейм нигру преступника |
yes |
Оправдание убийства |
offline crime |
Это нарушение УКРФ! |
no |
Осуждение нарушения закона |
offline crime |
Религия - это насилие высшего уровня. Верующего принуждают к вере еще в младенчестве. |
yes |
Оскорбление чувств верующих |
religion |
Христиане, как вы относитесь к иноверцам, в частности к Гамалиилу и к уверовавшим в Иисуса фарисеям? |
no |
Обычные вопросы к верующим без намека на оскорбление их чувств |
religion |
Хорошо что проституция запрещена |
no |
Осуждение простуитуции |
prostitution |
Возьму бабки и сниму шлюх на выходных |
yes |
Популяризация проституции |
prostitution |
Metrics
The model was trained, validated, and tested only on samples with 100% confidence, achieving the following metrics on the test set:
|
precision |
recall |
f1 - score |
support |
0 |
0.92 |
0.93 |
0.93 |
7839 |
1 |
0.80 |
0.76 |
0.78 |
2726 |
accuracy |
|
|
0.89 |
10565 |
macro avg |
0.86 |
0.85 |
0.85 |
10565 |
weighted avg |
0.89 |
0.89 |
0.89 |
10565 |
📄 License
This project is licensed under the [Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License][cc - by - nc - sa].
[![CC BY - NC - SA 4.0][cc - by - nc - sa - image]][cc - by - nc - sa]
[cc - by - nc - sa]: http://creativecommons.org/licenses/by - nc - sa/4.0/
[cc - by - nc - sa - image]: https://i.creativecommons.org/l/by - nc - sa/4.0/88x31.png
📚 Documentation
Citation
If you find this repository helpful, feel free to cite our publication:
@inproceedings{babakov-etal-2021-detecting,
title = "Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company{'}s Reputation",
author = "Babakov, Nikolay and
Logacheva, Varvara and
Kozlova, Olga and
Semenov, Nikita and
Panchenko, Alexander",
booktitle = "Proceedings of the 8th Workshop on Balto - Slavic Natural Language Processing",
month = apr,
year = "2021",
address = "Kiyv, Ukraine",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.bsnlp-1.4",
pages = "26--36",
abstract = "Not all topics are equally {``}flammable{''} in terms of toxicity: a calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labelling a dataset for appropriateness. While toxicity in user - generated data is well - studied, we aim at defining a more fine - grained notion of inappropriateness. The core of inappropriateness is that it can harm the reputation of a speaker. This is different from toxicity in two respects: (i) inappropriateness is topic - related, and (ii) inappropriate message is not toxic but still unacceptable. We collect and release two datasets for Russian: a topic - labelled dataset and an appropriateness - labelled dataset. We also release pre - trained classification models trained on this data.",
}
📞 Contacts
If you have any questions please contact Nikolay