distilbert-base-german-cased-toxic-comments Open Source Model - Free Detection of Malicious Content in German Comments

Home

Distilbert Base German Cased Toxic Comments

Developed by ml6team

A text classification model based on German DistilBERT for detecting toxic content in German comments

Text Classification

Transformers

German#German Toxic Detection #Social Content Moderation #DistilBERT Fine-tuning

Downloads 356

Release Time : 3/2/2022

Model Overview

This model is designed to detect toxic or potentially harmful German comments, fine-tuned on multiple German toxic content datasets using the DistilBERT architecture

Model Features

Multi-dataset Fusion Training

Combines 5 different German toxic content datasets covering various malicious types including insults and hate speech

Efficient Lightweight Architecture

Based on the distilled version of DistilBERT, improving inference efficiency while maintaining high accuracy

German-specific Optimization

Specifically optimized for German language characteristics, better handling German grammatical structures and expressions

Model Capabilities

German Text Classification

Toxic Content Detection

Hate Speech Recognition

Offensive Language Analysis

Use Cases

Content Moderation

Social Media Comment Filtering

Automatically identifies and flags toxic comments on social media platforms

78.5% accuracy, effectively reducing manual review workload

Forum Content Management

Detects hate speech and offensive content in online forums

F1 score 50.34, capable of identifying various forms of malicious expressions

User Behavior Analysis

User Risk Scoring

Analyzes user behavior risk level based on historical comments

🚀 German Toxic Comment Classification

This model is designed to detect toxic or potentially harmful comments in German.

🚀 Quick Start

This model is crafted to detect toxic or potentially harmful German comments. It fine - tunes a German DistilBERT model on a combination of five German datasets related to toxicity, profanity, offensive, or hate speech.

✨ Features

Detects toxicity in German comments.
Based on a fine - tuned German DistilBERT model.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import pipeline

model_hub_url = 'https://huggingface.co/ml6team/distilbert-base-german-cased-toxic-comments'
model_name = 'ml6team/distilbert-base-german-cased-toxic-comments'

toxicity_pipeline = pipeline('text-classification', model=model_name, tokenizer=model_name)

comment = "Ein harmloses Beispiel"
result = toxicity_pipeline(comment)[0]
print(f"Comment: {comment}\nLabel: {result['label']}, score: {result['score']}")

📚 Documentation

Model Description

This model was created with the purpose to detect toxic or potentially harmful comments. For this model, we fine - tuned a German DistilBERT model distilbert - base - german - cased on a combination of five German datasets containing toxicity, profanity, offensive, or hate speech.

Intended Uses & Limitations

This model can be used to detect toxicity in German comments. However, the definition of toxicity is vague and the model might not be able to detect all instances of toxicity. It will not be able to detect toxicity in languages other than German.

Limitations and Bias

The model was trained on a combinations of datasets that contain examples gathered from different social networks and internet communities. This only represents a narrow subset of possible instances of toxicity and instances in other domains might not be detected reliably.

Training Data

The training dataset combines the following five datasets:

GermEval18 [[dataset](https://github.com/uds - lsv/GermEval - 2018 - Data)]
- Labels: abuse, profanity, toxicity
GermEval21 [dataset]
- Labels: toxicity
IWG Hatespeech dataset [paper, [dataset](https://github.com/UCSM - DUE/IWG_hatespeech_public)]
- Labels: hate speech
Detecting Offensive Statements Towards Foreigners in Social Media (2017) by Breitschneider and Peters [[dataset](http://ub - web.de/research/)]
- Labels: hate
HASOC: 2019 Hate Speech and Offensive Content [dataset]
- Labels: offensive, profanity, hate

The datasets contains different labels ranging from profanity, over hate speech to toxicity. In the combined dataset these labels were subsumed as toxic and non - toxic and contains 23,515 examples in total. Note that the datasets vary substantially in the number of examples.

Training Procedure

The training and test set were created using either the predefined train/test splits where available and otherwise 80% of the examples for training and 20% for testing. This resulted in in 17,072 training examples and 6,443 test examples. The model was trained for 2 epochs with the following arguments:

training_args = TrainingArguments(
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=2,
    evaluation_strategy="steps",
    logging_strategy="steps",
    logging_steps=100,
    save_total_limit=5,
    learning_rate=2e-5,
    weight_decay=0.01,
    metric_for_best_model='accuracy',
    load_best_model_at_end=True
)

Evaluation Results

Model evaluation was done on 1/10th of the dataset, which served as the test dataset.

Accuracy	F1 Score	Recall	Precision
78.50	50.34	39.22	70.27

🔧 Technical Details

The model is based on fine - tuning a German DistilBERT model distilbert - base - german - cased on specific German datasets. The training and test set creation, as well as the training arguments, are described in the relevant sections.

📄 License

No license information is provided in the original document.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご