B

Bert Base Uncased Hatexplain

Developed by Hate-speech-CNERG
HateXplain is a text classification model for detecting hate speech, offensive content, and normal content, trained on data from Gab and Twitter, with enhanced performance through human-annotated rationale.
Downloads 3,831
Release Time : 3/2/2022

Model Overview

This model specializes in the three-class classification task (hate speech/offensive content/normal content) for social media text, improving detection accuracy through interpretability-enhanced annotated data.

Model Features

Multi-category classification
Simultaneously identifies hate speech, offensive content, and normal content in text.
Interpretability enhancement
Training data includes human-annotated rationale to improve model decision interpretability.
Cross-platform data
Integrates data from Gab and Twitter to enhance generalization capabilities.

Model Capabilities

Text classification
Hate speech detection
Content safety filtering

Use Cases

Content moderation
Social media content filtering
Automatically identifies and flags hate speech and offensive content on platforms.
Reduces manual review workload and improves efficiency in identifying harmful content.
Academic research
Hate speech pattern analysis
Used to study linguistic features and propagation patterns of online hate speech.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase