P

Polyguard Qwen Smol

Developed by ToxicityPrompts
PolyGuard is a multilingual safety model for protecting LLM-generated content, supporting safety review in 17 languages, currently achieving state-of-the-art performance.
Downloads 194
Release Time : 2/18/2025

Model Overview

PolyGuard is a safety review tool based on large language models, specifically designed to detect and filter harmful content in multilingual environments. It can identify the harmfulness of user requests, the rejection status of AI responses, and the harmfulness of the responses themselves, while classifying specific security policy violations.

Model Features

Multilingual support
Supports safety review in 17 languages, covering major global languages
Comprehensive safety classification
Capable of identifying 14 categories of unsafe content, including violent crimes, sexual crimes, hate speech, etc.
High performance
Outperforms existing state-of-the-art open-source and commercial safety classifiers by 5.5%
Large-scale training data
Trained on 1.91 million multilingual samples to ensure model generalization capability

Model Capabilities

Multilingual text analysis
Harmful content detection
Security policy classification
Dialogue safety assessment

Use Cases

Content moderation
Chatbot safety protection
Detects harmful content in conversations between users and AI assistants
Effectively identifies and filters unsafe content such as violence and hate speech
Multilingual community content moderation
Automatically moderates user-generated content on multilingual social platforms
Supports real-time harmful content detection in 17 languages
AI safety research
LLM safety evaluation
Evaluates the safety of content generated by different language models
Provides standardized multilingual safety benchmarking
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase