L

Llama Guard 4 12B

Developed by meta-llama
Llama Guard 4 is a native multimodal safety classifier with 12 billion parameters, jointly trained on text and multiple images for content safety evaluation of large language model inputs and outputs.
Downloads 16.52k
Release Time : 4/23/2025

Model Overview

Based on a pruned dense architecture from the Llama 4 Scout pre-trained model, fine-tuned for content safety classification. It generates text output to indicate content safety, listing violation categories if unsafe.

Model Features

Multimodal Safety Review
Combines text and image review capabilities, supporting multimodal safety assessment with a single classifier.
MLCommons Standard Alignment
Trained based on the MLCommons harm classification system, with an added 'Code Interpreter Abuse' category.
Multi-Image Input Support
Added support for training and evaluating with 2-5 images per sample.
Efficient Architecture
Converts MoE architecture to dense architecture through pruning, enabling single-GPU operation.

Model Capabilities

Text Safety Classification
Image Safety Classification
Multimodal Content Review
Violation Category Identification

Use Cases

Content Moderation
Social Media Content Filtering
Automatically identifies and filters harmful content on social media platforms.
Reduces safety violation rates, matching or surpassing the performance of previous models.
AI Chatbot Safety Protection
Evaluates the safety of inputs and outputs for large language models.
Input filtering reduces safety violation rates more effectively than output filtering.
Enterprise Security
Internal Communication Review
Monitors inappropriate content in internal corporate communications.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase