L

Llama Prompt Guard 2 22M

Developed by meta-llama
Llama Prompt Guard 2 86M is a text classification model designed to detect prompt injection and jailbreak attacks, serving as the second-generation product in the Prompt Guard series.
Downloads 2,376
Release Time : 4/28/2025

Model Overview

This model aims to help developers detect and prevent prompt attacks against LLMs, including prompt injection and jailbreak attacks. It can identify malicious prompts and protect LLM applications from such attacks.

Model Features

Improved Performance
Significant performance enhancement compared to the first-generation model, reducing false positives on out-of-distribution data.
Adversarial Attack Resistance
Improved tokenization strategies to mitigate adversarial tokenization attacks, such as whitespace manipulation and fragmented tokenization.
Multilingual Support
Capable of detecting prompt attacks in multiple languages, including both English and non-English.
Simplified Classification
Focuses on binary classification, labeling prompts as 'benign' or 'malicious,' simplifying the usage process.

Model Capabilities

Malicious Prompt Detection
Multilingual Text Classification
Prompt Attack Protection

Use Cases

LLM Security
Preventing Prompt Injection
Detects and blocks malicious prompts attempting to manipulate LLMs into executing unintended instructions.
Effectively identifies both known and unknown prompt injection patterns.
Preventing Jailbreak Attacks
Identifies malicious instructions attempting to bypass LLM's built-in safety restrictions.
High accuracy in detecting various jailbreak techniques.
AI Application Security
API Protection
Deployed at the front end of LLM APIs to filter malicious requests.
Reduces API abuse and security incidents.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase