P

Prompt Injection Defender Large V0 Onnx

Developed by testsavantai
TestSavantAI models are a set of fine-tuned classifiers specifically designed to defend against prompt injection and jailbreak attacks targeting large language models (LLMs).
Downloads 3,225
Release Time : 11/27/2024

Model Overview

This model adopts the BERT architecture, focusing on detecting and intercepting malicious prompts to protect LLMs from prompt injection and jailbreak attacks.

Model Features

Guard Effectiveness Score (GES)
An innovative evaluation metric combining Attack Success Rate (ASR) and False Rejection Rate (FRR)
Multi-size Variants
Offers models of different specifications to balance performance and computational efficiency
ONNX Support
Provides ONNX versions for easier deployment and optimized inference performance

Model Capabilities

Malicious Prompt Detection
Jailbreak Attack Defense
Text Classification

Use Cases

AI Security
Prompt Injection Defense
Detects and intercepts malicious prompts attempting to bypass LLM security restrictions
Effectively reduces the success rate of prompt injection attacks
Jailbreak Attack Protection
Prevents users from gaining unauthorized access to LLMs through specially crafted prompts
Reduces the risk of LLM misuse
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase