Llama-Prompt-Guard-2-86M-onnx Open-Source Model - Free Deployment for Efficient Text Classification

Llama Prompt Guard 2 86M Onnx

Developed by gravitee-io

Llama-Prompt-Guard-2-86M-onnx is an ONNX converted and quantized version of meta-llama/Llama-Prompt-Guard-2-86M, used for efficient text classification tasks.

Text Classification Supports Multiple Languages#Multilingual text classification #Efficient ONNX inference #Prompt security detection

Downloads 589

Release Time : 5/20/2025

Model Overview

This model is based on the Meta LLaMA architecture, provides multilingual support, and is mainly used for text classification tasks, especially suitable for detecting and filtering inappropriate content.

Model Features

Multilingual support

Supports multiple languages such as English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai.

Efficient inference

Through ONNX conversion and quantization, the model can run more efficiently in different environments.

Powerful building foundation

Based on the Meta LLaMA architecture, it provides powerful language understanding capabilities.

Model Capabilities

Text classification

Multilingual text processing

Efficient inference

Use Cases

Content moderation

Inappropriate content detection

Used to detect and filter inappropriate or harmful text content.

High accuracy and recall rate, effectively identify inappropriate content.

🚀 Llama-Prompt-Guard-2-86M-onnx

This repository offers an ONNX converted and quantized version of meta-llama/Llama-Prompt-Guard-2-86M, which is useful for text classification tasks.

🚀 Quick Start

Prerequisites

Ensure you have the necessary libraries installed. You can install them using pip install transformers optimum[onnxruntime] numpy.

Basic Usage

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np

# Load model and tokenizer using optimum
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")

# Tokenize input
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

# Run inference
outputs = model(**inputs)
logits = outputs.logits

# Optional: convert to probabilities
probs = 1 / (1 + np.exp(-logits))
print(probs)

✨ Features

Multilingual Support: Supports multiple languages including English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai.
Powerful Foundation Model: Built on Meta LLaMA models, providing strong classification capabilities.
Efficient Inference: Utilizes ONNX and ONNX Runtime for efficient model export and inference.

📦 Installation

The installation mainly involves installing the required Python libraries. You can use the following command:

pip install transformers optimum[onnxruntime] numpy

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np

# Load model and tokenizer using optimum
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")

# Tokenize input
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

# Run inference
outputs = model(**inputs)
logits = outputs.logits

# Optional: convert to probabilities
probs = 1 / (1 + np.exp(-logits))
print(probs)

📚 Documentation

Built With

Meta LLaMA: Foundation model powering the classifier
- meta-llama/Llama-Prompt-Guard-2-22M
- meta-llama/Llama-Prompt-Guard-2-86M
Hugging Face Transformers: Used for model and tokenizer loading.
ONNX: Model export and runtime format.
ONNX Runtime: Efficient inference backend.

Evaluation Dataset

We use the jackhhao/jailbreak-classification dataset for evaluation.

Evaluation Results

Model	Accuracy	Precision	Recall	F1 Score	AUC-ROC	Inference Time
Llama-Prompt-Guard-2-22M	0.9569	0.9879	0.9260	0.9559	0.9259	33s
Llama-Prompt-Guard-2-22M-q	0.9473	1.0000	0.8956	0.9449	0.9032	29s
Llama-Prompt-Guard-2-86M	0.9770	0.9980	0.9564	0.9767	0.9523	1m29s
Llama-Prompt-Guard-2-86M-q	0.8937	1.0000	0.7894	0.8823	0.7263	1m15s

GitHub Repository

You can find the full source code, CLI tools, and evaluation scripts in the official GitHub repository.

📄 License

This project is under the Llama4 license.

Property	Details
Model Type	Text Classification
Training Data	Not specified in the original document
Supported Languages	English, French, German, Hindi, Italian, Portuguese, Spanish, Thai
Base Model	meta-llama/Llama-Prompt-Guard-2-86M
License	Llama4

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご