đ Llama-Prompt-Guard-2-86M-onnx
This repository offers an ONNX converted and quantized version of meta-llama/Llama-Prompt-Guard-2-86M, which is useful for text classification tasks.
đ Quick Start
Prerequisites
- Ensure you have the necessary libraries installed. You can install them using
pip install transformers optimum[onnxruntime] numpy
.
Basic Usage
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
logits = outputs.logits
probs = 1 / (1 + np.exp(-logits))
print(probs)
⨠Features
- Multilingual Support: Supports multiple languages including English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai.
- Powerful Foundation Model: Built on Meta LLaMA models, providing strong classification capabilities.
- Efficient Inference: Utilizes ONNX and ONNX Runtime for efficient model export and inference.
đĻ Installation
The installation mainly involves installing the required Python libraries. You can use the following command:
pip install transformers optimum[onnxruntime] numpy
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
logits = outputs.logits
probs = 1 / (1 + np.exp(-logits))
print(probs)
đ Documentation
Built With
- Meta LLaMA: Foundation model powering the classifier
- Hugging Face Transformers: Used for model and tokenizer loading.
- ONNX: Model export and runtime format.
- ONNX Runtime: Efficient inference backend.
Evaluation Dataset
We use the jackhhao/jailbreak-classification
dataset for evaluation.
Evaluation Results
Model |
Accuracy |
Precision |
Recall |
F1 Score |
AUC-ROC |
Inference Time |
Llama-Prompt-Guard-2-22M |
0.9569 |
0.9879 |
0.9260 |
0.9559 |
0.9259 |
33s |
Llama-Prompt-Guard-2-22M-q |
0.9473 |
1.0000 |
0.8956 |
0.9449 |
0.9032 |
29s |
Llama-Prompt-Guard-2-86M |
0.9770 |
0.9980 |
0.9564 |
0.9767 |
0.9523 |
1m29s |
Llama-Prompt-Guard-2-86M-q |
0.8937 |
1.0000 |
0.7894 |
0.8823 |
0.7263 |
1m15s |
GitHub Repository
You can find the full source code, CLI tools, and evaluation scripts in the official GitHub repository.
đ License
This project is under the Llama4 license.
Property |
Details |
Model Type |
Text Classification |
Training Data |
Not specified in the original document |
Supported Languages |
English, French, German, Hindi, Italian, Portuguese, Spanish, Thai |
Base Model |
meta-llama/Llama-Prompt-Guard-2-86M |
License |
Llama4 |