Llama-Phishsense-1B Open-Source Phishing Email Detection Model

Home

Llama Phishsense 1B

Developed by AcuteShrewdSecurity

A phishing email detection model fine-tuned from Llama-Guard-3-1B, capable of efficiently identifying phishing attacks.

Text Classification

Transformers

English#Phishing Email Detection #LoRA Fine-tuning #Enterprise-grade Protection

Downloads 385

Release Time : 10/12/2024

Model Overview

Llama-Phishsense-1B is an AI model specifically designed for detecting phishing emails. By fine-tuning Llama-Guard-3-1B, it accurately classifies emails as phishing (TRUE) or non-phishing (FALSE).

Model Features

Efficient Phishing Detection

Capable of real-time automatic detection of phishing patterns in emails, providing high-precision classification results.

Lightweight Design

Utilizes LoRA fine-tuning technology, making the model compact and resource-friendly, suitable for various deployment environments.

Enterprise-grade Protection

Specially optimized for enterprise email environments, capable of identifying highly customized phishing attacks.

Model Capabilities

Phishing Email Identification

Text Classification

Security Threat Detection

Use Cases

Enterprise Security

Enterprise Email System Integration

Integrate the model into enterprise email systems as an additional layer of phishing protection.

Reduces phishing attack success rates and protects sensitive information

Personal Protection

Personal Email Security Scanning

Used to scan personal inboxes for suspicious emails.

Early identification of phishing attempts to protect personal data

🚀 Revolutionize Phishing Protections with the Shrewd's Llama-Phishsense-1B!

Phishing attacks are constantly evolving, posing threats to both businesses and individuals. The Shrewd's AcuteShrewdSecurity/Llama-Phishsense-1B is an AI-powered defense system that can proactively identify phishing threats and safeguard your inbox. It's a finetuned Llama-Guard-3-1B model, small enough for widespread use and trained to detect phishing.

image/png

PS: See the Launch Post and the paper.

🚀 Quick Start

Using the Llama-Phishsense-1B is as simple as running a few lines of Python code. You’ll need to load both the base model and the LoRA adapter, and you're ready to classify emails in seconds!

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Function to load the model and tokenizer
def load_model():
    tokenizer = AutoTokenizer.from_pretrained("AcuteShrewdSecurity/Llama-Phishsense-1B")
    base_model = AutoModelForCausalLM.from_pretrained("AcuteShrewdSecurity/Llama-Phishsense-1B")
    model_with_lora = PeftModel.from_pretrained(base_model, "AcuteShrewdSecurity/Llama-Phishsense-1B")
    
    # Move model to GPU if available
    if torch.cuda.is_available():
        model_with_lora = model_with_lora.to('cuda')
    
    return model_with_lora, tokenizer

# Function to make a single prediction
def predict_email(model, tokenizer, email_text):
    prompt = f"Classify the following text as phishing or not. Respond with 'TRUE' or 'FALSE':\n\n{email_text}\nAnswer:"
    inputs = tokenizer(prompt, return_tensors="pt")

    # Move inputs to GPU if available
    if torch.cuda.is_available():
        inputs = {key: value.to('cuda') for key, value in inputs.items()}

    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=5, temperature=0.01, do_sample=False)
    
    response = tokenizer.decode(output[0], skip_special_tokens=True).split("Answer:")[1].strip()
    return response

# Load model and tokenizer
model, tokenizer = load_model()

# Example email text
email_text = "Urgent: Your account has been flagged for suspicious activity. Please log in immediately."
prediction = predict_email(model, tokenizer, email_text)
print(f"Model Prediction for the email: {prediction}")

✨ Features

Why Phishing is a Growing Threat

Phishing is no longer just an individual concern; it's an enterprise-level threat. Many cyberattacks start with phishing emails aimed at stealing valuable data. Malicious actors create increasingly deceptive messages, making it difficult for even the most vigilant people to tell real from fraudulent emails. The consequences include billions in financial losses, compromised accounts, and reputational damage.

The Solution: AI-Powered Phishing Detection

Traditional security systems struggle to keep up with modern phishing tactics. The Llama-Phishsense-1B is designed to:

Automatically detect phishing patterns in real-time.
Protect your organization from costly breaches.
Empower people to confidently manage their inboxes, knowing they're protected.

Why You Should Use This Model

1. Protect Against Corporate Enterprise Phishing

In a corporate setting, phishing emails can seem legitimate and easily bypass traditional filters. Attackers target specific individuals, especially those in finance, HR, or IT. The AcuteShrewdSecurity/Llama-Phishsense-1B can be integrated into your corporate email system as an additional layer of protection:

Mitigate the risks of targeted phishing attacks.
Prevent unauthorized access to sensitive information.
Reduce downtime associated with recovering from successful phishing exploits.

2. Individual Use Case

For individuals, protecting personal information is more important than ever. Phishing emails from seemingly legitimate services can slip through basic email filters. This model:

Identifies phishing attempts before you open the email.
Provides a clear 'TRUE' or 'FALSE' prediction on whether an email is safe.
Gives peace of mind knowing your private data is secure.

3. Offer Phishing Protection as a Service

For security professionals and IT providers, integrating Llama-Phishsense-1B into your security offerings can provide clients with reliable, AI-driven protection:

Add this model to your existing cybersecurity stack.
Increase client satisfaction by offering a proven phishing detection system.
Help clients avoid costly breaches and maintain operational efficiency.

📚 Documentation

Model Description

The Llama-Phishsense-1B is a fine-tuned version of meta-llama/Llama-Guard-3-1B, specifically enhanced for phishing detection in corporate email environments. Through advanced LoRA-based fine-tuning, it classifies emails as either "TRUE" (phishing) or "FALSE" (non-phishing), offering lightweight yet powerful protection against email scams.

Key Features:

Base Model: meta-llama/Llama-Guard-3-1B and meta-llama/Llama-3.2-1B
LoRA Fine-tuning: Efficient adaptation using Low-Rank Adaptation for quick, resource-friendly deployment.
Task: Binary email classification—phishing (TRUE) or non-phishing (FALSE).
Dataset: A custom-tailored phishing email dataset, featuring real-world phishing and benign emails.
Model Size: 1 Billion parameters, ensuring robust performance without overburdening resources.
Architecture: Causal Language Model with LoRA-adapted layers for speed and efficiency.

Why Choose This Model?

Phishing is responsible for most security breaches today. The Llama-Phishsense-1B model is the solution:

Highly Accurate: The model has achieved outstanding results in real-world evaluations, with an F1-score of 0.99 on balanced datasets.
Fast and Efficient: Leveraging LoRA fine-tuning, it operates faster while requiring fewer computational resources, meaning you get top-notch protection without slowing down your systems.
Accessible to Everyone: Whether you're an IT team or a solo email user, this tool is designed for easy integration and use.

Training and Fine-tuning

LoRA Configuration:

Rank: r=16
Alpha: lora_alpha=32
Dropout: lora_dropout=0.1
Adapted on the q_proj and v_proj transformer layers for efficient fine-tuning.

Training Data:

The model was fine-tuned on a balanced dataset of phishing and non-phishing emails (30k each), selected from ealvaradob/phishing-dataset to ensure real-world applicability.

Optimizer:

AdamW Optimizer: Weight decay of 0.01 with a learning rate of 1e-3.

Training Configuration:

Mixed-precision (FP16): Enables faster training without sacrificing accuracy.
Gradient accumulation steps: 10.
Batch size: 10 per device.
Number of epochs: 10.

Performance (Before and After finetuning)

Our model has demonstrated its effectiveness across multiple datasets (evals from zefang-liu/phishing-email-dataset, and custom created):

Metric	Base Model (meta-llama/Llama-Guard-3-1B)	Finetuned Model (AcuteShrewdSecurity/Llama-Phishsense-1B)	Performance Gain (Finetuned vs Base)
Accuracy	0.52	0.97	0.45
Precision	0.52	0.96	0.44
Recall	0.53	0.98	0.45

image/png

On the validation dataset (which includes custom expert-designed phishing cases), the model still performs admirably:

Metric	Base Model (meta-llama/Llama-Guard-3-1B)	Finetuned Model (AcuteShrewdSecurity/Llama-Phishsense-1B)	Performance Gain (Finetuned vs Base)
Accuracy	0.31	0.98	0.67
Precision	0.99	1.00	0.01
Recall	0.31	0.98	0.67

Comparasion with some relevant models is seen below. image/png

The paper can be found here. Please send feedback to b1oo@shrewdsecurity.com.

📄 License

The license for this model is llama3.2.

📦 Installation

No installation steps were provided in the original document, so this section is skipped.

🔧 Technical Details

No additional technical details beyond what's already covered were provided in the original document, so this section is skipped.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご