open-source mail classification model for spam-mail-classifier - accurately identify spam and non-spam email topics

Spam Mail Classifier

Developed by Goodmotion

A text classification model fine-tuned based on microsoft/Multilingual-MiniLM-L12-H384, used to classify email subjects as spam (SPAM) or non-spam (NOSPAM).

Text Classification

Transformers

Open Source License:Apache-2.0 #Multilingual Email Classification #Lightweight Transformer #Spam Email Detection

Downloads 943

Release Time : 12/9/2024

Model Overview

This model is used for spam email detection in email subjects and supports multilingual text classification tasks.

Model Features

Multilingual Support

Based on the Multilingual-MiniLM model, it supports spam email detection in multiple languages

Lightweight Model

Uses the MiniLM architecture to reduce computational resource requirements while maintaining performance

Easy to Use

Provides clear API interfaces for quick integration into existing systems

Model Capabilities

Text Classification

Spam Email Detection

Multilingual Text Processing

Use Cases

Email Management

Spam Email Filtering

Automatically identify and filter spam emails

Improves email processing efficiency and reduces spam interference

Email Classification System

Automatically classify emails as spam or normal

Optimizes email management processes

Security Protection

Phishing Email Detection

Identify potential phishing emails and fraudulent content

Enhances email security

🚀 SPAM Mail Classifier

This model is fine - tuned from microsoft/Multilingual-MiniLM-L12-H384 to classify email subjects as SPAM or NOSPAM, offering an effective solution for multilingual spam detection in email subjects.

✨ Features

Fine - tuned from microsoft/Multilingual-MiniLM-L12-H384 for text classification.
Capable of distinguishing between 2 classes: SPAM and NOSPAM.
Supports multilingual text, making it suitable for a wide range of users.

📦 Installation

No explicit installation steps are provided in the original README. However, to use this model, you need to have the transformers library installed. You can install it using the following command:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "Goodmotion/spam-mail-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name
)

text = "Félicitations ! Vous avez gagné un iPhone."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
print(outputs.logits)

Advanced Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "Goodmotion/spam-mail-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

texts = [
'Join us for a webinar on AI innovations',
'Urgent: Verify your account immediately.',
'Meeting rescheduled to 3 PM',
'Happy Birthday!',
'Limited time offer: Act now!',
'Join us for a webinar on AI innovations',
'Claim your free prize now!',
'You have unclaimed rewards waiting!',
'Weekly newsletter from Tech World',
'Update on the project status',
'Lunch tomorrow at 12:30?',
'Get rich quick with this amazing opportunity!',
'Invoice for your recent purchase',
'Don\'t forget: Gym session at 6 AM',
'Join us for a webinar on AI innovations',
'bonjour comment allez vous ?',
'Documents suite à notre rendez-vous',
'Valentin Dupond mentioned you in a comment',
'Bolt x Supabase = 🤯',
'Modification site web de la société',
'Image de mise en avant sur les articles',
'Bring new visitors to your site',
'Le Cloud Éthique sans bullshit',
'Remix Newsletter #25: React Router v7',
'Votre essai auprès de X va bientôt prendre fin',
'Introducing a Google Docs integration, styles and more in Claude.ai',
'Carte de crédit sur le point d’expirer sur Cloudflare'
]
inputs = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
outputs = model(**inputs)

# Convert the logits to probabilities with softmax
logits = outputs.logits
probabilities = torch.softmax(logits, dim=1)

# Decode the classes for each text
labels = ["NOSPAM", "SPAM"]  # Mapping of indices to labels
results = [
    {"text": text, "label": labels[torch.argmax(prob).item()], "confidence": prob.max().item()}
    for text, prob in zip(texts, probabilities)
]

# Display the results
for result in results:
    print(f"Text: {result['text']}")
    print(f"Result: {result['label']} (Confidence: {result['confidence']:.2%})\n")

📚 Documentation

Model Details

Property	Details
Model Type	Fine - tuned from `microsoft/Multilingual-MiniLM-L12-H384`
Fine - tuned for	Text classification
Number of classes	2 (SPAM, NOSPAM)
Languages	Multilingual

📄 License

This model is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご