🚀 SPAM Mail Classifier
This model is fine - tuned from microsoft/Multilingual-MiniLM-L12-H384
to classify email subjects as SPAM or NOSPAM, offering an effective solution for multilingual spam detection in email subjects.
✨ Features
- Fine - tuned from
microsoft/Multilingual-MiniLM-L12-H384
for text classification.
- Capable of distinguishing between 2 classes: SPAM and NOSPAM.
- Supports multilingual text, making it suitable for a wide range of users.
📦 Installation
No explicit installation steps are provided in the original README. However, to use this model, you need to have the transformers
library installed. You can install it using the following command:
pip install transformers
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "Goodmotion/spam-mail-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
model_name
)
text = "Félicitations ! Vous avez gagné un iPhone."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
print(outputs.logits)
Advanced Usage
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "Goodmotion/spam-mail-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
texts = [
'Join us for a webinar on AI innovations',
'Urgent: Verify your account immediately.',
'Meeting rescheduled to 3 PM',
'Happy Birthday!',
'Limited time offer: Act now!',
'Join us for a webinar on AI innovations',
'Claim your free prize now!',
'You have unclaimed rewards waiting!',
'Weekly newsletter from Tech World',
'Update on the project status',
'Lunch tomorrow at 12:30?',
'Get rich quick with this amazing opportunity!',
'Invoice for your recent purchase',
'Don\'t forget: Gym session at 6 AM',
'Join us for a webinar on AI innovations',
'bonjour comment allez vous ?',
'Documents suite à notre rendez-vous',
'Valentin Dupond mentioned you in a comment',
'Bolt x Supabase = 🤯',
'Modification site web de la société',
'Image de mise en avant sur les articles',
'Bring new visitors to your site',
'Le Cloud Éthique sans bullshit',
'Remix Newsletter #25: React Router v7',
'Votre essai auprès de X va bientôt prendre fin',
'Introducing a Google Docs integration, styles and more in Claude.ai',
'Carte de crédit sur le point d’expirer sur Cloudflare'
]
inputs = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
probabilities = torch.softmax(logits, dim=1)
labels = ["NOSPAM", "SPAM"]
results = [
{"text": text, "label": labels[torch.argmax(prob).item()], "confidence": prob.max().item()}
for text, prob in zip(texts, probabilities)
]
for result in results:
print(f"Text: {result['text']}")
print(f"Result: {result['label']} (Confidence: {result['confidence']:.2%})\n")
📚 Documentation
Model Details
Property |
Details |
Model Type |
Fine - tuned from microsoft/Multilingual-MiniLM-L12-H384 |
Fine - tuned for |
Text classification |
Number of classes |
2 (SPAM, NOSPAM) |
Languages |
Multilingual |
📄 License
This model is licensed under the Apache-2.0
license.