đ NeuroSpaceX/ruSpamNS_big
This is a binary classification model for spam detection in Russian Telegram messages, based on the DeepPavlov/distilrubert-small-cased-conversational
model.
đ Quick Start
The ruSpamNS_big
model is designed for binary classification of spam in Russian Telegram messages. It can effectively identify spam messages in various Telegram scenarios, including chats, channels, and groups.
- Supported Platforms: Telegram (Chats, Channels, Groups)
- Base Model: Based on
ruBERT
, fine - tuned on Russian spam detection datasets
- Number of Samples: 4 million samples from Russian Telegram messages
- Author: @NeuroSpaceX
- Telegram Bot: @ruSpamNS_bot
⨠Features
- High - Performance Spam Detection: Capable of accurately identifying spam messages in Russian Telegram conversations.
- Fine - Tuned Model: Fine - tuned on a large amount of Russian Telegram data to improve detection accuracy.
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
hf_token = "hf_ââ¤âââà _âøâÂĩâÃâÃ
âÃĻâΊâââÂĒâÃĨâΊâÃŖâĪ_âÃâÃĻââĢâÂĩâΊ"
model_name = "NeuroSpaceX/ruSpamNS_big"
tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)
model = AutoModelForSequenceClassification.from_pretrained(
model_name,
token=hf_token,
attn_implementation="eager"
)
model.eval()
text = "âÃŧâÃââââ¤âÂĩâÃ, âΊâÃâââΊâÃŖ âÃ
âÃĻâÃâÃâÃâÂĨâΊââââĢââ âΊââ âÃâÂĨâââÂĒâÂĩâΊââĢâÃ, 300 âÂĨâÃĻâÂĒâÂĒâââÃâÃĻâ⤠â⤠âΊâÂĩâÂĨâÂĩâÂĒâÊ"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
logits = model(**inputs).logits
spam_score = torch.sigmoid(logits).item()
is_spam = spam_score >= 0.5
print(f"Is spam: {is_spam} (Spam score: {spam_score:.2%})")
đ Documentation
The model is fine - tuned on the NeuroSpaceX/v17
dataset, which contains a large number of Russian Telegram messages. It can be used to detect spam in various Telegram scenarios, such as:
- Chat messages
- Channel posts
- Group messages
đ License
This model is licensed under the CC - BY - NC - ND - 3.0 license.
- Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.
- Non - Commercial: You may not use the material for commercial purposes.
- No Derivatives: If you remix, transform, or build upon the material, you may not distribute the modified material.
- ShareAlike: If you distribute the material, you must license it under the same terms as the original.
Š NeuroSpaceX 2025
â ī¸ Important Note
To use this model, you need to obtain permission from @NeuroSpaceX. You need to provide information such as your full name and usage intent, and agree to the terms of use and attribution requirements.
đĄ Usage Tip
When using this model, make sure your input text is in Russian and from the Telegram environment for better performance.