twitter-bitcoin-spam-detection open-source model - Easily identify Bitcoin-related spam/robot tweets

Home

Twitter Bitcoin Spam Detection

Developed by sandiumenge

A BERTweet-base fine-tuned classification model for identifying spam/bot tweets related to Bitcoin

Text Classification

Transformers

Open Source License:MIT #Bitcoin Tweet Detection #Spam Filtering #BERTweet Fine-tuning

Downloads 95

Release Time : 4/9/2025

Model Overview

This model is specifically designed to detect spam content in tweets related to Bitcoin or cryptocurrencies, capable of distinguishing between 'human', 'spam', and 'bot' tweets, addressing the bias of general spam detection models against Bitcoin-related content.

Model Features

Bitcoin Domain Specialization

Optimized for Bitcoin-related tweets to avoid misjudgment of cryptocurrency content by general models

Three-class Classification Capability

Can distinguish between normal human tweets, spam content, and bot-generated content

Integrated Tweet Preprocessing

Supports standardized processing of tweet-specific elements like user mentions and links

Model Capabilities

Text Classification

Spam Content Detection

Tweet Content Analysis

Use Cases

Social Media Management

Cryptocurrency Community Management

Automatically filter spam/scam content in Bitcoin discussion areas

Accuracy 87.55%, F1-score 87.67%

Security Monitoring

Detect tweets related to phishing websites and scam information

Precision 87.92%

🚀 Twitter bitcoin related spam detection

This model is designed to classify bitcoin or crypto - related tweets as "human", "spam", or "bot", eliminating the prejudice in classifying such tweets.

🚀 Quick Start

This model aims to classify tweets related to bitcoin or crypto topics as "human", "spam", or "bot". There are already many models available, but bitcoin - related tweets are often quickly classified as "spam" because they are usually associated with phishing sites or scams. This model, trained on a bitcoin - related dataset, removes this prejudice, enabling effective work with bitcoin - related tweets.

The model is a fine - tuned version of [vinai/bertweet - base](https://huggingface.co/vinai/bertweet - base), a roBERTa - based model fine - tuned with 850M English Tweets, and it's trained for emotion classification over a bitcoin - related dataset.

✨ Features

Classify bitcoin or crypto - related tweets into "human", "spam", or "bot".
Eliminate the prejudice in classifying bitcoin - related tweets.
Based on a fine - tuned version of [vinai/bertweet - base](https://huggingface.co/vinai/bertweet - base).

💻 Usage Examples

Basic Usage

BERTweet was trained over normalized tweets such as: tweet = "DHEC confirms HTTPURL via @USER :crying_face:" So for better results, it's recommended to normalize the texts before applying the spam detection. What it does is converting user mentions and web/url links into special tokens @USER and HTTPURL, respectively, and other preprocessing modifications. To do so, copy or download the TweetNormalizer provided at the BERTweet webpage:

!git clone https://github.com/VinAIResearch/BERTweet.git
!pip install emoji

from transformers import pipeline
import sys

# (Optional to improve accuracy)
sys.path.append('/content/BERTweet') # Or whatever directory the folder got downloaded in
from TweetNormalizer import normalizeTweet

classifier = pipeline("text - classification",
                      model="sandiumenge/twitter - bitcoin - spam - detection",
                      tokenizer="sandiumenge/twitter - bitcoin - spam - detection",
                      truncation=True,
                      padding=True,
                      max_length=128
)

tweet = "I'm winning iPhone XS，BTC，ETH and other Awards. Join with us!@freecoinhunt https://t.co/VIUwLmdy4n"
normalized_tweet = normalizeTweet(tweet)
# "I 'm winning iPhone XS ， BTC ， ETH and other Awards . Join with us ! @USER HTTPURL"

print(classifier(normalized_tweet))

>> [{'label': 'spam', 'score': 0.9803344011306763}]

Advanced Usage

The model achieves the following results on the evaluation set:

Loss: 0.4793
Accuracy: 0.8755
F1: 0.8767
Precision: 0.8792
Recall: 0.8755

![image/png](https://cdn - uploads.huggingface.co/production/uploads/67d83fa60947234bc71f8125/6rtAypyAAs5YxDj0hl0D7.png)

📄 License

This project is licensed under the MIT license.

Property	Details
Model Type	Fine - tuned version of [vinai/bertweet - base](https://huggingface.co/vinai/bertweet - base)
Training Data	[sandiumenge/bitcoin - tweets - spam - emotion - sentiment](sandiumenge/bitcoin - tweets - spam - emotion - sentiment)
Metrics	accuracy, f1, precision, recall
Pipeline Tag	text - classification
Model Name	twitter - bitcoin - spam - detection

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご