Virtus Open-Source Binary Classification Model - Efficiently Detect Deepfake Images with an Accuracy of Up to 99.2%

Virtus

Developed by agasta

A Vision Transformer-based binary classification model specifically designed for detecting deepfake images, with an accuracy rate of 99.2%

Image Classification

Transformers

Open Source License:MIT #Deepfake Detection #High-Precision Classification #Vision Transformer

Downloads 970

Release Time : 4/14/2025

Model Overview

Virtus is a fine-tuned Vision Transformer model specifically designed to distinguish between real and deepfake images. The model was trained on a balanced dataset containing 190,000 images and achieves extremely high detection accuracy.

Model Features

High Accuracy

Achieves 99.2% accuracy on test sets, effectively identifying deepfake images

Balanced Dataset

Trained on a balanced dataset of 190,000 images to ensure model fairness

Data Augmentation

Utilizes various data augmentation techniques such as random rotation and sharpness adjustment to enhance generalization

Distilled Architecture

Based on the distilled version of Vision Transformer (DeiT) architecture, combining efficiency with high performance

Model Capabilities

Image Classification

Deepfake Detection

Facial Authenticity Analysis

Use Cases

Security Detection

Social Media Content Moderation

Automatically identifies deepfake images on social media

99.2% accuracy

Identity Verification Systems

Serves as an additional verification layer for biometric systems

Education & Research

Digital Media Literacy Tool

Helps students identify synthetic media

🚀 Virtus

Virtus is a fine - tuned Vision Transformer (ViT) model tailored for binary image classification. It's specifically designed to differentiate between real and deepfake images, achieving an accuracy of approximately 99.2% on a balanced dataset of over 190,000 images.

🚀 Quick Start

from transformers import AutoFeatureExtractor, AutoModelForImageClassification
from PIL import Image
import torch

model = AutoModelForImageClassification.from_pretrained("agasta/virtus")
extractor = AutoFeatureExtractor.from_pretrained("agasta/virtus")

image = Image.open("path_to_image.jpg")
inputs = extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(-1).item()
print(model.config.id2label[predicted_class])

✨ Features

High Accuracy: Achieves ~99.2% accuracy on a balanced dataset of over 190,000 images.
Binary Classification: Specifically trained for distinguishing between real and deepfake images.
Versatile Use: Can be deployed in image analysis pipelines or integrated into applications for media authenticity detection.

📦 Installation

No specific installation steps are provided in the original README.

📚 Documentation

Model Details

Model Description

Virtus is based on facebook/deit - base - distilled - patch16 - 224 and was fine - tuned on a binary classification task using a large dataset of real and fake facial images. The training process involved class balancing, data augmentation, and evaluation using accuracy and F1 score.

Developed by: [Agasta](https://github.com/Itz - Agasta)
Funded by: None
Shared by: Agasta
Model type: Vision Transformer (ViT) for image classification
Language(s): N/A (vision model)
License: MIT
Finetuned from model: [facebook/deit - base - distilled - patch16 - 224](https://huggingface.co/facebook/deit - base - distilled - patch16 - 224)

Model Sources

Repository: https://huggingface.co/agasta/virtus

Uses

Direct Use

This model can be used to predict whether an input image is a real or a deepfake. It can be deployed in image analysis pipelines or integrated into applications that require media authenticity detection.

Downstream Use

Virtus may be used in broader deepfake detection systems, educational tools for detecting synthetic media, or pre - screening systems for online platforms.

Out - of - Scope Use

Detection of deepfakes in videos or audio
General object classification tasks outside of the real/fake binary domain

Bias, Risks, and Limitations

The dataset, while balanced, may still carry biases in facial features, lighting conditions, or demographics. The model is also not robust to non - standard input sizes or heavily occluded faces.

Recommendations

💡 Usage Tip

Use only on face images similar in nature to the training set.

Do not use for critical or high - stakes decisions without human verification.

Regularly re - evaluate performance with updated data.

Training Details

Training Data

The dataset consisted of 190,335 self - collected real and deepfake face images, with RandomOverSampler used to balance the two classes. The data was split into 60% training and 40% testing, maintaining class stratification.

Training Procedure

Preprocessing

Images resized to 224x224
Augmentations: Random rotation, sharpness adjustments, normalization

Training Hyperparameters

Property	Details
Epochs	2
Learning rate	1e - 6
Train batch size	32
Eval batch size	8
Weight decay	0.02
Optimizer	AdamW (via Trainer API)
Mixed precision	Not used

Evaluation

Testing Data

Same dataset, stratified 60:40 split, used for evaluation.

Metrics

Accuracy
F1 Score (macro)
Confusion matrix
Classification report

Results

Property	Details
Accuracy	99.20%
F1 Score (macro)	0.9920

Environmental Impact

Property	Details
Hardware Type	NVIDIA Tesla V100 (Kaggle Notebook GPU)
Hours used	~2.3 hours
Cloud Provider	Kaggle
Compute Region	Unknown
Carbon Emitted	Can be estimated via MLCO2 Calculator

Technical Specifications

Model Architecture and Objective

The model is a distilled Vision Transformer (DeiT) designed for image classification with a binary objective: classify images as Real or Fake.

Compute Infrastructure

Hardware: 1x NVIDIA Tesla V100 GPU
Software: PyTorch, Hugging Face Transformers, Datasets, Accelerate

Citation

BibTeX:

@misc{virtus2025,
  title={Virtus: Deepfake Detection using Vision Transformers},
  author={Agasta},
  year={2025},
  howpublished={\url{https://huggingface.co/agasta/virtus}},
}

APA: Agasta. (2025). Virtus: Deepfake Detection using Vision Transformers. Hugging Face. https://huggingface.co/agasta/virtus

Model Card Contact

For questions or feedback, reach out via [GitHub](https://github.com/Itz - Agasta) or open an issue on the [model repository](https://github.com/Itz - Agasta/Lopt/tree/main/models/image). or mail me at rupam.golui@proton.me

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご