ai-image-detect-distilled Open-source Image Classification Model - Free Deployment to Detect Differences between AI and Real Images

Ai Image Detect Distilled

Developed by jacoballessio

A lightweight image classification model based on ViT architecture, specifically designed to detect differences between AI-generated images and real images

Image Classification

Transformers

Open Source License:MIT #Multi-source AI detection #Authentic forgery discrimination #Lightweight ViT

Downloads 7,054

Release Time : 7/1/2024

Model Overview

By distilling three independently trained sub-models, this model can distinguish AI-generated images (e.g., Midjourney, Stable Diffusion) from real images, focusing on identifying subtle differences in generated images

Model Features

Multi-model distillation

Incorporates knowledge from three sub-models targeting different AI generation technologies to improve detection universality

Data matching strategy

Uses BLIP descriptions to match generated images with real images, ensuring fair comparison

Lightweight and efficient

The distilled small ViT model has only 11.8 million parameters, maintaining high performance while reducing computational requirements

Real-world scenario adaptation

Performs excellently on custom real-world test sets, suitable for detecting common internet images

Model Capabilities

AI-generated image detection

Real image verification

Multi-generation technology recognition

Image classification

Use Cases

Content moderation

Social media AI content detection

Identify AI-generated images on social media platforms

Helps platforms flag potential fake content

Digital forensics

News image authenticity verification

Verify whether news images are AI-generated

Assists in news authenticity verification

🚀 AI Detection Model

An AI detection model for image classification, capable of distinguishing between real and AI - generated images.

🚀 Quick Start

This README provides detailed information about the AI Image Detect Distilled model, including its architecture, training process, data sources, performance, and future directions.

✨ Features

Multi - model Distillation: Combines the features learned from three separate models into a small ViT model for efficient detection.
Diverse Data Sources: Utilizes multiple datasets to ensure the similarity between real and AI - generated images.
Good Performance: Achieves high accuracy on both validation and real - world datasets, outperforming other popular models.

📦 Installation

No installation steps are provided in the original document.

💻 Usage Examples

No code examples are provided in the original document.

📚 Documentation

Model Architecture and Training

Three separate models were initially trained:

Midjourney vs. Real Images
Stable Diffusion vs. Real Images
Stable Diffusion Fine - tunings vs. Real Images

The data preparation process was as follows:

Used Google's Open Image Dataset for real images
Described real images using BLIP (Bootstrapping Language - Image Pre - training)
Generated Stable Diffusion images using BLIP descriptions
Found similar Midjourney images based on BLIP descriptions

This approach ensured that real and AI - generated images were as similar as possible, differing only in their origin.

The three models were then distilled into a small ViT model with 11.8 Million Parameters, combining their learned features for more efficient detection.

Data Sources

Google's Open Image Dataset: link
Ivan Sivkov's Midjourney Dataset: link
TANREI(NAMA)'s Stable Diffusion Prompts Dataset: link

Performance

Validation Set: 74% accuracy. It was held out from the training data to assess generalization.
Custom Real - World Set: 72% accuracy. Composed of self - captured images and online - sourced images, it is designed to be more representative of internet - based images.
Comparative Analysis: Outperformed other popular AI detection models by 5 percentage points on both sets. Other models achieved 89% and 79% on the validation and real - world sets respectively.

Key Insights

Strong generalization on validation data (75% accuracy).
Good adaptability to diverse, real - world images (72% accuracy).
Consistent outperformance of other popular models.
A 10 - point accuracy drop from the validation to the real - world set indicates room for improvement.
Comprehensive training on multiple AI generation techniques contributes to model versatility.
Focus on subtle differences in image generation rather than content disparities.

Future Directions

Expand the dataset with more diverse, real - world examples to bridge the performance gap.
Improve generalization to internet - sourced images.
Conduct error analysis on misclassified samples to identify patterns.
Integrate new AI image generation techniques as they emerge.
Consider fine - tuning for specific domains where detection accuracy is critical.

🔧 Technical Details

The model architecture involves distilling three separate models into a small ViT model with 11.8 Million Parameters. The data preparation process carefully aligns real and AI - generated images to ensure similarity in appearance.

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご