Sumsub-ffs-synthetic-1.0_sd_200 Open Source Model - Accurately Identify Stable Diffusion Synthesized Images

Sumsub Ffs Synthetic 1.0 Sd 200

Developed by Sumsub

AI-generated image detection model developed by Sumsub, specifically designed to identify synthetic images created by tools like Stable Diffusion

Image Classification

PyTorch

#Deepfake Detection #StableDiffusion Specialized #High-precision Forgery Identification

Downloads 21

Release Time : 8/15/2023

Model Overview

This model is used to detect synthetic images generated by AI tools such as Midjourney and Stable Diffusion, helping to identify deepfake content online

Model Features

High-precision Detection

High detection accuracy for images generated by different versions of Stable Diffusion (1.4/1.5/2.1)

Data Augmentation Training

Utilizes data augmentation techniques such as rotation cropping, Mixup, and CutMix to enhance model performance

Multi-dataset Validation

Validates model performance on multiple public datasets to ensure generalization capability

Model Capabilities

AI-generated Image Detection

Deepfake Recognition

Synthetic Image Classification

Real vs. Fake Image Discrimination

Use Cases

Content Moderation

Social Media Fake Content Identification

Detects AI-generated fake images circulating on social media

Can identify famous forged images such as the 'Puffer Jacket Pope'

News Verification

News Image Authenticity Verification

Verifies the authenticity of images used in news reports

Can detect fake news images such as the 'Pentagon Explosion'

🚀 For Fake's Sake: a set of models for detecting generated and synthetic images

Many people on the internet have recently been tricked by fake images. This project provides detectors for images generated by popular tools like Midjourney and Stable Diffusion to combat this issue.

🚀 Quick Start

Many people on the internet have recently been tricked by fake images of Pope Francis wearing a coat or of Donald Trump's arrest. To help combat this issue, we provide detectors for such images generated by popular tools like Midjourney and Stable Diffusion.

✨ Features

Provide detectors for images generated by popular tools like Midjourney and Stable Diffusion.
Help users combat the issue of being tricked by fake images.

📦 Installation

Use the code below to get started with the model:

git lfs install
git clone https://huggingface.co/Sumsub/Sumsub-ffs-synthetic-1.0_sd_200 sumsub_synthetic_sd_200

You may need these prerequsites installed:

pip install -r requirements.txt
pip install "git+https://github.com/rwightman/pytorch-image-models"
pip install "git+https://github.com/huggingface/huggingface_hub"

💻 Usage Examples

Basic Usage

from sumsub_synthetic_sd_200.pipeline import PreTrainedPipeline
from PIL import Image

pipe = PreTrainedPipeline("sumsub_synthetic_sd_200/")

img = Image.open("sumsub_synthetic_sd_200/images/2.jpg")

result = pipe(img)
print(result)

📚 Documentation

Model Details

Model Description

Property	Details
Developed by	Sumsub AI team
Model Type	Image classification
License	CC-By-SA-3.0
Types	diffusions_200m(Size: 200M parameters, Description: Designed to detect photos created using different versions of Stable Diffusion (1.4, 1.5, 2.1)
Finetuned from model	convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_384

Demo

The demo page can be found here.

Training Details

Training Data

The models were trained on the following datasets:

Stable Diffusion datasets:

Real photos : MS COCO.
AI photos : aiornot HuggingFace contest data, Stable Diffusion Wordnet Dataset.

Training Procedure

To improve the performance metrics, we used data augmentations such as rotation, crop, Mixup and CutMix. Each model was trained for 30 epochs using early stopping with batch size equal to 32.

Evaluation

For evaluation we used the following datasets:

Stable Diffusion datasets:

DiffusionDB: a set of 2 million images generated by Stable Diffusion using prompts and hyperparameters specified by real users.
Kaggel SD Faces: set of 4k human face images generated using Stable Diffusion 1.4.
Stable Diffusion Wordnet Dataset: set of 200K images generated by Stable Diffusion.

Realistic images:

MS COCO: set of 120k real world images.

Metrics

Model	Dataset	Accuracy
diffusions_200M	Kaggel SD Faces	0.989
diffusions_200M	DiffusionDB	0.926
diffusions_200M	Stable Diffusion Wordnet Dataset	0.946
diffusions_200M	MS COCO	0.941

Limitations

⚠️ Important Note

It should be noted that achieving 100% accuracy is not possible. Therefore, the model output should only be used as an indication that an image may have been (but not definitely) artificially generated.

Our models may face challenges in accurately predicting the class for real-world examples that are extremely vibrant and of exceptionally high quality. In such cases, the richness of colors and fine details may lead to misclassifications due to the complexity of the input. This could potentially cause the model to focus on visual aspects that are not necessarily indicative of the true class.

Citation

If you find this useful, please cite as:

@misc{sumsubaiornot, 
    publisher = {Sumsub},
    url       = {https://huggingface.co/Sumsub/Sumsub-ffs-synthetic-1.0_sd_200},
    year      = {2023},
    author    = {Savelyev, Alexander and Toropov, Alexey and Goldman-Kalaydin, Pavel and Samarin, Alexey},
    title     = {For Fake's Sake: a set of models for detecting deepfakes, generated images and synthetic images}
}

References

Stöckl, Andreas. (2022). Evaluating a Synthetic Image Dataset Generated with Stable Diffusion. 10.48550/arXiv.2211.01777.
Lin, Tsung-Yi & Maire, Michael & Belongie, Serge & Hays, James & Perona, Pietro & Ramanan, Deva & Dollár, Piotr & Zitnick, C.. (2014). Microsoft COCO: Common Objects in Context.
Howard, Andrew & Zhu, Menglong & Chen, Bo & Kalenichenko, Dmitry & Wang, Weijun & Weyand, Tobias & Andreetto, Marco & Adam, Hartwig. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.
Liu, Zhuang & Mao, Hanzi & Wu, Chao-Yuan & Feichtenhofer, Christoph & Darrell, Trevor & Xie, Saining. (2022). A ConvNet for the 2020s.
Wang, Zijie & Montoya, Evan & Munechika, David & Yang, Haoyang & Hoover, Benjamin & Chau, Polo. (2022). DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models. 10.48550/arXiv.2210.14896.

📄 License

This project is licensed under CC-By-SA-3.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご