Sumsub-ffs-synthetic-1.0_sd_200開源模型 - 精準識別Stable Diffusion合成圖像

首頁

Sumsub Ffs Synthetic 1.0 Sd 200

由Sumsub開發

Sumsub開發的AI生成圖像檢測模型，專門識別Stable Diffusion等工具生成的合成圖像

圖像分類

PyTorch

#深度偽造檢測 #StableDiffusion專用 #高精度鑑偽

下載量 21

發布時間 : 8/15/2023

模型概述

該模型用於檢測由Midjourney、Stable Diffusion等AI工具生成的合成圖像，幫助識別網絡上的深度偽造內容

模型特點

高精度檢測

針對Stable Diffusion不同版本(1.4/1.5/2.1)生成的圖像具有高檢測準確率

數據增強訓練

採用旋轉裁剪、Mixup和CutMix等數據增強技術提升模型性能

多數據集驗證

在多個公開數據集上驗證模型性能，確保泛化能力

模型能力

AI生成圖像檢測

深度偽造識別

合成圖像分類

真假圖像判別

使用案例

內容審核

社交媒體虛假內容識別

檢測社交媒體上傳播的AI生成虛假圖片

可識別如'羽絨服教皇'等著名偽造圖像

新聞驗證

新聞圖片真實性驗證

驗證新聞報道中使用圖片的真實性

可檢測如'五角大樓爆炸'等偽造新聞圖片

🚀 為了打假：用於檢測生成和合成圖像的一組模型

近期，網絡上許多人被教皇方濟各穿外套或唐納德·特朗普被捕的虛假圖像所誤導。為助力解決這一問題，我們提供了可檢測由Midjourney和Stable Diffusion等流行工具生成的此類圖像的檢測器。

✨ 主要特性

提供針對Midjourney和Stable Diffusion等工具生成圖像的檢測器。
可有效識別虛假圖像，助力解決網絡虛假圖像誤導問題。

📦 安裝指南

使用以下代碼開始使用該模型：

git lfs install
git clone https://huggingface.co/Sumsub/Sumsub-ffs-synthetic-1.0_sd_200 sumsub_synthetic_sd_200

你可能需要安裝以下先決條件：

pip install -r requirements.txt
pip install "git+https://github.com/rwightman/pytorch-image-models"
pip install "git+https://github.com/huggingface/huggingface_hub"

💻 使用示例

基礎用法

from sumsub_synthetic_sd_200.pipeline import PreTrainedPipeline
from PIL import Image

pipe = PreTrainedPipeline("sumsub_synthetic_sd_200/")

img = Image.open("sumsub_synthetic_sd_200/images/2.jpg")

result = pipe(img)
print(result)

📚 詳細文檔

模型詳情

模型描述

開發者：Sumsub AI團隊
模型類型：圖像分類
許可證：CC-By-SA-3.0
類型：diffusions_200m（大小：2億參數，描述：旨在檢測使用不同版本的Stable Diffusion（1.4、1.5、2.1）創建的照片）
微調自模型：convnext_large_mlp.clip_laion2b_soup_ft_in12k_in1k_384

演示

演示頁面可在此處找到。

訓練詳情

訓練數據

這些模型在以下數據集上進行訓練：

Stable Diffusion數據集：

真實照片：MS COCO。
AI照片：aiornot HuggingFace競賽數據，Stable Diffusion Wordnet數據集。

訓練過程

為提高性能指標，我們使用了旋轉、裁剪、Mixup和CutMix等數據增強方法。每個模型使用早停法訓練30個epoch，批量大小為32。

評估

在評估中，我們使用了以下數據集：

Stable Diffusion數據集：

DiffusionDB：一組由Stable Diffusion使用真實用戶指定的提示和超參數生成的200萬張圖像。
Kaggel SD Faces：一組使用Stable Diffusion 1.4生成的4000張人臉圖像。
Stable Diffusion Wordnet數據集：一組由Stable Diffusion生成的20萬張圖像。

真實圖像：

MS COCO：一組12萬張真實世界圖像。

指標

模型	數據集	準確率
diffusions_200M	Kaggel SD Faces	0.989
diffusions_200M	DiffusionDB	0.926
diffusions_200M	Stable Diffusion Wordnet Dataset	0.946
diffusions_200M	MS COCO	0.941

侷限性

需注意，無法達到100%的準確率。因此，模型輸出僅應作為圖像可能（但並非肯定）是人工生成的指示。
我們的模型在準確預測極其鮮豔且質量極高的真實世界示例的類別時可能會面臨挑戰。在這種情況下，豐富的顏色和精細的細節可能會因輸入的複雜性而導致誤分類。這可能會使模型關注不一定能指示真實類別的視覺方面。

🔧 技術細節

引用

如果您覺得此項目有用，請按以下方式引用：

@misc{sumsubaiornot, 
    publisher = {Sumsub},
    url       = {https://huggingface.co/Sumsub/Sumsub-ffs-synthetic-1.0_sd_200},
    year      = {2023},
    author    = {Savelyev, Alexander and Toropov, Alexey and Goldman-Kalaydin, Pavel and Samarin, Alexey},
    title     = {For Fake's Sake: a set of models for detecting deepfakes, generated images and synthetic images}
}

參考文獻

Stöckl, Andreas. (2022). Evaluating a Synthetic Image Dataset Generated with Stable Diffusion. 10.48550/arXiv.2211.01777.
Lin, Tsung-Yi & Maire, Michael & Belongie, Serge & Hays, James & Perona, Pietro & Ramanan, Deva & Dollár, Piotr & Zitnick, C.. (2014). Microsoft COCO: Common Objects in Context.
Howard, Andrew & Zhu, Menglong & Chen, Bo & Kalenichenko, Dmitry & Wang, Weijun & Weyand, Tobias & Andreetto, Marco & Adam, Hartwig. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.
Liu, Zhuang & Mao, Hanzi & Wu, Chao-Yuan & Feichtenhofer, Christoph & Darrell, Trevor & Xie, Saining. (2022). A ConvNet for the 2020s.
Wang, Zijie & Montoya, Evan & Munechika, David & Yang, Haoyang & Hoover, Benjamin & Chau, Polo. (2022). DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models. 10.48550/arXiv.2210.14896.