CleanDIFT Open-Source Feature Extraction Tool - Freely Process Clean Images to Extract Noiseless Features

Cleandift

Developed by CompVis

CleanDIFT is a novel diffusion model feature extraction method that extracts noise-free and time-step-independent features by directly processing clean input images.

Image Generation Open Source License:MIT #Noise-free feature extraction #Diffusion model optimization #Single GPU fast training

Downloads 555

Release Time : 12/4/2024

Model Overview

CleanDIFT improves diffusion models to extract features directly from clean images, avoiding the traditional requirement of noisy image inputs, thereby enhancing the efficiency and stability of feature extraction.

Model Features

Noise-free feature extraction

Directly processes clean input images, avoiding the noise issues introduced by traditional diffusion models.

Efficient training

Training can be completed in just 30 minutes on a single GPU.

Compatibility

Fully compatible with the diffusers library, allowing easy replacement of the U-Net part in existing Stable Diffusion models.

Model Capabilities

Image feature extraction

Semantic correspondence detection

Depth estimation

Semantic segmentation

Image classification

Use Cases

Computer vision

Semantic correspondence detection

Uses extracted features to detect semantic correspondence points between images.

Depth estimation

Performs monocular depth estimation based on extracted features.

Semantic segmentation

Performs pixel-level semantic segmentation using the features.

🚀 CleanDIFT Model Card

Diffusion models can learn powerful world representations, which are valuable for various tasks. However, they usually require noisy input images. CleanDIFT enables diffusion models to work directly with clean input images, extracting noise - free and timestep - independent features.

🚀 Quick Start

Diffusion models learn powerful world representations that have proven valuable for tasks like semantic correspondence detection, depth estimation, semantic segmentation, and classification. However, diffusion models require noisy input images, which destroys information and introduces the noise level as a hyperparameter that needs to be tuned for each task.

We introduce CleanDIFT, a novel method to extract noise - free, timestep - independent features by enabling diffusion models to work directly with clean input images. The approach is efficient, training on a single GPU in just 30 minutes. We publish these models alongside our paper "CleanDIFT: Diffusion Features without Noise".

We provide checkpoints for Stable Diffusion 1.5 and Stable Diffusion 2.1.

✨ Features

Noise - free Feature Extraction: Enables diffusion models to work with clean input images, extracting noise - free and timestep - independent features.
Efficiency: Can be trained on a single GPU in just 30 minutes.
Model Compatibility: Provides checkpoints for Stable Diffusion 1.5 and 2.1, fully compatible with the diffusers library.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

For detailed examples on how to extract features with CleanDIFT and how to use them for downstream tasks, please refer to the notebooks provided here.

Advanced Usage

Our checkpoints are fully compatible with the diffusers library. If you already have a pipeline using SD 1.5 or SD 2.1 from diffusers, you can simply replace the U - Net state dict:

from diffusers import UNet2DConditionModel
from huggingface_hub import hf_hub_download

unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-2-1", subfolder="unet")
ckpt_pth = hf_hub_download(repo_id="CompVis/cleandift", filename="cleandift_sd21_unet.safetensors")
state_dict = load_file(ckpt_pth)
unet.load_state_dict(state_dict, strict=True)

📚 Documentation

The main documentation can be found in the notebooks provided here.

📄 License

This project is licensed under the MIT license.

📚 Citation

@misc{stracke2024cleandiftdiffusionfeaturesnoise,
      title={CleanDIFT: Diffusion Features without Noise}, 
      author={Nick Stracke and Stefan Andreas Baumann and Kolja Bauer and Frank Fundel and Björn Ommer},
      year={2024},
      eprint={2412.03439},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.03439}, 
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご