NitroFusionオープンソース画像生成モデル - 高忠実度の1ステップ拡散で、簡単に高品質な画像を作成

ホーム

Nitrofusion

ChenDYによって開発

動的敵対的トレーニングによる高忠実度シングルステップ拡散画像生成モデル

画像生成 #シングルステップテキスト画像生成 #敵対的拡散蒸留 #高忠実度画像

ダウンロード数 490

リリース時間 : 11/30/2024

モデル概要

ニトロフュージョンは敵対的拡散蒸留技術に基づくテキスト画像生成モデルで、1-4ステップで高品質な画像を迅速に生成でき、写実的と鮮やかな2つのスタイルを選択可能です。

モデル特徴

シングルステップ生成

わずか1ステップの推論で高品質な画像を生成可能

デュアルスタイル選択

写真級の写実性と高彩度カラーの2つのスタイルモデルを提供

動的敵対的トレーニング

タイムステップシフト技術でマルチステップ推論効果を最適化

効率的な推論

1-4ステップで画像生成が完了し、生成速度を大幅に向上

モデル能力

テキストから画像生成

高速画像合成

スタイライズド画像生成

使用事例

クリエイティブデザイン

コンセプトアート制作

デザインコンセプト図を迅速に生成

1ステップで使用可能なスケッチ、4ステップで精密な作品を生成

コンテンツ制作

SNS用画像

スタイル統一された宣伝画像を一括生成

異なるスタイル案の迅速な反復をサポート

🚀 NitroFusion

NitroFusionは、動的敵対的トレーニングを通じた高忠実度の単一步拡散を実現するプロジェクトです。このプロジェクトは、テキストから画像を生成する分野において、高品質な画像生成を高速に行うことを目指しています。

Property	Details
Base Model	tianweiy/DMD2, ByteDance/Hyper - SD, stabilityai/stable - diffusion - xl - base - 1.0
Pipeline Tag	text - to - image
Library Name	diffusers
Tags	text - to - image, stable - diffusion, sdxl, adversarial diffusion distillation

NitroFusion: High - Fidelity Single - Step Diffusion through Dynamic Adversarial Training

Dar - Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi - Zhe Song

[arXiv Paper] [Project Page]

📢 ニュース

2025年1月6日: ComfyUIのチェックポイントnitrosd - realism_comfyui.safetensorsとnitrosd - vibrant_comfyui.safetensors、およびワークフローがリリースされました。
2024年12月4日: 論文がarXivに公開され、プロジェクトページが公開されました。
2024年11月30日: 単一步のテキストから画像へのデモが🤗 Hugging Face Spaceで公開されました。
2024年11月29日: 2つのチェックポイントNitroSD - RealismとNitroSD - Vibrantがリリースされました。

🌐 オンラインデモ

NitroFusionの単一步テキストから画像へのデモは、🤗 Hugging Face Spaceで利用できます。

📋 モデル概要

nitrosd - realism_unet.safetensors: 細部が豊かな写実的な画像を生成します。
nitrosd - vibrant_unet.safetensors: 鮮やかで飽和度の高い色の画像を生成します。
両方のモデルは、1から4ステップの推論をサポートしています。

💻 使用例

基本的な使用法

まず、多ステップ推論のためにタイムステップシフトを持つスケジューラを実装する必要があります。

from diffusers import LCMScheduler
class TimestepShiftLCMScheduler(LCMScheduler):
    def __init__(self, *args, shifted_timestep=250, **kwargs):
        super().__init__(*args, **kwargs)
        self.register_to_config(shifted_timestep=shifted_timestep)
    def set_timesteps(self, *args, **kwargs):
        super().set_timesteps(*args, **kwargs)
        self.origin_timesteps = self.timesteps.clone()
        self.shifted_timesteps = (self.timesteps * self.config.shifted_timestep / self.config.num_train_timesteps).long()
        self.timesteps = self.shifted_timesteps
    def step(self, model_output, timestep, sample, generator=None, return_dict=True):
        if self.step_index is None:
            self._init_step_index(timestep)
        self.timesteps = self.origin_timesteps
        output = super().step(model_output, timestep, sample, generator, return_dict)
        self.timesteps = self.shifted_timesteps
        return output

次に、diffuserパイプラインを利用することができます。

import torch
from diffusers import DiffusionPipeline, UNet2DConditionModel
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
# Load model.
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ChenDY/NitroFusion"
# NitroSD-Realism
ckpt = "nitrosd-realism_unet.safetensors"
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=250)
scheduler.config.original_inference_steps = 4
# # NitroSD-Vibrant
# ckpt = "nitrosd-vibrant_unet.safetensors"
# unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
# unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
# scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=500)
# scheduler.config.original_inference_steps = 4
pipe = DiffusionPipeline.from_pretrained(
    base_model_id,
    unet=unet,
    scheduler=scheduler,
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
prompt = "a photo of a cat"
image = pipe(
    prompt=prompt,
    num_inference_steps=1,  # NotroSD-Realism and -Vibrant both support 1 - 4 inference steps.
    guidance_scale=0,
).images[0]

🛠️ ComfyUIの使用方法

nitrosd - realism_comfyui.safetensorsとnitrosd - vibrant_comfyui.safetensorsをダウンロードし、ComfyUI/models/checkpointsに配置します。
ComfyUI - TimestepShiftModelリポジトリをComfyUI/custom_nodesにクローンします。
ワークフローを使って遊んでみましょう！

📄 ライセンス

NitroSD - Realismは、ベースモデルのDMD2に従い、[cc - by - nc - 4.0](https://creativecommons.org/licenses/by - nc - sa/4.0/deed.en)の下でリリースされています。 NitroSD - Vibrantは、[openrail++](https://huggingface.co/stabilityai/stable - diffusion - xl - base - 1.0/blob/main/LICENSE.md)の下でリリースされています。