Stable Diffusion 3.5 Medium開源文生圖模型 - 支持多風格免費生成圖像

首頁

Sd35m Sfwbooru Lycoris

由bghira開發

Stable Diffusion 3.5 Medium 是一個基於擴散模型的文生圖/圖生圖模型，支持多種風格的圖像生成，包括幻想、科幻、賽博朋克等。

圖像生成開源協議:其他 #多風格文生圖 #高細節渲染 #賽博朋克創作

下載量 595

發布時間 : 3/25/2025

模型概述

該模型是一個基於擴散模型的圖像生成模型，能夠根據文本提示生成高質量的圖像，支持多種風格和應用場景。

模型特點

高質量圖像生成

能夠生成高分辨率、高細節的圖像，適用於多種風格和場景。

多風格支持

支持幻想、科幻、賽博朋克、中世紀等多種風格。

文生圖與圖生圖

支持根據文本提示生成圖像，也支持基於現有圖像進行修改和增強。

LoRA和LyCORIS支持

支持LoRA和LyCORIS等輕量級微調技術，便於模型定製和優化。

模型能力

文本到圖像生成

圖像到圖像生成

高分辨率圖像生成

多風格圖像生成

支持LoRA微調

支持LyCORIS微調

使用案例

藝術創作

幻想藝術

生成幻想風格的圖像，如魔法森林、巨龍等。

高細節、高分辨率的幻想藝術圖像。

科幻場景

生成科幻風格的圖像，如未來城市、太空戰鬥等。

具有未來感的科幻場景圖像。

遊戲設計

角色設計

生成遊戲角色概念圖，如賽博格、精靈等。

多樣化的角色設計圖像。

場景設計

生成遊戲場景概念圖，如中世紀市場、廢棄遊樂場等。

豐富的場景設計圖像。

廣告與營銷

廣告素材

生成廣告所需的圖像素材，如霓虹燈招牌、復古餐廳等。

吸引眼球的廣告圖像。

產品展示

生成產品展示圖像，如復古車輛、古董店等。

高質量的產品展示圖像。

🚀 sd35m-sfwbooru-lycoris

這是一個基於 stabilityai/stable-diffusion-3.5-medium 的 LyCORIS 適配器。它能夠在圖像生成任務中，基於基礎模型生成更符合特定需求的圖像，為圖像生成領域帶來更多可能性。

🚀 快速開始

推理示例

以下是使用該適配器進行推理的 Python 代碼示例：

import torch
from diffusers import DiffusionPipeline
from lycoris import create_lycoris_from_weights


def download_adapter(repo_id: str):
    import os
    from huggingface_hub import hf_hub_download
    adapter_filename = "pytorch_lora_weights.safetensors"
    cache_dir = os.environ.get('HF_PATH', os.path.expanduser('~/.cache/huggingface/hub/models'))
    cleaned_adapter_path = repo_id.replace("/", "_").replace("\\", "_").replace(":", "_")
    path_to_adapter = os.path.join(cache_dir, cleaned_adapter_path)
    path_to_adapter_file = os.path.join(path_to_adapter, adapter_filename)
    os.makedirs(path_to_adapter, exist_ok=True)
    hf_hub_download(
        repo_id=repo_id, filename=adapter_filename, local_dir=path_to_adapter
    )

    return path_to_adapter_file
    
model_id = 'stabilityai/stable-diffusion-3.5-medium'
adapter_repo_id = 'bghira/sd35m-sfwbooru-lycoris'
adapter_filename = 'pytorch_lora_weights.safetensors'
adapter_file_path = download_adapter(repo_id=adapter_repo_id)
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
lora_scale = 1.0
wrapper, _ = create_lycoris_from_weights(lora_scale, adapter_file_path, pipeline.transformer)
wrapper.merge_to()

prompt = "A photo-realistic image of a cat"
negative_prompt = 'blurry, cropped, ugly'

## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.transformer, weights=qint8)
#freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=30,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.2,
    skip_guidance_layers=[7, 8, 9],
).images[0]

model_output.save("output.png", format="PNG")

驗證設置

CFG: 3.2
CFG 縮放: 0.0
步數: 30
採樣器: FlowMatchEulerDiscreteScheduler
種子: 42
分辨率: 1024x1024
跳過層引導: skip_guidance_layers=[7, 8, 9]

注意：驗證設置不一定與訓練設置相同。

訓練設置

屬性	詳情
訓練輪數	3
訓練步數	220250
學習率	5e-06
學習率調度	餘弦
熱身步數	500000
最大梯度值	0.01
有效批量大小	6
微批量大小	6
梯度累積步數	1
GPU 數量	1
梯度檢查點	啟用
預測類型	流匹配 (額外參數=['shift=3'])
優化器	optimi-lion
可訓練參數精度	純 BF16
基礎模型精度	`no_change`
字幕丟棄概率	10.0%

LyCORIS 配置

{
    "algo": "lokr",
    "multiplier": 1.0,
    "full_matrix": true,
    "linear_alpha": 1,
    "factor": 16,
    "apply_preset": {
        "target_module": [
            "Attention"
        ],
        "module_algo_map": {
            "Attention": {
                "factor": 6
            }
        }
    }
}