simpletuner-lora開源模型 - 免費用於文生圖和圖生圖創作

首頁

Simpletuner Lora

由ShanZard開發

基於stabilityai/stable-diffusion-3.5-large的LyCORIS適配器，專注於文生圖和圖生圖任務

圖像生成開源協議:其他 #SD3-LyCORIS適配器 #高分辨率文生圖 #流匹配優化

下載量 67

發布時間 : 4/15/2025

模型概述

這是一個基於Stable Diffusion 3.5大模型的LyCORIS適配器，主要用於文本到圖像和圖像到圖像的生成任務。它支持LoRA微調，並針對特定提示詞進行了優化。

模型特點

LyCORIS適配器

使用LyCORIS技術對基礎模型進行高效微調，保持模型性能的同時減少計算資源需求

LoRA支持

支持LoRA微調技術，可以快速適應特定風格或主題

高質量圖像生成

基於Stable Diffusion 3.5大模型，能夠生成高分辨率(1024x1024)的逼真圖像

模型能力

文本生成圖像

圖像到圖像轉換

風格化圖像生成

高分辨率圖像生成

使用案例

創意內容生成

逼真動物圖像生成

根據文本描述生成逼真的動物照片，如示例中的貓咪圖像

生成1024x1024分辨率的逼真圖像

藝術創作

風格化圖像創作

通過LoRA適配器快速實現特定藝術風格的圖像生成

🚀 simpletuner-lora

simpletuner-lora 是一個基於 stabilityai/stable-diffusion-3.5-large 的 LyCORIS 適配器，可用於文生圖等圖像生成任務。

🚀 快速開始

本項目是從 stabilityai/stable-diffusion-3.5-large 派生而來的 LyCORIS 適配器。

訓練期間使用的主要驗證提示為：

A photo-realistic image of a cat

✨ 主要特性

基於穩定擴散模型 stabilityai/stable-diffusion-3.5-large 派生，可用於文生圖、圖生圖等任務。
提供了訓練和驗證的詳細設置，方便復現和調整。
給出了推理代碼示例，便於快速上手使用。

📚 詳細文檔

驗證設置

CFG: 3.0
CFG Rescale: 0.0
步數: 20
採樣器: FlowMatchEulerDiscreteScheduler
種子: 42
分辨率: 1024x1024
跳過層引導:

注意：驗證設置不一定與訓練設置相同。

你可以在以下圖庫中找到一些示例圖像：

文本編碼器未進行訓練，推理時可複用基礎模型的文本編碼器。

訓練設置

訓練輪數: 1
訓練步數: 10000
學習率: 0.0001
- 學習率調度: 多項式
- 熱身步數: 100
最大梯度值: 2.0
有效批量大小: 2
- 微批量大小: 1
- 梯度累積步數: 1
- GPU 數量: 2
梯度檢查點: 啟用
預測類型: 流匹配 (額外參數=['shift=3'])
優化器: adamw_bf16
可訓練參數精度: 純 BF16
基礎模型精度: no_change
字幕丟棄概率: 10.0%

LyCORIS 配置

{
    "algo": "lokr",
    "multiplier": 1.0,
    "full_matrix": true,
    "linear_alpha": 1,
    "factor": 16,
    "apply_preset": {
        "target_module": [
            "Attention",
            "FeedForward"
        ],
        "module_algo_map": {
            "Attention": {
                "factor": 16
            },
            "FeedForward": {
                "factor": 8
            }
        }
    }
}

數據集

pseudo-camera-10k-sd3

重複次數: 0
圖像總數: ~14102
縱橫比桶總數: 1
分辨率: 1.048576 兆像素
裁剪: 是
裁剪樣式: 居中
裁剪縱橫比: 方形
用於正則化數據: 否

💻 使用示例

基礎用法

import torch
from diffusers import DiffusionPipeline
from lycoris import create_lycoris_from_weights


def download_adapter(repo_id: str):
    import os
    from huggingface_hub import hf_hub_download
    adapter_filename = "pytorch_lora_weights.safetensors"
    cache_dir = os.environ.get('HF_PATH', os.path.expanduser('~/.cache/huggingface/hub/models'))
    cleaned_adapter_path = repo_id.replace("/", "_").replace("\\", "_").replace(":", "_")
    path_to_adapter = os.path.join(cache_dir, cleaned_adapter_path)
    path_to_adapter_file = os.path.join(path_to_adapter, adapter_filename)
    os.makedirs(path_to_adapter, exist_ok=True)
    hf_hub_download(
        repo_id=repo_id, filename=adapter_filename, local_dir=path_to_adapter
    )

    return path_to_adapter_file
    
model_id = 'stabilityai/stable-diffusion-3.5-large'
adapter_repo_id = 'ShanZard/simpletuner-lora'
adapter_filename = 'pytorch_lora_weights.safetensors'
adapter_file_path = download_adapter(repo_id=adapter_repo_id)
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
lora_scale = 1.0
wrapper, _ = create_lycoris_from_weights(lora_scale, adapter_file_path, pipeline.transformer)
wrapper.merge_to()

prompt = "A photo-realistic image of a cat"
negative_prompt = 'blurry, cropped, ugly'

## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.transformer, weights=qint8)
#freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]

model_output.save("output.png", format="PNG")

高級用法

在基礎用法的基礎上，你可以根據需要調整以下參數：

prompt 和 negative_prompt：修改生成圖像的提示詞和負面提示詞。
num_inference_steps：調整推理步數，影響圖像生成的質量和速度。
guidance_scale：調整引導比例，控制生成圖像與提示詞的匹配程度。

# 示例：修改提示詞和推理步數
prompt = "A beautiful landscape with a lake and mountains"
negative_prompt = 'low quality, blurry'
num_inference_steps = 30

model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=num_inference_steps,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]

model_output.save("output_advanced.png", format="PNG")