Open-source image generation model simpletuner-sdxl-lora-test - Efficiently complete various image generation tasks

Simpletuner Sdxl Lora Test

Developed by bghira

LyCORIS adapter based on Stable Diffusion XL 1.0, focused on image generation tasks

Image Generation Open Source License:Openrail #SDXL-LyCORIS Adapter #Text-to-Image Optimization #High-Resolution Image Generation

Downloads 108

Release Time : 4/15/2025

Model Overview

This is a LyCORIS adapter model based on Stable Diffusion XL 1.0, primarily used for text-to-image and image-to-image tasks, especially suitable for generating realistic cat photos and similar image content.

Model Features

LyCORIS Adapter

Utilizes LyCORIS technology for fine-tuning, enabling specific style image generation while keeping the base model parameters unchanged

High-Quality Image Generation

Supports 1024x1024 high-resolution image generation, particularly excels at generating realistic cat photos

Efficient Training Configuration

Employs bnb-lion8bit optimizer and BF16 precision training for efficient fine-tuning

Model Capabilities

Text-to-Image

Image-to-Image

High-Resolution Image Generation

Specific Style Image Generation

Use Cases

Creative Image Generation

Realistic Animal Photo Generation

Generate high-quality realistic cat photos

Sample images demonstrate highly realistic cat photo effects

Artistic Creation

Generate creative artworks based on text prompts

🚀 simpletuner-sdxl-lora-test

This project is a LyCORIS adapter derived from stabilityai/stable-diffusion-xl-base-1.0. It aims to generate photo - realistic images, with a focus on cat images during validation.

🚀 Quick Start

The main validation prompt used during training was:

A photo - realistic image of a cat

✨ Features

Specific Validation Settings: Defined CFG, CFG Rescale, Steps, Sampler, Seed, and Resolution for validation.
Training Details: Clearly outlines training epochs, steps, learning rate, and other training - related parameters.
Dataset Information: Provides details about the datasets used for training.
Inference Code: Offers a complete Python code example for inference.

📦 Installation

No specific installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

import torch
from diffusers import DiffusionPipeline
from lycoris import create_lycoris_from_weights


def download_adapter(repo_id: str):
    import os
    from huggingface_hub import hf_hub_download
    adapter_filename = "pytorch_lora_weights.safetensors"
    cache_dir = os.environ.get('HF_PATH', os.path.expanduser('~/.cache/huggingface/hub/models'))
    cleaned_adapter_path = repo_id.replace("/", "_").replace("\\", "_").replace(":", "_")
    path_to_adapter = os.path.join(cache_dir, cleaned_adapter_path)
    path_to_adapter_file = os.path.join(path_to_adapter, adapter_filename)
    os.makedirs(path_to_adapter, exist_ok=True)
    hf_hub_download(
        repo_id=repo_id, filename=adapter_filename, local_dir=path_to_adapter
    )

    return path_to_adapter_file
    
model_id = 'stabilityai/stable-diffusion-xl-base-1.0'
adapter_repo_id = 'bghira/simpletuner-sdxl-lora-test'
adapter_filename = 'pytorch_lora_weights.safetensors'
adapter_file_path = download_adapter(repo_id=adapter_repo_id)
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
lora_scale = 1.0
wrapper, _ = create_lycoris_from_weights(lora_scale, adapter_file_path, pipeline.unet)
wrapper.merge_to()

prompt = "A photo-realistic image of a cat"
negative_prompt = 'blurry, cropped, ugly'

## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.unet, weights=qint8)
#freeze(pipeline.unet)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=4.2,
    guidance_rescale=0.0,
).images[0]

model_output.save("output.png", format="PNG")

📚 Documentation

Validation settings

CFG: 4.2
CFG Rescale: 0.0
Steps: 20
Sampler: ddim
Seed: 42
Resolution: 1024x1024

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 1
Training steps: 390
Learning rate: 3e - 07
- Learning rate schedule: constant
- Warmup steps: 100
Max grad value: 2.0
Effective batch size: 3
- Micro - batch size: 1
- Gradient accumulation steps: 1
- Number of GPUs: 3
Gradient checkpointing: True
Prediction type: epsilon (extra parameters=['training_scheduler_timestep_spacing=trailing', 'inference_scheduler_timestep_spacing=trailing'])
Optimizer: bnb - lion8bit
Trainable parameter precision: Pure BF16
Base model precision: no_change
Caption dropout probability: 0.1%

LyCORIS Config:

{
    "bypass_mode": true,
    "algo": "lokr",
    "multiplier": 1.0,
    "linear_dim": 10000,
    "linear_alpha": 1,
    "factor": 12,
    "apply_preset": {
        "target_module": [
            "Attention",
            "FeedForward"
        ],
        "module_algo_map": {
            "Attention": {
                "factor": 12
            },
            "FeedForward": {
                "factor": 6
            }
        }
    }
}

Datasets

signs - discovery

Repeats: 0
Total number of images: ~423
Total number of aspect buckets: 5
Resolution: 1.048576 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

signs - discovery - 512

Repeats: 0
Total number of images: ~420
Total number of aspect buckets: 4
Resolution: 0.262144 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

🔧 Technical Details

The project uses a LyCORIS adapter based on the Stable Diffusion XL base model. It has specific training and validation settings, and the text encoder of the base model is reused for inference. The training process involves multiple parameters such as learning rate, batch size, and gradient checkpointing.

📄 License

The project is licensed under the creativeml - openrail - m license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご