SDXL Open-source AI Model - Freely Deploy and Easily Generate Realistic Cat Photos

Sdxl

Developed by ControlNetLoRA

A ControlNet PEFT LoHa model based on stabilityai/stable-diffusion-xl-base-1.0, mainly used for generating realistic cat photos.

Image Generation Open Source License:Openrail #ControlNet Fine-tuning #LoHa Adapter #High-resolution Image Generation

Downloads 314

Release Time : 4/15/2025

Model Overview

This is an image generation model based on ControlNet PEFT LoHa technology, capable of generating high-quality realistic images according to text prompts, and is particularly good at generating cat photos.

Model Features

ControlNet PEFT LoHa Technology

Adopting ControlNet PEFT LoHa technology, derived from the stable-diffusion-xl-base-1.0 model, providing more efficient image generation capabilities.

Realistic Image Generation

Capable of generating high-quality realistic images, especially good at generating cat photos.

Detailed Training Parameter Settings

Detailed parameter settings were used during the training process to ensure the stability and accuracy of the model.

Model Capabilities

Text-to-Image Generation

Realistic Image Generation

Image Style Conversion

Use Cases

Image Generation

Generate Realistic Cat Photos

Generate high-quality realistic cat photos according to text prompts.

The generated images have high resolution and realistic effects.

🚀 simpletuner-controlnet-sdxl-lora-test

This project presents a ControlNet PEFT LoHa derived from stabilityai/stable-diffusion-xl-base-1.0, which can generate photo - realistic images, such as images of cats.

🚀 Quick Start

This is a ControlNet PEFT LoHa derived from stabilityai/stable-diffusion-xl-base-1.0. The main validation prompt used during training was:

A photo-realistic image of a cat

✨ Features

The text encoder was not trained. You may reuse the base model text encoder for inference.
It can generate photo - realistic images according to the given prompts.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

import torch
from diffusers import DiffusionPipeline

model_id = 'stabilityai/stable-diffusion-xl-base-1.0'
adapter_id = 'bghira/simpletuner-controlnet-sdxl-lora-test'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "A photo-realistic image of a cat"
negative_prompt = 'blurry, cropped, ugly'

## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.unet, weights=qint8)
#freeze(pipeline.unet)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=4.2,
    guidance_rescale=0.0,
).images[0]

model_output.save("output.png", format="PNG")

📚 Documentation

Validation settings

CFG: 4.2
CFG Rescale: 0.0
Steps: 20
Sampler: ddim
Seed: 42
Resolution: 1024x1024

Note: The validation settings are not necessarily the same as the training settings. You can find some example images in the following gallery:

Training settings

Property	Details
Training epochs	4
Training steps	100
Learning rate	0.0001 - Learning rate schedule: constant - Warmup steps: 0
Max grad value	2.0
Effective batch size	1 - Micro - batch size: 1 - Gradient accumulation steps: 1 - Number of GPUs: 1
Gradient checkpointing	True
Prediction type	epsilon (extra parameters=['training_scheduler_timestep_spacing=trailing', 'inference_scheduler_timestep_spacing=trailing'])
Optimizer	bnb - lion8bit
Trainable parameter precision	Pure BF16
Base model precision	`no_change`
Caption dropout probability	0.1%
LoRA Rank	128
LoRA Alpha	128.0
LoRA Dropout	0.1
LoRA initialisation style	default

Datasets

antelope - data

Property	Details
Repeats	0
Total number of images	24
Total number of aspect buckets	1
Resolution	1.048576 megapixels
Cropped	True
Crop style	center
Crop aspect	square
Used for regularisation data	No

🔧 Technical Details

The model is a ControlNet PEFT LoHa derived from stabilityai/stable-diffusion-xl-base-1.0. It has specific training settings, validation settings, and uses a particular dataset for training. The text encoder is not trained, and users can reuse the base model text encoder for inference.

📄 License

The license is creativeml-openrail-m.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご