๐ Alternative Stable Diffusion XL Base 1.0 Models
This repository offers alternative or tuned versions of Stable Diffusion XL Base 1.0 in .safetensors
format, facilitating various text - to - image generation tasks.
๐ Quick Start
This repository contains alternative or tuned versions of Stable Diffusion XL Base 1.0 in .safetensors
format.
โจ Features
- Multiple Model Variants: Offers different versions of Stable Diffusion XL Base 1.0, including merged models with specific VAEs and inpainting capabilities.
- Inpainting Calculation Formula: Provides a formula to create an SDXL inpainting checkpoint from any SDXL checkpoint.
๐ฆ Available Models
sd_xl_base_1.0_fp16_vae.safetensors
This file contains the weights of sd_xl_base_1.0.safetensors, merged with the weights of sdxl_vae.safetensors from MadeByOllin's SDXL FP16 VAE repository.
sd_xl_base_1.0_inpainting_0.1.safetensors
This file contains the weights of sd_xl_base_1.0_fp16_vae.safetensors
merged with the weights from diffusers/stable-diffusion-xl-1.0-inpainting-0.1.
๐ป Usage Examples
Creating an SDXL Inpainting Checkpoint
Using the .safetensors
files here, you can calculate an inpainting model using the formula A + (B - C)
, where:
A
is sd_xl_base_1.0_inpainting_0.1.safetensors
B
is your fine - tuned checkpoint
C
is sd_xl_base_1.0_fp16_vae.safetensors
Using ENFUGUE's Web UI:

โ ๏ธ Important Note
You must specifically use the two files present in this repository for this to work. The Diffusers team trained XL Inpainting using FP16 XL VAE, so using a different XL base will result in an incorrect delta being applied to the inpainting checkpoint, and the resulting VAE will be nonsensical.
๐ Documentation
Model Description
Property |
Details |
Developed by |
The Diffusers team |
Repackaged by |
Benjamin Paine |
Model Type |
Diffusion - based text - to - image generative model |
License |
CreativeML Open RAIL++ - M License |
Model Description |
This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP - ViT/G and CLIP - ViT/L). |
Uses
Direct Use
The model is intended for research purposes only. Possible research areas and tasks include:
- Generation of artworks and use in design and other artistic processes.
- Applications in educational or creative tools.
- Research on generative models.
- Safe deployment of models which have the potential to generate harmful content.
- Probing and understanding the limitations and biases of generative models.
Out - of - Scope Use
The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out - of - scope for the abilities of this model.
Limitations and Bias
Limitations
- The model does not achieve perfect photorealism.
- The model cannot render legible text.
- The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to โA red cube on top of a blue sphereโ.
- Faces and people in general may not be generated properly.
- The autoencoding part of the model is lossy.
- When the strength parameter is set to 1 (i.e. starting in - painting from a fully masked image), the quality of the image is degraded. The model retains the non - masked contents of the image, but images look less sharp. We're investing this and working on the next version.
Bias
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
๐ License
The models in this repository are licensed under the CreativeML Open RAIL++ - M License.