Sygil-Diffusion Open-Source Image Generation Model - Effectively Avoid Context Confusion and Precisely Control Image Elements

Sygil Diffusion

Developed by Sygil

A fine-tuned version based on Stable Diffusion, supporting multi-level namespace control for image generation elements, effectively avoiding context confusion issues

Image Generation Supports Multiple Languages#Multi-level Namespace Control #Multilingual Prompt Support #Context Confusion Optimization

Downloads 1,578

Release Time : 12/31/2022

Model Overview

This model is a fine-tuned version of Stable Diffusion, controlling various elements of generated images through namespaces (tag classification). It supports multilingual prompts and can generate diverse types of image compositions, excelling in fields such as portraits, architecture, reflections, fantasy scenes, concept art, anime, and landscapes.

Model Features

Multi-level Namespace Control

Precisely control various elements of generated images through tag classification (e.g., 'species:seal' or 'studio:DC Comics'), effectively avoiding context confusion issues

Multilingual Support

In addition to English, it can partially understand prompts in Chinese, Japanese, and Spanish

Wide Range of Applications

Outperforms the original model in fields such as portraits, architecture, reflections, fantasy scenes, concept art, anime, and landscapes

Continuous Updates

The model is still in its early stages and will continue to update training data, with plans to refine more high-quality tags in the future

Model Capabilities

Text-to-Image Generation

Multilingual Prompt Understanding

High-Resolution Image Generation

Diverse Image Composition

Use Cases

Concept Art

Environment Art

Generate realistic-style environment concept art

High-quality concept art works

Fantasy Scenes

Generate illustrations of fantasy scenes such as enchanted forests

Beautiful fantasy scene illustrations

Wallpaper Design

Landscape Wallpaper

Generate high-quality landscape wallpapers

Landscape images suitable for wallpapers

🚀 Sygil Diffusion Model

A fine - tuned Stable Diffusion model with multi - language support and namespace control for high - quality image generation.

🚀 Quick Start

Installation

Using the 🤗's Diffusers library to run Sygil Diffusion in a simple and efficient manner.

pip install diffusers transformers accelerate scipy safetensors

Basic Usage

Running the pipeline (if you don't swap the scheduler it will run with the default DDIM, in this example we are swapping it to DPMSolverMultistepScheduler):

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

model_id = "Sygil/Sygil-Diffusion"

# Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "a beautiful illustration of a fantasy forest"
image = pipe(prompt).images[0]
    
image.save("fantasy_forest_illustration.png")

⚠️ Important Note

Despite not being a dependency, we highly recommend you to install xformers for memory efficient attention (better performance).

If you have low GPU RAM available, make sure to add a pipe.enable_attention_slicing() after sending it to cuda for less VRAM usage (to the cost of speed).

✨ Features

Multi - language Support

This model is able to understand other languages besides English. Currently it can partially understand prompts in Chinese, Japanese and Spanish. More training is already being done in order to have the model completely understand those languages and have it work just like how it works with English prompts.

Namespace Control

This model is a fine - tune of Stable Diffusion, trained on the Imaginary Network Expanded Dataset. It has the big advantage of allowing the use of multiple namespaces (labeled tags) to control various parts of the final generation. While current models usually are prone to “context errors” and need substantial negative prompting to set them on the right track, the use of namespaces in this model (eg. “species:seal” or “studio:dc”) stop the model from misinterpreting a seal as the singer Seal, or DC Comics as Washington DC.

Diverse Image Generation

As the model is fine - tuned on a wide variety of content, it’s able to generate many types of images and compositions, and easily outperforms the original model when it comes to portraits, architecture, reflections, fantasy, concept art, anime, landscapes and a lot more without being hyper - specialized like other community fine - tunes that are currently available.

📦 Installation

The installation command using the Diffusers library is as follows:

pip install diffusers transformers accelerate scipy safetensors

💻 Usage Examples

Basic Usage

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

model_id = "Sygil/Sygil-Diffusion"

# Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "a beautiful illustration of a fantasy forest"
image = pipe(prompt).images[0]
    
image.save("fantasy_forest_illustration.png")

Advanced Usage

You can use the tags and namespaces found here Dataset Explorer to get better results. The prompt engineering techniques needed are slightly different from other fine - tunes and the original Stable Diffusion model, so while you can still use your favorite prompts, for best results you might need to tweak them to make use of namespaces.

📚 Documentation

Available Checkpoints

Category	Checkpoint	Training Details
Stable	Sygil Diffusion v0.1	Trained on Stable Diffusion 1.5 for 800,000 steps.
Stable	Sygil Diffusion v0.2	Resumed from Sygil Diffusion v0.1 and trained for a total of 1.77 million steps.
Stable	Sygil Diffusion v0.3	Resumed from Sygil Diffusion v0.2 and trained for a total of 2.01 million steps.
Stable	Sygil Diffusion v0.4	Resumed from Sygil Diffusion v0.3 and trained for a total of 2.37 million steps.
Beta	No active beta right now.	-

Note: Checkpoints under the Beta section are updated daily or at least 3 - 4 times a week. This is usually the equivalent of 1 - 2 training session, this is done until they are stable enough to be moved into a proper release, usually every 1 or 2 weeks. While the beta checkpoints can be used as they are only the latest version is kept on the repo and the older checkpoints are removed when a new one is uploaded to keep the repo clean. The HuggingFace inference API as well as the diffusers library will always use the latest beta checkpoint in the diffusers format. For special cases we might make additional repositories to keep a copy of the diffusers model like when a model uses a different Stable Diffusion model as base (eg. Stable Diffusion 1.5 vs 2.1).

Training Details

Property	Details
Training Data	Imaginary Network Expanded Dataset dataset.
Hardware	1 x Nvidia RTX 3050 8GB GPU
Hours Trained	Approximately 857 hours
Optimizer	AdamW
Adam Beta 1	0.9
Adam Beta 2	0.999
Adam Weight Decay	0.01
Adam Epsilon	1e - 8
Gradient Checkpointing	True
Gradient Accumulations	400
Batch	1
Learning Rate	1e - 7
Learning Rate Scheduler	cosine_with_restarts
Learning Rate Warmup Steps	10,000
Lora unet Learning Rate	1e - 7
Lora Text Encoder Learning Rate	1e - 7
Resolution	512 pixels
Total Training Steps	2,370,200

Note: For the learning rate, after changing from using the constant scheduler to cosine_with_restarts after v0.3 was released, it practically uses the optimal learning rate while trying to minimize the loss value. So, when every training session finishes, the latest learning rate value shown for the last few steps from the last session is used for the next session, which makes it decrease at a constant rate over time. When a lot of data is added to the training dataset at once, the learning rate is moved back to 1e - 7, and then the scheduler will move it down again as it learns more from the new data. This prevents the training from overfitting or using a learning rate too low that makes the model not learn anything new for a while.

🔧 Technical Details

The model is fine - tuned on the Imaginary Network Expanded Dataset. The use of namespaces in the model helps to control various parts of the final generation and avoid “context errors”. The learning rate adjustment strategy is designed to optimize the training process and prevent overfitting.

📄 License

This model is open access and available to all, with a CreativeML Open RAIL++ - M License further specifying rights and usage. Please read the full license here

Developed by: ZeroCool94 at Sygil-Dev

Community Contributions

Kevin Turner (keturn): created the INE-dataset-explorer space for better browsing of the INE dataset.

This model card is based on the Stable Diffusion v1 and DALL-E Mini model card.

Showcase

Showcase image

If you find my work useful, please consider supporting me on GitHub Sponsors!

This model is still in its infancy and it's meant to be constantly updated and trained with more and more data as time goes by, so feel free to give us feedback on our Discord Server or on the discussions section on huggingface. We plan to improve it with more, better tags in the future, so any help is always welcome 😛

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご