Niji-Diffusion XL Base 1.0 Open-Source Model - Achieve Efficient Generation of Anime-Style Images from Text

Niji Diffusion Xl Base 1.0

Developed by inu-ai

An anime-style text-to-image generation model based on SDXL (stable-diffusion-xl-base-1.0), fine-tuned with LoRA using the niji-v5 dataset

Image Generation #Anime style generation #High-resolution illustration #Nijijourney optimization

Downloads 62

Release Time : 8/1/2023

Model Overview

This is an SDXL model with an anime-style bias, specifically optimized for generating nijijourney-style images, suitable for creating high-quality anime-style illustrations.

Model Features

Anime style optimization

Specifically optimized and trained for nijijourney-style anime images

High-quality output

Capable of generating high-resolution, detail-rich anime-style images

LoRA fine-tuning technology

Targeted optimization of the SDXL base model using LoRA fine-tuning technology

Model Capabilities

Text-to-image generation

Anime-style image generation

High-resolution image generation

Use Cases

Anime creation

Character design

Generate anime character images in various styles

Can produce high-quality, stylistically consistent anime characters

Scene creation

Generate anime scenes with different themes

Can generate various themed scenes such as Tokyo, steampunk, fantasy, etc.

🚀 Niji Diffusion XL Base 1.0

Anime - styled SDXL model fine - tuned with the niji - v5 dataset.

🚀 Quick Start

This is an anime - styled "SDXL (stable - diffusion - xl - base - 1.0)" model. The content is a model merged after LoRA fine - tuning with the "niji - v5" dataset.

✨ Features

Based on the SDXL model, it is adjusted to an anime style.
Fine - tuned with the niji - v5 dataset, which can generate anime - style images.

💻 Usage Examples

Basic Usage

Generate images using niji - diffusion - xl - base - 1.0.safetensors and stable - diffusion - webui with the following parameters:

⚠️ Important Note

Since only about 13000 - 100 images have been trained in total, if you write multiple items in the prompt, the generated image may not be in the niji style. It seems okay to write multiple items in the negative prompt.

Prompt:

masterpiece, best quality, high quality, absurdres, 1girl, flower

Negative prompt:

worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry

PNG info:

Steps: 28, Sampler: Euler a, CFG scale: 7, Seed: 1, Size: 1536x1024, Model hash: 791d0c791e, Model: sd_xl_niji_1.0, Clip skip: 2, ENSD: 31337, Token merging ratio: 0.5, Eta: 0.67, Version: v1.5.1

Prompt:

1girl

thumbnail Prompt:

1girl, tokyo

thumbnail Prompt:

1girl, steampunk

thumbnail Prompt:

1girl, fantasy

thumbnail

📚 Documentation

Model Creation Method

Refer to "Simple☆Copy Machine Learning Method (Surely the Beginner's Edition)", perform LoRA DreamBooth on "Blur", and merge the LoRA model negatively into the SDXL model.
Select 100 pictures with detailed backgrounds and hair from niji - v5, perform LoRA fine - tuning on the model created in step 1, and merge the LoRA model into the SDXL model.

Future Model Improvements

We want to distribute it as a LoRA model. The reason is that when trained with 512dim(rank), the LoRA model's file size becomes 3GB. So, this time, it is merged into the SDXL model.

Thoughts

It was very difficult to adjust it properly, and I had to start over many times. I want to make a video about the creation method later.

Acknowledgments

We sincerely thank those who created and distributed the models, training data, and training tools.

Libraries

[sd - scripts](https://github.com/kohya - ss/sd - scripts/tree/sdxl) 4072f723c12822e2fa1b2e076cc1f90b8f4e30c9
[bitsandbytes](https://github.com/jllllll/bitsandbytes - windows - webui) 0.39.1
Pytorch 2.0.0+cu117
xformers 0.0.19

Update History

August 14, 2023 We visually selected about 1000 images from nijijourney that are good in anime or illustration style and trained the model with them. The following records what we did, but we don't know what is effective. The following lists the hyperparameters. However, after performing hierarchical merging with v11 and [sd - webui - supermerger](https://github.com/hako - mikan/sd - webui - supermerger) (a ratio where the pictures similar to block_lr seem good), the result was not finalized at once. Finally, we merged blur at about - 0.05 and [anime](https://civitai.com/models/128125/anime - leco) created by LECO at 1 using LoRA merging to finalize the model.

Property	Details
GPU	RTX3090 24GB
optimizer_type	PagedLion8bit
optimizer_args	weight_decay = 0.01, betas =.9,.999
block_lr	0,1e - 08,1e - 08,1e - 08,1e - 08,1e - 07,1e - 07,1e - 07,1e - 06,1e - 06,1e - 05,1e - 05,1e - 05,1e - 06,1e - 06,1e - 07,1e - 07,1e - 07,1e - 08,1e - 08,1e - 08,1e - 08,0
lr_scheduler	cosine
lr_warmup_steps	100
gradient_checkpointing
mixed_precision	bf16
full_bf16
max_token_length	225
min_snr_gamma	5
noise_offset	0.0357
max_train_epochs	3
batch_size	12
enable_bucket	true
resolution	[1024,1024]

August 11, 2023 We trained the model with 12000 images mixed with the previous nijijourney images. The optimizer was Lion(4e - 06, cosine, weight_decay = 0.015, betas =.9,.999).
August 7, 2023 We performed full fine - tuning with about 4500 images from nijijourney. We replaced it with a VAE that doesn't break in fp16. The learning rate of 1e - 07 seemed too low, and the pictures didn't change much. We plan to increase the learning rate next time.
August 1, 2023 We performed LoRA fine - tuning with about 100 images from nijijourney.

📄 License

[CreativeML Open RAIL++ - M License](https://huggingface.co/stabilityai/stable - diffusion - xl - base - 1.0/blob/main/LICENSE.md)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご