🚀 Niji Diffusion XL Base 1.0
Anime - styled SDXL model fine - tuned with the niji - v5 dataset.
🚀 Quick Start
This is an anime - styled "SDXL (stable - diffusion - xl - base - 1.0)" model. The content is a model merged after LoRA fine - tuning with the "niji - v5" dataset.
✨ Features
- Based on the SDXL model, it is adjusted to an anime style.
- Fine - tuned with the niji - v5 dataset, which can generate anime - style images.
💻 Usage Examples
Basic Usage
Generate images using niji - diffusion - xl - base - 1.0.safetensors and stable - diffusion - webui with the following parameters:
⚠️ Important Note
Since only about 13000 - 100 images have been trained in total, if you write multiple items in the prompt, the generated image may not be in the niji style. It seems okay to write multiple items in the negative prompt.
Prompt:
masterpiece, best quality, high quality, absurdres, 1girl, flower
Negative prompt:
worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry
PNG info:
Steps: 28, Sampler: Euler a, CFG scale: 7, Seed: 1, Size: 1536x1024, Model hash: 791d0c791e, Model: sd_xl_niji_1.0, Clip skip: 2, ENSD: 31337, Token merging ratio: 0.5, Eta: 0.67, Version: v1.5.1

Prompt:
1girl
Prompt:
1girl, tokyo
Prompt:
1girl, steampunk
Prompt:
1girl, fantasy

📚 Documentation
Model Creation Method
- Refer to "Simple☆Copy Machine Learning Method (Surely the Beginner's Edition)", perform LoRA DreamBooth on "Blur", and merge the LoRA model negatively into the SDXL model.
- Select 100 pictures with detailed backgrounds and hair from niji - v5, perform LoRA fine - tuning on the model created in step 1, and merge the LoRA model into the SDXL model.
Future Model Improvements
We want to distribute it as a LoRA model. The reason is that when trained with 512dim(rank), the LoRA model's file size becomes 3GB. So, this time, it is merged into the SDXL model.
Thoughts
It was very difficult to adjust it properly, and I had to start over many times. I want to make a video about the creation method later.
Acknowledgments
We sincerely thank those who created and distributed the models, training data, and training tools.
Libraries
- [sd - scripts](https://github.com/kohya - ss/sd - scripts/tree/sdxl) 4072f723c12822e2fa1b2e076cc1f90b8f4e30c9
- [bitsandbytes](https://github.com/jllllll/bitsandbytes - windows - webui) 0.39.1
- Pytorch 2.0.0+cu117
- xformers 0.0.19
Update History
- August 14, 2023
We visually selected about 1000 images from nijijourney that are good in anime or illustration style and trained the model with them. The following records what we did, but we don't know what is effective. The following lists the hyperparameters. However, after performing hierarchical merging with v11 and [sd - webui - supermerger](https://github.com/hako - mikan/sd - webui - supermerger) (a ratio where the pictures similar to block_lr seem good), the result was not finalized at once. Finally, we merged blur at about - 0.05 and [anime](https://civitai.com/models/128125/anime - leco) created by LECO at 1 using LoRA merging to finalize the model.
Property |
Details |
GPU |
RTX3090 24GB |
optimizer_type |
PagedLion8bit |
optimizer_args |
weight_decay = 0.01, betas =.9,.999 |
block_lr |
0,1e - 08,1e - 08,1e - 08,1e - 08,1e - 07,1e - 07,1e - 07,1e - 06,1e - 06,1e - 05,1e - 05,1e - 05,1e - 06,1e - 06,1e - 07,1e - 07,1e - 07,1e - 08,1e - 08,1e - 08,1e - 08,0 |
lr_scheduler |
cosine |
lr_warmup_steps |
100 |
gradient_checkpointing |
|
mixed_precision |
bf16 |
full_bf16 |
|
max_token_length |
225 |
min_snr_gamma |
5 |
noise_offset |
0.0357 |
max_train_epochs |
3 |
batch_size |
12 |
enable_bucket |
true |
resolution |
[1024,1024] |
-
August 11, 2023
We trained the model with 12000 images mixed with the previous nijijourney images. The optimizer was Lion(4e - 06, cosine, weight_decay = 0.015, betas =.9,.999).
-
August 7, 2023
We performed full fine - tuning with about 4500 images from nijijourney. We replaced it with a VAE that doesn't break in fp16. The learning rate of 1e - 07 seemed too low, and the pictures didn't change much. We plan to increase the learning rate next time.
-
August 1, 2023
We performed LoRA fine - tuning with about 100 images from nijijourney.
📄 License
- [CreativeML Open RAIL++ - M License](https://huggingface.co/stabilityai/stable - diffusion - xl - base - 1.0/blob/main/LICENSE.md)