Cool-Japan-Diffusion-2-1-1-Beta Open-Source Model - Free Generation of Japanese Anime-Style Images

Cool Japan Diffusion 2 1 1 Beta

Developed by aipicasso

An anime-style image generation model fine-tuned on Stable Diffusion, specializing in Japanese anime, manga, game culture content

Image Generation Open Source License:Other #Anime-style generation #Japanese culture specialization #Non-commercial restrictions

Downloads 19

Release Time : 1/11/2023

Model Overview

Cool Japan Diffusion is a text-to-image generation model fine-tuned on Stable Diffusion, specializing in generating anime, manga, and game-style images with Japanese cultural elements. Developed under Japanese legal framework, suitable for non-commercial cultural creation and academic research.

Model Features

Japanese anime style optimization

Specially optimized for Japanese cultural content like anime, manga, and games

Non-commercial license

Current version restricted to non-commercial use to avoid potential impact on creative industries

Legal compliance

Developed under Japanese legal framework, training process complies with Copyright Law

Model Capabilities

Generate anime-style images from text descriptions

Support high-resolution image generation (up to 4K)

Support negative prompt filtering

Use Cases

Art creation

Anime character design

Generate original anime character images from text descriptions

High-quality anime-style portraits

Illustration creation

Assist artists with concept design and illustration

Stylistically consistent artwork

Cultural promotion

Japanese culture display

Generate visual content featuring Japanese cultural elements

Images with traditional elements like kimono, shrines

Academic research

AI art research

For research on generative AI in artistic creation

Analyzable art creation samples

🚀 Cool Japan Diffusion 2.1.1 Beta Model Card

Cool Japan Diffusion is a specialized text-to-image generation model based on Stable Diffusion, fine-tuned to excel in creating anime, manga, and game-related imagery.

Eye-catching Image

⚠️ Important Note

China will impose legal restrictions on image-generating AI. (Warning for people in China)

The English version is here.

🚀 Quick Start

Cool Japan Diffusion (for learning) is a model that fine-tunes Stable Diffusion and specializes in expressing Cool Japan in anime, manga, games, etc. It has no particular relation to the Cool Japan Strategy of the Cabinet Office.

✨ Features

License

The license for this model is based on the original CreativeML Open RAIL++-M License, with an added prohibition on commercial use except for certain exceptions. The reason for adding the commercial use prohibition is the concern that it may have a negative impact on the creative industry. If this concern is alleviated, the next version will revert to the original license, allowing commercial use. The Japanese translation of the original license can be found here. If you are from a for-profit company, please consult with your legal department. If you are using it for personal interest, you should be fine as long as you follow general common sense. As stated in the license, if you modify this model, you must inherit this license.

Legal and Ethical Considerations

This model was created in Japan, so Japanese law applies. The author claims that the training of this model is legal based on Article 30-4 of the Copyright Act. Regarding the distribution of this model, the author claims that it does not constitute a principal or accessory offense under the Copyright Act or Article 175 of the Penal Code. For more details, please refer to the opinion of lawyer Kakinuma. However, as stated in the license, please handle the outputs of this model in accordance with various laws and regulations.

The author believes that the act of distributing this model is not ethical because the author did not obtain permission from the copyright holders of the works used for training. However, legally, permission from the copyright holders is not required for training, just like search engines. Therefore, please consider that this distribution also serves the purpose of investigating the ethical aspects rather than just the legal ones.

📦 Installation

If you want to have a quick and easy experience, please use this Space. The detailed instructions on how to handle this model can be found here. You can download the model from here.

💻 Usage Examples

Basic Usage

This model can be used in the same way as Stable Diffusion v2. Here are two common usage patterns:

Web UI

Please follow the instructions in this manual.

Diffusers

Use 🤗's Diffusers library.

First, run the following script to install the library:

pip install --upgrade git+https://github.com/huggingface/diffusers.git transformers accelerate scipy

Then, run the following script to generate an image:

from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler
import torch

model_id = "aipicasso/cool-japan-diffusion-2-1-1-beta"

scheduler = EulerAncestralDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)#,use_auth_token="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")
pipe = pipe.to("cuda")

prompt = "anime, a portrait of a girl with black short hair and red eyes, kimono, full color illustration, official art, 4k, detailed"
negative_prompt="(((deformed))), blurry, ((((bad anatomy)))), bad pupil, disfigured, poorly drawn face, mutation, mutated, (extra limb), (ugly), (poorly drawn hands), bad hands, fused fingers, messy drawing, broken legs censor, low quality, ((mutated hands and fingers:1.5), (long body :1.3), (mutation, poorly drawn :1.2), ((bad eyes)), ui, error, missing fingers, fused fingers, one hand with more than 5 fingers, one hand with less than 5 fingers, one hand with more than 5 digit, one hand with less than 5 digit, extra digit, fewer digits, fused digit, missing digit, bad digit, liquid digit, long body, uncoordinated body, unnatural body, lowres, jpeg artifacts, 2d, 3d, cg, text"
image = pipe(prompt,negative_prompt=negative_prompt, width=512, height=512, num_inference_steps=20).images[0]
image.save("girl.png")

💡 Usage Tip

Using xformers seems to speed up the process.

If you have limited GPU memory when using a GPU, please use pipe.enable_attention_slicing().

Expected Use Cases

Contests
- Submissions to AI Art Grand Prix. Make sure to disclose all the data used for fine-tuning and meet the review criteria. If you have any requests for the contest, please contact me on Hugging Face's Community.
Reporting on Image Generation AI
- Both public broadcasters and for-profit companies can use it. The author believes that the "right to know" information about image synthesis AI will not have a negative impact on the creative industry and respects the freedom of the press.
Introduction of Cool Japan
- Explain what Cool Japan is to people from other countries. Many international students are attracted to Japan by Cool Japan but are often disappointed to find that it is not as "cool" in Japan as they expected. Alfred Increment hopes that people from other countries can be more proud of their own cultures that are admired by others.
Research and Development
- Using the model on Discord for prompt engineering, fine-tuning (including additional learning such as DreamBooth), merging with other models, studying the compatibility between the Latent Diffusion Model and Cool Japan, investigating the performance of this model using metrics like FID, and checking the independence of this model from other models using checksums or hash functions.
Education
- Graduation projects for art college students and vocational school students, graduation theses and assignment projects for university students, and teachers can use it to convey the current situation of image generation AI.
Self-expression
- Express your emotions and thoughts on SNS.
Use Cases on Hugging Face's Community
- Ask questions in Japanese or English.

Unexpected Use Cases

Representing things as facts.
Using it in monetized content on YouTube or other platforms.
Directly providing it as a commercial service.
Causing trouble for teachers.
Other actions that may have a negative impact on the creative industry.

Prohibited or Malicious Use Cases

Do not publish digital forgeries (Digital Forgery) as it may violate the Copyright Act. In particular, do not publish existing characters as it may also violate the Copyright Act. Note that it seems possible to generate characters that were not used for training. (This tweet itself is permitted for research purposes.)
Do not perform Image-to-Image on others' works without permission as it may violate the Copyright Act.
Do not distribute pornographic materials as it may violate Article 175 of the Penal Code.
Do not spread false information as it may be subject to the crime of interfering with business operations.

📚 Documentation

Model Details

Property	Details
Developer	Robin Rombach, Patrick Esser, Alfred Increment
Model Type	Diffusion model-based text-to-image generation model
Language	Japanese
License	CreativeML Open RAIL++-M-NC License
Model Description	This model can generate appropriate images according to the prompt. The algorithms used are Latent Diffusion Model and OpenCLIP-ViT/H.
References	@InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High-Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684-10695} }

Model Limitations and Biases

Model Limitations

The limitations are not well understood yet.

Biases

This model has the same biases as Stable Diffusion. Please be cautious.

Training

Training Data

VAE: Approximately 600,000 types of data that comply with Japanese domestic laws, excluding unauthorized reprint sites like Danbooru. An infinite number of samples can be created through data augmentation.
U-Net: 800,000 pairs of data that comply with Japanese domestic laws, excluding unauthorized reprint sites like Danbooru.

Training Process

The VAE and U-Net of Stable Diffusion were fine-tuned.

Hardware: RTX 3090
Optimizer: AdamW
Gradient Accumulations: 1
Batch Size: 1

Evaluation Results

No evaluation results are provided in the original document.

Environmental Impact

The environmental impact is minimal.

Hardware Type: RTX 3090
Usage Time (in hours): 500
Cloud Service Provider: None
Training Location: Japan
Carbon Emissions: Not significant

References

@InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High-Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684-10695} }

This model card was written by Alfred Increment based on Stable Diffusion v2.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご