Cool Japan Diffusion 2 1 0 Beta
Model Overview
Model Features
Model Capabilities
Use Cases
đ Cool Japan Diffusion 2.1.0 Beta Model Card
Cool Japan Diffusion is a model specifically tailored to represent Cool Japan elements such as anime, manga, and games by fine - tuning Stable Diffusion.
Note. As of January 10, 2023, China will implement legal restrictions on image - generating AI. (Warning for those in China)
The English version is here.
đ Quick Start
If you want to have a quick try, on a PC, you can enter text in the text form on the upper - right side to generate images. On a smartphone, go back to the top and try generating. For detailed instructions on using this model, please refer to this user manual. You can download the model here.
⨠Features
Cool Japan Diffusion is a text - to - image generation model based on the diffusion model. It is fine - tuned from Stable Diffusion and specializes in representing Cool Japan elements like anime, manga, and games.
đĻ Installation
Diffusers
First, run the following script to install the library:
pip install --upgrade git+https://github.com/huggingface/diffusers.git transformers accelerate scipy
đģ Usage Examples
Basic Usage
It is used in the same way as Stable Diffusion v2. Here are two usage patterns:
Web UI
Please create according to this user manual.
Diffusers
Use đ¤'s Diffusers library. After installing the library as described above, run the following script to generate an image:
from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
import torch
model_id = "aipicasso/cool-japan-diffusion-2-1-0-beta"
scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "anime, a portrait of a girl with black short hair and red eyes, kimono, full color illustration, official art, 4k, detailed"
negative_prompt="low quality, bad face, bad anatomy, bad hand, lowres, jpeg artifacts, 2d, 3d, cg, text"
image = pipe(prompt,negative_prompt=negative_prompt).images[0]
image.save("girl.png")
â ī¸ Important Note
- Using xformers seems to speed up the process.
- If you have limited GPU memory when using a GPU, please use
pipe.enable_attention_slicing()
.
Advanced Usage
Intended Use Cases
- Contests
- Submissions to AI Art Grand Prix. We will disclose all the data used for fine - tuning and ensure that the submission meets the review criteria. Also, we will apply in advance for confirmation. If you have any requests for the contest, please let me know on Hugging Face's Community.
- Reporting on Image - generating AI
- It is allowed not only for public broadcasters but also for for - profit enterprises. We believe that the "right to know" information about image - synthesizing AI will not have a negative impact on the creative industry, and we respect the freedom of reporting.
- Introduction of Cool Japan
- Explain what Cool Japan is to people from other countries. Alfred Increment has noticed that many international students are attracted to Japan by Cool Japan but often feel disappointed when they find that Cool Japan is considered "uncool" in Japan. We hope people from other countries can be more proud of their own cultures that others admire.
- Research and Development
- Model Use on Discord: Prompt engineering, fine - tuning (also known as additional learning, such as DreamBooth), and merging with other models.
- Compatibility between Latent Diffusion Model and Cool Japan: Investigate the compatibility between the Latent Diffusion Model and Cool Japan.
- Model Performance Evaluation: Examine the performance of this model using metrics like FID.
- Model Independence Check: Check the independence of this model from models other than Stable Diffusion using checksums or hash functions.
- Education
- Graduation projects for art college students and vocational school students.
- Graduation theses and assignment projects for university students.
- Teachers can use it to convey the current situation of image - generating AI.
- Self - expression
- Express your emotions and thoughts on SNS.
- Use Cases on Hugging Face's Community
- Please ask questions in Japanese or English.
Unintended Use Cases
- Do not represent things as facts.
- Do not use it for monetized content on platforms like YouTube.
- Do not directly provide it as a commercial service.
- Do not do things that will trouble teachers.
- Avoid other actions that may have a negative impact on the creative industry.
đ Documentation
Model Details
Property | Details |
---|---|
Developer | Robin Rombach, Patrick Esser, Alfred Increment |
Model Type | Text - to - image generation model based on diffusion model |
Language | Japanese |
License | CreativeML Open RAIL++ - M - NC License |
Model Description | This model can generate appropriate images according to prompts. The algorithms are Latent Diffusion Model and OpenCLIP - ViT/H. |
Supplementary | - |
References | @InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High - Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684 - 10695} } |
Prohibited or Malicious Use Cases
- Do not publish digital forgeries (Digital Forgery) (risk of violating copyright law). Especially, do not publish existing characters (risk of violating copyright law). It seems that characters that have not been trained can also be generated. (This tweet itself is permitted for research purposes.)
- Do not perform Image - to - Image on others' works without permission (risk of violating copyright law).
- Do not distribute pornographic materials (risk of violating Article 175 of the Criminal Code).
- Do not state non - factual information as facts (risk of being charged with the crime of interfering with business).
Model Limitations and Biases
Model Limitations
Not well understood.
Biases
It has the same biases as Stable Diffusion. Please be careful.
Training
Training Data
- VAE: About 600,000 types of data that comply with Japanese domestic laws, excluding unauthorized re - posting sites like Danbooru (unlimited images can be created through data augmentation).
- U - Net: 400,000 pairs of data that comply with Japanese domestic laws, excluding unauthorized re - posting sites like Danbooru.
Training Process
We fine - tuned the VAE and U - Net of Stable Diffusion.
- Hardware: RTX 3090
- Optimizer: AdamW
- Gradient Accumulations: 1
- Batch Size: 1
Evaluation Results
No evaluation results are provided in the original document.
Environmental Impact
There is almost no impact.
- Hardware Type: RTX 3090
- Usage Time (in hours): 300
- Cloud Provider: None
- Training Location: Japan
- Carbon Emissions: Not much
References
@InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High - Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684 - 10695} }
*This model card was written by Alfred Increment based on Stable Diffusion v2.
đ License
The license is the original CreativeML Open RAIL++ - M License with the addition of a prohibition on commercial use, except for certain exceptions. The reason for adding the prohibition on commercial use (except for exceptions) is the concern that it may have a negative impact on the creative industry. If this concern is eliminated, the next version will revert to the original license and allow commercial use. The Japanese translation of the original license can be found here. If you work for a for - profit enterprise, please consult your legal department. If you use it for personal interest, you generally don't need to worry too much as long as you follow common sense. As stated in the license, if you modify this model, you need to inherit this license.