Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Picasso Diffusion 1.1 Model Card
Picasso Diffusion 1.1 is an AI art - specialized image - generation AI developed with approximately 7000 GPU hours.
Title: Welcome to Scientific Fact World.
English version is here.
🚀 Quick Start
If you want to have a quick and easy experience, please use this Space. You can download the model in safetensors format or ckpt format.
✨ Features
Picasso Diffusion is an image - generation AI specialized in AI art, developed with about 7000 GPU hours.
📦 Installation
Web UI
Similar to the usage of Stable Diffusion v2, please put the model file in ckpt or safetensor format and the configuration file in yaml format into the model folder. For detailed installation methods, please refer to this article. It is recommended to install xformers and turn on the --xformers --disable - nan - check
options. Otherwise, please turn on the --no - half
option.
Diffusers
Use 🤗's Diffusers library. First, run the following script to install the library:
pip install --upgrade git+https://github.com/huggingface/diffusers.git transformers accelerate scipy
💻 Usage Examples
Basic Usage
from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler
import torch
model_id = "alfredplpl/picasso-diffusion-1-1"
scheduler = EulerAncestralDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "anime, masterpiece, a portrait of a girl, good pupil, 4k, detailed"
negative_prompt="deformed, blurry, bad anatomy, bad pupil, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, bad hands, fused fingers, messy drawing, broken legs censor, low quality, mutated hands and fingers, long body, mutation, poorly drawn, bad eyes, ui, error, missing fingers, fused fingers, one hand with more than 5 fingers, one hand with less than 5 fingers, one hand with more than 5 digit, one hand with less than 5 digit, extra digit, fewer digits, fused digit, missing digit, bad digit, liquid digit, long body, uncoordinated body, unnatural body, lowres, jpeg artifacts, 3d, cg, text, japanese kanji"
images = pipe(prompt,negative_prompt=negative_prompt, num_inference_steps=20).images
images[0].save("girl.png")
⚠️ Important Note
- Using xformers can speed up the process.
- If you have limited GPU memory when using a GPU, please use
pipe.enable_attention_slicing()
.
Expected Use Cases
- Self - expression: Use this AI to express your uniqueness.
- Reporting on image - generation AI: It is allowed not only for public broadcasters but also for for - profit enterprises. The reason is that the "right to know" information about image - synthesis AI is judged not to have a negative impact on the creative industry, and freedom of the press is respected.
- Research and development:
- Model usage on Discord:
- Prompt engineering.
- Fine - tuning (also known as additional learning), such as DreamBooth.
- Merging with other models.
- Investigating the performance of this model using metrics like FID.
- Checking the independence of this model from models other than Stable Diffusion using checksum or hash functions.
- Model usage on Discord:
- Education:
- Graduation projects for art college students and vocational school students.
- Graduation theses and assignment projects for university students.
- Teachers can use it to convey the current situation of image - generation AI.
- Use cases described in Hugging Face's Community: Please ask questions in Japanese or English.
Unexpected Use Cases
- Representing things as facts.
- Using it in monetized content on platforms like YouTube.
- Directly providing it as a commercial service.
- Causing trouble for teachers.
- Other actions that may have a negative impact on the creative industry.
Prohibited or Malicious Use Cases
- Do not publish digital forgeries (Digital Forgery) (may violate copyright laws).
- In particular, do not publish existing characters (may violate copyright laws).
- Do not perform Image - to - Image operations on others' works without permission (may violate copyright laws).
- Do not distribute pornographic materials (may violate Article 175 of the criminal law).
- Do not violate the so - called industry etiquette.
- Do not claim non - factual information as facts (may be subject to the crime of interfering with business operations).
- Avoid spreading fake news.
📚 Documentation
Model Details
Property | Details |
---|---|
Model Type | Diffusion - model - based text - to - image generation model |
Language | Japanese |
License | CreativeML Open RAIL++ - M - NC License |
Model Description | This model can generate appropriate images according to prompts. The algorithms are Latent Diffusion Model and OpenCLIP - ViT/H. |
Supplementary Notes | |
References | @InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High - Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684 - 10695} } |
Model Limitations and Biases
Model Limitations
Diffusion models and large - scale language models still have many unknown aspects, and their limitations are not yet clear.
Biases
Diffusion models and large - scale language models still have many unknown aspects, and biases are not yet clear.
Training
Training Data
Data and models compliant with domestic laws, excluding unauthorized re - posting sites such as Danbooru.
Training Process
- Hardware: A100 80GB, V100
Evaluation Results
We are seeking evaluations from third parties.
Environmental Impact
- Hardware Type: A100 80GB, V100
- Usage Time (in hours): 7000
- Training Location: Japan
References
@InProceedings{Rombach_2022_CVPR, author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj"orn}, title = {High - Resolution Image Synthesis With Latent Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {10684 - 10695} }
*This model card is based on Stable Diffusion v2.
📄 License
Regarding the license, except for exceptions, a commercial - use prohibition is added to the original license, CreativeML Open RAIL++ - M License. The reason for adding the commercial - use prohibition (except for exceptions) is the concern that it may have a negative impact on the creative industry. If you work for a for - profit enterprise, please consult with your legal department. If you use it for personal hobbies, you can use it while following general common sense without too much concern.
This model was created in Japan. Therefore, Japanese laws apply. We claim that the training of this model is legal based on Article 30 - 4 of the Copyright Law. Also, regarding the distribution of this model, we claim that it does not fall under the category of principal offenders or accessory offenders even in light of the Copyright Law and Article 175 of the Criminal Law. For more details, please refer to the opinion of lawyer Kakinuma. However, as stated in the license, please handle the products of this model in accordance with various laws and regulations.