🚀 ProteusV0.4: The Style Update Lightning Edition
This project is a text - to - image model. The ProteusV0.4 update focuses on enhancing the stylistic capabilities, similar to Midjourney's approach, rather than improving prompt comprehension. And the methods used do not infringe on any copyrighted material.
✨ Features
Proteus Overview
Proteus is a sophisticated enhancement over OpenDalleV1.1. It leverages the core functionalities of OpenDalleV1.1 to deliver better results. Key improvements include higher responsiveness to prompts and enhanced creative abilities.
Training Process
- Data Source: It was fine - tuned using about 220,000 GPTV captioned images from copyright - free stock images (including some anime), which were then normalized.
- Optimization Method: DPO (Direct Preference Optimization) was employed with a collection of 10,000 carefully selected high - quality, AI - generated image pairs.
- LORA Models: Numerous LORA (Low - Rank Adaptation) models are trained independently and then selectively incorporated into the main model via dynamic application methods. These methods target specific segments within the model without interfering with other areas during the learning phase.
Performance Advantages
Proteus shows significant improvements in depicting intricate facial characteristics and lifelike skin textures. It also maintains good proficiency across various aesthetic domains, especially surrealism, anime, and cartoon - style visualizations. So far, it has been finetuned/trained on a total of 400k+ images.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Settings for ProteusV0.4 - Lightning
Use these settings for the best results with ProteusV0.4 - Lightning:
- CFG Scale: Use a CFG scale of 1 to 2.
- Steps: 4 to 10 steps for more detail, 8 steps for faster results.
- Sampler: eular
- Scheduler: normal
- Resolution: 1280x1280 or 1024x1024
Please also consider using these keep words to improve your prompts: best quality, HD, ~*~aesthetic~*~
.
If you are having trouble coming up with prompts, you can use this GPT I put together to help you refine the prompt. [Click here](https://chat.openai.com/g/g - RziQNoydR - diffusion - master)
Basic Usage
import torch
from diffusers import (
StableDiffusionXLPipeline,
EulerAncestralDiscreteScheduler,
AutoencoderKL
)
vae = AutoencoderKL.from_pretrained(
"madebyollin/sdxl-vae-fp16-fix",
torch_dtype=torch.float16
)
pipe = StableDiffusionXLPipeline.from_pretrained(
"dataautogpt3/ProteusV0.4-Lightning",
vae=vae,
torch_dtype=torch.float16
)
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')
prompt = "black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed"
negative_prompt = "nsfw, bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image"
image = pipe(
prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
guidance_scale=2,
num_inference_steps=8
).images[0]
📄 License
This project is under the gpl - 3.0 license.
💡 Usage Tip
You can support the work by donating on Buy Me a Coffee or following on Twitter.