ProteusV0.2开源文生图模型 - 强化核心功能，提示词理解与风格表现更出色

首页

Proteusv0.2

由 dataautogpt3 开发

ProteusV0.2是基于OpenDalleV1.1的进阶版本，通过核心功能强化实现卓越的文生图效果，特别在提示词理解和风格表现上有显著提升。

图像生成开源协议:Gpl-3.0 #超现实动漫混合 #高精度面部渲染 #DPO优化画质

下载量 20.62k

发布时间 : 1/19/2024

模型简介

ProteusV0.2是一个文生图模型，专注于生成高质量图像，支持超现实主义、动漫及卡通等多种视觉风格。通过混合RealCartoonXL和直接偏好优化（DPO）技术，提升了模型的提示响应能力和创造力。

模型特点

灵敏的提示响应

模型对提示词的理解能力显著提升，能够更准确地生成符合描述的图像。

卓越的风格表现

支持超现实主义、动漫、卡通等多种视觉风格，风格表现力接近MJ6。

复杂面部特征优化

通过动态加载LORA模型技术，显著提升了面部特征和真实肤质的表现。

直接偏好优化（DPO）

使用精选的AI生成图像对进行优化，提升了模型的整体性能。

模型能力

文本到图像生成

多风格图像生成

高质量细节渲染

复杂场景生成

使用案例

艺术创作

动漫角色设计

生成具有特定风格的动漫角色图像，如黑色蓬松的猫科动物或像素艺术风格的太空少女。

高质量、细节丰富的动漫角色图像。

肖像画生成

生成具有特定风格和情感的肖像画，如白须长髯的老者或深夜酒吧独坐的银行从业者。

情感丰富、风格独特的肖像画。

影视概念设计

电影风格剧照

生成具有电影胶片质感的剧照，如日本地铁上的和服女子。

具有柯达电影胶片质感的剧照，浅景深和暗角晕影效果。

科幻场景设计

生成太空废土风格的场景，如身着橄榄绿旧棉袍的圣女贞德。

充满科幻氛围的脏污噪点效果，极致细节。

🚀 ProteusV0.2

ProteusV0.2是一款强大的文本到图像生成模型，它在OpenDalleV1.1的基础上进行了显著改进，能更好地理解提示词，在多种风格的图像生成上表现出色，尤其在超现实主义、动漫和卡通风格方面。

🚀 快速开始

模型特点

ProteusV0.2与RealCartoonXL合并，仅以0.5%的权重，就解决了无法理解与动漫或卡通风格相关标签的问题。与版本0.1相比，版本0.2有了细微但显著的改进，在提示理解上超越了MJ6，同时也接近其风格表现能力。

模型优势

Proteus是对OpenDalleV1.1的复杂增强，利用其核心功能来提供更优的结果。主要改进领域包括对提示的更高响应性和增强的创作能力。为实现这一点，它使用了约220,000张来自免版权库存图像（包含一些动漫）的GPTV字幕图像进行微调，然后进行了归一化处理。此外，还通过精心挑选的10,000对高质量AI生成图像对，采用了直接偏好优化（DPO）方法。

模型效果

为追求最佳性能，大量的低秩适应（LORA）模型被独立训练，然后通过动态应用方法有选择地整合到主模型中。这些技术在学习阶段针对模型的特定部分，同时避免干扰其他区域。因此，Proteus在描绘复杂的面部特征和逼真的皮肤纹理方面有显著改进，同时在各种美学领域，特别是超现实主义、动漫和卡通风格可视化方面保持了出色的能力。

✨ 主要特性

风格融合：与RealCartoonXL合并，解决动漫和卡通风格标签理解问题。
提示理解增强：超越MJ6的提示理解能力。
多风格适配：在超现实主义、动漫和卡通风格等多种美学领域表现出色。
细节优化：在描绘复杂面部特征和皮肤纹理方面有显著改进。

📦 安装指南

使用以下设置以获得ProteusV0.2的最佳效果：

CFG Scale：使用8到7的CFG比例。
Steps：20到60步以获得更多细节，20步以获得更快结果。
Sampler：DPM++ 2M SDE
Scheduler：Karras
Resolution：1280x1280或1024x1024

此外，建议在提示词中使用以下关键词来提升效果：最佳质量、高清、~*~美学~*~。

如果在构思提示词方面遇到困难，可以使用这个我整理的GPT来帮助优化提示词：点击进入

💻 使用示例

基础用法

import torch
from diffusers import (
    StableDiffusionXLPipeline, 
    KDPM2AncestralDiscreteScheduler,
    AutoencoderKL
)

# Load VAE component
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", 
    torch_dtype=torch.float16
)

# Configure the pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "dataautogpt3/ProteusV0.2", 
    vae=vae,
    torch_dtype=torch.float16
)
pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')

# Define prompts and generate image
prompt = "black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed"
negative_prompt = "nsfw, bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image"

image = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=7.5,
    num_inference_steps=50
).images[0]

📄 许可证

本项目采用GPL-3.0许可证。

支持作者

如果您觉得这个项目有帮助，请通过以下方式支持作者：

示例展示

| 输入提示词 | 输出图片链接 | | ---- | ---- | | black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed | [ComfyUI_03087_.png](ComfyUI_03087_.png) | | (impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers | [GEN8-iTXcAA-okN.jpeg](GEN8-iTXcAA-okN.jpeg) | | high quality pixel art, a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel. The image should embody epic portraiture and double exposure, featuring an isolated landscape visible through the window. The colors should primarily be dynamic and action-packed, with a strong use of negative space. The entire artwork should be in pixel art style, emphasizing the characters shape and set against a white background. Silhouette | [ComfyUI_03060_.png](ComfyUI_03060_.png) | | The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze. | [ComfyUI_03017_.png](ComfyUI_03017_.png) | | Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd | [ComfyUI_03045.png](ComfyUI_03045.png) | | cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy | [3.png](3.png) | | in the style of artgerm, comic style,3D model, mythical seascape, negative space, space quixotic dreams, temporal hallucination, psychedelic, mystical, intricate details, very bright neon colors, (vantablack background:1.5), pointillism, pareidolia, melting, symbolism, very high contrast, chiaroscuro | [ComfyUI_03061_.png](ComfyUI_03061_.png) | | 1980s anime portrait of a character glitching. His face is separated from his body by heavy static. His face is deformed by pain. Dream-like, analog horror, glitch, terrifying | [ComfyUI_03092_.png](ComfyUI_03092_.png) | | (("Proteus"):text_logo:1) | [ComfyUI_03297_.png](ComfyUI_03297_.png) | | dan seagrave, dante, Abandon All Hope, Ye Who Enter Here, hell religious art purgatory zdzislaw Beksinski, abyss inferno, lost, wanderer | [ComfyUI_03483_.png](ComfyUI_03483_.png) |