🚀 Ld-Art模型
Ld-Art是一个专注于图像生成的模型,借助文本输入,它能生成高质量的图像,尤其在人物面部特写等场景表现出色。
🚀 快速开始
环境设置
import torch
from pipelines import DiffusionPipeline
base_model = "black-forest-labs/FLUX.1-dev"
pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
lora_repo = "strangerzonehf/Realism-v3-Flux"
trigger_word = "Realism v3"
pipe.load_lora_weights(lora_repo)
device = torch.device("cuda")
pipe.to(device)
触发词使用
你应该使用 Realism v3
来触发图像生成。
模型下载
该模型的权重以Safetensors格式提供。
点击下载,可在“文件与版本”选项卡中获取。
✨ 主要特性
- 文本到图像转换:通过输入文本描述,模型可以生成对应的图像。
- 多种场景支持:能够生成不同风格、不同背景的人物面部特写图像。
📚 详细文档
图像示例

以下是一些文本输入及对应的图像输出示例:
- 输入:'Realism v3, a close-up shot of a womans face is seen. Her eyes are a piercing blue, and her eyebrows are a darker shade of brown. Her lips are a lighter shade of pink, and she has a slight smile on her face. Her hair is a dark brown, with a few wispy bangs on the top of her head. The background is blurred, and the womans skin is a light beige.'
输出:图像链接
- 输入:'Realism v3, a close-up shot of a womans face is seen. She is wearing a black turtleneck, her hair cascades over her shoulders, adding a touch of warmth to her face. Her eyes are a piercing blue, her eyebrows are a darker shade of brown, and her lips are a lighter shade of pink. Her hair is pulled back in a ponytail, framing her forehead. The backdrop is a plain white wall.'
输出:图像链接
- 输入:'Realism v3, A close-up shot of a womans face, taken from a low-angle perspective. The womans eyes are blue, her eyebrows are a darker shade of brown, and her lips are a lighter shade of red. She is wearing a gray turtleneck, and has a pair of silver earrings on her left ear. Her hair is pulled back in a ponytail, and she has a slight smile on her face. The background is dark, and the womans hair is a dark brown.'
输出:图像链接
- 输入:'Realism v3, A close-up eye-level shot of a red-haired man with a goatee and mustache. He is wearing a navy blue short-sleeved t-shirt, and his hair cascades over his shoulders. His eyes are a piercing blue, and he has a slight smile on his face. His hair is a vibrant shade of red, and the background is a light gray.'
输出:图像链接
图像参数
参数 |
详情 |
基础模型 |
black-forest-labs/FLUX.1-dev |
实例提示词 |
Realism v3 |
许可证 |
creativeml-openrail-m |
图像生成参数
参数 |
值 |
参数 |
值 |
LR调度器 |
constant |
噪声偏移 |
0.03 |
优化器 |
AdamW |
多分辨率噪声折扣 |
0.1 |
网络维度 |
64 |
多分辨率噪声迭代次数 |
10 |
网络Alpha |
32 |
重复次数与步数 |
19 & 3100 |
训练轮数 |
27 |
每N轮保存一次 |
1 |
标注信息
标注使用的是florence2-en(自然语言与英语)。
训练使用的图像总数
33张
最佳尺寸与推理
尺寸 |
宽高比 |
推荐情况 |
1280 x 832 |
3:2 |
最佳 |
1024 x 1024 |
1:1 |
默认 |
推理范围
📄 许可证
本模型使用的许可证为creativeml-openrail-m。