🚀 Ld-Art模型
Ld-Art是一個專注於圖像生成的模型,藉助文本輸入,它能生成高質量的圖像,尤其在人物面部特寫等場景表現出色。
🚀 快速開始
環境設置
import torch
from pipelines import DiffusionPipeline
base_model = "black-forest-labs/FLUX.1-dev"
pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
lora_repo = "strangerzonehf/Realism-v3-Flux"
trigger_word = "Realism v3"
pipe.load_lora_weights(lora_repo)
device = torch.device("cuda")
pipe.to(device)
觸發詞使用
你應該使用 Realism v3
來觸發圖像生成。
模型下載
該模型的權重以Safetensors格式提供。
點擊下載,可在“文件與版本”選項卡中獲取。
✨ 主要特性
- 文本到圖像轉換:通過輸入文本描述,模型可以生成對應的圖像。
- 多種場景支持:能夠生成不同風格、不同背景的人物面部特寫圖像。
📚 詳細文檔
圖像示例

以下是一些文本輸入及對應的圖像輸出示例:
- 輸入:'Realism v3, a close-up shot of a womans face is seen. Her eyes are a piercing blue, and her eyebrows are a darker shade of brown. Her lips are a lighter shade of pink, and she has a slight smile on her face. Her hair is a dark brown, with a few wispy bangs on the top of her head. The background is blurred, and the womans skin is a light beige.'
輸出:圖像鏈接
- 輸入:'Realism v3, a close-up shot of a womans face is seen. She is wearing a black turtleneck, her hair cascades over her shoulders, adding a touch of warmth to her face. Her eyes are a piercing blue, her eyebrows are a darker shade of brown, and her lips are a lighter shade of pink. Her hair is pulled back in a ponytail, framing her forehead. The backdrop is a plain white wall.'
輸出:圖像鏈接
- 輸入:'Realism v3, A close-up shot of a womans face, taken from a low-angle perspective. The womans eyes are blue, her eyebrows are a darker shade of brown, and her lips are a lighter shade of red. She is wearing a gray turtleneck, and has a pair of silver earrings on her left ear. Her hair is pulled back in a ponytail, and she has a slight smile on her face. The background is dark, and the womans hair is a dark brown.'
輸出:圖像鏈接
- 輸入:'Realism v3, A close-up eye-level shot of a red-haired man with a goatee and mustache. He is wearing a navy blue short-sleeved t-shirt, and his hair cascades over his shoulders. His eyes are a piercing blue, and he has a slight smile on his face. His hair is a vibrant shade of red, and the background is a light gray.'
輸出:圖像鏈接
圖像參數
參數 |
詳情 |
基礎模型 |
black-forest-labs/FLUX.1-dev |
實例提示詞 |
Realism v3 |
許可證 |
creativeml-openrail-m |
圖像生成參數
參數 |
值 |
參數 |
值 |
LR調度器 |
constant |
噪聲偏移 |
0.03 |
優化器 |
AdamW |
多分辨率噪聲折扣 |
0.1 |
網絡維度 |
64 |
多分辨率噪聲迭代次數 |
10 |
網絡Alpha |
32 |
重複次數與步數 |
19 & 3100 |
訓練輪數 |
27 |
每N輪保存一次 |
1 |
標註信息
標註使用的是florence2-en(自然語言與英語)。
訓練使用的圖像總數
33張
最佳尺寸與推理
尺寸 |
寬高比 |
推薦情況 |
1280 x 832 |
3:2 |
最佳 |
1024 x 1024 |
1:1 |
默認 |
推理範圍
📄 許可證
本模型使用的許可證為creativeml-openrail-m。