🚀 pixart-900m-1024-ft
本項目是基於 ptx0/pixart-900m-1024-ft-large 的全秩微調模型。它能夠根據輸入的文本描述生成高質量的圖像,在圖像生成領域具有廣泛的應用前景。
🚀 快速開始
你可以按照以下步驟進行推理:
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt='blurry, cropped, ugly',
num_inference_steps=25,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
width=1152,
height=768,
guidance_scale=4.5,
guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")
✨ 主要特性
📚 詳細文檔
驗證設置
- CFG:
4.5
- CFG Rescale:
0.0
- Steps:
25
- Sampler:
None
- Seed:
42
- Resolutions:
1024x1024,1344x768,916x1152
注意:驗證設置不一定與訓練設置相同。
你可以在以下圖庫中找到一些示例圖像:
訓練設置
屬性 |
詳情 |
訓練輪數 |
7 |
訓練步數 |
100000 |
學習率 |
1e-06 |
有效批量大小 |
192 |
微批量大小 |
24 |
梯度累積步數 |
1 |
GPU 數量 |
8 |
預測類型 |
epsilon |
重新縮放的 betas 零 SNR |
False |
優化器 |
AdamW, stochastic bf16 |
精度 |
Pure BF16 |
Xformers |
未使用 |
數據集
photo-concept-bucket
屬性 |
詳情 |
重複次數 |
0 |
圖像總數 |
~567552 |
寬高比桶總數 |
1 |
分辨率 |
1.0 兆像素 |
是否裁剪 |
True |
裁剪風格 |
random |
裁剪寬高比 |
square |
💻 使用示例
基礎用法
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt='blurry, cropped, ugly',
num_inference_steps=25,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
width=1152,
height=768,
guidance_scale=4.5,
guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")
高級用法
你可以根據需要調整 prompt
、negative_prompt
、num_inference_steps
、width
、height
、guidance_scale
和 guidance_rescale
等參數,以獲得不同風格和質量的圖像。
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'A beautiful sunset over the ocean'
negative_prompt = 'ugly, blurry'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=30,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
width=1280,
height=720,
guidance_scale=5.0,
guidance_rescale=0.1,
).images[0]
image.save("custom_output.png", format="PNG")
📄 許可證
本項目採用 creativeml-openrail-m 許可證。