🚀 pixart-900m-1024-ft
本项目是基于 ptx0/pixart-900m-1024-ft-large 的全秩微调模型。它能够根据输入的文本描述生成高质量的图像,在图像生成领域具有广泛的应用前景。
🚀 快速开始
你可以按照以下步骤进行推理:
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt='blurry, cropped, ugly',
num_inference_steps=25,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
width=1152,
height=768,
guidance_scale=4.5,
guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")
✨ 主要特性
📚 详细文档
验证设置
- CFG:
4.5
- CFG Rescale:
0.0
- Steps:
25
- Sampler:
None
- Seed:
42
- Resolutions:
1024x1024,1344x768,916x1152
注意:验证设置不一定与训练设置相同。
你可以在以下图库中找到一些示例图像:
训练设置
属性 |
详情 |
训练轮数 |
7 |
训练步数 |
100000 |
学习率 |
1e-06 |
有效批量大小 |
192 |
微批量大小 |
24 |
梯度累积步数 |
1 |
GPU 数量 |
8 |
预测类型 |
epsilon |
重新缩放的 betas 零 SNR |
False |
优化器 |
AdamW, stochastic bf16 |
精度 |
Pure BF16 |
Xformers |
未使用 |
数据集
photo-concept-bucket
属性 |
详情 |
重复次数 |
0 |
图像总数 |
~567552 |
宽高比桶总数 |
1 |
分辨率 |
1.0 兆像素 |
是否裁剪 |
True |
裁剪风格 |
random |
裁剪宽高比 |
square |
💻 使用示例
基础用法
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt='blurry, cropped, ugly',
num_inference_steps=25,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
width=1152,
height=768,
guidance_scale=4.5,
guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")
高级用法
你可以根据需要调整 prompt
、negative_prompt
、num_inference_steps
、width
、height
、guidance_scale
和 guidance_rescale
等参数,以获得不同风格和质量的图像。
import torch
from diffusers import DiffusionPipeline
model_id = 'pixart-900m-1024-ft'
prompt = 'A beautiful sunset over the ocean'
negative_prompt = 'ugly, blurry'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=30,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
width=1280,
height=720,
guidance_scale=5.0,
guidance_rescale=0.1,
).images[0]
image.save("custom_output.png", format="PNG")
📄 许可证
本项目采用 creativeml-openrail-m 许可证。