Pixart-900m-1024-ft-v0.6開源圖像生成模型

首頁

Pixart 900m 1024 Ft V0.6

由terminusresearch開發

基於ptx0/pixart-900m-1024-ft-large進行全秩微調的圖像生成模型，專注於高質量圖像生成

圖像生成開源協議:Openrail #高分辨率圖像生成 #民族誌風格攝影 #寬高比自適應

下載量 4,111

發布時間 : 6/17/2024

模型概述

這是一個基於PixArt架構的圖像生成模型，經過全秩微調優化，能夠根據文本提示生成高分辨率圖像。

模型特點

高分辨率圖像生成

支持1024x1024及多種寬高比的高分辨率圖像生成

全秩微調優化

基於基礎模型進行了全秩微調，提升了生成質量

多分辨率支持

支持1024x1024、1344x768、916x1152等多種分辨率

模型能力

文本到圖像生成

高分辨率圖像生成

多寬高比圖像生成

使用案例

創意設計

概念藝術創作

根據詳細文本描述生成創意概念藝術圖像

如示例中的泰迪熊野餐場景

商業視覺內容生成

快速生成用於營銷、廣告等商業用途的視覺內容

🚀 pixart-900m-1024-ft

本項目是基於 ptx0/pixart-900m-1024-ft-large 的全秩微調模型。它能夠根據輸入的文本描述生成高質量的圖像，在圖像生成領域具有廣泛的應用前景。

🚀 快速開始

你可以按照以下步驟進行推理：

import torch
from diffusers import DiffusionPipeline

model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')

prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"

pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
    prompt=prompt,
    negative_prompt='blurry, cropped, ugly',
    num_inference_steps=25,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
    width=1152,
    height=768,
    guidance_scale=4.5,
    guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")

✨ 主要特性

基於 ptx0/pixart-900m-1024-ft-large 進行全秩微調。
文本編碼器未進行訓練，可複用基礎模型的文本編碼器進行推理。

📚 詳細文檔

驗證設置

CFG: 4.5
CFG Rescale: 0.0
Steps: 25
Sampler: None
Seed: 42
Resolutions: 1024x1024,1344x768,916x1152

注意：驗證設置不一定與訓練設置相同。

你可以在以下圖庫中找到一些示例圖像：

訓練設置

屬性	詳情
訓練輪數	7
訓練步數	100000
學習率	1e-06
有效批量大小	192
微批量大小	24
梯度累積步數	1
GPU 數量	8
預測類型	epsilon
重新縮放的 betas 零 SNR	False
優化器	AdamW, stochastic bf16
精度	Pure BF16
Xformers	未使用

數據集

photo-concept-bucket

屬性	詳情
重複次數	0
圖像總數	~567552
寬高比桶總數	1
分辨率	1.0 兆像素
是否裁剪	True
裁剪風格	random
裁剪寬高比	square

💻 使用示例

基礎用法

import torch
from diffusers import DiffusionPipeline

model_id = 'pixart-900m-1024-ft'
prompt = 'ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies'
negative_prompt = 'blurry, cropped, ugly'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')

prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
negative_prompt = "blurry, cropped, ugly"

pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
    prompt=prompt,
    negative_prompt='blurry, cropped, ugly',
    num_inference_steps=25,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
    width=1152,
    height=768,
    guidance_scale=4.5,
    guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")

高級用法

你可以根據需要調整 prompt、negative_prompt、num_inference_steps、width、height、guidance_scale 和 guidance_rescale 等參數，以獲得不同風格和質量的圖像。

import torch
from diffusers import DiffusionPipeline

model_id = 'pixart-900m-1024-ft'
# 自定義提示
prompt = 'A beautiful sunset over the ocean'
negative_prompt = 'ugly, blurry'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')

image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=30,  # 增加推理步數
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1280,  # 調整圖像寬度
    height=720,  # 調整圖像高度
    guidance_scale=5.0,  # 調整引導比例
    guidance_rescale=0.1,  # 調整引導重新縮放比例
).images[0]
image.save("custom_output.png", format="PNG")