Auraflow開源模型 - 免費實現文本到圖像的高質量轉換效果

首頁

Auraflow

由ControlNetLoRA開發

基於terminusresearch/auraflow-v0.3的ControlNet PEFT LoRA模型，用於文本到圖像的高質量轉換

圖像生成開源協議:Apache-2.0 #ControlNet微調 #文本到圖像生成 #LoRA適配器

下載量 142

發布時間 : 6/17/2025

模型概述

這是一個基於ControlNet架構的LoRA適配器模型，主要用於文本到圖像的生成任務，能夠在多種場景下生成高質量的圖像。

模型特點

高效微調

採用LoRA技術進行參數高效微調，顯著減少訓練資源需求

高質量圖像生成

能夠在1024x1024分辨率下生成高質量的圖像

低資源推理

支持int8量化，降低推理時的顯存需求

模型能力

文本到圖像生成

高分辨率圖像生成

基於提示的圖像合成

使用案例

創意內容生成

藝術創作

根據文本描述生成藝術作品

1024x1024分辨率的高質量圖像

概念設計

快速生成產品概念圖

符合文本描述的視覺呈現

🚀 auraflow-controlnet-lora-test

這是一個基於 terminusresearch/auraflow-v0.3 的 ControlNet PEFT LoRA。本項目主要用於文本到圖像的轉換，能夠在多種場景下生成高質量的圖像。

🚀 快速開始

驗證設置

CFG：4.0
CFG 重縮放：0.0
步數：16
採樣器：FlowMatchEulerDiscreteScheduler
種子：42
分辨率：1024x1024

注意：驗證設置不一定與訓練設置相同。

你可以在以下圖庫中找到一些示例圖像：

文本編碼器未經過訓練，你可以重用基礎模型的文本編碼器進行推理。

訓練設置

訓練輪數：15
訓練步數：450
學習率：0.0001
- 學習率調度：恆定
- 熱身步數：500
最大梯度值：2.0
有效批量大小：1
- 微批量大小：1
- 梯度累積步數：1
- GPU 數量：1
梯度檢查點：啟用
預測類型：flow_matching（額外參數=['shift=3.0', 'controlnet_enabled']）
優化器：adamw_bf16
可訓練參數精度：純 BF16
基礎模型精度：int8 - torchao
字幕丟棄概率：0.0%
LoRA 秩：64
LoRA Alpha：64.0
LoRA 丟棄率：0.1
LoRA 初始化風格：默認

數據集

antelope - data - 256

重複次數：0
圖像總數：29
縱橫比桶總數：1
分辨率：0.065536 兆像素
裁剪：是
裁剪風格：居中
裁剪縱橫比：方形
用於正則化數據：否

推理

import torch
from diffusers import DiffusionPipeline

model_id = 'terminusresearch/auraflow-v0.3'
adapter_id = 'bghira/auraflow-controlnet-lora-test'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "A photo-realistic image of a cat"
negative_prompt = 'ugly, cropped, blurry, low-quality, mediocre average'

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=16,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=4.0,
).images[0]

model_output.save("output.png", format="PNG")