Auraflow开源模型 - 免费实现文本到图像的高质量转换效果

首页

Auraflow

由 ControlNetLoRA 开发

基于terminusresearch/auraflow-v0.3的ControlNet PEFT LoRA模型，用于文本到图像的高质量转换

图像生成开源协议:Apache-2.0 #ControlNet微调 #文本到图像生成 #LoRA适配器

下载量 142

发布时间 : 6/17/2025

模型简介

这是一个基于ControlNet架构的LoRA适配器模型，主要用于文本到图像的生成任务，能够在多种场景下生成高质量的图像。

模型特点

高效微调

采用LoRA技术进行参数高效微调，显著减少训练资源需求

高质量图像生成

能够在1024x1024分辨率下生成高质量的图像

低资源推理

支持int8量化，降低推理时的显存需求

模型能力

文本到图像生成

高分辨率图像生成

基于提示的图像合成

使用案例

创意内容生成

艺术创作

根据文本描述生成艺术作品

1024x1024分辨率的高质量图像

概念设计

快速生成产品概念图

符合文本描述的视觉呈现

🚀 auraflow-controlnet-lora-test

这是一个基于 terminusresearch/auraflow-v0.3 的 ControlNet PEFT LoRA。本项目主要用于文本到图像的转换，能够在多种场景下生成高质量的图像。

🚀 快速开始

验证设置

CFG：4.0
CFG 重缩放：0.0
步数：16
采样器：FlowMatchEulerDiscreteScheduler
种子：42
分辨率：1024x1024

注意：验证设置不一定与训练设置相同。

你可以在以下图库中找到一些示例图像：

文本编码器未经过训练，你可以重用基础模型的文本编码器进行推理。

训练设置

训练轮数：15
训练步数：450
学习率：0.0001
- 学习率调度：恒定
- 热身步数：500
最大梯度值：2.0
有效批量大小：1
- 微批量大小：1
- 梯度累积步数：1
- GPU 数量：1
梯度检查点：启用
预测类型：flow_matching（额外参数=['shift=3.0', 'controlnet_enabled']）
优化器：adamw_bf16
可训练参数精度：纯 BF16
基础模型精度：int8 - torchao
字幕丢弃概率：0.0%
LoRA 秩：64
LoRA Alpha：64.0
LoRA 丢弃率：0.1
LoRA 初始化风格：默认

数据集

antelope - data - 256

重复次数：0
图像总数：29
纵横比桶总数：1
分辨率：0.065536 兆像素
裁剪：是
裁剪风格：居中
裁剪纵横比：方形
用于正则化数据：否

推理

import torch
from diffusers import DiffusionPipeline

model_id = 'terminusresearch/auraflow-v0.3'
adapter_id = 'bghira/auraflow-controlnet-lora-test'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "A photo-realistic image of a cat"
negative_prompt = 'ugly, cropped, blurry, low-quality, mediocre average'

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=16,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=4.0,
).images[0]

model_output.save("output.png", format="PNG")