flux-lora-training开源模型 - 免费实现文本生图、图像生成图像功能

首页

Flux Lora Training

由 Forezeztgump 开发

这是一个基于FLUX.1-dev的标准PEFT LoRA衍生模型，专注于文本生成图像和图像生成图像任务。

图像生成开源协议:其他 #高分辨率图像生成 #流匹配优化 #LoRA微调

下载量 94

发布时间 : 4/15/2025

模型简介

该模型是基于black-forest-labs/FLUX.1-dev的标准PEFT LoRA衍生模型，主要用于文本生成图像和图像生成图像任务。训练期间使用的主要验证提示词为'照片：$kora猫在窗台上睡觉。'

模型特点

LoRA微调

使用PEFT LoRA技术对基础模型进行微调，保持模型轻量化

高分辨率输出

支持1024x1024高分辨率图像生成

流匹配预测

采用流匹配预测类型，优化图像生成过程

模型能力

文本生成图像

图像生成图像

使用案例

创意图像生成

宠物图像生成

根据文本描述生成特定品种的宠物图像

示例中展示了'$kora猫在窗台上睡觉'的图像生成效果

🚀 flux-lora-training

这是一个基于 black-forest-labs/FLUX.1-dev 的标准PEFT LoRA模型，可用于文本到图像的转换。

🚀 快速开始

本项目是从 black-forest-labs/FLUX.1-dev 派生而来的标准PEFT LoRA。

训练期间使用的主要验证提示为：

photo of $kora the cat sleeping on a windowsill.

✨ 主要特性

基于标准PEFT LoRA技术。
适用于文本到图像、图像到图像等多种任务。

📦 安装指南

文档未提及安装步骤，可参考原模型 black-forest-labs/FLUX.1-dev 的安装说明。

💻 使用示例

基础用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'Forezeztgump/flux-lora-training'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "photo of $kora the cat sleeping on a windowsill."


## Optional: quantise the model to save on vram.
## Note: The model was not quantised during training, so it is not necessary to quantise it during inference time.
#from optimum.quanto import quantize, freeze, qint8
#quantize(pipeline.transformer, weights=qint8)
#freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    num_inference_steps=15,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.5,
).images[0]

model_output.save("output.png", format="PNG")

📚 详细文档

验证设置

CFG: 3.5
CFG Rescale: 0.0
步数: 15
采样器: FlowMatchEulerDiscreteScheduler
随机种子: 42
分辨率: 1024x1024
跳过层引导:

注意：验证设置不一定与训练设置相同。

你可以在以下图库中找到一些示例图像：

文本编码器未进行训练。你可以重用基础模型的文本编码器进行推理。

训练设置

训练轮数: 384
训练步数: 5000
学习率: 0.0001
- 学习率调度: constant_with_warmup
- 热身步数: 100
最大梯度值: 1.0
有效批量大小: 4
- 微批量大小: 4
- 梯度累积步数: 1
- GPU数量: 1
梯度检查点: True
预测类型: flow-matching (额外参数=['flow_schedule_auto_shift', 'shift=0.0', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all+ffs'])
优化器: adamw_bf16
可训练参数精度: Pure BF16
基础模型精度: no_change
字幕丢弃概率: 0.0%
LoRA秩: 16
LoRA Alpha: None
LoRA丢弃率: 0.1
LoRA初始化风格: default