Ben-Brand-LoRA开源模型 - 免费部署实现文本生成图像与特定艺术风格转换

首页

Ben Brand LoRA

由 davidrd123 开发

基于FLUX.1-dev训练的PEFT LoRA模型，专注于文本生成图像任务，支持特定艺术风格转换。

图像生成开源协议:其他 #FLUX.1风格适配 #高分辨率图像生成 #艺术化文本转图像

下载量 253

发布时间 : 2/19/2025

模型简介

这是一个基于FLUX.1-dev基础模型训练的标准PEFT LoRA模型，主要用于文本生成图像任务，能够根据文本描述生成具有特定艺术风格的图像。

模型特点

艺术风格转换

能够根据文本描述生成具有特定艺术风格的图像，如示例中的b3nbr4nd画风。

高分辨率输出

支持最高1024x1024分辨率的图像生成。

高效微调

采用LoRA技术进行参数高效微调，仅训练少量参数即可实现风格迁移。

模型能力

文本生成图像

艺术风格转换

高分辨率图像生成

使用案例

创意设计

概念艺术创作

根据文字描述快速生成概念艺术图像

如示例中盘绕在方尖碑上的巨型绿蛇图像

风格化图像生成

将普通描述转换为特定艺术风格的图像

如b3nbr4nd画风的图像生成

游戏开发

游戏场景概念设计

快速生成游戏场景的概念图

如示例中部分掩埋的古代遗迹场景

🚀 Ben-Brand-LoRA

Ben-Brand-LoRA 是一个标准的 PEFT LoRA，它源自 black-forest-labs/FLUX.1-dev。本项目在训练过程中未使用验证提示。

🚀 快速开始

推理示例

以下是使用该 LoRA 进行推理的示例代码：

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

✨ 主要特性

基于 black-forest-labs/FLUX.1-dev 模型派生的标准 PEFT LoRA。
文本编码器未进行训练，推理时可复用基础模型的文本编码器。

📦 安装指南

文档未提及具体安装步骤，可参考推理示例代码中的依赖导入部分，确保安装以下库：

torch
diffusers
optimum.quanto

💻 使用示例

基础用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

高级用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

📚 详细文档

验证设置

设置项	详情
CFG	`3.0`
CFG Rescale	`0.0`
步数	`20`
采样器	`FlowMatchEulerDiscreteScheduler`
种子	`42`
分辨率	`1024x1024`
跳过层引导	无

注意：验证设置不一定与训练设置相同。

训练设置

设置项	详情
训练轮数	2
训练步数	3750
学习率	0.00015 - 学习率调度：常数 - 热身步数：100
最大梯度范数	0.1
有效批量大小	6 - 微批量大小：2 - 梯度累积步数：3 - GPU 数量：1
梯度检查点	True
预测类型	flow-matching (额外参数=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
优化器	adamw_bf16
可训练参数精度	Pure BF16
字幕丢弃概率	10.0%
LoRA 秩	64
LoRA Alpha	None
LoRA 丢弃率	0.1
LoRA 初始化风格	默认

数据集

数据集名称	重复次数	图像总数	宽高比桶总数	分辨率	裁剪	裁剪风格	裁剪宽高比	是否用于正则化数据
ben-brand-256	10	98	3	0.065536 兆像素	否	无	无	否
ben-brand-crop-256	10	98	1	0.065536 兆像素	是	中心裁剪	方形	否
ben-brand-512	10	98	3	0.262144 兆像素	否	无	无	否
ben-brand-crop-512	10	98	1	0.262144 兆像素	是	中心裁剪	方形	否
ben-brand-768	10	98	3	0.589824 兆像素	否	无	无	否
ben-brand-crop-768	10	98	1	0.589824 兆像素	是	中心裁剪	方形	否
ben-brand-1024	10	98	4	1.048576 兆像素	否	无	无	否
ben-brand-crop-1024	10	98	1	1.048576 兆像素	是	中心裁剪	方形	否
ben-brand-1440	10	98	2	2.0736 兆像素	否	无	无	否
ben-brand-crop-1440	10	98	1	2.0736 兆像素	是	中心裁剪	方形	否