Ben-Brand-LoRA開源模型 - 免費部署實現文本生成圖像與特定藝術風格轉換

首頁

Ben Brand LoRA

由davidrd123開發

基於FLUX.1-dev訓練的PEFT LoRA模型，專注於文本生成圖像任務，支持特定藝術風格轉換。

圖像生成開源協議:其他 #FLUX.1風格適配 #高分辨率圖像生成 #藝術化文本轉圖像

下載量 253

發布時間 : 2/19/2025

模型概述

這是一個基於FLUX.1-dev基礎模型訓練的標準PEFT LoRA模型，主要用於文本生成圖像任務，能夠根據文本描述生成具有特定藝術風格的圖像。

模型特點

藝術風格轉換

能夠根據文本描述生成具有特定藝術風格的圖像，如示例中的b3nbr4nd畫風。

高分辨率輸出

支持最高1024x1024分辨率的圖像生成。

高效微調

採用LoRA技術進行參數高效微調，僅訓練少量參數即可實現風格遷移。

模型能力

文本生成圖像

藝術風格轉換

高分辨率圖像生成

使用案例

創意設計

概念藝術創作

根據文字描述快速生成概念藝術圖像

如示例中盤繞在方尖碑上的巨型綠蛇圖像

風格化圖像生成

將普通描述轉換為特定藝術風格的圖像

如b3nbr4nd畫風的圖像生成

遊戲開發

遊戲場景概念設計

快速生成遊戲場景的概念圖

如示例中部分掩埋的古代遺蹟場景

🚀 Ben-Brand-LoRA

Ben-Brand-LoRA 是一個標準的 PEFT LoRA，它源自 black-forest-labs/FLUX.1-dev。本項目在訓練過程中未使用驗證提示。

🚀 快速開始

推理示例

以下是使用該 LoRA 進行推理的示例代碼：

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

✨ 主要特性

基於 black-forest-labs/FLUX.1-dev 模型派生的標準 PEFT LoRA。
文本編碼器未進行訓練，推理時可複用基礎模型的文本編碼器。

📦 安裝指南

文檔未提及具體安裝步驟，可參考推理示例代碼中的依賴導入部分，確保安裝以下庫：

torch
diffusers
optimum.quanto

💻 使用示例

基礎用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

高級用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'davidrd123/Ben-Brand-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "An astronaut is riding a horse through the jungles of Thailand."

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
image = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=1024,
    height=1024,
    guidance_scale=3.0,
).images[0]
image.save("output.png", format="PNG")

📚 詳細文檔

驗證設置

設置項	詳情
CFG	`3.0`
CFG Rescale	`0.0`
步數	`20`
採樣器	`FlowMatchEulerDiscreteScheduler`
種子	`42`
分辨率	`1024x1024`
跳過層引導	無

注意：驗證設置不一定與訓練設置相同。

訓練設置

設置項	詳情
訓練輪數	2
訓練步數	3750
學習率	0.00015 - 學習率調度：常數 - 熱身步數：100
最大梯度範數	0.1
有效批量大小	6 - 微批量大小：2 - 梯度累積步數：3 - GPU 數量：1
梯度檢查點	True
預測類型	flow-matching (額外參數=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
優化器	adamw_bf16
可訓練參數精度	Pure BF16
字幕丟棄概率	10.0%
LoRA 秩	64
LoRA Alpha	None
LoRA 丟棄率	0.1
LoRA 初始化風格	默認

數據集

數據集名稱	重複次數	圖像總數	寬高比桶總數	分辨率	裁剪	裁剪風格	裁剪寬高比	是否用於正則化數據
ben-brand-256	10	98	3	0.065536 兆像素	否	無	無	否
ben-brand-crop-256	10	98	1	0.065536 兆像素	是	中心裁剪	方形	否
ben-brand-512	10	98	3	0.262144 兆像素	否	無	無	否
ben-brand-crop-512	10	98	1	0.262144 兆像素	是	中心裁剪	方形	否
ben-brand-768	10	98	3	0.589824 兆像素	否	無	無	否
ben-brand-crop-768	10	98	1	0.589824 兆像素	是	中心裁剪	方形	否
ben-brand-1024	10	98	4	1.048576 兆像素	否	無	無	否
ben-brand-crop-1024	10	98	1	1.048576 兆像素	是	中心裁剪	方形	否
ben-brand-1440	10	98	2	2.0736 兆像素	否	無	無	否
ben-brand-crop-1440	10	98	1	2.0736 兆像素	是	中心裁剪	方形	否