オープンソースのreddy-v3モデル - 無料で文から画像生成と画像から画像生成をサポート、高細部のアートスタイルの女性形象を得意とする

Reddy V3

Developed by Unmapped2895

FLUX.1-devをベースにしたPEFT LoRAモデルで、テキストから画像生成および画像から画像生成タスクに特化しており、高度なディテールと芸術的なスタイルを持つ女性イメージの生成に優れています。

画像生成 Open Source License:Other #女性イメージ生成 #高詳細写真 #下着ファッションデザイン

Downloads 70

Release Time : 4/4/2025

Model Overview

これはFLUX.1-devをベースにした標準的なPEFT LoRAモデルで、主にテキストから画像生成および画像から画像生成タスクに使用され、高品質でディテール豊かな画像を生成できます。特に女性イメージやファッション写真スタイルのシーン処理に優れています。

Model Features

高度なディテール化

豊富なディテールを持つ画像を生成でき、特に人物イメージや服装のディテールにおいて優れたパフォーマンスを発揮します。

多様な芸術スタイル

写実的な写真からファンタジースタイルまで、様々な芸術スタイルに対応できます。

女性イメージ最適化

女性イメージの生成能力を特別に最適化しており、様々な体型や服装のディテールを正確に表現できます。

LoRA適応

PEFT LoRA技術を採用しており、ベースモデル上で柔軟に調整でき、効率的な推論を維持します。

Model Capabilities

テキストから画像生成

画像から画像生成

高解像度画像生成

スタイリッシュな画像生成

人物イメージ生成

Use Cases

アート創作

ファッション写真

高品質なファッション写真スタイルの画像を生成し、下着やファッションなどのシーンを含みます。

プロフェッショナルな写真品質を持つファッション画像を生成できます

キャラクターデザイン

ゲームやイラストのためのキャラクターイメージ、特に女性キャラクターデザインを作成します。

服装やヘアスタイルなどの特徴を含む、ディテール豊かなキャラクターイメージを生成できます

商業応用

広告素材生成

製品展示や広告宣伝に必要な画像素材を迅速に生成します。

商業要件に合致するプロフェッショナル級の画像を生成できます

🚀 reddy-v3

このプロジェクトは、black-forest-labs/FLUX.1-dev をベースとした標準的なPEFT LoRAです。主にテキストから画像を生成するタスクに使用され、さまざまな画像生成シーンに対応しています。

🚀 クイックスタート

このモデルを使用するには、以下の手順に従ってください。

推論コード例

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'Unmapped2895/reddy-v3'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "Realistic wide shot photo of woman posing in a luxurious satin lingerie set, featuring a plunging bra, delicate thong and a classic garter belt with black stockings. The satin lingerie shimmers softly in the light, and the cut emphasizes both sophistication and a hint of allure. The lingerie is detailed with fine lace edges, highlighting her alluring figure. She elegantly styled hair as if getting ready for a formal event. The photo has a cinematic quality with rays of light and dramatic play of shadow and light"


## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=832,
    height=1216,
    guidance_scale=3.5,
).images[0]

model_output.save("output.png", format="PNG")

✨ 主な機能

テキストから画像生成（text-to-image）、画像から画像生成（image-to-image）などの機能をサポート。
さまざまな画像生成シーンに対応し、多様な画像を生成できます。

📦 インストール

インストールに関する具体的なコマンドは元のREADMEに記載されていないため、このセクションは省略されます。

💻 使用例

基本的な使用法

import torch
from diffusers import DiffusionPipeline

model_id = 'black-forest-labs/FLUX.1-dev'
adapter_id = 'Unmapped2895/reddy-v3'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)

prompt = "Realistic wide shot photo of woman posing in a luxurious satin lingerie set, featuring a plunging bra, delicate thong and a classic garter belt with black stockings. The satin lingerie shimmers softly in the light, and the cut emphasizes both sophistication and a hint of allure. The lingerie is detailed with fine lace edges, highlighting her alluring figure. She elegantly styled hair as if getting ready for a formal event. The photo has a cinematic quality with rays of light and dramatic play of shadow and light"

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
model_output = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=832,
    height=1216,
    guidance_scale=3.5,
).images[0]

model_output.save("output.png", format="PNG")

高度な使用法

# モデルの量子化を行い、VRAMを節約する例
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
model_output = pipeline(
    prompt=prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=832,
    height=1216,
    guidance_scale=3.5,
).images[0]

model_output.save("output.png", format="PNG")

📚 ドキュメント

検証設定

設定項目	詳細
CFG	`3.5`
CFG Rescale	`0.0`
ステップ数	`20`
サンプラー	`FlowMatchEulerDiscreteScheduler`
シード値	`42`
解像度	`832x1216`
スキップレイヤーガイダンス	未指定

学習設定

設定項目	詳細
学習エポック数	10
学習ステップ数	2000
学習率	0.0001
学習率スケジュール	一定
ウォームアップステップ数	500
最大勾配値	2.0
有効バッチサイズ	1
マイクロバッチサイズ	1
勾配累積ステップ数	1
GPU数	1
勾配チェックポイント	True
予測タイプ	flow-matching (追加パラメータ=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
オプティマイザー	adamw_bf16
学習可能パラメータ精度	Pure BF16
ベースモデル精度	`int8 - quanto`
キャプションドロップアウト確率	1.0%
LoRAランク	32
LoRAアルファ	None
LoRAドロップアウト	0.1
LoRA初期化スタイル	デフォルト

データセット

reddy - v2 - 512

繰り返し数: 10
画像総数: 13
アスペクトバケット総数: 1
解像度: 0.262144メガピクセル
クロップ: False
クロップスタイル: なし
クロップアスペクト: なし
正則化データとしての使用: いいえ

reddy - v2 - 1024

繰り返し数: 10
画像総数: 5
アスペクトバケット総数: 1
解像度: 1.048576メガピクセル
クロップ: False
クロップスタイル: なし
クロップアスペクト: なし
正則化データとしての使用: いいえ

🔧 技術詳細

このモデルは、ベースモデルである [black - forest - labs/FLUX.1 - dev](https://huggingface.co/black - forest - labs/FLUX.1 - dev) を元にPEFT LoRAを適用しています。
学習時には、特定の検証プロンプトを使用し、さまざまな設定を調整して最適化されています。
推論時には、量子化を行うことでVRAMを節約することができます。