SD3.5-Large-Anime-LoRAオープンソースモデル - 高品質のアニメスタイル画像を効率的に生成

ホーム

SD3.5 Large Anime LoRA

prithivMLmodsによって開発

stable-diffusion-3.5-largeモデルを基にしたアニメスタイルLoRAアダプターで、高品質なアニメスタイル画像を生成可能

画像生成オープンソースライセンス:Openrail #アニメスタイル生成 #LoRA微調整拡散モデル #高解像度カートゥーン描画

ダウンロード数 1,529

リリース時間 : 10/29/2024

モデル概要

このモデルはstable-diffusion-3.5-largeのLoRAアダプターで、アニメスタイル画像生成に特化して最適化されたトレーニングが施されており、テキスト記述に基づいて特定のアニメスタイルを持つ画像を生成できます。

モデル特徴

アニメスタイル最適化

アニメ35スタイルに特化してトレーニングされ、高品質なアニメスタイル画像を生成可能

LoRA技術統合

LoRA技術を採用して基本モデルを微調整し、基本モデルの能力を維持しながら特定スタイルの表現を追加

高解像度出力

960×1280などの高解像度画像生成をサポート

モデル能力

テキストから画像生成

アニメスタイル画像生成

高解像度画像生成

特定スタイル画像生成

使用事例

アニメ制作

アニメキャラクターデザイン

テキスト記述に基づいて特定スタイルのアニメキャラクターイメージを生成

特定の髪色、服装、表情を持つアニメキャラクターを生成可能

アニメシーン生成

アニメスタイルの背景シーンを生成

カフェやビーチなどのアニメスタイルのシーンを生成可能

コンセプトデザイン

キャラクターコンセプトデザイン

迅速にキャラクターコンセプトアートを生成

デザイナーが迅速にキャラクターコンセプトを可視化するのを支援

🚀 SD3.5-Large-Anime-LoRA

このモデルは画像生成に特化したLoRAモデルで、Stable Diffusion 3.5-Largeをベースにしています。アニメスタイルの画像を生成することができます。

🚀 クイックスタート

このモデルはまだ学習段階にあります。これは最終バージョンではなく、アーティファクトが含まれる場合や、一部のケースで性能が低い可能性があります。

✨ 主な機能

アニメスタイルの画像生成：このモデルはアニメスタイルの画像を生成することができます。
多様なシーン対応：人物、風景、キャラクターなど、様々なシーンの画像を生成することができます。

📦 インストール

以下のコードを使用して、モデルを設定することができます。

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
pipe.load_lora_weights("prithivMLmods/SD3.5-Large-Anime-LoRA", weight_name="SD3.5-Large-Anime-LoRA.safetensors")
pipe.fuse_lora(lora_scale=1.0)
pipe.to("cuda")

prompt = "Man in the style of dark beige and brown, uhd image, youthful protagonists, nonrepresentational photography"
negative_prompt = "(lowres, low quality, worst quality)"

image = pipe(prompt=prompt,
             negative_prompt=negative_prompt,
             num_inference_steps=24, 
             guidance_scale=4.0,
             width=960, height=1280,
            ).images[0]
image.save(f"example.jpg")

💻 使用例

基本的な使用法

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
pipe.load_lora_weights("prithivMLmods/SD3.5-Large-Anime-LoRA", weight_name="SD3.5-Large-Anime-LoRA.safetensors")
pipe.fuse_lora(lora_scale=1.0)
pipe.to("cuda")

prompt = "Anime 35 depiction of a man with dreadlocks stands in front of a backdrop of rocky mountains. The mans face is adorned with bright yellow eyes, a brown scarf, and a brown jacket. His jacket is draped over a brown belt with a silver buckle. His dreadlocks are pulled back, adding a pop of color to the scene. The sky is a deep blue, dotted with fluffy white clouds."
negative_prompt = "(lowres, low quality, worst quality)"

image = pipe(prompt=prompt,
             negative_prompt=negative_prompt,
             num_inference_steps=24, 
             guidance_scale=4.0,
             width=960, height=1280,
            ).images[0]
image.save(f"example.jpg")

📚 ドキュメント

モデルの説明

prithivMLmods/SD3.5-Large-Anime-LoRA

画像処理パラメータ

属性	详情
学習率スケジューラー	constant
ノイズオフセット	0.03
オプティマイザー	AdamW
マルチレゾリューションノイズ割引	0.1
ネットワーク次元	64
マルチレゾリューションノイズ反復回数	10
ネットワークアルファ	32
繰り返しとステップ数	25 & 2.7K
エポック数	15
保存頻度	1エポックごと