wan - flat - color - 1.3b - v2開源風格模型 - 免費生成無可見線稿純色平塗圖像

首頁

Wan Flat Color 1.3b V2

由motimalu開發

專為無可見線稿、純色平塗且景深表現極簡的圖像訓練的風格模型

圖像生成開源協議:Apache-2.0 #平塗無線稿 #動漫風格化 #虛擬主播生成

下載量 49

發布時間 : 3/13/2025

模型概述

該模型基於LoRA技術訓練，能夠生成無可見線稿、純色平塗風格的圖像，特別適合動漫風格的角色設計。

模型特點

平塗色彩風格

生成無可見線稿、純色平塗風格的圖像

LoRA適配

使用LoRA技術進行微調，保持基礎模型能力的同時實現特定風格

高質量輸出

能夠生成高質量電影級畫面，特別適合動漫風格角色設計

模型能力

文本生成圖像

風格化圖像生成

動漫角色設計

使用案例

數字藝術創作

虛擬主播形象設計

生成虛擬主播的動漫風格形象

如示例中的星街彗星和櫻花未來角色形象

動漫場景創作

創作具有特定風格的動漫場景

如星空背景或櫻花樹下的場景

🚀 扁平色彩風格模型

本項目的扁平色彩風格模型專注於生成無明顯線條、色彩扁平且深度感較弱的圖像和視頻，為圖像與視頻生成領域帶來獨特的視覺體驗。

🚀 快速開始

觸發詞使用

使用 flat color 觸發圖像生成。
使用 no lineart 觸發圖像生成。

模型下載

本模型的權重以 Safetensors 格式提供，可在 Files & versions 標籤頁中下載。

✨ 主要特性

獨特風格：基於無明顯線條、扁平色彩和弱深度感的圖像進行訓練，生成具有獨特風格的圖像和視頻。
廣泛應用：適用於多種場景，如虛擬主播形象、動漫風格視頻等。

📦 安裝指南

加載 LoRA 時，使用 LoraLoaderModelOnly 節點，並使用 fp16 的 wan2.1_t2v_1.3B_fp16.safetensors。

💻 使用示例

基礎用法

可在 ComfyUI_examples/wan/#text-to-video 中查看文本到視頻的預覽示例。

以下是一些示例輸入及對應輸出：

示例 1

輸入文本：flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke kaikai|hara id 21|yoneyama mai|fuzichoco],  1girl, hoshimachi suisei, virtual youtuber, blue hair, side ponytail, cowboy shot, black shirt, star print, off shoulder, outdoors, starry sky, wariza, looking up, half-closed eyes, black sky,  live2d animation, upper body, high quality cinematic video of a woman sitting under the starry night sky. The Camera is steady, This is a cowboy shot. The animation is smooth and fluid.
負提示詞：bad quality video,色調豔麗，過曝，靜態，細節模糊不清，字幕，風格，作品，畫作，畫面，靜止，整體發灰，最差質量，低質量，JPEG壓縮殘留，醜陋的，殘缺的，多餘的手指，畫得不好的手部，畫得不好的臉部，畸形的，毀容的，形態畸形的肢體，手指融合，靜止不動的畫面，雜亂的背景，三條腿，背景人很多，倒著走
輸出：[images/ComfyUI_00455_.webp](images/ComfyUI_00455_.webp)

示例 2

輸入文本：flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke kaikai|hara id 21|yoneyama mai|fuzichoco],  1girl, sakura miko, pink hair, cowboy shot, white shirt, floral print, off shoulder, outdoors, cherry blossom, tree shade, wariza, looking up, falling petals, half-closed eyes, white sky, clouds,  live2d animation, upper body, high quality cinematic video of a woman sitting under a sakura tree. Dreamy and lonely, the camera close-ups on the face of the woman as she turns towards the viewer. The Camera is steady, This is a cowboy shot. The animation is smooth and fluid.
負提示詞：bad quality video,色調豔麗，過曝，靜態，細節模糊不清，字幕，風格，作品，畫作，畫面，靜止，整體發灰，最差質量，低質量，JPEG壓縮殘留，醜陋的，殘缺的，多餘的手指，畫得不好的手部，畫得不好的臉部，畸形的，毀容的，形態畸形的肢體，手指融合，靜止不動的畫面，雜亂的背景，三條腿，背景人很多，倒著走
輸出：[images/ComfyUI_00469_.webp](images/ComfyUI_00469_.webp)

📚 詳細文檔

模型描述

本模型基於 Wan-AI/Wan2.1-T2V-1.3B-Diffusers 基礎模型進行訓練。相關內容轉載自 CivitAI。

訓練配置

本模型使用 diffusion-pipe 進行訓練，以下是詳細的訓練配置文件：

dataset.toml

# 分辨率設置
resolutions = [512]

# 寬高比分桶設置
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 7

# 幀分桶（1 表示圖像）
frame_buckets = [1]

[[directory]] # 圖像
# 包含圖像及其對應字幕文件的目錄路徑
path = '/mnt/d/huanvideo/training_data/images'
num_repeats = 5
resolutions = [720]
frame_buckets = [1] # 圖像使用 1 幀

[[directory]] # 視頻
# 包含視頻及其對應字幕文件的目錄路徑
path = '/mnt/d/huanvideo/training_data/videos'
num_repeats = 5
resolutions = [512] # 設置視頻分辨率為 512（例如 244p）
frame_buckets = [6, 28, 31, 32, 36, 42, 43, 48, 50, 53]

config.toml

# 數據集配置文件
output_dir = '/mnt/d/wan/training_output'
dataset = 'dataset.toml'

# 訓練設置
epochs = 50
micro_batch_size_per_gpu = 1
pipeline_stages = 1
gradient_accumulation_steps = 4
gradient_clipping = 1.0
warmup_steps = 100

# 評估設置
eval_every_n_epochs = 5
eval_before_first_step = true
eval_micro_batch_size_per_gpu = 1
eval_gradient_accumulation_steps = 1

# 其他設置
save_every_n_epochs = 5
checkpoint_every_n_minutes = 30
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
steps_per_print = 1
video_clip_mode = 'single_middle'

[model]
type = 'wan'
ckpt_path = '../Wan2.1-T2V-1.3B'
dtype = 'bfloat16'
# 訓練 LoRA 時，變壓器可使用 fp8
transformer_dtype = 'float8'
timestep_sample_method = 'logit_normal'

[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'

[optimizer]
type = 'adamw_optimi'
lr = 5e-5
betas = [0.9, 0.99]
weight_decay = 0.02
eps = 1e-8