🚀 3D溶解效果視頻生成模型微調項目
本項目是對THUDM/CogVideoX - 5b模型在finetrainers/3dgs - dissolve數據集上進行微調的成果。我們還提供了參數的LoRA變體,可在此處查看。
🚀 快速開始
本項目是對 THUDM/CogVideoX-5b 模型在 finetrainers/3dgs-dissolve 數據集上的微調。同時,我們也提供了參數的 LoRA 變體,可在 此處 查看。
代碼倉庫:https://github.com/a-r-r-o-w/finetrainers
⚠️ 重要提示
這是一個實驗性的檢查點,其泛化能力較差是已知的情況。
📦 安裝指南
文檔未提及具體安裝步驟,故跳過此章節。
💻 使用示例
基礎用法
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"finetrainers/3dgs-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
3D_dissolve In a 3D appearance, a bookshelf filled with books is surrounded by a burst of red sparks, creating a dramatic and explosive effect against a black background.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)
高級用法
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
import torch
pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("/fsx/sayak/finetrainers/cogvideox-crush/extracted_crush_smol_lora_64.safetensors", adapter_name="crush")
pipeline.load_lora_weights("/fsx/sayak/finetrainers/cogvideox-3dgs/extracted_3dgs_lora_64.safetensors", adapter_name="3dgs")
pipeline
prompts = ["""
In a 3D appearance, a small bicycle is seen surrounded by a burst of fiery sparks, creating a dramatic and intense visual effect against the dark background.
The video showcases a dynamic explosion of fiery particles in a 3D appearance, with sparks and embers scattering across the screen against a stark black background.
""",
"""
In a 3D appearance, a bookshelf filled with books is surrounded by a burst of red sparks, creating a dramatic and explosive effect against a black background.
""",
]
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs, bad physique"
id_token = "3D_dissolve"
for i, prompt in enumerate(prompts):
video = pipeline(
prompt=f"{id_token} {prompt}",
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50,
generator=torch.manual_seed(0)
).frames[0]
export_to_video(video, f"output_{i}.mp4", fps=25)
📚 詳細文檔
模型信息
示例輸出
提示詞 |
輸出視頻鏈接 |
3D_dissolve 一隻穿著彩色冬季服裝的小老虎以3D形象出現,周圍環繞著動態的紅色火花。火花在小老虎周圍旋轉,逐漸消散成一陣紅色火花,留下一個漆黑的背景。 |
./assets/output_0.mp4 |
3D_dissolve 一輛以3D形象呈現的小汽車在旋轉的熾熱粒子漩渦中行駛。當它向前移動時,周圍的環境變成了動態的紅色火花展示,最終消散成一陣紅色火花,在黑暗背景下產生了迷人的視覺效果。 |
./assets/output_1.mp4 |
標籤
- 文本到視頻
- diffusers訓練
- diffusers
- cogvideox
- cogvideox - diffusers
- 模板:sd - lora
LoRA
我們從微調後的檢查點中提取了一個秩為64的LoRA(腳本見此處)。這個LoRA可用於模擬相同類型的效果。
訓練日誌可在WandB 此處查看。
📄 許可證
本項目使用的許可證為其他,具體許可證信息請見此處。