🚀 3D溶解效果视频生成模型微调项目
本项目是对THUDM/CogVideoX - 5b模型在finetrainers/3dgs - dissolve数据集上进行微调的成果。我们还提供了参数的LoRA变体,可在此处查看。
🚀 快速开始
本项目是对 THUDM/CogVideoX-5b 模型在 finetrainers/3dgs-dissolve 数据集上的微调。同时,我们也提供了参数的 LoRA 变体,可在 此处 查看。
代码仓库:https://github.com/a-r-r-o-w/finetrainers
⚠️ 重要提示
这是一个实验性的检查点,其泛化能力较差是已知的情况。
📦 安装指南
文档未提及具体安装步骤,故跳过此章节。
💻 使用示例
基础用法
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"finetrainers/3dgs-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
3D_dissolve In a 3D appearance, a bookshelf filled with books is surrounded by a burst of red sparks, creating a dramatic and explosive effect against a black background.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)
高级用法
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
import torch
pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("/fsx/sayak/finetrainers/cogvideox-crush/extracted_crush_smol_lora_64.safetensors", adapter_name="crush")
pipeline.load_lora_weights("/fsx/sayak/finetrainers/cogvideox-3dgs/extracted_3dgs_lora_64.safetensors", adapter_name="3dgs")
pipeline
prompts = ["""
In a 3D appearance, a small bicycle is seen surrounded by a burst of fiery sparks, creating a dramatic and intense visual effect against the dark background.
The video showcases a dynamic explosion of fiery particles in a 3D appearance, with sparks and embers scattering across the screen against a stark black background.
""",
"""
In a 3D appearance, a bookshelf filled with books is surrounded by a burst of red sparks, creating a dramatic and explosive effect against a black background.
""",
]
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs, bad physique"
id_token = "3D_dissolve"
for i, prompt in enumerate(prompts):
video = pipeline(
prompt=f"{id_token} {prompt}",
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50,
generator=torch.manual_seed(0)
).frames[0]
export_to_video(video, f"output_{i}.mp4", fps=25)
📚 详细文档
模型信息
示例输出
提示词 |
输出视频链接 |
3D_dissolve 一只穿着彩色冬季服装的小老虎以3D形象出现,周围环绕着动态的红色火花。火花在小老虎周围旋转,逐渐消散成一阵红色火花,留下一个漆黑的背景。 |
./assets/output_0.mp4 |
3D_dissolve 一辆以3D形象呈现的小汽车在旋转的炽热粒子漩涡中行驶。当它向前移动时,周围的环境变成了动态的红色火花展示,最终消散成一阵红色火花,在黑暗背景下产生了迷人的视觉效果。 |
./assets/output_1.mp4 |
标签
- 文本到视频
- diffusers训练
- diffusers
- cogvideox
- cogvideox - diffusers
- 模板:sd - lora
LoRA
我们从微调后的检查点中提取了一个秩为64的LoRA(脚本见此处)。这个LoRA可用于模拟相同类型的效果。
训练日志可在WandB 此处查看。
📄 许可证
本项目使用的许可证为其他,具体许可证信息请见此处。