crush-smol-v0開源模型 - 免費生成物體被液壓機壓碎的趣味視頻

首頁

Crush Smol V0

由finetrainers開發

基於THUDM/CogVideoX-5b模型在crush-smol數據集上的微調版本，專注於生成物體被液壓機壓碎的視頻內容

文本生成視頻開源協議:其他 #液壓機特效生成 #物體壓碎模擬 #高動態視頻生成

下載量 94

發布時間 : 1/27/2025

模型概述

這是一個文本到視頻生成模型，特別擅長生成物體被大型金屬圓柱體壓碎的高質量視頻片段

模型特點

液壓機壓碎特效

專門針對物體被液壓機壓碎場景優化的視頻生成能力

高質量視頻輸出

能夠生成512x768分辨率、25fps的流暢視頻

LoRA支持

提供64秩的LoRA變體，便於輕量級部署和使用

模型能力

文本到視頻轉換

特定場景視頻生成

物理現象模擬

使用案例

特效視頻製作

液壓機壓碎特效

生成各種物體被液壓機壓碎的特效視頻

示例中展示了蠟燭、燈泡和漢堡被壓碎的逼真效果

教育演示

物理現象演示

用於展示物體受壓變形過程的物理教學材料

🚀 基於THUDM/CogVideoX - 5b的微調模型項目

本項目是基於 [THUDM/CogVideoX - 5b](https://huggingface.co/THUDM/CogVideoX - 5b) 模型在 [finetrainers/crush - smol](https://huggingface.co/datasets/finetrainers/crush - smol) 數據集上進行微調的成果。同時，我們還提供了參數的LoRA變體。

項目信息

屬性	詳情
基礎模型	THUDM/CogVideoX - 5b
訓練數據集	finetrainers/crush - smol
庫名稱	diffusers
許可證	其他（查看 [許可證鏈接](https://huggingface.co/THUDM/CogVideoX - 5b/blob/main/LICENSE)）
示例提示詞	DIFF_crush 一支紅色蠟燭放在金屬平臺上，一個大金屬圓柱體從上方降下，像在液壓機下一樣壓扁蠟燭。蠟燭被壓成扁平的圓形，周圍留下一堆碎片。

示例展示

以下是一些示例提示詞及其對應的輸出視頻： - **提示詞**：DIFF_crush 一支紅色蠟燭放在金屬平臺上，一個大金屬圓柱體從上方降下，像在液壓機下一樣壓扁蠟燭。蠟燭被壓成扁平的圓形，周圍留下一堆碎片。 **輸出視頻**：[點擊查看](./assets/output_0.mp4) - **提示詞**：DIFF_crush 一個燈泡放在木製平臺上，一個大金屬圓柱體從上方降下，像在液壓機下一樣壓碎燈泡。燈泡被壓成扁平的圓形，周圍留下一堆碎片。 **輸出視頻**：[點擊查看](./assets/output_1.mp4) - **提示詞**：DIFF_crush 一個厚漢堡放在餐桌上，一個大金屬圓柱體從上方降下，像在液壓機下一樣壓碎漢堡。燈泡被壓碎，周圍留下一堆碎片。 **輸出視頻**：[點擊查看](./assets/output_2.mp4)

項目代碼

項目代碼可在 [GitHub](https://github.com/a - r - r - o - w/finetrainers) 上查看。

⚠️ 重要提示 這是一個實驗性的檢查點，其泛化能力較差是已知的情況。

🚀 快速開始

推理代碼

以下是使用微調模型進行推理的代碼示例：

from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/crush-smol-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
DIFF_crush A thick burger is placed on a dining table, and a large metal cylinder descends from above, crushing the burger as if it were under a hydraulic press. The bulb is crushed, leaving a pile of debris around it.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

訓練日誌

訓練日誌可在WandB上查看：[點擊查看](https://wandb.ai/sayakpaul/finetrainers - cogvideox/runs/ngcsyhom)

LoRA

我們從微調後的檢查點中提取了一個秩為64的LoRA（提取腳本可查看這裡）。這個LoRA 可用於模擬相同的效果：

代碼

from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_crush_smol_lora_64.safetensors")

prompt = """
DIFF_crush A thick burger is placed on a dining table, and a large metal cylinder descends from above, crushing the burger as if it were under a hydraulic press. The bulb is crushed, leaving a pile of debris around it.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)