cakeify-v0開源視頻生成模型 - 免費將日常物品轉為蛋糕創意視頻

首頁

Cakeify V0

由finetrainers開發

基於THUDM/CogVideoX-5b模型在cakeify-smol數據集上微調的視頻生成模型，專注於將日常物品轉變為蛋糕的創意視頻生成

文本生成視頻開源協議:其他 #物品蛋糕化視頻生成 #超寫實道具轉換 #創意視覺特效

下載量 24

發布時間 : 1/22/2025

模型概述

該模型能夠根據文本提示生成高質量視頻，特別擅長將普通物品轉變為超寫實蛋糕的創意蛻變過程

模型特點

創意物品轉變

能夠將日常物品（如茶杯、香皂等）轉變為超寫實蛋糕，展現創意蛻變過程

高質量視頻生成

生成分辨率達768×512的高質量視頻，包含81幀流暢動畫

LoRA支持

提供64秩的LoRA變體，可在保持效果的同時減少計算資源需求

模型能力

文本到視頻生成

創意視覺轉換

高分辨率視頻輸出

特定風格視頻生成

使用案例

創意內容製作

物品變蛋糕視頻

將日常物品如茶杯、香皂等轉變為蛋糕的創意視頻

生成包含物品切開、內部蛋糕顯露、最終轉變全過程的視頻

廣告創意製作

為食品或創意產品製作吸引眼球的廣告視頻

產生出人意料的產品展示效果

社交媒體內容

短視頻創意

為社交媒體平臺製作3-5秒的創意短視頻

產生高互動性的視覺內容

🚀 蛋糕化視頻微調模型

本項目是基於 THUDM/CogVideoX - 5b 模型，在 finetrainers/cakeify - smol 數據集上進行微調的成果。我們將日常物品轉化為逼真蛋糕道具的創意視頻生成，帶來意想不到的驚喜。同時，還提供了參數的 LoRA 變體，你可以在此處查看詳情。

代碼倉庫：https://github.com/a-r-r-o-w/finetrainers

🚀 快速開始

推理代碼示例

from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

訓練日誌可在 WandB 此處查看。

✨ 主要特性

創意視頻生成：將日常物品通過視頻展示轉化為逼真的蛋糕道具，帶來創意十足的視覺體驗。
LoRA 變體支持：提供 LoRA 變體參數，方便在不同場景下使用。

📦 模型信息

屬性	詳情
基礎模型	THUDM/CogVideoX - 5b
訓練數據集	finetrainers/cakeify - smol
庫名稱	diffusers
許可證	other

💻 使用示例

基礎用法

# 上述推理代碼即為基礎使用示例
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

高級用法

# 使用 LoRA 變體的代碼示例
from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)

🔧 LoRA

我們從微調後的檢查點中提取了一個秩為 64 的 LoRA（腳本此處）。這個 LoRA 可用於模擬相同的效果：

代碼

from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)