cakeify-v0开源视频生成模型 - 免费将日常物品转为蛋糕创意视频

首页

Cakeify V0

由 finetrainers 开发

基于THUDM/CogVideoX-5b模型在cakeify-smol数据集上微调的视频生成模型，专注于将日常物品转变为蛋糕的创意视频生成

文本生成视频开源协议:其他 #物品蛋糕化视频生成 #超写实道具转换 #创意视觉特效

下载量 24

发布时间 : 1/22/2025

模型简介

该模型能够根据文本提示生成高质量视频，特别擅长将普通物品转变为超写实蛋糕的创意蜕变过程

模型特点

创意物品转变

能够将日常物品（如茶杯、香皂等）转变为超写实蛋糕，展现创意蜕变过程

高质量视频生成

生成分辨率达768×512的高质量视频，包含81帧流畅动画

LoRA支持

提供64秩的LoRA变体，可在保持效果的同时减少计算资源需求

模型能力

文本到视频生成

创意视觉转换

高分辨率视频输出

特定风格视频生成

使用案例

创意内容制作

物品变蛋糕视频

将日常物品如茶杯、香皂等转变为蛋糕的创意视频

生成包含物品切开、内部蛋糕显露、最终转变全过程的视频

广告创意制作

为食品或创意产品制作吸引眼球的广告视频

产生出人意料的产品展示效果

社交媒体内容

短视频创意

为社交媒体平台制作3-5秒的创意短视频

产生高互动性的视觉内容

🚀 蛋糕化视频微调模型

本项目是基于 THUDM/CogVideoX - 5b 模型，在 finetrainers/cakeify - smol 数据集上进行微调的成果。我们将日常物品转化为逼真蛋糕道具的创意视频生成，带来意想不到的惊喜。同时，还提供了参数的 LoRA 变体，你可以在此处查看详情。

代码仓库：https://github.com/a-r-r-o-w/finetrainers

🚀 快速开始

推理代码示例

from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

训练日志可在 WandB 此处查看。

✨ 主要特性

创意视频生成：将日常物品通过视频展示转化为逼真的蛋糕道具，带来创意十足的视觉体验。
LoRA 变体支持：提供 LoRA 变体参数，方便在不同场景下使用。

📦 模型信息

属性	详情
基础模型	THUDM/CogVideoX - 5b
训练数据集	finetrainers/cakeify - smol
库名称	diffusers
许可证	other

💻 使用示例

基础用法

# 上述推理代码即为基础使用示例
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

transformer = CogVideoXTransformer3DModel.from_pretrained(
    "finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
    "THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)

高级用法

# 使用 LoRA 变体的代码示例
from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)

🔧 LoRA

我们从微调后的检查点中提取了一个秩为 64 的 LoRA（脚本此处）。这个 LoRA 可用于模拟相同的效果：

代码

from diffusers import DiffusionPipeline 
from diffusers.utils import export_to_video
import torch 

pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")

prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"

video = pipeline(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    num_frames=81, 
    height=512,
    width=768,
    num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)