🚀 CogVideoX-5b Fine-tuned Model
This project is a fine-tuned version of the THUDM/CogVideoX-5b model on the finetrainers/crush-smol dataset. It provides a LoRA variant of the parameters and offers text-to-video generation capabilities.
✨ Features
- Fine-tuned Model: Based on the THUDM/CogVideoX-5b model, fine-tuned on a specific dataset for better performance.
- LoRA Variant: A 64-rank LoRA is extracted from the fine-tuned checkpoint, which can be used to emulate similar effects.
- Text-to-Video Generation: Generate videos based on text prompts.
📦 Installation
The installation process depends on the diffusers
library. You can install it using the following command:
pip install diffusers
💻 Usage Examples
Basic Usage
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"finetrainers/crush-smol-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
DIFF_crush A thick burger is placed on a dining table, and a large metal cylinder descends from above, crushing the burger as if it were under a hydraulic press. The bulb is crushed, leaving a pile of debris around it.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)
Advanced Usage (Using LoRA)
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
import torch
pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_crush_smol_lora_64.safetensors")
prompt = """
DIFF_crush A thick burger is placed on a dining table, and a large metal cylinder descends from above, crushing the burger as if it were under a hydraulic press. The bulb is crushed, leaving a pile of debris around it.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)
📚 Documentation
📄 License
This project is licensed under the license.
⚠️ Important Note
This is an experimental checkpoint and its poor generalization is well-known.