๐ Pika Dissolve Fine-tuned Model
This project presents a fine-tuned model based on THUDM/CogVideoX-5b, trained on the modal-labs/dissolve dataset. It enables the generation of high - quality text - to - video content, specifically focusing on the "dissolve" effect.
๐ Quick Start
Prerequisites
Make sure you have the necessary libraries installed, such as diffusers
and torch
.
Inference Code
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"sayakpaul/pika-dissolve-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
PIKA_DISSOLVE A slender glass vase, brimming with tiny white pebbles, stands centered on a polished ebony dais. Without warning, the glass begins to dissolve from the edges inward. Wisps of translucent dust swirl upward in an elegant spiral, illuminating each pebble as they drop onto the dais. The gently drifting dust eventually settles, leaving only the scattered stones and faint traces of shimmering powder on the stage.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output_vase.mp4", fps=25)
โจ Features
- Text - to - Video Generation: Generate videos based on text prompts, especially focusing on the "dissolve" effect.
- Fine - tuned Model: Built on the foundation of the CogVideoX - 5b model, fine - tuned on a specific dataset to enhance performance.
๐ฆ Installation
The installation mainly involves setting up the required Python libraries. You can install them using pip
:
pip install diffusers torch
๐ป Usage Examples
Basic Usage
The above inference code demonstrates the basic usage of generating a video based on a given text prompt.
Advanced Usage
You can adjust parameters such as num_frames
, height
, width
, and num_inference_steps
to customize the output video according to your needs. For example, increasing num_frames
can make the video longer, and adjusting num_inference_steps
can affect the quality of the generated video.
๐ Documentation
๐ License
This project uses the license specified in https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE.
๐ Information Table
Property |
Details |
Model Type |
Fine - tuned CogVideoX - 5b |
Training Data |
modal - labs/dissolve |
๐ฅ Widget Examples
Prompt |
Output Video |
PIKA_DISSOLVE A meticulously detailed, tea cup, sits centrally on a dark brown circular pedestal. The cup, seemingly made of clay, begins to dissolve from the bottom up. The disintegration process is rapid but not explosive, with a cloud of fine, light tan dust forming and rising in a swirling, almost ethereal column that expands outwards before slowly descending. The dust particles are individually visible as they float, and the overall effect is one of delicate disintegration rather than shattering. Finally, only the empty pedestal and the intricately patterned marble floor remain. |
output_cup.mp4 |
PIKA_DISSOLVE Resting quietly atop an ancient stone altar, a delicately carved wooden mask starts to crumble from its outer edges. The intricate patterns crack and give way, releasing a fine, smoke - like plume of mahogany - hued particles that dance upwards, then disperse gradually into the hushed atmosphere. As the dust descends, the once captivating mask is reduced to an outline on the weathered altar. |
output_altar.mp4 |
PIKA_DISSOLVE A slender glass vase, brimming with tiny white pebbles, stands centered on a polished ebony dais. Without warning, the glass begins to dissolve from the edges inward. Wisps of translucent dust swirl upward in an elegant spiral, illuminating each pebble as they drop onto the dais. The gently drifting dust eventually settles, leaving only the scattered stones and faint traces of shimmering powder on the stage. |
output_vase.mp4 |
PIKA_DISSOLVE On a narrow marble ledge, a gracefully folded paper crane rests, its surface marked by delicate ink lines. It starts to fragment from the tail feathers outward, releasing a cloud of feather - light pulp fibers. Suspended for a moment in a magical swirl, the fibers drift back down, cloaking the ledge in a near - transparent veil of white. Then the ledge stands empty, the craneโs faint silhouette lingering in memory. |
output_marble.mp4 |