Ltxvideo-Disney Open Source Model - Freely Deploy and Generate Black-and-White Disney-Style Video Content

Ltxvideo Disney

Developed by bghira

A LyCORIS adapter trained based on Lightricks/LTX-Video, specializing in generating black and white Disney-style video content.

Text-to-Video Open Source License:Other #Black and white Disney style #LyCORIS adapter #Flow matching video generation

Downloads 18

Release Time : 3/21/2025

Model Overview

This model is a text-to-video adapter, particularly adept at generating black and white Disney scenes in the style of 'Steamboat Willie'.

Model Features

Black and white Disney style

Especially skilled at generating black and white Disney scenes in the style of 'Steamboat Willie'.

LyCORIS adapter

A LyCORIS adapter trained on Lightricks/LTX-Video, providing more efficient fine-tuning capabilities.

Flow matching prediction

Uses flow matching prediction type to optimize video generation quality.

Model Capabilities

Text-to-video

Image-to-video

Video-to-video

Use Cases

Creative content generation

Black and white animation style video creation

Generate animated video content with a retro black and white Disney style.

Examples showcase black and white Disney scenes in the style of 'Steamboat Willie'.

Anime-style action scenes

Generate videos of anime characters performing actions in urban environments.

Examples showcase smooth movements of anime protagonists in a neon-lit midnight city.

🚀 ltxvideo-disney

This project is a LyCORIS adapter derived from Lightricks/LTX-Video. It is designed to generate black - and - white Disney - style video scenes, offering a unique visual experience for text - to - video tasks.

🚀 Quick Start

This is a LyCORIS adapter derived from Lightricks/LTX-Video.

The main validation prompt used during training was:

A black and white disney scene in the style of Steamboat Willie

✨ Features

Disney - Style Generation: Capable of generating black - and - white Disney - style video scenes in the style of Steamboat Willie.
Customizable Inference: Allows users to adjust various parameters during inference, such as CFG, steps, sampler, etc.

📦 Installation

No specific installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

import torch
from diffusers import DiffusionPipeline
from lycoris import create_lycoris_from_weights


def download_adapter(repo_id: str):
    import os
    from huggingface_hub import hf_hub_download
    adapter_filename = "pytorch_lora_weights.safetensors"
    cache_dir = os.environ.get('HF_PATH', os.path.expanduser('~/.cache/huggingface/hub/models'))
    cleaned_adapter_path = repo_id.replace("/", "_").replace("\\", "_").replace(":", "_")
    path_to_adapter = os.path.join(cache_dir, cleaned_adapter_path)
    path_to_adapter_file = os.path.join(path_to_adapter, adapter_filename)
    os.makedirs(path_to_adapter, exist_ok=True)
    hf_hub_download(
        repo_id=repo_id, filename=adapter_filename, local_dir=path_to_adapter
    )

    return path_to_adapter_file
    
model_id = 'Lightricks/LTX-Video'
adapter_repo_id = 'bghira/ltxvideo-disney'
adapter_filename = 'pytorch_lora_weights.safetensors'
adapter_file_path = download_adapter(repo_id=adapter_repo_id)
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
lora_scale = 1.0
wrapper, _ = create_lycoris_from_weights(lora_scale, adapter_file_path, pipeline.transformer)
wrapper.merge_to()

prompt = "A black and white disney scene in the style of Steamboat Willie"
negative_prompt = 'ugly, cropped, blurry, low-quality, mediocre average'

## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
    
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=25,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=768,
    height=512,
    guidance_scale=3.8,
).frames[0]

from diffusers.utils.export_utils import export_to_gif
export_to_gif(model_output, "output.gif", fps=25)

📚 Documentation

Validation settings

Property	Details
CFG	`3.8`
CFG Rescale	`0.0`
Steps	`25`
Sampler	`FlowMatchEulerDiscreteScheduler`
Seed	`42`
Resolution	`768x512`

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Property	Details
Training epochs	2666
Training steps	8000
Learning rate	5e - 05 - Learning rate schedule: cosine - Warmup steps: 400000
Max grad value	0.0
Effective batch size	24 - Micro - batch size: 8 - Gradient accumulation steps: 1 - Number of GPUs: 3
Gradient checkpointing	True
Prediction type	flow - matching (extra parameters=['training_scheduler_timestep_spacing=trailing', 'inference_scheduler_timestep_spacing=trailing'])
Optimizer	adamw_bf16
Trainable parameter precision	Pure BF16
Base model precision	`int8 - quanto`
Caption dropout probability	10.0%

LyCORIS Config:

{
    "bypass_mode": true,
    "algo": "lokr",
    "multiplier": 1.0,
    "full_matrix": true,
    "linear_dim": 10000,
    "linear_alpha": 1,
    "factor": 4,
    "apply_preset": {
        "target_module": [
            "Attention",
            "FeedForward"
        ],
        "module_algo_map": {
            "FeedForward": {
                "factor": 4
            },
            "Attention": {
                "factor": 2
            }
        }
    }
}

Datasets

disney - black - and - white

Property	Details
Repeats	0
Total number of images	~69
Total number of aspect buckets	1
Resolution	0.2304 megapixels
Cropped	False
Crop style	None
Crop aspect	None
Used for regularisation data	No

🔧 Technical Details

Exponential Moving Average (EMA)

SimpleTuner generates a safetensors variant of the EMA weights and a pt file.

The safetensors file is intended to be used for inference, and the pt file is for continuing finetuning.

The EMA model may provide a more well - rounded result, but typically will feel undertrained compared to the full model as it is a running decayed average of the model weights.

📄 License

The license is other.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご