Animatediff Motion Adapter v1-5-3 Open-Source Model - Easily Create Coherent Videos with Stable Diffusion

Animatediff Motion Adapter V1 5 3

Developed by guoyww

AnimateDiff is a technology that leverages existing Stable Diffusion text-to-image models to create videos by inserting motion module layers to achieve coherent motion between image frames.

Video Processing #Text-to-Video #Motion Module Expansion #Stable Diffusion Adaptation

Downloads 800

Release Time : 12/18/2023

Model Overview

This technology inserts motion module layers into frozen text-to-image models and trains on video clips to extract motion priors, enabling coherent motion between image frames. It supports applying motion modules to existing Stable Diffusion models.

Model Features

Motion Module Adaptation

Adds motion capabilities to existing Stable Diffusion models via MotionAdapter and UNetMotionModel

Video Coherence

Inserts motion modules after ResNet and attention blocks to ensure coherent motion between frames

Model Compatibility

Compatible with various Stable Diffusion text-to-image models, such as Realistic Vision V5.1 in the example

Model Capabilities

Text-to-Video Generation

Image Animation

Video Style Transfer

Use Cases

Creative Content Generation

Sunset Animation Generation

Generates coherent sunset scene animations based on text descriptions

Example shows a 16-frame sunset scene animation with elements like fishing boats, waves, and seagulls

Digital Art Creation

Art Style Animation

Transforms artistic style images into animations

🚀 Diffusers - Text-to-Video

AnimateDiff is a groundbreaking method that empowers users to generate videos using pre - existing Stable Diffusion Text - to - Image models. It inserts motion module layers into a frozen text - to - image model and trains it on video clips to extract a motion prior. These motion modules are placed after the ResNet and Attention blocks in the Stable Diffusion UNet, aiming to introduce coherent motion across image frames. To support these modules, the concepts of a MotionAdapter and UNetMotionModel are introduced, providing a convenient way to integrate these motion modules with existing Stable Diffusion models.

✨ Features

Leverage Existing Models: Utilize pre - trained Stable Diffusion text - to - image models to generate videos.
Motion Modules: Insert motion module layers to introduce coherent motion across frames.
Convenient Integration: Use MotionAdapter and UNetMotionModel to easily integrate motion modules with existing models.

📦 Installation

The installation steps are not provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

import torch
from diffusers import MotionAdapter, AnimateDiffPipeline, EulerAncestralDiscreteScheduler
from diffusers.utils import export_to_gif

# Load the motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-3")
# load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter)
scheduler = EulerAncestralDiscreteScheduler.from_pretrained(
    model_id,
    subfolder="scheduler",
    beta_schedule="linear",
)
pipe.scheduler = scheduler

# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()

output = pipe(
    prompt=(
        "masterpiece, bestquality, highlydetailed, ultradetailed, sunset, "
        "orange sky, warm lighting, fishing boats, ocean waves seagulls, "
        "rippling water, wharf, silhouette, serene atmosphere, dusk, evening glow, "
        "golden hour, coastal landscape, seaside scenery"
    ),
    negative_prompt="bad quality, worse quality",
    num_frames=16,
    guidance_scale=7.5,
    num_inference_steps=25,
    generator=torch.Generator("cpu").manual_seed(42),
)
frames = output.frames[0]
export_to_gif(frames, "animation.gif")

📚 Documentation

AnimateDiff allows you to generate videos using pre - existing Stable Diffusion text - to - image models. The following is a simple example of using motion modules with an existing Stable Diffusion text - to - image model:

First, load the motion adapter and the Stable Diffusion text - to - image model. Then, set up the scheduler and enable memory - saving features. Finally, generate the video frames based on the prompt and save them as a GIF.

🔧 Technical Details

AnimateDiff achieves video generation by inserting motion module layers into a frozen text - to - image model. These motion modules are placed after the ResNet and Attention blocks in the Stable Diffusion UNet. By training on video clips, a motion prior is extracted, which helps introduce coherent motion across image frames. The MotionAdapter and UNetMotionModel are introduced to conveniently use these motion modules with existing Stable Diffusion models.

Visual Example

masterpiece, bestquality, sunset.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご