🚀 Allegro - Text-to-Video Generation
Allegro is an open - source text - to - video generation model. It offers versatile content creation, high - quality output, and is small and efficient. The model weights and code are publicly available, enabling the community to explore and utilize its capabilities.
Gallery · GitHub · Blog · Paper · Discord · Join Waitlist (Try it on Discord!)
🚀 Quick Start
Step 1: Install the necessary requirements
- Ensure Python >= 3.10, PyTorch >= 2.4, CUDA >= 12.4.
- It is recommended to use Anaconda to create a new environment (Python >= 3.10)
conda create -n rllegro python=3.10 -y
to run the following example.
- Run
pip install git+https://github.com/huggingface/diffusers.git torch==2.4.1 transformers==4.40.1 accelerate sentencepiece imageio imageio - ffmpeg beautifulsoup4
Step 2: Run inference
import torch
from diffusers import AutoencoderKLAllegro, AllegroPipeline
from diffusers.utils import export_to_video
vae = AutoencoderKLAllegro.from_pretrained("rhymes-ai/Allegro-T2V-40x720P", subfolder="vae", torch_dtype=torch.float32)
pipe = AllegroPipeline.from_pretrained(
"rhymes-ai/Allegro-T2V-40x720P", vae=vae, torch_dtype=torch.bfloat16
)
pipe.to("cuda")
pipe.vae.enable_tiling()
prompt = "A seaside harbor with bright sunlight and sparkling seawater, with many boats in the water. From an aerial view, the boats vary in size and color, some moving and some stationary. Fishing boats in the water suggest that this location might be a popular spot for docking fishing boats."
positive_prompt = """
(masterpiece), (best quality), (ultra-detailed), (unwatermarked),
{}
emotional, harmonious, vignette, 4k epic detailed, shot on kodak, 35mm photo,
sharp focus, high budget, cinemascope, moody, epic, gorgeous
"""
negative_prompt = """
nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality,
low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry.
"""
prompt = prompt.format(prompt.lower().strip())
video = pipe(prompt, negative_prompt=negative_prompt, guidance_scale=7.5, max_sequence_length=512, num_inference_steps=100, generator = torch.Generator(device="cuda:0").manual_seed(42)).frames[0]
export_to_video(video, "output.mp4", fps=15)
Use pipe.enable_sequential_cpu_offload()
to offload the model into CPU for less GPU memory cost, but the inference time will increase significantly.
Step 3: (Optional) Interpolate the video to 30 FPS
It is recommended to use [EMA - VFI](https://github.com/MCG - NJU/EMA - VFI) to interpolate the video from 15 FPS to 30 FPS.
For better visual quality, please use imageio to save the video.
Step 4: For faster inference
For faster inference such as Context Parallel, PAB, please refer to our [github repo](https://github.com/rhymes - ai/Allegro).
✨ Features
- Open Source: Full [model weights](https://huggingface.co/rhymes - ai/Allegro) and [code](https://github.com/rhymes - ai/Allegro) available to the community, under the Apache 2.0 license!
- Versatile Content Creation: Capable of generating a wide range of content, from close - ups of humans and animals to diverse dynamic scenes.
- High - Quality Output: Generate detailed 2 to 6 - second videos at 15 FPS with 368x640 and 720x1280 resolution, which can be interpolated to 30 FPS with [EMA - VFI](https://github.com/MCG - NJU/EMA - VFI).
- Small and Efficient: Features a 175M parameter VideoVAE and a 2.8B parameter VideoDiT model. Supports multiple precisions (FP32, BF16, FP16) and uses 9.3 GB of GPU memory in BF16 mode with CPU offloading. Context length is 79.2K, equivalent to 88 frames.
📦 Model Info
Property |
Details |
Model |
Allegro - T2V - 40x720P |
Description |
Text - to - Video Generation Model |
Download |
[Hugging Face](https://huggingface.co/rhymes - ai/Allegro - T2V - 40x720P) |
Parameter - VAE |
175M |
Parameter - DiT |
2.8B |
Inference Precision - VAE |
FP32/TF32/BF16/FP16 (best in FP32/TF32) |
Inference Precision - DiT/T5 |
BF16/FP32/TF32 |
Context Length |
36K |
Resolution |
720 x 1280 |
Frames |
40 |
Video Length |
3 seconds @ 15 FPS |
📚 Gallery

For more demos and corresponding prompts, see the [Allegro Gallery](https://rhymes.ai/allegro_gallery).
📄 License
This repo is released under the Apache 2.0 License.