Open-source Text2Video-Zero Text-to-Video Tool - Freely Achieve Edge-guided Video Generation in GTA-5 Style

Text2video Zero Controlnet Canny Gta5

Developed by PAIR

Text2Video-Zero is a zero-shot text-to-video tool that supports edge-guided GTA-5 style video generation through ControlNet.

Text-to-Video Open Source License:Openrail #Zero-shot video generation #GTA-5 stylization #Edge-guided control

Downloads 38

Release Time : 3/24/2023

Model Overview

This model combines DreamBooth and ControlNet technologies to generate GTA-5 style videos or images based on text prompts and edge conditions, supporting zero-shot video generation and editing.

Model Features

Zero-shot video generation

Generate video content from text without additional training

Edge condition control

Achieve Canny edge-guided video/image generation through ControlNet

GTA-5 artistic style

Generated videos/images feature the artistic style characteristics of GTA-5 game

Multi-condition support

Supports combined control generation with multiple conditions like text, pose, and edges

Model Capabilities

Text-to-video

Text-to-image

Video editing

Stylized generation

Edge-guided generation

Use Cases

Creative content generation

GTA-5 style video creation

Generate short video content with GTA-5 artistic style based on text descriptions

Dynamic scenes generated that match the game's artistic style

Edge-guided image generation

Use Canny edge maps to control the generation of GTA-5 style images with specific compositions

Apply stylized effects while maintaining edge structures

Video editing

Stylized video conversion

Convert regular videos into GTA-5 artistic style

Apply style transformation while preserving original video dynamics

🚀 Text2Video-Zero Model Card - ControlNet Canny GTA-5 Style

Text2Video-Zero is a zero-shot text to video generator that offers multiple video generation and editing capabilities, and this model provides GTA-5 style weights for text2video zero with edge guidance.

🚀 Quick Start

Text2Video-Zero is a zero-shot text to video generator. It can perform zero-shot text-to-video generation, Video Instruct Pix2Pix (instruction-guided video editing), text and pose conditional video generation, text and canny-edge conditional video generation, and text, canny-edge and dreambooth conditional video generation. For more information about this work, please have a look at our paper and our demo: Our code works with any StableDiffusion base model.

This model provides DreamBooth weights for the GTA-5 style to be used with edge guidance (using ControlNet) in text2video zero.

✨ Features

Weights for Text2Video-Zero

We converted the original weights into diffusers and made them usable for ControlNet with edge guidance using: https://github.com/lllyasviel/ControlNet/discussions/12.

Original Weights

The Dreambooth weights for the GTA-5 style were taken from CIVITAI.

📚 Documentation

Model Details (Weights for Text2Video-Zero)

Property	Details
Developed by	Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan and Humphrey Shi
Model Type	Dreambooth text-to-image and text-to-video generation model with edge control for text2video zero
Language(s)	English
License	The CreativeML OpenRAIL M license.
Model Description	This is a model for text2video zero with edge guidance and gta-5 style. It can be used also with ControlNet in a text-to-image setup with edge guidance.
DreamBoth Keyword	gtav style
Resources for more information	GitHub, Paper, CIVITAI.
Cite as	@article{text2video-zero, title={Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators}, author={Khachatryan, Levon and Movsisyan, Andranik and Tadevosyan, Vahram and Henschel, Roberto and Wang, Zhangyang and Navasardyan, Shant and Shi, Humphrey}, journal={arXiv preprint arXiv:2303.13439}, year={2023} }

Model Details (Original Weights)

Property	Details
Developed by	Quiet_Joker (Username listed on CIVITAI)
Model Type	Dreambooth text-to-image generation model
Language(s)	English
License	The CreativeML OpenRAIL M license.
Model Description	This is a model that was created using DreamBooth to generate images with GTA-5 style, based on text prompts.
DreamBoth Keyword	gtav style
Resources for more information	CIVITAI.

📄 License

The model is under The CreativeML OpenRAIL M license.

🔧 Technical Details

Beware that Text2Video-Zero may output content that reinforces or exacerbates societal biases, as well as realistic faces, pornography, and violence. Text2Video-Zero in this demo is meant only for research purposes.

📚 Citation

@article{text2video-zero,
  title={Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators},
  author={Khachatryan, Levon and Movsisyan, Andranik and Tadevosyan, Vahram and Henschel, Roberto and Wang, Zhangyang and Navasardyan, Shant and Shi, Humphrey},
  journal={arXiv preprint arXiv:2303.13439},
  year={2023}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご