🚀 非官方Diffusers格式的LTX-Video權重
本項目提供了https://huggingface.co/Lightricks/LTX-Video (版本0.9.1)的非官方Diffusers格式權重。該項目支持文本到視頻以及圖像到視頻的轉換功能,為視頻生成提供了便捷的解決方案。
🚀 快速開始
環境準備
確保你已經安裝了torch
和diffusers
庫,並且擁有支持CUDA的GPU設備。
文本到視頻
以下是一個使用文本生成視頻的示例代碼:
import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video
pipe = LTXPipeline.from_pretrained("a-r-r-o-w/LTX-Video-0.9.1-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")
prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
video = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=704,
height=480,
num_frames=161,
num_inference_steps=50,
decode_timestep=0.03,
decode_noise_scale=0.025,
).frames[0]
export_to_video(video, "output.mp4", fps=24)
圖像到視頻
以下是一個使用圖像生成視頻的示例代碼:
import torch
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image
pipe = LTXImageToVideoPipeline.from_pretrained("a-r-r-o-w/LTX-Video-0.9.1-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")
image = load_image(
"https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png"
)
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
video = pipe(
image=image,
prompt=prompt,
negative_prompt=negative_prompt,
width=704,
height=480,
num_frames=161,
num_inference_steps=50,
decode_timestep=0.03,
decode_noise_scale=0.025,
).frames[0]
export_to_video(video, "output.mp4", fps=24)
✨ 主要特性
- 文本到視頻轉換:通過輸入文本描述,生成相應的視頻內容。
- 圖像到視頻轉換:以圖像為基礎,結合文本提示,生成相關視頻。
- 支持CUDA加速:利用GPU進行快速推理,提高視頻生成效率。
📦 安裝指南
確保你已經安裝了Python環境,並且可以使用pip
進行包管理。可以使用以下命令安裝所需的庫:
pip install torch diffusers
💻 使用示例
基礎用法
上述的文本到視頻和圖像到視頻的示例代碼展示了該項目的基礎用法。你可以根據自己的需求修改提示詞、負提示詞、視頻尺寸、幀數等參數。
高級用法
在實際應用中,你可以嘗試不同的提示詞組合、調整推理步數、解碼時間步長和解碼噪聲比例等參數,以獲得更好的視頻生成效果。同時,你還可以將生成的視頻用於其他應用場景,如社交媒體分享、視頻編輯等。