オープンソースLTX-Video-0.9.1-diffusersモデル - テキスト、画像からビデオ生成機能をサポート

ホーム

LTX Video 0.9.1 Diffusers

newgenai79によって開発

非公式Diffusers形式のテキスト生成動画および画像生成動画モデル、LightricksのLTX-Videoモデルから変換

テキスト生成ビデオ #テキスト駆動動画生成 #画像から動画へ #高フレームレート動画

ダウンロード数 33

リリース時間 : 3/1/2025

モデル概要

このモデルはテキスト記述または入力画像に基づいて高品質な動画コンテンツを生成でき、解像度、フレーム数、生成ステップなどのカスタム動画パラメータをサポート

モデル特徴

高品質動画生成

連続した動きと詳細を備えた高品質動画を生成可能

デュアルモーダル入力

テキスト記述または画像を入力条件として動画生成をサポート

パラメータカスタマイズ

動画解像度、フレーム数、推論ステップなどのパラメータを調整可能

ネガティブプロンプト対応

望ましくない動画効果を除外するためのネガティブプロンプトをサポート

モデル能力

テキストから動画生成

画像から動画生成

動画スタイル制御

動画コンテンツカスタマイズ

使用事例

クリエイティブコンテンツ生成

ショート動画制作

脚本やコンセプトに基づき迅速にショート動画コンテンツを生成

24fpsの連続動画を生成

コンセプトビジュアライゼーション

テキストで記述されたシーンやコンセプトを視覚化

記述に合致した視覚コンテンツを生成

映像制作支援

プレビジュアライゼーション

映像プロジェクトのための予備的な視覚的リファレンスを生成

監督の構想を迅速に提示

🚀 非公式Diffusers形式のLTX-Video重み

このプロジェクトは、https://huggingface.co/Lightricks/LTX-Video (バージョン0.9.1) の非公式Diffusers形式の重みを提供します。テキストから動画、画像から動画への変換機能を備えています。

🚀 クイックスタート

このパイプラインを使用することで、テキストまたは画像から動画を生成することができます。以下に、具体的な使用方法を示します。

💻 使用例

基本的な使用法

テキストから動画への変換

import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video

pipe = LTXPipeline.from_pretrained("a-r-r-o-w/LTX-Video-0.9.1-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")

prompt = "A woman with long brown hair and light skin smiles at another woman with long blonde hair. The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
    decode_timestep=0.03,
    decode_noise_scale=0.025,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

画像から動画への変換

import torch
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

pipe = LTXImageToVideoPipeline.from_pretrained("a-r-r-o-w/LTX-Video-0.9.1-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")

image = load_image(
    "https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png"
)
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
    decode_timestep=0.03,
    decode_noise_scale=0.025,
).frames[0]
export_to_video(video, "output.mp4", fps=24)