Multiple-Input-Resshift-VFI開源視頻幀插值工具 - 支持多輸入幀並含不確定性估計

首頁

Multiple Input Resshift VFI

由vfontech開發

基於擴散模型的視頻幀插值工具，支持多輸入幀插值幷包含不確定性估計

視頻處理

Safetensors

英語開源協議:MIT #視頻幀插值 #擴散模型 #多幀輸入

下載量 207

發布時間 : 4/18/2025

模型概述

這是一個基於PyTorch實現的視頻幀插值模型，採用ResShift擴散架構，能夠從兩幀輸入圖像生成中間幀，適用於動畫和視頻處理場景。

模型特點

多輸入幀插值

支持從兩幀輸入圖像生成中間幀，實現流暢的視頻過渡效果

不確定性估計

模型包含對生成結果的不確定性估計功能

擴散模型架構

採用ResShift擴散架構，提供高質量的插值結果

模型能力

視頻幀插值

動畫處理

圖像生成

不確定性估計

使用案例

視頻處理

視頻幀率提升

通過插值生成中間幀，提高視頻的幀率和流暢度

動畫製作

在動畫製作中生成平滑過渡幀，減少繪製工作量

🚀 🤖 多輸入ResShift擴散視頻幀插值（Multi‑Input ResShift Diffusion VFI）

多輸入ResShift擴散視頻幀插值模型主要用於視頻幀插值任務，能夠在動畫、視頻等場景中，根據已有幀生成中間幀，同時還支持不確定性估計，為視頻處理提供了更豐富的功能和更準確的結果。

🚀 快速開始

環境搭建

首先，直接從GitHub下載源代碼：

git clone https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI.git

創建一個conda環境並安裝所有依賴項：

conda create -n multi-input-resshift python=3.12
conda activate multi-input-resshift
pip install -r requirements.txt

⚠️ 重要提示

請確保你的系統與 CUDA 12.4 兼容。如果不兼容，請根據你當前的CUDA版本安裝 CuPy。

推理示例

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

from torchvision.transforms import Compose, ToTensor, Resize, Normalize
from utils.utils import denorm
from model.hub import MultiInputResShiftHub

model = MultiInputResShiftHub.from_pretrained("vfontech/Multiple-Input-Resshift-VFI").cuda()
model.eval()

img0_path = r"_data\example_images\frame1.png"
img2_path = r"_data\example_images\frame3.png"

mean = std = [0.5]*3
transforms = Compose([
    Resize((256, 448)),
    ToTensor(),
    Normalize(mean=mean, std=std),
])

img0 = transforms(Image.open(img0_path).convert("RGB")).unsqueeze(0).cuda()
img2 = transforms(Image.open(img2_path).convert("RGB")).unsqueeze(0).cuda()
tau = 0.5

img1 = model.reverse_process([img0, img2], tau)

plt.figure(figsize=(10, 5))
plt.subplot(1, 3, 1)
plt.imshow(denorm(img0, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy())
plt.subplot(1, 3, 2)
plt.imshow(denorm(img1, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy())
plt.subplot(1, 3, 3)
plt.imshow(denorm(img2, mean=mean, std=std).squeeze().permute(1, 2, 0).cpu().numpy())
plt.show()