🚀 [ECCV 2024] VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
VFusion3D is a large, feed - forward 3D generative model. It is trained with a small amount of 3D data and a large volume of synthetic multi - view data, exploring scalable 3D generative/reconstruction models as a step towards a 3D foundation.
Porject page, Paper link
VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
Junlin Han, Filippos Kokkinos, Philip Torr
GenAI, Meta and TVG, University of Oxford
European Conference on Computer Vision (ECCV), 2024
🚀 Quick Start
Getting started with VFusion3D is super easy! 🤗 Here’s how you can use the model with Hugging Face:
📦 Installation
Install Dependencies (Optional)
Depending on your needs, you may want to enable specific features like mesh generation or video rendering. We've got you covered with these additional packages:
!pip --quiet install imageio[ffmpeg] PyMCubes trimesh rembg[gpu,cli] kiui
💻 Usage Examples
Basic Usage
import torch
from transformers import AutoModel, AutoProcessor
model = AutoModel.from_pretrained("jadechoghari/vfusion3d", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("jadechoghari/vfusion3d")
import requests
from PIL import Image
from io import BytesIO
image_url = 'https://sm.ign.com/ign_nordic/cover/a/avatar-gen/avatar-generations_prsz.jpg'
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))
image, source_camera = processor(image)
output_planes = model(image, source_camera)
print("Planes shape:", output_planes.shape)
output_planes, mesh_path = model(image, source_camera, export_mesh=True)
print("Planes shape:", output_planes.shape)
print("Mesh saved at:", mesh_path)
output_planes, video_path = model(image, source_camera, export_video=True)
print("Planes shape:", output_planes.shape)
print("Video saved at:", video_path)
- Default (Planes): By default, VFusion3D outputs planes—ideal for further 3D operations.
- Export Mesh: Want a 3D mesh? Just set
export_mesh=True
, and you'll get a .obj
file ready to roll. You can also customize the mesh resolution by adjusting the mesh_size
parameter.
- Export Video: Fancy a 3D video? Set
export_video=True
, and you'll receive a beautifully rendered video from multiple angles. You can tweak render_size
and fps
to get the video just right.
Check out our demo app to see VFusion3D in action! 🤗
✨ Features
3D Generation Results
User Study Results
📚 Documentation
Acknowledgement
- This inference code of VFusion3D heavily borrows from OpenLRM.
Citation
If you find this work useful, please cite us:
@article{han2024vfusion3d,
title={VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models},
author={Junlin Han and Filippos Kokkinos and Philip Torr},
journal={European Conference on Computer Vision (ECCV)},
year={2024}
}
📄 License
- The majority of VFusion3D is licensed under CC - BY - NC, however portions of the project are available under separate license terms: OpenLRM as a whole is licensed under the Apache License, Version 2.0, while certain components are covered by NVIDIA's proprietary license.
- The model weights of VFusion3D is also licensed under CC - BY - NC.