đ Shap-E
Shap-E is a revolutionary model that employs a diffusion process to generate 3D images from text prompts. It was introduced in the paper Shap-E: Generating Conditional 3D Implicit Functions by Heewoo Jun and Alex Nichol from OpenAI.
The original repository of Shap-E can be accessed here: https://github.com/openai/shap-e.
The authors of Shap-E didn't author this model card. They provide a separate model card here.
đ Quick Start
Shap-E is a powerful tool for generating 3D images from text prompts or 2D images. To get started, you need to understand its basic concepts and installation steps.
⨠Features
- Text-to-3D Generation: Generate 3D images from text prompts.
- Image-to-3D Generation: Sample 3D images from synthetic 2D images.
- Multiple Output Representations: Directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields.
đĻ Installation
First, ensure you have installed all the necessary dependencies:
pip install transformers accelerate -q
pip install git+https://github.com/huggingface/diffusers@@shap-ee
đģ Usage Examples
Basic Usage
Once the dependencies are installed, you can use the following code to generate 3D images from a 2D image:
import torch
from diffusers import ShapEImg2ImgPipeline
from diffusers.utils import export_to_gif, load_image
ckpt_id = "openai/shap-e-img2img"
pipe = ShapEImg2ImgPipeline.from_pretrained(repo).to("cuda")
img_url = "https://hf.co/datasets/diffusers/docs-images/resolve/main/shap-e/corgi.png"
image = load_image(img_url)
generator = torch.Generator(device="cuda").manual_seed(0)
batch_size = 4
guidance_scale = 3.0
images = pipe(
image,
num_images_per_prompt=batch_size,
generator=generator,
guidance_scale=guidance_scale,
num_inference_steps=64,
size=256,
output_type="pil"
).images
gif_path = export_to_gif(images, "corgi_sampled_3d.gif")
đ Documentation
Released checkpoints
The authors released the following checkpoints:
Results
Reference |
Sampled 3D Image (One) |
Sampled 3D Image (Two) |
 |
 |
 |
Reference corgi image in 2D |
Sampled image in 3D (one) |
Sampled image in 3D (two) |
Training details
Refer to the original paper.
Known limitations and potential biases
Refer to the original model card.
đ License
This project is licensed under the MIT license.
đ Citation
@misc{jun2023shape,
title={Shap-E: Generating Conditional 3D Implicit Functions},
author={Heewoo Jun and Alex Nichol},
year={2023},
eprint={2305.02463},
archivePrefix={arXiv},
primaryClass={cs.CV}
}