Shap-e-img2img Open-source 3D Image Generation Model - Generate 3D Resources for Free Based on Text or 2D Images

Shap E Img2img

Developed by openai

Shap-E is a diffusion-based 3D image generation model capable of generating 3D assets from text prompts or 2D images.

3D Vision Open Source License:MIT #Text-to-3D #Multi-representation Output #Implicit Function Rendering

Downloads 380

Release Time : 7/4/2023

Model Overview

Shap-E is a conditional generative model that directly generates parameters of implicit functions, which can be rendered as textured meshes and neural radiance fields. It supports 3D content generation from text or images.

Model Features

Multi-representation Output

Directly generates parameters of implicit functions that can be rendered as textured meshes and neural radiance fields.

Fast Generation

Can generate complex and diverse 3D assets in seconds.

Two-stage Training

First trains an encoder to map 3D assets to implicit function parameters, then trains a conditional diffusion model.

Model Capabilities

Text-to-3D

Image-to-3D

Generate Textured Meshes

Generate Neural Radiance Fields

Use Cases

3D Content Creation

Text-to-3D Model Generation

Quickly generates 3D model assets from text prompts.

Can generate complex and diverse 3D assets

2D Image-to-3D Model Conversion

Converts 2D images into 3D models.

Example shows the effect of converting a corgi image into a 3D model

🚀 Shap-E

Shap-E is a revolutionary model that employs a diffusion process to generate 3D images from text prompts. It was introduced in the paper Shap-E: Generating Conditional 3D Implicit Functions by Heewoo Jun and Alex Nichol from OpenAI.

The original repository of Shap-E can be accessed here: https://github.com/openai/shap-e.

The authors of Shap-E didn't author this model card. They provide a separate model card here.

🚀 Quick Start

Shap-E is a powerful tool for generating 3D images from text prompts or 2D images. To get started, you need to understand its basic concepts and installation steps.

✨ Features

Text-to-3D Generation: Generate 3D images from text prompts.
Image-to-3D Generation: Sample 3D images from synthetic 2D images.
Multiple Output Representations: Directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields.

📦 Installation

First, ensure you have installed all the necessary dependencies:

pip install transformers accelerate -q
pip install git+https://github.com/huggingface/diffusers@@shap-ee

💻 Usage Examples

Basic Usage

Once the dependencies are installed, you can use the following code to generate 3D images from a 2D image:

import torch
from diffusers import ShapEImg2ImgPipeline
from diffusers.utils import export_to_gif, load_image

ckpt_id = "openai/shap-e-img2img"
pipe = ShapEImg2ImgPipeline.from_pretrained(repo).to("cuda")

img_url = "https://hf.co/datasets/diffusers/docs-images/resolve/main/shap-e/corgi.png"
image = load_image(img_url)

generator = torch.Generator(device="cuda").manual_seed(0)
batch_size = 4
guidance_scale = 3.0

images = pipe(
    image, 
    num_images_per_prompt=batch_size, 
    generator=generator, 
    guidance_scale=guidance_scale,
    num_inference_steps=64, 
    size=256, 
    output_type="pil"
).images

gif_path = export_to_gif(images, "corgi_sampled_3d.gif")

📚 Documentation

Released checkpoints

The authors released the following checkpoints:

openai/shap-e: produces a 3D image from a text input prompt
openai/shap-e-img2img: samples a 3D image from synthetic 2D image

Results

Reference	Sampled 3D Image (One)	Sampled 3D Image (Two)

Reference corgi image in 2D	Sampled image in 3D (one)	Sampled image in 3D (two)

Training details

Refer to the original paper.

Known limitations and potential biases

Refer to the original model card.

📄 License

This project is licensed under the MIT license.

📖 Citation

@misc{jun2023shape,
      title={Shap-E: Generating Conditional 3D Implicit Functions}, 
      author={Heewoo Jun and Alex Nichol},
      year={2023},
      eprint={2305.02463},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご