openlrm - mix - base - 1.1 Open Source Model - Generate 3D Models of Various Scales from a Single Image for Free

Openlrm Mix Base 1.1

Developed by zxhezexin

OpenLRM is an open-source implementation of the LRM paper, capable of generating 3D models from a single image, with multiple versions of different scales.

3D Vision

Transformers

#Single Image to 3D #Tri-plane Decoding #DINOv2 Encoding

Downloads 10.25k

Release Time : 3/4/2024

Model Overview

OpenLRM is a deep learning model for image-to-3D conversion, capable of reconstructing high-quality 3D objects from a single input image. The series includes multiple versions based on different training data and architecture scales.

Model Features

Multi-scale Options

Offers small/base/large models of different scales to meet various computational resource needs

Dual-data Training

Some versions are trained on mixed datasets of Objaverse and MVImgNet to enhance generalization capabilities

Efficient 3D Representation

Uses tri-plane representation for efficient 3D object encoding and decoding

DINOv2 Image Encoding

Employs DINOv2 model with registered tokens as the image encoder to improve feature extraction

Model Capabilities

Single-view 3D Reconstruction

3D Object Generation

Multi-view Rendering

Use Cases

3D Content Creation

Rapid 3D Modeling

Generate 3D models quickly from a single product photo

Useful for e-commerce displays or AR applications

Game Asset Creation

Convert concept art into 3D game assets

Accelerates game development workflows

Education & Research

3D Reconstruction Research

Serves as a benchmark model in the field of 3D reconstruction

Useful for algorithm comparison and improvement

🚀 Model Card for OpenLRM V1.1

This model card provides detailed information about the OpenLRM V1.1 project, an open - source implementation of the LRM paper.

🚀 Quick Start

This model card is dedicated to the OpenLRM project, an open - source implementation of the paper LRM. The information presented here corresponds to Version 1.1.

✨ Features

Image - to - 3D Pipeline: It belongs to the image - to - 3D pipeline, which can transform images into 3D models.
Multiple Model Variants: There are different model variants such as small, base, and large, trained on various datasets.

📚 Documentation

Model Details

Training data

Property	Details
[openlrm - obj - small - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - small - 1.1)	Objaverse
[openlrm - obj - base - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - base - 1.1)	Objaverse
[openlrm - obj - large - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - large - 1.1)	Objaverse
[openlrm - mix - small - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - small - 1.1)	Objaverse + MVImgNet
[openlrm - mix - base - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - base - 1.1)	Objaverse + MVImgNet
[openlrm - mix - large - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - large - 1.1)	Objaverse + MVImgNet

Model architecture (version==1.1)

Property	Details
Model Type (small)	Layers: 12, Feat. Dim: 512, Attn. Heads: 8, Triplane Dim.: 32, Input Res.: 224, Image Encoder: dinov2_vits14_reg, Size: 446M
Model Type (base)	Layers: 12, Feat. Dim: 768, Attn. Heads: 12, Triplane Dim.: 48, Input Res.: 336, Image Encoder: dinov2_vitb14_reg, Size: 1.04G
Model Type (large)	Layers: 16, Feat. Dim: 1024, Attn. Heads: 16, Triplane Dim.: 80, Input Res.: 448, Image Encoder: dinov2_vitb14_reg, Size: 1.81G

Training settings

Property	Details
Model Type (small)	Rend. Res.: 192, Rend. Patch: 64, Ray Samples: 96
Model Type (base)	Rend. Res.: 288, Rend. Patch: 96, Ray Samples: 96
Model Type (large)	Rend. Res.: 384, Rend. Patch: 128, Ray Samples: 128

Notable Differences from the Original Paper

We do not use the deferred back - propagation technique in the original paper.
We used random background colors during training.
The image encoder is based on the DINOv2 model with register tokens.
The triplane decoder contains 4 layers in our implementation.

📄 License

The model weights are released under the Creative Commons Attribution - NonCommercial 4.0 International License.
They are provided for research purposes only, and CANNOT be used commercially.

Disclaimer

This model is an open - source implementation and is NOT the official release of the original research paper. While it aims to reproduce the original results as faithfully as possible, there may be variations due to model implementation, training data, and other factors.

Ethical Considerations

⚠️ Important Note

This model should be used responsibly and ethically, and should not be used for malicious purposes. Users should be aware of potential biases in the training data, and the model should not be used under the circumstances that could lead to harm or unfair treatment of individuals or groups.

Usage Considerations

💡 Usage Tip

The model is provided "as is" without warranty of any kind. Users are responsible for ensuring that their use complies with all relevant laws and regulations. The developers and contributors of this model are not liable for any damages or losses arising from the use of this model.

This model card is subject to updates and modifications. Users are advised to check for the latest version regularly.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご