đ Model Card for OpenLRM V1.1
This model card provides detailed information about the OpenLRM project, an open - source implementation of the paper LRM. The information corresponds to Version 1.1.
đ Quick Start
The content in the original README does not have a quick - start section, so this section is skipped.
⨠Features
The content in the original README does not have a feature description section, so this section is skipped.
đĻ Installation
The content in the original README does not have an installation steps section, so this section is skipped.
đģ Usage Examples
The content in the original README does not have code examples, so this section is skipped.
đ Documentation
Overview
- This model card is for the OpenLRM project, which is an open - source implementation of the paper LRM.
- Information contained in this model card corresponds to Version 1.1.
Model Details
Training data
Property |
Details |
[openlrm - obj - small - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - small - 1.1) |
Objaverse |
[openlrm - obj - base - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - base - 1.1) |
Objaverse |
[openlrm - obj - large - 1.1](https://huggingface.co/zxhezexin/openlrm - obj - large - 1.1) |
Objaverse |
[openlrm - mix - small - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - small - 1.1) |
Objaverse + MVImgNet |
[openlrm - mix - base - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - base - 1.1) |
Objaverse + MVImgNet |
[openlrm - mix - large - 1.1](https://huggingface.co/zxhezexin/openlrm - mix - large - 1.1) |
Objaverse + MVImgNet |
Model architecture (version==1.1)
Type |
Layers |
Feat. Dim |
Attn. Heads |
Triplane Dim. |
Input Res. |
Image Encoder |
Size |
small |
12 |
512 |
8 |
32 |
224 |
dinov2_vits14_reg |
446M |
base |
12 |
768 |
12 |
48 |
336 |
dinov2_vitb14_reg |
1.04G |
large |
16 |
1024 |
16 |
80 |
448 |
dinov2_vitb14_reg |
1.81G |
Training settings
Type |
Rend. Res. |
Rend. Patch |
Ray Samples |
small |
192 |
64 |
96 |
base |
288 |
96 |
96 |
large |
384 |
128 |
128 |
Notable Differences from the Original Paper
- We do not use the deferred back - propagation technique in the original paper.
- We used random background colors during training.
- The image encoder is based on the DINOv2 model with register tokens.
- The triplane decoder contains 4 layers in our implementation.
đ§ Technical Details
The content in the original README does not have in - depth technical details (more than 50 - word specific technical descriptions), so this section is skipped.
đ License
Disclaimer
This model is an open - source implementation and is NOT the official release of the original research paper. While it aims to reproduce the original results as faithfully as possible, there may be variations due to model implementation, training data, and other factors.
Ethical Considerations
- This model should be used responsibly and ethically, and should not be used for malicious purposes.
- Users should be aware of potential biases in the training data.
- The model should not be used under the circumstances that could lead to harm or unfair treatment of individuals or groups.
Usage Considerations
- The model is provided "as is" without warranty of any kind.
- Users are responsible for ensuring that their use complies with all relevant laws and regulations.
- The developers and contributors of this model are not liable for any damages or losses arising from the use of this model.
This model card is subject to updates and modifications. Users are advised to check for the latest version regularly.