LVM Ckpts
LVM is an innovative visual pretraining model that achieves large-scale visual learning by converting visual data into visual sentences and making predictions in an autoregressive manner.
Downloads 247
Release Time : 6/13/2024
Model Overview
LVM is a visual pretraining model that achieves large-scale visual learning by converting various visual data into visual sentences and predicting the next token in an autoregressive manner. The model is compatible with both GPU and TPU hardware platforms.
Model Features
Visual Sequence Modeling
Innovatively converts visual data into visual sentence sequences for autoregressive prediction
Large-scale Training
Trained on a deeply cleaned dataset of 1.2 billion images
Hardware Compatibility
Supports both GPU and TPU hardware platforms
Parameter Scale
This release features a 7 billion parameter version, significantly larger than the original paper's 3 billion parameters
Model Capabilities
Image Sequence Modeling
Visual Token Prediction
Large-scale Visual Learning
Use Cases
Computer Vision
Visual Content Generation
Autoregressive prediction capability based on visual sequences can be used for image generation tasks
Visual Understanding
Large-scale pretraining model can improve performance in various visual understanding tasks
Featured Recommended AI Models
Š 2025AIbase