V

Vit Base Patch14 Reg4 Dinov2.lvd142m

Developed by timm
A visual transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.
Downloads 40.95k
Release Time : 10/30/2023

Model Overview

This model is an image feature extraction backbone based on the Vision Transformer (ViT) architecture, specifically enhanced with a register mechanism to improve performance. Primarily used for image classification and feature extraction tasks.

Model Features

Register Enhancement
The model employs a register mechanism to enhance the performance of the vision transformer.
Self-supervised Pre-training
Pre-trained using the DINOv2 self-supervised learning method on the LVD-142M dataset.
Large Input Size Support
Supports large image inputs of up to 518×518 pixels.

Model Capabilities

Image feature extraction
Image classification
Generating image embeddings

Use Cases

Computer Vision
Image Classification
Can be used for general image classification tasks.
Feature Extraction
Can serve as a backbone network to provide feature representations for downstream vision tasks.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase