V

Vit Small Patch14 Dinov2.lvd142m

Developed by timm
A vision Transformer (ViT)-based image feature model pre-trained using self-supervised DINOv2 method on the LVD-142M dataset
Downloads 35.85k
Release Time : 5/9/2023

Model Overview

This is a small vision Transformer model specifically designed for image feature extraction. It employs DINOv2 self-supervised learning method pre-trained on the LVD-142M dataset, capable of generating high-quality image representations.

Model Features

Self-supervised learning
Utilizes DINOv2 self-supervised learning method to acquire high-quality image features without manual annotation
Efficient architecture
Small ViT architecture with moderate parameter count (22.1M), computationally efficient
Large-scale pre-training
Pre-trained on the massive LVD-142M dataset, learning broad visual features

Model Capabilities

Image feature extraction
Image classification
Visual representation learning

Use Cases

Computer vision
Image classification
Can be used for image classification tasks by extracting features for classification
Visual search
Extracts image features for similar image retrieval
Downstream vision tasks
Serves as a pre-trained model that can be fine-tuned for various downstream vision tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase