# Self-supervised visual features

Dinov2 Small ONNX
ONNX format version of DINOv2-small, suitable for vision tasks
Transformers
D
onnx-community
14
0
Vit Large Patch14 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT)-based image feature model, pre-trained on the LVD-142M dataset using the self-supervised DINOv2 method.
Image Classification Transformers
V
pcuenq
18
0
Dinov2 With Registers Small Imagenet1k 1 Layer
Apache-2.0
A vision Transformer model trained with DINOv2, improved by adding register tokens to enhance attention mechanism, eliminate artifacts, and boost performance
Image Classification Transformers
D
facebook
445
2
Dinov2.giant.patch 14
Apache-2.0
DINOv2 is a visual feature extraction model developed by Facebook Research team, achieving powerful image representation capabilities through self-supervised learning.
D
refiners
26
0
Dinov2.base.patch 14
Apache-2.0
DINOv2 is a self-supervised visual feature extraction model developed by Facebook Research, capable of generating robust visual feature representations.
D
refiners
18
0
Dinov2.small.patch 14
Apache-2.0
DINOv2 is a visual feature extraction model developed by Facebook Research that generates robust visual features without supervised learning.
D
refiners
23
0
Vit Small Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A visual Transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification Transformers
V
timm
15.98k
5
Vit Large Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A Vision Transformer (ViT) image feature model with registers, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification Transformers
V
timm
119.48k
7
Vit Giant Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT) image feature model with registers, pretrained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification Transformers
V
timm
917
1
Vit Base Patch14 Reg4 Dinov2.lvd142m
Apache-2.0
A visual transformer (ViT) image feature model with registers, pre-trained using the self-supervised DINOv2 method on the LVD-142M dataset.
Image Classification Transformers
V
timm
40.95k
10
Dinov2 Small Imagenet1k 1 Layer
Apache-2.0
A small vision Transformer model trained using the DINOv2 method, suitable for image feature extraction and classification tasks
Image Classification Transformers
D
facebook
50.86k
2
Dinov2 Small
Apache-2.0
A small-scale vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning
Image Classification Transformers
D
facebook
5.0M
31
Dinov2 Giant
Apache-2.0
A vision Transformer model trained using the DINOv2 method for self-supervised image feature extraction
Image Classification Transformers
D
facebook
117.56k
41
Dinov2 Large
Apache-2.0
A vision Transformer model trained using the DINOv2 method, extracting robust visual features from massive image data through self-supervised learning
Image Classification Transformers
D
facebook
558.78k
79
Dinov2 Base
Apache-2.0
Vision Transformer model trained using the DINOv2 method, extracting image features through self-supervised learning
Image Classification Transformers
D
facebook
1.9M
126
Vit Small Patch14 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT)-based image feature model pre-trained using self-supervised DINOv2 method on the LVD-142M dataset
Image Classification Transformers
V
timm
35.85k
3
Vit Large Patch14 Dinov2.lvd142m
Apache-2.0
A self-supervised image feature model based on Vision Transformer (ViT), pre-trained using the DINOv2 method on the LVD-142M dataset, suitable for image classification and feature extraction tasks.
Image Classification Transformers
V
timm
32.01k
11
Vit Giant Patch14 Dinov2.lvd142m
Apache-2.0
A giant vision Transformer (ViT)-based image feature extraction model, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset
Image Classification Transformers
V
timm
6,911
0
Vit Base Patch14 Dinov2.lvd142m
Apache-2.0
A Vision Transformer (ViT)-based image feature model, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset
Image Classification Transformers
V
timm
50.71k
4
Vit Large Patch16 224.mae
Large-scale image feature extraction model based on Vision Transformer (ViT), pre-trained on ImageNet-1k dataset using self-supervised Masked Autoencoder (MAE) method
Image Classification Transformers
V
timm
960
1
Vit Base Patch16 224.mae
Vision Transformer (ViT) based image feature extraction model, pre-trained on ImageNet-1k dataset using self-supervised masked autoencoder (MAE) method
Image Classification Transformers
V
timm
23.63k
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase