# Image Feature Extraction
Openvision Vit Base Patch8 160
Apache-2.0
OpenVision-ViT-Tiny is a fully open, cost-effective advanced visual encoder, part of the OpenVision family, focusing on multimodal learning.
Image Classification
Transformers

O
UCSC-VLAA
26
0
Openvision Vit Small Patch8 384
Apache-2.0
OpenVision is a fully open, cost-effective family of advanced vision encoders focused on multimodal learning.
Multimodal Fusion
O
UCSC-VLAA
21
0
Openvision Vit Small Patch16 224
Apache-2.0
OpenVision is a fully open, cost-effective family of advanced vision encoders focused on multimodal learning.
Image Enhancement
O
UCSC-VLAA
17
0
Openvision Vit Tiny Patch16 160
Apache-2.0
OpenVision is a fully open, cost-effective advanced visual encoder family focused on multimodal learning.
Multimodal Fusion
Transformers

O
UCSC-VLAA
30
0
Sam2 Hiera Tiny.fb R896 2pt1
Apache-2.0
SAM2 model based on the HieraDet image encoder, focusing on image feature extraction tasks.
Object Detection
Transformers

S
timm
37
0
Sam2 Hiera Small.fb R896
Apache-2.0
SAM2 model based on the HieraDet image encoder, focused on image feature extraction tasks.
Image Segmentation
Transformers

S
timm
142
0
Sam2 Hiera Base Plus.fb R896 2pt1
Apache-2.0
SAM2 model weights based on HieraDet image encoder, focused on image feature extraction tasks
Image Segmentation
Transformers

S
timm
148
0
Sam2 Hiera Base Plus.fb R896
Apache-2.0
SAM2 model based on the HieraDet image encoder, focused on image feature extraction tasks.
Image Segmentation
Transformers

S
timm
764
0
Mambavision T2 1K
Other
The first hybrid computer vision model combining the strengths of Mamba and Transformer, enhancing visual feature modeling through redesigned Mamba formulations and incorporating self-attention modules in the Mamba architecture to improve long-range spatial dependency modeling.
Image Classification
Transformers

M
nvidia
597
4
Vit Base Patch16 224.orig In21k
Apache-2.0
An image classification model based on Vision Transformer, pretrained on ImageNet-21k, suitable for feature extraction and fine-tuning
Image Classification
Transformers

V
timm
23.07k
1
Eva02 Small Patch14 224.mim In22k
MIT
EVA02 feature/representation model, pretrained on ImageNet-22k via masked image modeling, suitable for image classification and feature extraction tasks.
Image Classification
Transformers

E
timm
705
0
Eva02 Base Patch14 224.mim In22k
MIT
EVA02 base version visual representation model, pre-trained on ImageNet-22k through masked image modeling, suitable for image classification and feature extraction tasks.
Image Classification
Transformers

E
timm
2,834
6
Face Discriminator 2
Apache-2.0
A face classification model fine-tuned based on ResNet-50, achieving an accuracy of 94.16% on the evaluation set
Image Classification
Transformers

F
petrznel
23
0
Google Vit Base Patch16 224 Cartoon Face Recognition
Apache-2.0
A cartoon face recognition model fine-tuned based on the Google Vision Transformer (ViT) architecture, excelling in image classification tasks
Face-related
Transformers

G
jayanta
62
2
Vit Small Patch8 224.dino
Apache-2.0
Self-supervised image feature extraction model based on Vision Transformer (ViT), trained using the DINO method
Image Classification
Transformers

V
timm
8,904
2
Vit Large Patch32 224.orig In21k
Apache-2.0
An image classification model based on Vision Transformer (ViT) architecture, pretrained on the ImageNet-21k dataset, suitable for feature extraction and fine-tuning scenarios.
Image Classification
Transformers

V
timm
771
0
Vit Base Patch16 224.dino
Apache-2.0
A Vision Transformer (ViT) image feature model trained with self-supervised DINO method, suitable for image classification and feature extraction tasks.
Image Classification
Transformers

V
timm
33.45k
5
Vit Base Patch8 224.dino
Apache-2.0
A vision Transformer (ViT) image feature model trained with the self-supervised DINO method, suitable for image classification and feature extraction tasks.
Image Classification
Transformers

V
timm
9,287
1
Dino Resnet 50
A ResNet-50 model pre-trained using the DINO self-supervised learning method, suitable for visual feature extraction tasks
Image Classification
Transformers

D
Ramos-Ramos
106
0
Regnet Y 006
Apache-2.0
RegNet is an image classification model designed through neural architecture search, trained on the ImageNet-1k dataset.
Image Classification
Transformers

R
facebook
18
0
Regnet X 040
Apache-2.0
RegNet model trained on imagenet-1k, an efficient vision model designed via neural architecture search
Image Classification
Transformers

R
facebook
69
1
Featured Recommended AI Models