V

Vit Msn Base

Developed by facebook
A Vision Transformer model pre-trained using the MSN method, suitable for few-shot image classification tasks
Downloads 694
Release Time : 9/9/2022

Model Overview

This model is pre-trained with the Masked Siamese Networks method to learn intrinsic image representations, particularly suitable for downstream tasks with limited labeled samples

Model Features

Few-shot learning
Through MSN pre-training, achieves good performance even with limited labeled samples
Joint embedding architecture
Matches prototypes of masked image patches with unmasked ones to learn more robust representations
Transformer-based
Adopts Vision Transformer architecture, processing input as sequences of image patches

Model Capabilities

Image feature extraction
Few-shot image classification

Use Cases

Computer vision
Image classification
Performs image classification tasks with limited labeled data
Excels in few-shot and very few-shot scenarios
Feature extraction
Serves as backbone network for extracting image features for downstream tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase