V

Vit Msn Large

Developed by facebook
Vision Transformer model pretrained using MSN method, excels in few-shot scenarios
Downloads 48
Release Time : 9/9/2022

Model Overview

This Vision Transformer model is pretrained with Masked Siamese Networks method, particularly suitable for image classification tasks with limited labeled data, capable of learning intrinsic image representations and transferring to downstream tasks

Model Features

Few-shot Learning Capability
Maintains excellent performance in scenarios with limited labeled data through MSN pretraining method
Joint Embedding Architecture
Employs unique training approach of matching masked patches with prototype patches
Transfer Learning Friendly
Pretrained representations can be easily transferred to various downstream vision tasks

Model Capabilities

Image Feature Extraction
Few-shot Image Classification
Visual Representation Learning

Use Cases

Computer Vision
Few-shot Image Classification
Achieves image classification with limited labeled samples
Performs exceptionally well in few-shot and very few-shot scenarios
Visual Feature Extraction
Serves as base encoder for extracting image features
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase