V

Vit Msn Large 7

Developed by facebook
This Vision Transformer model is pre-trained using the MSN method and excels in few-shot scenarios, suitable for tasks like image classification
Downloads 67
Release Time : 9/9/2022

Model Overview

A Vision Transformer model pre-trained with Masked Siamese Networks (MSN), learning image representations by matching prototypes of masked and unmasked image patches, particularly suitable for scenarios with limited labeled data

Model Features

Few-shot learning capability
Utilizes MSN pre-training method to maintain excellent performance even with limited labeled data
Joint embedding architecture
Learns image representations by matching prototypes of masked and unmasked patches
Large-scale pre-training
Pre-trained on ImageNet-1k dataset to learn general visual features

Model Capabilities

Image feature extraction
Image classification
Few-shot learning

Use Cases

Computer vision
Image classification
Performing image classification tasks with limited labeled data
Outperforms in few-shot and very few-shot scenarios
Feature extraction
Serves as a backbone network to extract image features for downstream tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase