S

Swinv2 Tiny Patch4 Window16 256

Developed by microsoft
Swin Transformer v2 is a vision Transformer model that achieves efficient image classification through hierarchical feature maps and local window self-attention mechanisms.
Downloads 403.69k
Release Time : 6/14/2022

Model Overview

This model was pre-trained on the ImageNet-1k dataset at 256x256 resolution and is suitable for image classification tasks. It improves training stability through post-normalization of residuals and cosine attention mechanisms, supporting transfer learning from low-resolution to high-resolution inputs.

Model Features

Hierarchical Feature Maps
Constructs hierarchical feature maps by merging image patches, suitable for processing images at different resolutions.
Local Window Self-Attention
Computes self-attention only within local windows, with computational complexity linear to the input image size, improving efficiency.
Training Stability Improvements
Uses post-normalization of residuals and cosine attention mechanisms to significantly enhance training stability.
Transfer Learning Support
Supports transfer learning from low-resolution to high-resolution inputs through log-spaced continuous position bias methods.

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
ImageNet Image Classification
Classifies images into one of the 1000 ImageNet categories.
Highly accurate image classification results.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase