S

Swinv2 Small Patch4 Window8 256

Developed by microsoft
Swin Transformer v2 is a vision Transformer model that achieves efficient image processing through hierarchical feature maps and local window self-attention mechanisms.
Downloads 1,836
Release Time : 6/15/2022

Model Overview

This model was pre-trained on the ImageNet-1k dataset at 256x256 resolution and is suitable for image classification tasks.

Model Features

Hierarchical feature maps
Constructs hierarchical feature maps by merging image patches at deeper layers, improving feature extraction efficiency.
Local window self-attention
Computes self-attention only within local windows, making computational complexity linear with input image size.
Training stability improvements
Employs post normalization with residuals and cosine attention to enhance training stability.
High-resolution transfer capability
Uses log-spaced continuous position bias to effectively support transfer from low-resolution to high-resolution inputs.

Model Capabilities

Image classification
Visual feature extraction

Use Cases

Computer vision
Image classification
Classifies images into one of the 1000 ImageNet categories.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase