S

Swinv2 Tiny Patch4 Window8 256

Developed by microsoft
Swin Transformer v2 is a vision Transformer model pre-trained on ImageNet-1k, featuring hierarchical feature maps and local window self-attention mechanisms with linear computational complexity.
Downloads 25.04k
Release Time : 6/14/2022

Model Overview

This model is the tiny version of Swin Transformer v2, designed for image classification tasks, pre-trained at 256x256 resolution, and can serve as a general backbone for computer vision tasks.

Model Features

Hierarchical Feature Maps
Constructs hierarchical feature maps by merging deep image patches, suitable for visual tasks at different scales.
Local Window Self-Attention
Computes self-attention only within local windows, achieving linear computational complexity relative to input image size.
Post Normalization Residual
Uses post normalization residual combined with cosine attention to enhance training stability.
Position Bias Transfer
Employs log-spaced continuous position bias to effectively transfer low-resolution pre-trained models to high-resolution tasks.

Model Capabilities

Image Classification
Visual Feature Extraction
Computer Vision Task Backbone

Use Cases

Computer Vision
Image Classification
Classifies input images into one of 1000 ImageNet categories.
Performs well on the ImageNet-1k dataset.
Visual Feature Extraction
Serves as a pre-trained feature extractor for other computer vision tasks.
Can be used for downstream tasks like object detection and semantic segmentation.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase