Model Selection

High-Resolution Image Classification

# High-Resolution Image Classification

Mambavision L3 512 21K

MambaVision is the first hybrid computer vision model combining the strengths of Mamba and Transformer. It enhances visual feature modeling by redesigning the Mamba formulation and incorporates self-attention modules in the final layers of the Mamba architecture to improve long-range spatial dependency modeling.

Image Classification

Mambavision L2 512 21K

The first hybrid computer vision model combining the advantages of Mamba and Transformer, enhancing visual feature modeling capability by reconstructing the Mamba formula

Image Classification

Efficientnet B7

EfficientNet is an efficient convolutional neural network that achieves high-performance image classification by uniformly scaling depth, width, and resolution

Image Classification

Swinv2 Large Patch4 Window12to24 192to384 22kto1k Ft

Swin Transformer v2 is a vision Transformer model pre-trained on ImageNet-21k and fine-tuned on ImageNet-1k at 384x384 resolution, featuring hierarchical feature maps and local window self-attention mechanisms.

Image Classification

Swinv2 Large Patch4 Window12to16 192to256 22kto1k Ft

Swin Transformer v2 is a vision Transformer model that achieves efficient image classification and dense recognition tasks through hierarchical feature maps and local window self-attention mechanisms.

Image Classification

Swinv2 Base Patch4 Window12to16 192to256 22kto1k Ft

Swin Transformer v2 is a vision Transformer model that achieves efficient image classification through hierarchical feature maps and local window-based self-attention mechanisms.

Image Classification

Swinv2 Base Patch4 Window16 256

Swin Transformer v2 is a vision Transformer model that achieves efficient image classification and dense recognition tasks through hierarchical feature maps and local window self-attention mechanisms.

Image Classification

Swinv2 Tiny Patch4 Window8 256

Swin Transformer v2 is a vision Transformer model pre-trained on ImageNet-1k, featuring hierarchical feature maps and local window self-attention mechanisms with linear computational complexity.

Image Classification

CvT-21 is a vision model combining convolutional and Transformer architectures, pretrained on ImageNet-22k and fine-tuned on ImageNet-1k

Image Classification

CvT-13 is a vision transformer model pre-trained on the ImageNet-1k dataset, improving the performance of traditional vision transformers by introducing convolutional operations.

Image Classification

Swin Base Patch4 Window12 384 In22k

Swin Transformer is a hierarchical vision Transformer based on shifted windows, specifically designed for image classification tasks.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase