M

Mambavision L3 256 21K

Developed by nvidia
The first hybrid computer vision model combining the strengths of Mamba and Transformer, enhancing visual feature modeling efficiency by reconstructing the Mamba formula, and introducing self-attention modules in the final layers of the Mamba architecture to improve long-range spatial dependency modeling.
Downloads 510
Release Time : 3/24/2025

Model Overview

MambaVision is a hybrid Mamba-Transformer vision backbone network, designed for image classification and feature extraction, pre-trained on the ImageNet-21K dataset and fine-tuned on ImageNet-1K.

Model Features

Hybrid architecture
Combines Mamba's efficient sequence modeling with Transformer's long-range dependency capture capabilities to optimize visual feature extraction.
Hierarchical structure
Adopts a hierarchical design to meet the needs of diverse visual tasks, supporting multi-stage feature extraction.
Performance optimization
Achieves a new SOTA Pareto frontier in Top-1 accuracy and throughput.

Model Capabilities

Image classification
Visual feature extraction
Multi-stage feature map output

Use Cases

Computer vision
Image classification
Classifies input images to identify the main objects in the image.
Achieves 87.3% Top-1 accuracy on ImageNet-1K.
Feature extraction
Extracts multi-stage feature maps from images for downstream visual tasks.
Supports output of feature maps at 4 stages, suitable for visual analysis at different granularities.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase