M

Mobilevit Small

Developed by Matthijs
MobileViT is a lightweight, low-latency vision Transformer model that combines the advantages of CNNs and Transformers, suitable for mobile devices.
Downloads 39
Release Time : 5/11/2022

Model Overview

MobileViT is a lightweight convolutional neural network that combines MobileNetV2-style layers with Transformer modules for image classification tasks.

Model Features

Lightweight Design
Optimized for mobile devices with low latency and high efficiency.
Combines CNN and Transformer
Integrates the local feature extraction capability of CNNs with the global processing power of Transformers.
No Positional Encoding Required
The model design eliminates the need for traditional positional encoding in Transformers.

Model Capabilities

Image Classification
Multi-scale Feature Extraction

Use Cases

Computer Vision
Object Recognition
Identify object categories in images
Achieves 78.4% top-1 accuracy on ImageNet-1k
Scene Classification
Classify image scenes
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase