M

Mobilevit Small

Developed by apple
MobileViT is a lightweight, low-latency vision Transformer model that combines the strengths of CNNs and Transformers, making it suitable for mobile devices.
Downloads 894.23k
Release Time : 5/30/2022

Model Overview

MobileViT is a lightweight convolutional neural network that combines MobileNetV2-style layers with Transformer modules for image classification tasks.

Model Features

Lightweight Design
The model has only 5.6M parameters, making it suitable for mobile deployment.
Combining CNN and Transformer
Integrates the local feature extraction capability of CNNs with the global modeling ability of Transformers.
No Positional Encoding Required
The model design eliminates the need for positional encoding, simplifying implementation.

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
ImageNet Image Classification
Classify images into one of the 1000 ImageNet categories.
Top-1 accuracy 78.4%, Top-5 accuracy 94.1%
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase