MobileNet V2 Open-Source Vision Model - Optimized for Mobile Devices, Efficiently Completes Image Classification

Home

Mobilenet V2 1.0 224

Developed by google

MobileNet V2 is a lightweight vision model optimized for mobile devices, excelling in image classification tasks.

Image Classification

Transformers

Open Source License:Other #Lightweight Image Classification #Mobile-Optimized #Low-Latency Inference

Downloads 69.47k

Release Time : 11/10/2022

Model Overview

MobileNet V2 is a lightweight convolutional neural network model pre-trained on the ImageNet-1k dataset, suitable for image classification tasks. The model employs inverted residual structures and linear bottleneck designs to significantly reduce computational load and parameter count while maintaining high accuracy.

Model Features

Lightweight Design

Optimized for mobile and embedded devices with reduced computational load and parameter count.

Efficient Architecture

Utilizes inverted residual structures and linear bottleneck designs to enhance computational efficiency.

High Accuracy

Maintains competitive accuracy compared to larger models while remaining lightweight.

Model Capabilities

Image Classification

Visual Feature Extraction

Use Cases

Computer Vision

Object Recognition

Identify object categories in images.

Capable of classifying 1000 ImageNet categories.

Mobile Vision Applications

Suitable for real-time vision applications on mobile devices like smartphones.

Operates with low latency and low power consumption.

🚀 MobileNet V2

A pre - trained MobileNet V2 model on ImageNet - 1k at 224x224 resolution, offering efficient image classification capabilities.

🚀 Quick Start

The MobileNet V2 model is pre - trained on ImageNet - 1k at a resolution of 224x224. It was introduced in MobileNetV2: Inverted Residuals and Linear Bottlenecks by Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang - Chieh Chen and first released in this repository.

Disclaimer: The team releasing MobileNet V2 did not write a model card for this model, so this model card has been written by the Hugging Face team.

✨ Features

Model description

From the original README:

MobileNets are small, low - latency, low - power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embeddings and segmentation similar to how other popular large - scale models, such as Inception, are used. MobileNets can be run efficiently on mobile devices [...] MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature.

The checkpoints are named mobilenet_v2_depth_size, for example mobilenet_v2_1.0_224, where 1.0 is the depth multiplier and 224 is the resolution of the input images the model was trained on.

Intended uses & limitations

You can use the raw model for image classification. See the model hub to look for fine - tuned versions on a task that interests you.

💻 Usage Examples

Basic Usage

Here is how to use this model to classify an image of the COCO 2017 dataset into one of the 1,000 ImageNet classes:

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

preprocessor = AutoImageProcessor.from_pretrained("google/mobilenet_v2_1.0_224")
model = AutoModelForImageClassification.from_pretrained("google/mobilenet_v2_1.0_224")

inputs = preprocessor(images=image, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])

⚠️ Important Note

This model actually predicts 1001 classes, the 1000 classes from ImageNet plus an extra “background” class (index 0). Currently, both the feature extractor and model support PyTorch.

📚 Documentation

BibTeX entry and citation info

@inproceedings{mobilenetv22018,
  title={MobileNetV2: Inverted Residuals and Linear Bottlenecks},
  author={Mark Sandler and Andrew Howard and Menglong Zhu and Andrey Zhmoginov and Liang-Chieh Chen},
  booktitle={CVPR},
  year={2018}
}

📄 License

License: other

Property	Details
Tags	vision, image - classification
Datasets	imagenet - 1k

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご