Deeplabv3 MobileNet V2 1.0 513 Open-Source Semantic Segmentation Model - Lightweight Implementation for Precise Image Segmentation

Deeplabv3 Mobilenet V2 1.0 513

Developed by google

A lightweight semantic segmentation model based on MobileNetV2 architecture combined with DeepLabV3+ segmentation head, pre-trained on the PASCAL VOC dataset

Image Segmentation

Transformers

Open Source License:Other #Lightweight Semantic Segmentation #Mobile Optimization #Low-Power Model

Downloads 3,129

Release Time : 11/10/2022

Model Overview

This model employs MobileNetV2 as the backbone network combined with the DeepLabV3+ segmentation head, specifically designed for semantic image segmentation tasks, featuring lightweight and high efficiency

Model Features

Lightweight Design

The MobileNetV2 architecture is optimized for mobile devices, reducing computational resource requirements while maintaining performance

Efficient Segmentation

Combined with the DeepLabV3+ segmentation head, it enables precise semantic segmentation

Pre-trained Model

Pre-trained on the PASCAL VOC dataset at 513x513 resolution, ready for direct use or fine-tuning

Model Capabilities

Image Segmentation

Semantic Understanding

Object Region Recognition

Use Cases

Computer Vision

Scene Understanding

Identify and segment different objects and regions in an image

Accurately marks the boundaries of various objects in the image

Autonomous Driving

Used for road scene analysis to identify roads, vehicles, pedestrians, etc.

🚀 MobileNetV2 with DeepLabV3+

A MobileNet V2 model pre - trained on PASCAL VOC at a resolution of 513x513, suitable for semantic image segmentation.

🚀 Quick Start

You can use the raw model for semantic segmentation. Check out the model hub to find fine - tuned versions for tasks that interest you.

✨ Features

The MobileNetV2 model is small, has low latency, and low power consumption, and can be efficiently run on mobile devices.
It trades off between latency, size, and accuracy, comparing favorably with popular models in the literature.
A DeepLabV3+ head is added to the MobileNetV2 backbone for semantic segmentation.

📦 Installation

The README does not provide specific installation steps, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import AutoImageProcessor, AutoModelForSemanticSegmentation
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

preprocessor = AutoImageProcessor.from_pretrained("google/deeplabv3_mobilenet_v2_1.0_513")
model = AutoModelForSemanticSegmentation.from_pretrained("google/deeplabv3_mobilenet_v2_1.0_513")

inputs = preprocessor(images=image, return_tensors="pt")

outputs = model(**inputs)
predicted_mask = preprocessor.post_process_semantic_segmentation(outputs)

Currently, both the feature extractor and model support PyTorch.

📚 Documentation

Model description

From the original README:

MobileNets are small, low - latency, low - power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embeddings and segmentation similar to how other popular large scale models, such as Inception, are used. MobileNets can be run efficiently on mobile devices [...] MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature.

The model in this repo adds a DeepLabV3+ head to the MobileNetV2 backbone for semantic segmentation.

Intended uses & limitations

You can use the raw model for semantic segmentation. See the model hub to look for fine - tuned versions on a task that interests you.

🔧 Technical Details

The README does not provide specific technical details, so this section is skipped.

📄 License

The license for this model is "other".

BibTeX entry and citation info

@inproceedings{deeplabv3plus2018,
  title={Encoder - Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
  author={Liang - Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
  booktitle={ECCV},
  year={2018}
}

Property	Details
Tags	vision, image - segmentation
Datasets	pascal - voc

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご