Segformer-b5 Open-source Semantic Segmentation Model - Free Deployment, Optimized for Cityscapes Dataset

Home

Segformer B5 1024x1024 City 160k

Developed by smp-hub

Semantic segmentation model based on Segformer architecture, optimized for Cityscapes dataset

Image Segmentation

Safetensors

Open Source License:Other #High-Resolution Semantic Segmentation #Urban Street Scene Parsing #Segformer Architecture

Downloads 445

Release Time : 11/29/2024

Model Overview

This is a semantic segmentation model based on the Segformer architecture, specifically trained on the Cityscapes street scene dataset, capable of pixel-level classification of different objects and regions in street view images.

Model Features

Efficient Segmentation

Utilizes Segformer architecture, combining the advantages of Transformer and CNN for efficient and accurate semantic segmentation

Pre-trained Support

Provides pre-trained weights for direct inference or fine-tuning

High-Resolution Processing

Supports high-resolution image input up to 1024x1024

Model Capabilities

Street scene image segmentation

Pixel-level classification

Semantic understanding

Use Cases

Autonomous Driving

Road Scene Understanding

Identifying key elements such as roads, vehicles, and pedestrians

Urban Management

Infrastructure Analysis

Identifying and analyzing urban infrastructure distribution

🚀 Segformer Model Card

This is a model card for the Segformer model, which is used for image segmentation tasks. It provides details on how to load the trained model, its initialization parameters, and the dataset used.

🚀 Quick Start

You can quickly start using the Segformer model by following these steps:

Click the button below to open the Colab notebook for inference with the pre - trained model.
Install the required libraries:

pip install -U segmentation_models_pytorch albumentations

Run the inference code:

import torch
import requests
import numpy as np
import albumentations as A
import segmentation_models_pytorch as smp

from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load pretrained model and preprocessing function
checkpoint = "smp-hub/segformer-b5-1024x1024-city-160k"
model = smp.from_pretrained(checkpoint).eval().to(device)
preprocessing = A.Compose.from_pretrained(checkpoint)

# Load image
url = "https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg"
image = Image.open(requests.get(url, stream=True).raw)

# Preprocess image
np_image = np.array(image)
normalized_image = preprocessing(image=np_image)["image"]
input_tensor = torch.as_tensor(normalized_image)
input_tensor = input_tensor.permute(2, 0, 1).unsqueeze(0)  # HWC -> BCHW
input_tensor = input_tensor.to(device)

# Perform inference
with torch.no_grad():
    output_mask = model(input_tensor)

# Postprocess mask
mask = torch.nn.functional.interpolate(
    output_mask, size=(image.height, image.width), mode="bilinear", align_corners=False
)
mask = mask.argmax(1).cpu().numpy()  # argmax over predicted classes (channels dim)

💻 Usage Examples

Basic Usage

The above code in the "Quick Start" section shows the basic usage of loading a pre - trained Segformer model, preprocessing an image, performing inference, and post - processing the output mask.

📦 Installation

To use this model, you need to install the following libraries:

pip install -U segmentation_models_pytorch albumentations

📚 Documentation

Model init parameters

The following are the initialization parameters for the model:

model_init_params = {
    "encoder_name": "mit_b5",
    "encoder_depth": 5,
    "encoder_weights": None,
    "decoder_segmentation_channels": 768,
    "in_channels": 3,
    "classes": 19,
    "activation": None,
    "aux_params": None
}

Dataset

The model is trained on the Cityscapes dataset.

More Information

Library: https://github.com/qubvel/segmentation_models.pytorch
Docs: https://smp.readthedocs.io/en/latest/
License: https://github.com/NVlabs/SegFormer/blob/master/LICENSE

This model has been pushed to the Hub using the PytorchModelHubMixin

📄 License

The license information can be found at: https://github.com/NVlabs/SegFormer/blob/master/LICENSE

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご