UperNet-Swin-Small Open-Source Semantic Segmentation Model - Free Deployment to Assist in ADE20K Scene Parsing Tasks

Upernet Swin Small

Developed by smp-hub

UPerNet semantic segmentation model based on Swin Transformer small architecture, suitable for scene parsing tasks like ADE20K

Image Segmentation

Safetensors

Open Source License:MIT #Semantic Segmentation #Swin-Transformer Encoder #ADE20K Dataset

Downloads 100

Release Time : 4/12/2025

Model Overview

This model adopts the UPerNet architecture combined with Swin-Small as the encoder, specifically designed for high-precision semantic segmentation tasks, particularly suitable for scene parsing and image segmentation applications

Model Features

Swin Transformer Backbone

Utilizes the advanced Swin-Small as the encoder, incorporating hierarchical window attention mechanisms to effectively capture multi-scale features

UPerNet Decoder Architecture

Employs the Unified Perceptual Parsing Network (UPerNet) as the decoder to achieve efficient multi-scale feature fusion

Pre-trained Support

Provides out-of-the-box pre-trained weights, supporting quick loading via HuggingFace Hub

ADE20K Optimization

Specifically optimized for the ADE20K scene parsing dataset, supporting 150-class semantic segmentation

Model Capabilities

Image Semantic Segmentation

Scene Parsing

Pixel-Level Classification

Multi-Scale Feature Extraction

Use Cases

Computer Vision

Scene Understanding

Performs pixel-level recognition and segmentation of various objects in complex scenes

Can output precise segmentation masks containing 150 classes of objects

Autonomous Driving Perception

Parses various elements in road scenes (vehicles, pedestrians, roads, etc.)

Remote Sensing Image Analysis

Classifies and segments ground objects in satellite/aerial images

🚀 UPerNet Model Card

This model card provides details about the UPerNet model for image segmentation, including how to load the trained model, its initialization parameters, and the associated dataset.

🚀 Quick Start

Load trained model

You can quickly start using the pre - trained model by following these steps. Click the button below to open the example in Google Colab:

Install requirements.

pip install -U segmentation_models_pytorch albumentations

Run inference.

import torch
import requests
import numpy as np
import albumentations as A
import segmentation_models_pytorch as smp

from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load pretrained model and preprocessing function
checkpoint = "smp-hub/upernet-swin-small"
model = smp.from_pretrained(checkpoint).eval().to(device)
preprocessing = A.Compose.from_pretrained(checkpoint)

# Load image
url = "https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg"
image = Image.open(requests.get(url, stream=True).raw)

# Preprocess image
np_image = np.array(image)
normalized_image = preprocessing(image=np_image)["image"]
input_tensor = torch.as_tensor(normalized_image)
input_tensor = input_tensor.permute(2, 0, 1).unsqueeze(0)  # HWC -> BCHW
input_tensor = input_tensor.to(device)

# Perform inference
with torch.no_grad():
    output_mask = model(input_tensor)

# Postprocess mask
mask = mask.argmax(1).cpu().numpy()  # argmax over predicted classes (channels dim)

✨ Model init parameters

The following are the initialization parameters for the UPerNet model:

model_init_params = {
    "encoder_name": "tu-swin_small_patch4_window7_224",
    "encoder_depth": 5,
    "encoder_weights": None,
    "decoder_channels": 512,
    "decoder_use_norm": "batchnorm",
    "in_channels": 3,
    "classes": 150,
    "activation": None,
    "upsampling": 4,
    "aux_params": None,
    "img_size": 512
}

📦 Dataset

The model uses the ADE20K dataset for training and evaluation.

📚 Documentation

Library: https://github.com/qubvel/segmentation_models.pytorch
Docs: https://smp.readthedocs.io/en/latest/

This model has been pushed to the Hub using the PytorchModelHubMixin

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご