D

Dpt Large Ade

Developed by Intel
This is a Dense Prediction Transformer (DPT) model fine-tuned on the ADE20k dataset for semantic segmentation tasks.
Downloads 3,497
Release Time : 3/2/2022

Model Overview

The model uses a Vision Transformer (ViT) as the backbone network, enhanced with neck and head structures for semantic segmentation, capable of high-quality semantic segmentation of input images.

Model Features

High-performance semantic segmentation
Achieved 49.02% mIoU on the ADE20K dataset, setting a new state-of-the-art benchmark.
Vision Transformer-based
Utilizes a Vision Transformer (ViT) as the backbone network combined with the Dense Prediction Transformer (DPT) architecture to deliver high-quality segmentation results.
Fine-tunability
The architecture can be fine-tuned on smaller datasets and has achieved new state-of-the-art results on these datasets as well.

Model Capabilities

Image semantic segmentation
High-resolution image processing
Multi-class object recognition

Use Cases

Computer Vision
Scene parsing
Used to parse various objects and backgrounds in complex scenes, suitable for applications like autonomous driving and robot navigation.
Achieved 49.02% mIoU on the ADE20K dataset.
Image editing
Can be used in image editing tools to help users quickly separate different elements in an image.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase