TCD-SegFormer-MIT-B3 Open-Source Model - Accurately Map Tree Cover from High-Resolution Aerial Imagery

Tcd Segformer Mit B3

Developed by restor

This is a semantic segmentation model capable of delineating tree cover from high-resolution (10 cm/pixel) aerial images.

Image Segmentation

Transformers

Open Source License:CC #Aerial Tree Canopy Segmentation #High-Resolution Semantic Segmentation #Specialized for Ecological Monitoring

Downloads 49

Release Time : 5/20/2024

Model Overview

Trained on global aerial imagery, this model accurately delineates tree cover in similar images, providing pixel-level tree/non-tree classification.

Model Features

High-Resolution Aerial Image Processing

Optimized for 10 cm/pixel resolution aerial images, accurately identifying tree canopy coverage.

Global Diversity Training

Trained on globally diverse aerial imagery, adaptable to different ecological communities.

Pixel-Level Classification

Provides pixel-level tree/non-tree classification, not single-tree detection.

Efficient Inference

Supports efficient inference on single GPU or CPU, suitable for field use.

Model Capabilities

Aerial Image Analysis

Tree Canopy Coverage Detection

Pixel-Level Semantic Segmentation

Ecological Assessment

Use Cases

Ecological Research

Tree Canopy Coverage Assessment

Evaluates the percentage of a study area covered by tree canopy.

Provides accurate coverage percentage data

Environmental Monitoring

Forest Change Monitoring

Monitors changes in forest cover over time.

Generates time-series change maps

🚀 Model Card for Restor's SegFormer-based TCD models

This is a semantic segmentation model designed to delineate tree cover in high - resolution (10 cm/px) aerial images, offering significant value in ecological research and land - use analysis.

🚀 Quick Start

You can see a brief example of inference in this Colab notebook. For end - to - end usage, we direct users to our prediction and training pipeline which also supports tiled prediction over arbitrarily large images, reporting outputs, etc.

✨ Features

This semantic segmentation model can accurately delineate tree cover in high - resolution aerial images.
It provides a per - pixel classification of tree/no - tree.
Supports tiled prediction over large images through the provided pipeline.

📦 Installation

The README does not provide specific installation steps, so this section is skipped.

💻 Usage Examples

The README does not provide code examples, so this section is skipped.

📚 Documentation

Model Details

Model Description

This semantic segmentation model was trained on global aerial imagery and can accurately delineate tree cover in similar images. It offers per - pixel classification of tree/no - tree, rather than detecting individual trees.

Developed by: Restor / ETH Zurich
Funded by: This project was made possible via a (Google.org impact grant)[https://blog.google/outreach-initiatives/sustainability/restor-helps-anyone-be-part-ecological-restoration/]
Model type: Semantic segmentation (binary class)
License: Model training code is provided under an Apache - 2 license. NVIDIA has released SegFormer under their own research license. Users should check the terms of this license before deploying. This model was trained on CC BY - NC imagery.
Finetuned from model: SegFormer family

SegFormer is a variant of the Pyramid Vision Transformer v2 model, with many identical structural features and a semantic segmentation decode head. Functionally, it is similar to a Feature Pyramid Network (FPN) as output predictions are based on combining features from different stages of the network at different spatial resolutions.

Model Sources

Repository: https://github.com/restor-foundation/tcd
Paper: We will release a preprint shortly.

Uses

Direct Use

This model is suitable for inference on a single image tile. For large orthomosaics, a higher - level framework is required. Our repository provides a comprehensive reference implementation of such a pipeline and has been tested on extremely large images (country - scale). The model gives predictions for an entire image. For specific regions, users should perform region - of - interest analysis. Our linked pipeline repository supports shapefile - based region analysis.

Out - of - Scope Use

The model was trained on 10 cm/px imagery. Performance may vary at other resolutions. For routine prediction at different resolutions, fine - tuning on a resampled dataset is recommended.
It only predicts the likelihood of pixel tree - canopy cover, not biomass, canopy height or other derived information.
As - is, it is not suitable for carbon credit estimation.

Bias, Risks, and Limitations

The main limitation is false positives over objects that look like trees, such as large bushes, shrubs or ground cover.
The training dataset was annotated by non - experts, which may lead to incorrect labels and biases in model output. We are working to re - evaluate the training data.
We provide cross - validation results, but no guarantees on accuracy. Users should perform independent testing for critical use.

Training Details

Training Data

The training dataset can be found here. Image labels are largely released under a CC - BY 4.0 license, with smaller subsets of CC BY - NC and CC BY - SA imagery.

Training Procedure

We used a 5 - fold cross - validation process to adjust hyperparameters during training, then trained on the "full" training set and evaluated on a holdout set. The model in the main branch is the release version.

We used Pytorch Lightning as the training framework with the following hyperparameters:

tcd-train semantic segformer-mit-b3 data.output= ... data.root=/mnt/data/tcd/dataset/holdout data.tile_size=1024

Preprocessing

This repository contains a pre - processor configuration for use with the transformers library. You can load it as follows:

from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained('restor/tcd-segformer-mit-b3')

Note that we do not resize input images and assume normalisation is performed in this processing step.

Training Hyperparameters

Image size: 1024 px square
Learning rate: initially 1e4 - 1e5
Learning rate schedule: reduce on plateau
Optimizer: AdamW
Augmentation: random crop to 1024x1024, arbitrary rotation, flips, colour adjustments
Number of epochs: 75 during cross - validation; 50 for final models
Normalisation: Imagenet statistics

Speeds, Sizes, Times

The model can be evaluated on a CPU, but large tile sizes require a lot of RAM. In general, 1024 px inputs are recommended. All models were trained on a single GPU with 24 GB VRAM (NVIDIA RTX3090) attached to a 32 - core machine with 64GB RAM. Smaller models take under half a day, while the largest take just over a day to train.

Evaluation

Testing Data

The training dataset is available here. The main branch model was trained on train images and tested on test (holdout) images. Training loss

Metrics

We report F1, Accuracy and IoU on the holdout dataset, as well as results on a 5 - fold cross - validation split. Cross - validation is visualised as min/max error bars in the plots.

Results

Validation loss IoU Accuracy (foreground) F1 Score

Environmental Impact

Hardware Type: NVIDIA RTX3090
Hours used: < 36
Carbon Emitted: 5.44 kg CO2 equivalent per model

Carbon emissions were estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). This estimate does not account for experimentation time or failed training runs.

Citation

We will provide a preprint version of our paper shortly. In the meantime, please cite as:

BibTeX:

@unpublished{restortcd,
  author = "Veitch - Michaelis, Josh and Cottam, Andrew and Schweizer, Daniella Schweizer and Broadbent, Eben N. and Dao, David and Zhang, Ce and Almeyda Zambrano, Angelica and Max, Simeon",
  title  = "OAM - TCD: A globally diverse dataset of high - resolution tree cover maps",
  note   = "In prep.",
  month  = "06",
  year   = "2024"
}

Model Card Authors

Josh Veitch - Michaelis, 2024; on behalf of the dataset authors.

Model Card Contact

Please contact josh [at] restor.eco for questions or further information.

🔧 Technical Details

The README provides detailed technical information about model architecture, training procedures, hyperparameters, etc., so this section is well - covered in the "📚 Documentation" part.

📄 License

Model training code is provided under an Apache - 2 license. NVIDIA has released SegFormer under their own research license. This model was trained on CC BY - NC imagery.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご