N

Nat Base In1k 224

Developed by shi-labs
NAT-Base is a Vision Transformer model trained on ImageNet-1K, which uses the neighborhood attention mechanism for image classification.
Downloads 6
Release Time : 11/18/2022

Model Overview

NAT is a hierarchical Vision Transformer based on Neighborhood Attention (NA), specifically designed for image classification tasks. Neighborhood Attention is a restricted self-attention mechanism where the receptive field of each token is limited to its nearest neighboring pixels, offering high flexibility and maintaining translational equivariance.

Model Features

Neighborhood Attention Mechanism
It adopts the sliding window attention mode, where the receptive field of each token is limited to its nearest neighboring pixels, maintaining translational equivariance.
Efficient Implementation
The neighborhood attention mechanism is efficiently implemented in PyTorch through the NATTEN library.
Hierarchical Structure
It uses a hierarchical Vision Transformer architecture, suitable for processing visual features at different scales.

Model Capabilities

Image Classification
Visual Feature Extraction

Use Cases

Computer Vision
ImageNet Image Classification
Classify an image into one of the 1,000 ImageNet categories.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase