H

Hiera Base Plus 224 Hf

Developed by facebook
Hiera is a hierarchical vision Transformer model that is fast, powerful, and concise, surpassing state-of-the-art performance in a wide range of image and video tasks while significantly improving runtime speed.
Downloads 15
Release Time : 5/12/2024

Model Overview

Hiera is an efficient hierarchical vision Transformer model designed for image classification, feature extraction, and masked image modeling. By simplifying redundant modules and employing MAE training, it achieves high performance in multiple image and video recognition tasks.

Model Features

Efficient Hierarchical Design
Adjusts spatial resolution and feature quantity at different stages through a hierarchical structure, significantly improving operational efficiency.
Simplified Architecture
Removes redundant modules from existing Transformers, enhancing accuracy while maintaining a concise architecture.
MAE Training
Uses masked autoencoder (MAE) training to teach the model to learn spatial biases rather than manually adding them through architecture.
High Performance
Surpasses state-of-the-art performance in multiple image and video recognition tasks while maintaining fast inference speed.

Model Capabilities

Image Classification
Feature Extraction
Masked Image Modeling

Use Cases

Computer Vision
Image Classification
Classifies input images to identify the main objects or scenes within them.
Performs excellently on benchmarks like ImageNet-1K
Feature Extraction
Extracts multi-level feature representations from images, useful for downstream vision tasks.
Can extract feature maps at different stages, supporting various vision applications
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase