E

Eva02 Large Patch14 224.mim In22k

Developed by timm
EVA02 feature/representation model, pretrained on ImageNet-22k via masked image modeling, adopts Vision Transformer architecture, suitable for image classification and feature extraction tasks.
Downloads 280
Release Time : 3/31/2023

Model Overview

The EVA-02 model is an image feature extraction model based on the Vision Transformer architecture, pretrained on the ImageNet-22k dataset via masked image modeling (MIM), supporting image classification and feature embedding tasks.

Model Features

Masked Image Modeling Pretraining
Uses EVA-CLIP as the MIM teacher model for pretraining, enhancing the model's feature extraction capabilities.
Optimized Transformer Architecture
Incorporates techniques like mean pooling, SwiGLU activation function, and rotary position embedding (ROPE) to boost model performance.
High-Precision Feature Extraction
Pretrained on large-scale datasets like ImageNet-22k, capable of extracting high-quality image features.

Model Capabilities

Image Classification
Image Feature Extraction
Visual Representation Learning

Use Cases

Computer Vision
Image Classification
Uses the pretrained model for image classification, supporting multi-category recognition.
Achieves high accuracy on ImageNet-1k (see performance comparison table).
Feature Embedding
Extracts image feature vectors for downstream tasks like object detection and image retrieval.
Generates high-quality image feature representations.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase