A

Aimv2 1B Patch14 448

Developed by apple
AIMv2 is a series of vision models pretrained with multimodal autoregressive objectives, achieving outstanding performance across multiple vision understanding benchmarks.
Downloads 71
Release Time : 10/29/2024

Model Overview

AIMv2 is an efficient vision model pretrained with multimodal autoregressive objectives, excelling in tasks such as image classification and object detection.

Model Features

Multimodal Autoregressive Pretraining
Pretrained with multimodal autoregressive objectives, enhancing the model's generalization and performance.
High Performance
Outperforms models like CLIP and SigLIP across multiple vision understanding benchmarks.
Efficient Scaling
Simple and straightforward pretraining method allows efficient scaling to larger models.

Model Capabilities

Image feature extraction
Image classification
Multimodal understanding

Use Cases

Computer Vision
Image Classification
Performs image classification tasks on datasets like ImageNet-1k.
Accuracy 89.0%
Open-Vocabulary Object Detection
Outperforms DINOv2 in open-vocabulary object detection tasks.
Referring Expression Understanding
Outperforms DINOv2 in referring expression understanding tasks.
Featured Recommended AI Models
ยฉ 2025AIbase