A

Aimv2 3B Patch14 336

Developed by apple
AIMv2 is a series of vision models pretrained with multimodal autoregressive objectives, achieving excellent performance across multiple multimodal understanding benchmarks.
Downloads 23
Release Time : 10/29/2024

Model Overview

AIMv2 is an efficient vision model pretrained with multimodal autoregressive objectives, excelling in tasks such as image classification and object detection.

Model Features

Multimodal Autoregressive Pretraining
Utilizes multimodal autoregressive objectives for pretraining, enhancing model comprehension capabilities
High Performance
Outperforms models like CLIP, SigLIP, and DINOv2 on multiple benchmarks
Large-Scale Scalability
Simple and direct pretraining method enables effective scaling of training size

Model Capabilities

Image feature extraction
Image classification
Open-vocabulary object detection
Referring expression comprehension

Use Cases

Computer Vision
Image Classification
High-precision image classification on datasets like ImageNet
ImageNet-1k accuracy 89.2%
Fine-Grained Classification
Classification on domain-specific datasets such as stanford-cars
stanford-cars accuracy 96.6%
Medical Imaging
Pathology Image Analysis
Analysis on medical imaging datasets like camelyon17
camelyon17 accuracy 93.2%
Featured Recommended AI Models
ยฉ 2025AIbase