AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Open-Vocabulary Detection

# Open-Vocabulary Detection

Aimv2 3B Patch14 448
AIMv2 is a series of vision models pretrained with multimodal autoregressive objectives, demonstrating excellent performance across multiple visual understanding benchmarks.
Image Classification
A
apple
161
12
Aimv2 1B Patch14 448
AIMv2 is a series of vision models pretrained with multimodal autoregressive objectives, achieving outstanding performance across multiple vision understanding benchmarks.
Image Classification
A
apple
71
0
Aimv2 Large Patch14 336
AIMv2 is a series of vision models based on multimodal autoregressive objective pretraining, excelling in various vision tasks.
Image Classification
A
apple
6,177
3
Aimv2 1B Patch14 224
AIMv2 is a series of vision models pretrained with multimodal autoregressive objectives, excelling in various vision tasks.
Image Classification
A
apple
299
7
Vitamin XL 256px
MIT
ViTamin-XL-256px is a vision-language model based on the ViTamin architecture, designed for efficient visual feature extraction and multimodal tasks, supporting high-resolution image processing.
Text-to-Image Transformers
V
jienengchen
655
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase