Vit Gigantic Patch14 Clip 224.metaclip 2pt5b
V
Vit Gigantic Patch14 Clip 224.metaclip 2pt5b
Developed by timm
A dual-framework compatible vision model trained on MetaCLIP-2.5B dataset, supporting both OpenCLIP and timm frameworks
Downloads 444
Release Time : 10/23/2024
Model Overview
This model is a large-scale vision model based on Vision Transformer architecture, primarily used for zero-shot image classification tasks. It is compatible with both OpenCLIP and timm frameworks, offering powerful image understanding capabilities.
Model Features
Dual-framework compatibility
Supports both OpenCLIP and timm frameworks, providing more flexible usage options
Large-scale pretraining
Trained on the MetaCLIP-2.5B large-scale dataset, possessing powerful visual representation capabilities
Zero-shot learning
Supports zero-shot image classification tasks without requiring domain-specific fine-tuning
Model Capabilities
Image classification
Visual feature extraction
Cross-modal understanding
Use Cases
Computer vision
Zero-shot image classification
Classify images of new categories without specific training
Content moderation
Identify inappropriate content in images
Cross-modal applications
Image search
Search for relevant images based on text descriptions
Featured Recommended AI Models
Š 2025AIbase