Vit Base Patch32 Clip 224.metaclip 400m
V
Vit Base Patch32 Clip 224.metaclip 400m
Developed by timm
A vision-language model trained on the MetaCLIP-400M dataset, supporting zero-shot image classification tasks
Downloads 2,406
Release Time : 10/23/2024
Model Overview
This is a dual-purpose vision-language model that can be used in both OpenCLIP and timm frameworks, primarily for zero-shot image classification tasks.
Model Features
Dual Framework Support
Compatible with both OpenCLIP and timm frameworks, offering flexible usage options
Zero-shot Learning Capability
Can perform image classification tasks without task-specific training
Fast Inference
Optimized based on ViT-B-32 architecture, providing efficient inference speed
Model Capabilities
Zero-shot Image Classification
Image Feature Extraction
Cross-modal Understanding
Use Cases
Computer Vision
General Image Classification
Classify images of unknown categories without specific training
Performs well in various image classification tasks
Content Moderation
Identify inappropriate content in images
Multimodal Applications
Image-Text Matching
Evaluate the matching degree between images and text descriptions
Featured Recommended AI Models
Š 2025AIbase