Vit Base Patch16 Clip 224.laion400m E32
V
Vit Base Patch16 Clip 224.laion400m E32
Developed by timm
Vision Transformer model trained on the LAION-400M dataset, compatible with both open_clip and timm frameworks
Downloads 5,751
Release Time : 10/23/2024
Model Overview
This is a dual-framework compatible Vision Transformer model primarily designed for zero-shot image classification tasks. The model adopts the ViT-B-16 architecture and is trained on the large-scale LAION-400M dataset.
Model Features
Dual Framework Compatibility
Supports both open_clip and timm frameworks, offering more flexible usage options
Large-scale Training Data
Trained on the LAION-400M dataset, covering a wide range of visual concepts
Zero-shot Classification Capability
Capable of performing image classification tasks without task-specific fine-tuning
Model Capabilities
Zero-shot Image Classification
Visual Feature Extraction
Image-Text Alignment
Use Cases
Image Understanding
Zero-shot Image Classification
Classify images of new categories without category-specific training
Image Retrieval
Retrieve relevant images based on text queries
Multimodal Applications
Image Captioning
Generate descriptive text labels for images
Featured Recommended AI Models
Š 2025AIbase