V

Vit Base Patch16 Clip 224.laion400m E31

Developed by timm
Vision Transformer model trained on LAION-400M dataset, supporting zero-shot image classification tasks
Downloads 1,469
Release Time : 10/23/2024

Model Overview

This is a dual-purpose Vision Transformer model compatible with both OpenCLIP and timm frameworks. It adopts the ViT-B-16 architecture and is trained on the LAION-400M dataset, primarily for zero-shot image classification tasks.

Model Features

Dual-framework compatibility
Supports both OpenCLIP and timm frameworks, offering more flexible usage
Zero-shot learning capability
Can classify new categories without specific training
Large-scale pre-training
Trained on the massive LAION-400M dataset with strong visual representation capabilities

Model Capabilities

Zero-shot image classification
Image feature extraction
Cross-modal representation learning

Use Cases

Computer vision
Open-domain image classification
Classify images of arbitrary categories without retraining
Image retrieval
Retrieve relevant images based on text descriptions
Multimodal applications
Image-text matching
Evaluate the matching degree between images and text descriptions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase