V

Vit L 16 SigLIP 384

Developed by timm
SigLIP (Sigmoid Loss for Language-Image Pre-training) model trained on the WebLI dataset for zero-shot image classification tasks.
Downloads 3,008
Release Time : 10/16/2023

Model Overview

This model is a contrastive image-text model that uses Sigmoid loss for language-image pre-training and supports zero-shot image classification tasks.

Model Features

Sigmoid Loss Function
Uses Sigmoid loss for language-image pre-training, which outperforms traditional Softmax loss in certain tasks.
Zero-shot Classification Capability
Supports zero-shot image classification without fine-tuning for specific tasks, applicable to new categories.
Large-scale Vision Transformer
Based on the ViT-L-16 architecture, featuring powerful image feature extraction capabilities.

Model Capabilities

Zero-shot Image Classification
Image-Text Contrastive Learning
Image Feature Extraction

Use Cases

Computer Vision
Image Classification
Classify images of new categories without training
Image Retrieval
Retrieve relevant images based on text descriptions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase