Open-source model vit_so400m_patch14_siglip_378.v2_webli - Designed specifically for image feature extraction

Home

Vit So400m Patch14 Siglip 378.v2 Webli

Developed by timm

Vision Transformer model based on SigLIP 2, designed for image feature extraction, trained on the webli dataset

Text-to-Image

Transformers

Open Source License:Apache-2.0 #Multimodal Visual Encoding #High Semantic Understanding #Dense Feature Extraction

Downloads 30

Release Time : 2/21/2025

Model Overview

This is a SigLIP 2 architecture-based Vision Transformer model, containing only the image encoder part, suitable for image feature extraction tasks. The model is implemented based on the timm library, functionally equivalent to the image tower module of the ViT-SO400M-14-SigLIP2-378 model on HuggingFace.

Model Features

SigLIP 2 Architecture

Utilizes the improved SigLIP 2 architecture with enhanced semantic understanding and localization capabilities

Dense Feature Extraction

Capable of extracting dense feature representations from images

Large-scale Pretraining

Pretrained on the large-scale webli dataset

Model Capabilities

Image Feature Extraction

Visual Semantic Understanding

Image Localization

Use Cases

Computer Vision

Image Retrieval

Utilizes extracted image features for similar image retrieval

Visual Localization

Identifies and locates specific objects or regions in images

Multimodal Applications

Vision-Language Tasks

Serves as a visual encoder for tasks like image-text matching

Property	Details
Dataset	webli
Papers	- SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features: https://arxiv.org/abs/2502.14786 - Sigmoid Loss for Language Image Pre-Training: https://arxiv.org/abs/2303.15343

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Vit So400m Patch14 Siglip 378.v2 Webli

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model card for vit_so400m_patch14_siglip_378.v2_webli

🚀 Quick Start

📚 Documentation

Model Details

📄 License

📚 Citation