C

Colqwen2 2b V1.0

Developed by tsystems
A visual retrieval model based on Qwen2-VL-2B-Instruct and ColBERT strategy, capable of generating multi-vector text and image representations
Downloads 700
Release Time : 12/24/2024

Model Overview

ColQwen is a novel architecture based on vision-language models, efficiently indexing documents through visual features, supporting dynamic resolution image input while maintaining aspect ratio

Model Features

Dynamic resolution processing
Supports dynamic resolution image input without resizing, capable of generating up to 1024 image patches at maximum resolution
Multi-vector representation
Adopts ColBERT strategy to generate multi-vector representations for text and images, improving retrieval efficiency
Efficient training
Uses LoRA adapters for training, with paged_adamw_8bit optimizer, distributed training on 8xH100 GPUs

Model Capabilities

Visual document retrieval
Multimodal embedding
Image feature extraction
Text feature extraction

Use Cases

Document retrieval
PDF document retrieval
Quickly retrieve relevant content from a large number of PDF documents
Experiments show that increasing the number of image patches significantly improves performance
Featured Recommended AI Models
ยฉ 2025AIbase