C

Colqwen2 V0.1

Developed by vidore
A visual retrieval model based on Qwen2-VL-2B-Instruct and ColBERT strategy, capable of efficiently indexing documents through visual features
Downloads 21.25k
Release Time : 9/26/2024

Model Overview

ColQwen2 is an innovative vision-language model that extends the Qwen2-VL-2B architecture and adopts a ColBERT-style multi-vector representation strategy to achieve efficient visual document retrieval.

Model Features

Dynamic Image Resolution Support
Supports dynamic input image resolution without resizing, with a maximum resolution set to generate up to 768 image patches
Multi-vector Representation
Employs a ColBERT-style multi-vector representation strategy, capable of generating multi-vector representations for both text and images
Efficient Retrieval
Efficiently indexes documents through visual features, particularly suitable for PDF document retrieval
LoRA Adaptation
Applies Low-Rank Adaptation (LoRA) on the Transformer and projection layers of the language model to optimize training efficiency

Model Capabilities

Visual Document Retrieval
Multimodal Representation Learning
Cross-modal Matching
Image Understanding
Text Understanding

Use Cases

Document Retrieval
Academic Literature Retrieval
Quickly retrieves relevant content from academic PDF documents through visual features
Enterprise Document Management
Efficiently indexes and manages internal corporate PDF document libraries
Cross-modal Search
Image-Text Association Search
Retrieves related image content through text queries or retrieves related text descriptions through images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase