C

Colqwen2.5 3b Multilingual V1.0 Merged

Developed by tsystems
A multilingual visual retrieval model based on Qwen2.5-VL-3B-Instruct and ColBERT strategy, supporting dynamic input image resolution and generating ColBERT-style multi-vector text and image representations.
Downloads 70
Release Time : 3/9/2025

Model Overview

This model is a novel architecture and training strategy based on Vision-Language Models (VLMs), capable of efficiently indexing documents through visual features and supporting multilingual and multimodal embeddings.

Model Features

Multilingual Support
Supports visual document retrieval in multiple languages including English, French, Spanish, Italian, and German
Dynamic Image Resolution
Supports dynamic input image resolution without changing aspect ratio, with a maximum resolution setting of up to 768 image patches
Efficient Retrieval
Utilizes ColBERT-style multi-vector representations for efficient document retrieval
Multimodal Embedding
Supports joint embedding of text and images for cross-modal retrieval

Model Capabilities

Multilingual visual document retrieval
Text-to-image retrieval
Multimodal embedding
Dynamic image processing

Use Cases

Document Retrieval
Multilingual PDF Document Retrieval
Retrieve relevant images or document fragments from multilingual PDF documents based on text queries
Efficient retrieval of relevant document content with support for multiple languages
Visual Question Answering System
Retrieve relevant images or document content in a visual question answering system based on questions
Improves the accuracy and efficiency of the question answering system
Cross-Modal Retrieval
Text-to-Image Retrieval
Retrieve relevant images based on text descriptions
Achieves efficient cross-modal retrieval
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase