C

Colqwenstella 2b Multilingual

Developed by Metric-AI
A multilingual visual retriever combining Qwen2 vision model with stella_en_1.5B_v5, ranked first among models with ≤2B parameters in Vidore benchmark
Downloads 175
Release Time : 2/11/2025

Model Overview

A multilingual visual document retrieval model integrating Qwen2's vision component with stella_en_1.5B_v5 as embedding model, supporting multiple languages and cross-modal retrieval tasks

Model Features

Multilingual Support
Supports visual document retrieval in five languages: English, French, Spanish, Italian, and German
Efficient Training
Utilizes LoRA technology for parameter-efficient fine-tuning, enabling efficient training on 4xA100 GPUs
High Performance
Ranked first among models with ≤2B parameters and eighth overall in Vidore benchmark
Multimodal Fusion
Combines vision model with text embedding model to achieve cross-modal retrieval capability

Model Capabilities

Multilingual text understanding
Visual document analysis
Cross-modal retrieval
Multimodal embedding
Multilingual embedding

Use Cases

Document Retrieval
Cross-language Document Retrieval
Retrieve relevant visual documents using queries in different languages
Excellent performance in Vidore benchmark
Visual Question Answering System
Document image-based Q&A system
Enterprise Applications
Enterprise Knowledge Base Retrieval
Retrieve relevant visual content from corporate document libraries
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase