M

Monoqwen2 VL V0.1

Developed by lightonai
MonoQwen2-VL-v0.1 is a multimodal re-ranker fine-tuned based on Qwen2-VL-2B, used to evaluate the relevance between images and queries.
Downloads 547
Release Time : 10/25/2024

Model Overview

This model optimizes the point-wise relevance judgment between images and queries through LoRA fine-tuning, can generate True or False responses, and calculate relevance scores. It is suitable for re-ranking or filtering retrieval results.

Model Features

Multimodal re-ranking
Supports evaluating the relevance between images and text queries and generating True or False responses.
LoRA fine-tuning
Efficiently fine-tunes based on the Qwen2-VL-2B model through LoRA to optimize the relevance judgment task.
High performance
Performs excellently in the ViDoRe benchmark test, significantly improving the ndcg@5 score of retrieval results.

Model Capabilities

Image and text relevance assessment
Multimodal retrieval result re-ranking
Generate True/False responses

Use Cases

Information retrieval
Document retrieval re-ranking
Re-rank the candidate results generated by the first-stage retriever (such as DSE or ColPali) to improve retrieval quality.
In the ViDoRe benchmark test, the average ndcg@5 score is increased by 4.7%.
Image filtering
Image relevance filtering
Filter out images irrelevant to the query by setting a threshold to improve retrieval efficiency.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase