gme-Qwen2-VL-2B-Instruct Open-Source Vision-Language Model - Supports Chinese and English Natural Language Processing Tasks

Gme Qwen2 VL 2B Instruct

Developed by Alibaba-NLP

Qwen2-VL-2B-Instruct is a vision-language model based on the Qwen2 architecture, supporting both English and Chinese, suitable for various natural language processing tasks.

Text-to-Image

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Multimodal Understanding #Bilingual English-Chinese Processing #Sentence Similarity Calculation

Downloads 31.18k

Release Time : 12/21/2024

Model Overview

This model is a multimodal vision-language model capable of handling text and image-related tasks, with special optimization for instruction-following capabilities.

Model Features

Multilingual Support

Supports English and Chinese, suitable for cross-language tasks.

Multi-Task Processing

Capable of performing various NLP tasks such as sentence similarity, classification, and retrieval.

Vision-Language Capabilities

Combines visual and language processing abilities, suitable for multimodal tasks.

Model Capabilities

Text Classification

Sentence Similarity Calculation

Information Retrieval

Clustering Analysis

Re-ranking

Multimodal Processing

Use Cases

Text Analysis

Sentiment Analysis

Classify sentiment polarity in Amazon reviews.

Accuracy up to 96.75%

Intent Recognition

Identify user intent in bank customer service conversations.

Accuracy 80.24%

Information Retrieval

Document Retrieval

Perform document retrieval on the ArguAna dataset.

Average precision@10 reaches 52.78

Multimodal Applications

Image-Text Matching

Perform image-text matching tasks by combining visual and language information.

🚀 Model Evaluation Results

This document presents the evaluation results of a model based on Qwen/Qwen2-VL-2B-Instruct, covering multiple tasks and datasets.

📚 Documentation

Model Information

Property	Details
Base Model	Qwen/Qwen2-VL-2B-Instruct
Supported Languages	English, Chinese
Tags	mteb, sentence-transformers, transformers, Qwen2-VL, sentence-similarity, vidore

Evaluation Results

The model has been evaluated on various tasks, including Semantic Textual Similarity (STS), Classification, Retrieval, Clustering, and Reranking. The following are the detailed results:

1. STS Tasks

Dataset	cos_sim_pearson	cos_sim_spearman	euclidean_pearson	euclidean_spearman	manhattan_pearson	manhattan_spearman
C-MTEB/AFQMC (validation)	61.03190209456061	67.54853383020948	65.38958681599493	67.54853383020948	65.25341659273157	67.34190190683134
C-MTEB/ATEC (test)	50.83794357648487	54.03230997664373	55.2072028123375	54.032311102613264	55.05163232251946	53.81272176804127
mteb/biosses-sts (test)	89.18568151905953	86.47666922475281	87.25416218056225	86.47666922475281	87.04960508086356	86.73992823533615
C-MTEB/BQ (test)	75.7464284612374	77.71894224189296	77.63454068918787	77.71894224189296	77.58744810404339	77.63293552726073

2. Classification Tasks

Dataset	Accuracy	AP	F1
mteb/amazon_counterfactual (en, test)	72.55223880597015	35.01515316721116	66.44086070814382
mteb/amazon_polarity (test)	96.75819999999999	95.51009242092881	96.75713119357414
mteb/amazon_reviews_multi (en, test)	61.971999999999994	-	60.50745575187704
mteb/amazon_reviews_multi (zh, test)	53.49	-	51.576550662258434
mteb/banking77 (test)	80.2435064935065	-	79.44078343737895

3. Retrieval Tasks

| Dataset | map_at_1 | map_at_10 | map_at_100 | map_at_1000 | map_at_3 | map_at_5 | mrr_at_1 | mrr_at_10 | mrr_at_100 | mrr_at_1000 | mrr_at_3 | mrr_at_5 | ndcg_at_1 | ndcg_at_10 | ndcg_at_100 | ndcg_at_1000 | ndcg_at_3 | ndcg_at_5 | precision_at_1 | precision_at_10 | precision_at_100 | precision_at_1000 | precision_at_3 | precision_at_5 | recall_at_1 | recall_at_10 | recall_at_100 | recall_at_1000 | recall_at_3 | recall_at_5 | |---------|----------|-----------|------------|-------------|----------|----------|-----------|------------|-------------|--------------|-----------|-----------|-----------|------------|-------------|--------------|-----------|-----------|--------------|---------------|----------------|-----------------|--------------|--------------|-----------|------------|-------------|--------------|--------------|--------------|--------------|--------------| | mteb/arguana (test) | 36.272999999999996 | 52.782 | 53.339999999999996 | 53.342999999999996 | 48.4 | 50.882000000000005 | 36.984 | 53.052 | 53.604 | 53.607000000000006 | 48.613 | 51.159 | 36.272999999999996 | 61.524 | 63.796 | 63.869 | 52.456 | 56.964000000000006 | 36.272999999999996 | 8.926 | 0.989 | 0.1 | 21.407999999999998 | 15.049999999999999 | 36.272999999999996 | 89.25999999999999 | 98.933 | 99.502 | 64.225 | 75.249 | | BeIR/cqadupstack (Android, test) | 30.623 | 40.482 | 41.997 | 42.135 | 37.754 | 39.031 | 37.482 | 46.311 | 47.211999999999996 | 47.27 | 44.157999999999994 | 45.145 | 37.482 | 46.142 | 51.834 | 54.164 | 42.309000000000005 | 43.485 | 37.482 | 8.455 | 1.3780000000000001 | 0.188 | 20.172 | 13.705 | 30.623 | 56.77100000000001 | 80.034 | 94.62899999999999 | 44.663000000000004 | 48.692 | | BeIR/cqadupstack (English, test) | 27.941 | 38.437 | 39.625 | 39.753 | 35.388999999999996 | 37.113 | 34.522000000000006 | 43.864999999999995 | 44.533 | 44.580999999999996 | 41.55 | 42.942 | 34.522000000000006 | 44.330000000000005 | 48.61 | 50.712999999999994 | 39.834 | 42.016 | 34.522000000000006 | 8.471 | 1.3379999999999999 | 0.182 | 19.363 | 13.898 | 27.941 | 55.336 | 73.51100000000001 | 86.636 | 42.54 | 48.392 | | BeIR/cqadupstack (Gaming, test) | 32.681 | 45.48 | 46.542 | 46.604 | 42.076 | 44.076 | 37.492 | 48.746 | 49.485 | 49.517 | 45.998 | 47.681000000000004 | 37.492 | 51.778999999999996 | 56.294 | 57.58 | 45.856 | 48.968 | 37.492 | 8.620999999999999 | - | - | - | - | 32.681 | - | - | - | - | - |

4. Clustering Tasks

Dataset	v_measure
mteb/arxiv-clustering-p2p (test)	52.45236368396085
mteb/arxiv-clustering-s2s (test)	46.83781937870832
mteb/biorxiv-clustering-p2p (test)	44.68220155432257
mteb/biorxiv-clustering-s2s (test)	40.666150477589284
C-MTEB/CLSClusteringP2P (test)	44.23533333311907
C-MTEB/CLSClusteringS2S (test)	43.01114481307774

5. Reranking Tasks

Dataset	MAP	MRR
mteb/askubuntudupquestions-reranking (test)	60.653430349851746	74.28736314470387
C-MTEB/CMedQAv1-reranking (test)	86.4349853821696	88.80150793650795
C-MTEB/CMedQAv2-reranking (test)	87.56417400982208	89.85813492063491

📄 License

This project is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご