🚀 jina-embeddings-v2-base-de
This model jina-embeddings-v2-base-de
is designed for various natural language processing tasks, including text classification, retrieval, clustering, and more. It supports both German and English languages and has been evaluated on multiple datasets with detailed performance metrics provided.
📚 Documentation
Tags
- sentence-transformers
- feature-extraction
- sentence-similarity
- mteb
- transformers
- transformers.js
Supported Languages
Inference
Inference is disabled for this model.
License
This model is licensed under the apache-2.0
license.
Model Index
The model named jina-embeddings-v2-base-de
has the following evaluation results on different tasks and datasets:
Classification Tasks
Dataset |
Language |
Accuracy |
AP |
F1 |
MTEB AmazonCounterfactualClassification |
en |
73.76119402985076 |
35.99577188521176 |
67.50397431543269 |
MTEB AmazonCounterfactualClassification |
de |
68.9186295503212 |
79.73307115840507 |
66.66245744831339 |
MTEB AmazonPolarityClassification |
- |
77.52215 |
71.85051037177416 |
77.4171096157774 |
MTEB AmazonReviewsClassification |
en |
38.498 |
- |
38.058193386555956 |
MTEB AmazonReviewsClassification |
de |
37.717999999999996 |
- |
37.22674371574757 |
MTEB Banking77Classification |
- |
83.93506493506493 |
- |
83.91014949949302 |
Retrieval Tasks
Dataset |
MAP@1 |
MAP@10 |
MAP@100 |
MAP@1000 |
MRR@1 |
MRR@10 |
MRR@100 |
MRR@1000 |
NDCG@1 |
NDCG@10 |
NDCG@100 |
NDCG@1000 |
Precision@1 |
Precision@10 |
Precision@100 |
Precision@1000 |
Recall@1 |
Recall@10 |
Recall@100 |
Recall@1000 |
MTEB ArguAna |
25.319999999999997 |
40.351 |
41.435 |
41.443000000000005 |
25.746999999999996 |
40.515 |
41.606 |
41.614000000000004 |
25.319999999999997 |
49.332 |
53.909 |
54.089 |
25.319999999999997 |
7.831 |
0.9820000000000001 |
0.1 |
25.319999999999997 |
78.307 |
98.222 |
99.57300000000001 |
MTEB CQADupstackAndroidRetrieval |
30.830999999999996 |
41.355 |
42.791000000000004 |
42.918 |
38.484 |
47.593 |
48.388 |
48.439 |
38.484 |
47.27 |
52.568000000000005 |
54.729000000000006 |
38.484 |
8.927 |
1.425 |
0.19 |
30.830999999999996 |
57.87799999999999 |
80.124 |
94.208 |
MTEB CQADupstackEnglishRetrieval |
25.782 |
34.492 |
35.521 |
35.638 |
32.357 |
39.965 |
40.644000000000005 |
40.695 |
32.357 |
39.644 |
43.851 |
46.211999999999996 |
32.357 |
7.344 |
1.201 |
0.168 |
25.782 |
49.132999999999996 |
67.24 |
83.045 |
MTEB CQADupstackGamingRetrieval |
35.778999999999996 |
47.038000000000004 |
48.064 |
48.128 |
41.254000000000005 |
50.556999999999995 |
51.296 |
51.331 |
41.254000000000005 |
52.454 |
56.776 |
58.181000000000004 |
41.254000000000005 |
8.464 |
1.157 |
0.133 |
35.778999999999996 |
64.85300000000001 |
83.98400000000001 |
94.18299999999999 |
MTEB CQADupstackGisRetrieval |
21.719 |
29.326999999999998 |
30.314000000000004 |
30.397000000000002 |
23.503 |
31.225 |
32.096000000000004 |
32.159 |
23.503 |
33.842 |
39.038000000000004 |
41.214 |
23.503 |
5.266 |
- |
- |
21.719 |
- |
- |
- |
Clustering Tasks
Dataset |
V - Measure |
MTEB ArxivClusteringP2P |
41.43100588255654 |
MTEB ArxivClusteringS2S |
32.08988904593667 |
MTEB BiorxivClusteringP2P |
34.970675877585144 |
MTEB BiorxivClusteringS2S |
28.779230269190954 |
MTEB BlurbsClusteringP2P |
35.490175601567216 |
MTEB BlurbsClusteringS2S |
16.16638280560168 |
Reranking Task
Dataset |
MAP |
MRR |
MTEB AskUbuntuDupQuestions |
60.55514765595906 |
73.51393835465858 |
STS Task
Dataset |
Cosine Similarity Pearson |
Cosine Similarity Spearman |
Euclidean Pearson |
Euclidean Spearman |
Manhattan Pearson |
Manhattan Spearman |
MTEB BIOSSES |
79.6723823121172 |
76.90596922214986 |
77.87910737957918 |
76.66319260598262 |
77.37039493457965 |
76.09872191280964 |
Bitext Mining Task
Dataset |
Accuracy |
F1 |
Precision |
Recall |
MTEB BUCC (de - en) |
98.97703549060543 |
98.86569241475296 |
98.81002087682673 |
98.97703549060543 |