M

Mmlw Retrieval E5 Small

Developed by sdadas
MMLW (I Must Get Better Messages) is a neural text encoder for Polish, optimized for information retrieval tasks, capable of converting queries and passages into 384-dimensional vectors.
Downloads 34
Release Time : 10/18/2023

Model Overview

This model is a Polish sentence transformer primarily used for feature extraction and sentence similarity calculation, especially suitable for information retrieval tasks.

Model Features

Multilingual knowledge distillation
Trained on 60 million Polish-English text pairs, using English FlagEmbeddings as the teacher model for knowledge distillation.
Contrastive loss fine-tuning
Fine-tuned on the Polish version of the MS MARCO training set with contrastive loss, optimized for training efficiency with large batch sizes.
Prefix enhancement
Specific prefixes must be added when encoding text ('query: ' for queries, 'passage: ' for passages) to optimize retrieval performance.

Model Capabilities

Text encoding
Sentence similarity calculation
Information retrieval

Use Cases

Information retrieval
Q&A systems
Used to match user queries with relevant answer passages
Effectively identifies semantically related question-answer pairs
Document retrieval
Retrieving relevant content from large document collections
Achieved an NDCG@10 score of 52.34 on the Polish Information Retrieval Benchmark (PIRB)
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase