E5 Large En Ru
This is a vocabulary-pruned version of the intfloat/multilingual-e5-large model, retaining only Russian and English tokens while maintaining the original model's performance.
Downloads 712
Release Time : 9/18/2023
Model Overview
E5-large-en-ru is a multilingual text embedding model specifically optimized for Russian and English, suitable for tasks such as information retrieval and semantic similarity calculation.
Model Features
Vocabulary Optimization
Pruning retains only Russian and English tokens, significantly reducing model size while maintaining performance.
High-Performance Retrieval
Excellent performance on the SberQuAD benchmark, with metrics comparable to the original model.
Multi-Task Adaptation
Supports distinguishing different task types (query/passage/symmetric tasks) via prefixes.
Model Capabilities
Text vectorization
Semantic similarity calculation
Information retrieval
Cross-language text matching
Use Cases
Information Retrieval
Open-Domain Question Answering
Used to retrieve the most relevant document passages for questions.
Achieved recall@5 of 82.8% on SberQuAD test.
Semantic Analysis
Document Similarity Calculation
Compare semantic similarity between different documents.
Featured Recommended AI Models