M ST5
m-ST5 is a multilingual sentence embedding model based on the mT5 encoder, specifically optimized for cross-lingual semantic textual similarity and sentence retrieval tasks.
Downloads 30
Release Time : 6/26/2023
Model Overview
This model is a multilingual extension of Sentence-T5, designed to generate high-quality sentence embeddings that support cross-lingual semantic textual similarity comparison and sentence retrieval.
Model Features
Multilingual support
Based on the mT5 architecture, supports sentence embedding generation for multiple languages.
Efficient fine-tuning
Uses LoRA technology for adaptation, enabling efficient parameter fine-tuning.
High performance
Outperforms baseline models like LaBSE in cross-lingual semantic textual similarity and sentence retrieval tasks.
Model Capabilities
Cross-lingual sentence embedding generation
Semantic textual similarity calculation
Cross-lingual sentence retrieval
Use Cases
Cross-lingual information retrieval
Multilingual document retrieval
Search for semantically similar sentences across document collections in different languages.
Achieved 97.6 accuracy in BUCC task
Semantic similarity analysis
Cross-lingual text similarity evaluation
Compare semantic similarity between texts in different languages.
Outperformed LaBSE model in XSTS task
Featured Recommended AI Models
Š 2025AIbase