Labse Sentence Embeddings
LaBSE is a multilingual sentence embedding model based on BERT, supporting 109 languages, suitable for sentence similarity calculation and bilingual text retrieval.
Downloads 152
Release Time : 4/30/2023
Model Overview
This model is pre-trained by combining masked language modeling and translation language modeling, capable of generating high-quality multilingual sentence embeddings, especially suitable for cross-language text matching tasks.
Model Features
Multilingual support
Supports sentence embeddings for 109 languages, including many low-resource languages
Cross-language retrieval
Specifically optimized for cross-language text matching and retrieval tasks
High-quality embeddings
Generates high-quality sentence representations through joint training of masked language modeling and translation language modeling
Model Capabilities
Multilingual sentence embeddings
Cross-language text similarity calculation
Bilingual text retrieval
Sentence-level feature extraction
Use Cases
Information retrieval
Cross-language document retrieval
Finding semantically similar documents in document collections of different languages
Can effectively match documents expressing the same concept in different languages
Machine translation
Translation quality assessment
Evaluating translation quality by comparing embedding similarity between source and target language sentences
Featured Recommended AI Models
ยฉ 2025AIbase