L

Labse

Developed by setu4993
LaBSE is a BERT-based multilingual sentence embedding model that supports 109 languages, suitable for sentence similarity calculation and bilingual text retrieval.
Downloads 18.74k
Release Time : 3/2/2022

Model Overview

This model is pre-trained by combining masked language modeling and translation language modeling, capable of generating high-quality multilingual sentence embeddings, especially suitable for cross-lingual text matching tasks.

Model Features

Multilingual support
Supports sentence embeddings for 109 languages, enabling cross-lingual text matching.
High-quality embeddings
Generates high-quality sentence representations through joint training of masked language modeling and translation language modeling.
Cross-lingual retrieval
Particularly suitable for cross-lingual application scenarios such as bilingual text retrieval.

Model Capabilities

Multilingual sentence embeddings
Cross-lingual text similarity calculation
Bilingual text retrieval
Multilingual semantic matching

Use Cases

Information retrieval
Cross-lingual document retrieval
Finding semantically similar documents in document collections of different languages
Effectively matches documents expressing the same concept in different languages.
Machine translation
Translation quality assessment
Evaluating translation quality by comparing the embedding similarity of source and target language sentences
Provides automatic evaluation metrics highly correlated with human assessments.
Featured Recommended AI Models
ยฉ 2025AIbase