A

All Datasets V4 Mpnet Base

Developed by flax-sentence-embeddings
A sentence embedding model based on mpnet-base, trained on 1 billion sentence pairs through self-supervised contrastive learning, capable of generating high-quality semantic vector representations of sentences
Downloads 131
Release Time : 3/2/2022

Model Overview

This model is an encoder specifically designed for sentence embeddings, capable of converting input sentences into vector representations containing semantic information, suitable for tasks such as information retrieval, text clustering, and sentence similarity calculation

Model Features

Large-scale training data
Trained on a diverse dataset of over 1 billion sentence pairs, covering various text types such as Q&A, forum discussions, encyclopedias, and more
Contrastive learning optimization
Utilizes self-supervised contrastive learning objectives to optimize sentence representations by predicting positive sentence pairs
High-performance TPU training
Trained on 7 TPU v3-8 units with support from Google's technical team

Model Capabilities

Sentence vectorization
Semantic similarity calculation
Information retrieval
Text clustering

Use Cases

Information retrieval
Document search
Convert query sentences and documents into vectors to achieve semantic-based document retrieval
Text analysis
Similar question identification
Identify semantically similar questions in Q&A systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase