A

All Mpnet Base V2

Developed by navteca
This is a sentence embedding model based on the MPNet architecture, capable of mapping text to a 768-dimensional vector space, suitable for semantic search and sentence similarity tasks.
Downloads 14
Release Time : 3/2/2022

Model Overview

The model was trained on over 1 billion sentence pairs through self-supervised contrastive learning, enabling the conversion of sentences and paragraphs into dense vector representations, supporting NLP tasks such as clustering and semantic search.

Model Features

Large-scale training data
Trained on over 1 billion sentence pairs, covering diverse text types and domains
Efficient semantic encoding
Converts sentences and paragraphs into 768-dimensional dense vectors, effectively capturing semantic information
Contrastive learning optimization
Fine-tuned using contrastive learning objectives to improve the accuracy of sentence similarity judgments
TPU-optimized training
Efficiently trained on 7 TPU v3-8s, benefiting from the acceleration of the Flax/JAX framework

Model Capabilities

Sentence vectorization
Semantic similarity calculation
Information retrieval
Text clustering
Feature extraction

Use Cases

Information retrieval
Document search
Convert queries and documents into vectors to enable semantic-based document retrieval
Better understands query intent compared to keyword search
Text analysis
Sentence similarity calculation
Calculate the semantic similarity between two sentences
Useful for QA systems, duplicate question detection, etc.
Text clustering
Automatically group texts with similar content
Useful for topic modeling, user feedback analysis, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase