A

All Mpnet Base Questions Clustering En

Developed by aiknowyou
A sentence embedding model based on sentence-transformers, optimized for question clustering tasks, supporting semantic similarity calculation for English text
Downloads 45
Release Time : 9/13/2022

Model Overview

This model maps sentences and paragraphs into a 768-dimensional dense vector space, suitable for tasks like clustering or semantic search. Fine-tuned by integrating three public datasets—Quora, WikiAnswer, and StackExchange—it significantly improves the ability to identify semantically similar questions.

Model Features

Question Clustering Optimization
Fine-tuned specifically for question clustering tasks, excelling in identifying semantically similar questions
Multi-dataset Fusion Training
Trained by integrating three public datasets: Quora, WikiAnswer, and StackExchange
Efficient Semantic Encoding
Efficiently maps sentences and paragraphs into a 768-dimensional dense vector space

Model Capabilities

Sentence Embedding
Semantic Similarity Calculation
Question Clustering
Feature Extraction

Use Cases

QA Systems
Similar Question Identification
Identifies whether a user's question is semantically similar to existing questions
Achieves 99.3% cosine similarity accuracy on the WikiAnswer test set
Question Clustering
Automatically groups semantically similar questions
Information Retrieval
Semantic Search
Search system based on semantic rather than keyword matching
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase