Simcse Model Distil M Bert
A sentence transformer model based on m-Distil-BERT, trained using the SimCSE method, capable of mapping text to 768-dimensional vectors, suitable for semantic search and clustering tasks
Downloads 21
Release Time : 3/2/2022
Model Overview
This model is fine-tuned on Thai Wikipedia corpus using SimCSE contrastive learning method, capable of generating high-quality sentence embeddings, particularly suitable for semantic similarity calculation of Thai text
Model Features
SimCSE Training Method
Uses contrastive learning framework to learn high-quality sentence representations without negative samples
Multilingual Capability
Based on m-Distil-BERT architecture, with potential for processing multilingual text
Efficient Representation
Maps sentences to 768-dimensional dense vectors, balancing expressive power and computational efficiency
Model Capabilities
Sentence embedding generation
Semantic similarity calculation
Text clustering
Semantic search
Use Cases
Information Retrieval
Similar Question Finding
Finding semantically similar questions to user queries in FAQ systems
Improves matching accuracy in Q&A systems
Content Analysis
Document Clustering
Automatic topic grouping for large document collections
Enables unsupervised document organization
Featured Recommended AI Models