Simcse Model M Bert Thai Cased
S
Simcse Model M Bert Thai Cased
Developed by kornwtp
A SimCSE model based on mBERT, specifically trained for Thai language, used to generate 768-dimensional vector representations of sentences and paragraphs
Downloads 25
Release Time : 12/22/2023
Model Overview
This model adopts the SimCSE method, using mBERT as the base model and training with Thai Wikipedia data. It is suitable for tasks such as sentence similarity calculation, clustering, and semantic search
Model Features
Thai language optimization
Specially trained for Thai language, excelling in Thai text processing tasks
SimCSE training method
Trained using the contrastive learning framework SimCSE, enhancing the discriminative ability of sentence representations
Multilingual foundation
Based on the mBERT architecture, retaining the ability to process multilingual texts
Model Capabilities
Sentence vectorization
Semantic similarity calculation
Text clustering
Semantic search
Use Cases
Information retrieval
Thai document similarity search
Finding semantically similar documents in a Thai document repository
Text analysis
Thai text clustering
Automatic classification and grouping of large amounts of Thai text
Featured Recommended AI Models