S

Simcse Model M Bert Thai Cased

Developed by kornwtp
A SimCSE model based on mBERT, specifically trained for Thai language, used to generate 768-dimensional vector representations of sentences and paragraphs
Downloads 25
Release Time : 12/22/2023

Model Overview

This model adopts the SimCSE method, using mBERT as the base model and training with Thai Wikipedia data. It is suitable for tasks such as sentence similarity calculation, clustering, and semantic search

Model Features

Thai language optimization
Specially trained for Thai language, excelling in Thai text processing tasks
SimCSE training method
Trained using the contrastive learning framework SimCSE, enhancing the discriminative ability of sentence representations
Multilingual foundation
Based on the mBERT architecture, retaining the ability to process multilingual texts

Model Capabilities

Sentence vectorization
Semantic similarity calculation
Text clustering
Semantic search

Use Cases

Information retrieval
Thai document similarity search
Finding semantically similar documents in a Thai document repository
Text analysis
Thai text clustering
Automatic classification and grouping of large amounts of Thai text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase