Kosimcse Roberta
A Korean sentence vector embedding model based on the RoBERTa architecture, optimized for sentence representation through contrastive learning, suitable for tasks such as semantic similarity calculation.
Downloads 10.35k
Release Time : 4/5/2022
Model Overview
This model uses the RoBERTa architecture for pre-training and optimizes sentence embedding representations through the SimCSE contrastive learning method, capable of generating high-quality Korean sentence vectors for natural language processing tasks such as semantic search and text similarity calculation.
Model Features
Efficient sentence embedding
Optimizes sentence representation through contrastive learning to generate high-quality sentence vectors
Multi-task learning
The multi-task version further enhances performance by combining multiple training objectives
High performance
Achieves SOTA level in Korean semantic similarity tasks
Model Capabilities
Sentence embedding generation
Semantic similarity calculation
Text retrieval
Sentence clustering
Use Cases
Information retrieval
Semantic search
Using sentence vectors for similar document retrieval
Obtains more relevant results compared to traditional keyword search
Text analysis
Text similarity calculation
Calculating semantic similarity between two Korean sentences
Achieves an average score of 85.77 on the test set
Featured Recommended AI Models
ยฉ 2025AIbase