K

Kosimcse Roberta

Developed by BM-K
A Korean sentence vector embedding model based on the RoBERTa architecture, optimized for sentence representation through contrastive learning, suitable for tasks such as semantic similarity calculation.
Downloads 10.35k
Release Time : 4/5/2022

Model Overview

This model uses the RoBERTa architecture for pre-training and optimizes sentence embedding representations through the SimCSE contrastive learning method, capable of generating high-quality Korean sentence vectors for natural language processing tasks such as semantic search and text similarity calculation.

Model Features

Efficient sentence embedding
Optimizes sentence representation through contrastive learning to generate high-quality sentence vectors
Multi-task learning
The multi-task version further enhances performance by combining multiple training objectives
High performance
Achieves SOTA level in Korean semantic similarity tasks

Model Capabilities

Sentence embedding generation
Semantic similarity calculation
Text retrieval
Sentence clustering

Use Cases

Information retrieval
Semantic search
Using sentence vectors for similar document retrieval
Obtains more relevant results compared to traditional keyword search
Text analysis
Text similarity calculation
Calculating semantic similarity between two Korean sentences
Achieves an average score of 85.77 on the test set
Featured Recommended AI Models
ยฉ 2025AIbase