G

Gte Base Ko

Developed by juyoungml
A sentence embedding model fine-tuned on the Korean triplet dataset based on the Alibaba-NLP/gte-multilingual-base model for semantic similarity calculation
Downloads 18
Release Time : 11/17/2024

Model Overview

This is a sentence transformer model fine-tuned on the Korean triplet dataset nlpai-lab/ko-triplet-v1.0 based on the Alibaba-NLP/gte-multilingual-base model. It maps sentences and paragraphs to a 768-dimensional dense vector space and can be used for tasks such as semantic text similarity, semantic search, and text classification.

Model Features

Korean optimization
Optimized specifically for Korean text and fine-tuned on the Korean triplet dataset
Long text support
Supports a sequence length of up to 8192 tokens, suitable for processing long texts
High accuracy
Achieved a cosine accuracy of 98.55% on the evaluation dataset

Model Capabilities

Semantic text similarity calculation
Semantic search
Text classification
Cluster analysis
Feature extraction

Use Cases

Information retrieval
Similar document retrieval
Find semantically similar documents based on the query text
Text analysis
Text clustering
Automatically group semantically similar texts
Featured Recommended AI Models
ยฉ 2025AIbase