B

Bge M3 Korean

Developed by upskyy
A Korean-optimized sentence embedding model based on BAAI/bge-m3, supporting 1024-dimensional vector representation, suitable for tasks like semantic similarity calculation
Downloads 7,823
Release Time : 8/9/2024

Model Overview

This model is a Korean sentence embedding model fine-tuned on korsts and kornli datasets based on BAAI/bge-m3, capable of mapping text to a 1024-dimensional vector space for tasks such as semantic text similarity, semantic search, and text classification

Model Features

Optimized Korean understanding
Specially fine-tuned for Korean datasets (korst and kornli), excelling in Korean semantic understanding tasks
Long text support
Supports sequences up to 8192 tokens, suitable for processing long documents and paragraphs
High-quality embeddings
Generates 1024-dimensional dense vector representations, performing well across various similarity metrics

Model Capabilities

Semantic text similarity calculation
Semantic search
Text classification
Clustering analysis
Paraphrase mining

Use Cases

Information retrieval
Similar document retrieval
Finding semantically similar documents in a document repository
Pearson cosine similarity reaches 0.874
Q&A systems
Question matching
Matching user questions with similar questions in a knowledge base
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase