G

Greennode Embedding Large VN V1

Developed by GreenNode
This is a sentence embedding model optimized for Vietnamese, capable of converting text into 1024-dimensional vectors, suitable for semantic similarity and retrieval tasks.
Downloads 785
Release Time : 4/11/2025

Model Overview

A sentence embedding model based on the XLM-RoBERTa architecture, specifically optimized for Vietnamese text, supporting tasks such as semantic similarity calculation, text retrieval, and clustering.

Model Features

Vietnamese optimization
Specially trained for Vietnamese text, outperforming general multilingual models in Vietnamese retrieval tasks.
Long text support
Supports sequences up to 8192 tokens, suitable for processing longer documents.
High-performance retrieval
Excels in multiple Vietnamese retrieval benchmarks, particularly in tabular retrieval tasks.

Model Capabilities

Semantic text similarity calculation
Semantic search
Text clustering
Text classification
Paraphrase mining

Use Cases

Information retrieval
Legal document retrieval
Quickly find relevant documents in legal text databases
Achieved an average performance of 74.95% on the Zac legal text retrieval dataset
Tabular data retrieval
Retrieve relevant information from structured tabular data
Achieved an average performance of 46.23% on the GreenNode tabular retrieval dataset
Question answering systems
Vietnamese question answering
Build retrieval components for Vietnamese question answering systems
Achieved an average performance of 56.86% on the VieQuAD dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase