V

Vietnamese Bi Encoder

Developed by bkai-foundation-models
This is a sentence transformer model based on PhoBERT-base-v2, specifically designed for Vietnamese text semantic similarity tasks.
Downloads 30.46k
Release Time : 9/9/2023

Model Overview

The model maps Vietnamese sentences and paragraphs into a 768-dimensional dense vector space, suitable for natural language processing tasks such as clustering and semantic search.

Model Features

Optimized Vietnamese processing
Pre-trained on PhoBERT-base-v2 and specifically optimized for Vietnamese text
Multi-dataset training
Trained on MS Macro, SQuAD v2, and Zalo Legal Text Retrieval Challenge datasets
High-performance semantic encoding
Excellent performance on Zalo Legal Text Retrieval task with Acc@1 reaching 73.28%

Model Capabilities

Sentence embedding
Semantic similarity calculation
Text clustering
Information retrieval

Use Cases

Legal text retrieval
Legal document similarity search
Finding semantically similar documents in legal document libraries
Achieved Acc@1 of 73.28% on Zalo Legal Text Retrieval Challenge
Educational applications
Educational content retrieval
Searching for relevant learning materials in educational resource libraries
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase