S

Sup SimCSE VietNamese Phobert Base

Developed by VoVanPhuc
SimeCSE_Vietnamese is a Vietnamese sentence embedding model based on SimCSE, using PhoBERT as the pretrained language model, suitable for both unlabeled and labeled data.
Downloads 25.51k
Release Time : 3/2/2022

Model Overview

SimeCSE_Vietnamese is a model for Vietnamese sentence embedding, optimized through contrastive learning during pretraining to generate high-quality sentence vector representations.

Model Features

Contrastive Learning Based on SimCSE
Adopts SimCSE's contrastive learning method to optimize the pretraining process and improve the quality of sentence embeddings.
Supports Unlabeled and Labeled Data
The model is suitable for both unlabeled and labeled data, with strong generalization capabilities.
Pretraining Based on PhoBERT
Uses PhoBERT as the pretrained language model, fully leveraging the linguistic characteristics of Vietnamese.

Model Capabilities

Generate Vietnamese sentence embeddings
Sentence similarity calculation
Text retrieval

Use Cases

Text Similarity
Sentence Similarity Calculation
Calculate the similarity between two Vietnamese sentences.
Information Retrieval
Vietnamese Text Retrieval
Used to retrieve Vietnamese documents most relevant to the query sentence.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase