V

Vi Word Segmentation

Developed by NlpHUST
Vietnamese word segmentation model based on ELECTRA architecture, fine-tuned on VLSP 2013 dataset, providing high-precision Vietnamese word segmentation capability
Downloads 1,756
Release Time : 10/30/2022

Model Overview

This model is specifically designed for Vietnamese text segmentation tasks, accurately identifying word boundaries in Vietnamese, suitable for preprocessing in natural language processing

Model Features

High-precision segmentation
Achieves 98.35% F1 score on VLSP 2013 evaluation set
Based on ELECTRA architecture
Uses efficient ELECTRA pre-trained model as base, with better contextual understanding
Domain adaptation
Excellent performance on government documents and socio-economic texts

Model Capabilities

Vietnamese text segmentation
Terminology recognition
Compound word splitting

Use Cases

Government document processing
Parliament document analysis
Automatic segmentation of Vietnamese parliamentary discussion documents
Accurately segments professional terms and compound words in government documents
Socio-economic research
Socio-economic report processing
Automatic processing of Vietnamese socio-economic situation reports
Correctly identifies professional vocabulary in economic fields
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase