Bert Base Han Chinese Ws
This model provides word segmentation functionality for classical Chinese, with training datasets covering four historical periods of Chinese language development.
Downloads 14
Release Time : 7/1/2022
Model Overview
A BERT-based Chinese word segmentation model specifically designed for classical Chinese texts, supporting word segmentation tasks from ancient to modern Chinese.
Model Features
Historical Chinese Support
Training data covers four developmental periods of Chinese: Ancient, Middle, Early Modern, and Modern
Academic-grade Corpus
Trained on authoritative annotated corpora from Academia Sinica's Linguistics Research Institute
BERT Architecture
Utilizes BERT-base architecture with excellent contextual understanding capabilities
Model Capabilities
Chinese Word Segmentation
Historical Chinese Processing
Sequence Labeling
Use Cases
Academic Research
Ancient Text Analysis
Automatic word segmentation for ancient Chinese texts
Accurately identifies word boundaries in classical Chinese
Language Evolution Studies
Comparing word segmentation characteristics across different historical periods
Assists linguists in studying the historical evolution of Chinese
Digital Humanities
Ancient Text Digitization
Provides preprocessing support for digitizing classical texts
Enhances searchability and analyzability of ancient texts
Featured Recommended AI Models