Zh Wiki Punctuation Restore
A tool for restoring punctuation marks in Chinese Wikipedia texts, supporting the restoration of 6 common punctuation marks.
Sequence Labeling
Transformers Supports Multiple Languages#Punctuation Restoration#Chinese Text Processing#Wikipedia Optimization

Downloads 102.99k
Release Time : 1/31/2023
Model Overview
This model is specifically designed for punctuation restoration in Chinese Wikipedia texts, capable of automatically adding commas, enumeration commas, periods, question marks, exclamation marks, and semicolons to Chinese texts without punctuation.
Model Features
Multi-Punctuation Support
Supports the restoration of 6 common Chinese punctuation marks: comma, enumeration comma, period, question mark, exclamation mark, and semicolon.
Wikipedia Optimization
Specially optimized for Chinese Wikipedia texts, providing more accurate restoration results.
Sliding Window Processing
Uses sliding window technology to handle long texts, ensuring accurate punctuation restoration for lengthy documents.
Model Capabilities
Chinese punctuation restoration
Text normalization
Long text processing
Use Cases
Text Processing
Wikipedia Text Normalization
Adding punctuation to unpunctuated Wikipedia texts
Makes the text more readable and conforms to publishing standards.
OCR Post-Processing
Processing Chinese texts that have lost punctuation after OCR recognition
Improves the readability of OCR texts.
Data Preprocessing
NLP Task Preprocessing
Preparing normalized texts for downstream NLP tasks
Enhances the processing effectiveness of subsequent NLP tasks.
Featured Recommended AI Models