M

Macbert4csc V2

Developed by Macropodus
macbert4csc_v2 is a model for Chinese spelling correction. It adopts a specific architecture and training strategy and performs well on multiple evaluation datasets. It is suitable for text correction tasks in various fields.
Downloads 112
Release Time : 1/16/2025

Model Overview

This model is mainly used for Chinese spelling correction and supports correction tasks for texts in multiple fields, including classical Chinese and common high-frequency errors such as 'de' (地, 得, 的).

Model Features

Specific architecture design
A new error detection branch (classification task) is added after BertForMaskedLM, and different strategies are adopted during training and inference.
Efficient training strategy
Use MFT (dynamically mask 0.2 of non - error tokens) for training, and the weight of det_loss is 0.3.
Multi - domain applicability
Trained with data from multiple domains, it is suitable as a pre - trained model and can be used for further fine - tuning of data in specific domains.
Classical Chinese support
The training data includes classical Chinese data, supporting classical Chinese correction.
High - frequency error handling
It has a high recognition rate and correction rate for high - frequency errors such as 'de' (地, 得, 的).

Model Capabilities

Chinese text spelling correction
Multi - domain text correction
Classical Chinese correction
High - frequency error recognition

Use Cases

General text correction
Daily text correction
Correct spelling errors in daily texts.
Example: '少先队员因该为老人让坐' → '少先队员应该为老人让坐'
Professional field correction
Correct spelling errors in texts in professional fields.
Example: '机七学习是人工智能领遇最能体现智能的一个分知' → '机器学习是人工智能领域最能体现智能的一个分支'
Specific error type handling
Correction of 'de' (地, 得, 的)
Specifically handle the common usage errors of 'de' (地, 得, 的) in Chinese.
Example: '希望你们好好的跳无' → '希望你们好好地跳舞'
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase