R

Roberta Small Word Chinese Cluecorpussmall

Developed by uer
A Chinese word-level RoBERTa medium model pretrained on CLUECorpusSmall, outperforming character-level models in multiple tasks
Downloads 33
Release Time : 3/2/2022

Model Overview

This model is a Chinese word-level RoBERTa pretrained model with a medium-scale architecture (8 layers/512 hidden dimensions), trained on CLUECorpusSmall corpus, suitable for various Chinese natural language processing tasks.

Model Features

Word-level Tokenization Advantage
Compared to character-level models, word-level processing results in shorter sequences, faster speed, and better performance in multiple tasks
Multiple Size Options
Offers 5 different scales of pretrained models from Tiny to Base
Open Training Process
Uses publicly available corpus and tokenization tools, with complete training details provided for reproducibility

Model Capabilities

Chinese Text Understanding
Masked Word Prediction
Text Feature Extraction
Downstream Task Fine-tuning

Use Cases

Text Classification
Sentiment Analysis
Used for determining sentiment tendencies in product reviews or social media texts
Achieved 95.1% accuracy in Chinese sentiment analysis tasks
News Classification
Automatically categorizing news articles by topic
Achieved 67.8% accuracy in CLUE news classification tasks
Text Matching
QA Systems
Determining relevance between questions and candidate answers
Achieved 88.0% accuracy in text matching tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase