Roberta Small Japanese Aozora
A small Japanese RoBERTa model pre-trained on Aozora Bunko texts, suitable for various downstream NLP tasks
Downloads 19
Release Time : 3/2/2022
Model Overview
This is a RoBERTa model pre-trained on Japanese Aozora Bunko texts using the Japanese-LUW tokenizer, applicable for masked language modeling and downstream task fine-tuning
Model Features
Aozora Bunko pre-training
Pre-trained using texts from Japan's Aozora Bunko, ideal for processing literary Japanese texts
Japanese-LUW tokenizer
Utilizes the LUW (Linguistic Unit Word) tokenizer optimized for Japanese, enhancing Japanese text processing
Small model
Small version suitable for deployment and use in resource-limited environments
Model Capabilities
Masked word prediction
Japanese text understanding
Downstream task fine-tuning
Use Cases
Natural Language Processing
Part-of-speech tagging
Can be used for Japanese POS tagging tasks
Refer to the POS tagging model provided by the author
Dependency parsing
Can be used for Japanese dependency parsing tasks
Text completion
Predicts masked words in text
As shown in examples, can predict recommended Japanese tourist spots
Featured Recommended AI Models