R

Roberta Small Japanese Aozora

Developed by KoichiYasuoka
A small Japanese RoBERTa model pre-trained on Aozora Bunko texts, suitable for various downstream NLP tasks
Downloads 19
Release Time : 3/2/2022

Model Overview

This is a RoBERTa model pre-trained on Japanese Aozora Bunko texts using the Japanese-LUW tokenizer, applicable for masked language modeling and downstream task fine-tuning

Model Features

Aozora Bunko pre-training
Pre-trained using texts from Japan's Aozora Bunko, ideal for processing literary Japanese texts
Japanese-LUW tokenizer
Utilizes the LUW (Linguistic Unit Word) tokenizer optimized for Japanese, enhancing Japanese text processing
Small model
Small version suitable for deployment and use in resource-limited environments

Model Capabilities

Masked word prediction
Japanese text understanding
Downstream task fine-tuning

Use Cases

Natural Language Processing
Part-of-speech tagging
Can be used for Japanese POS tagging tasks
Refer to the POS tagging model provided by the author
Dependency parsing
Can be used for Japanese dependency parsing tasks
Text completion
Predicts masked words in text
As shown in examples, can predict recommended Japanese tourist spots
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase