R

Roberta Base Japanese Aozora Char

Developed by KoichiYasuoka
A RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer, suitable for Japanese text processing tasks.
Downloads 50
Release Time : 3/2/2022

Model Overview

This is a RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer. It can be fine-tuned for downstream tasks such as part-of-speech tagging and dependency parsing.

Model Features

Based on Aozora Bunko
Pretrained on texts from Japan's Aozora Bunko, making it suitable for processing Japanese literary works.
Character-level Tokenization
Uses a character-level tokenizer, which better handles the complex writing system of Japanese.
Adaptable to Downstream Tasks
Can be fine-tuned for various natural language processing tasks such as part-of-speech tagging and dependency parsing.

Model Capabilities

Japanese Text Understanding
Masked Language Modeling
Part-of-Speech Tagging
Dependency Parsing

Use Cases

Natural Language Processing
Part-of-Speech Tagging
Performs part-of-speech tagging on Japanese text
Achieves high tagging accuracy
Dependency Parsing
Analyzes the syntactic structure of Japanese text
Effectively parses dependency relations in Japanese sentences
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase