Roberta Base Japanese Aozora Char
A RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer, suitable for Japanese text processing tasks.
Downloads 50
Release Time : 3/2/2022
Model Overview
This is a RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer. It can be fine-tuned for downstream tasks such as part-of-speech tagging and dependency parsing.
Model Features
Based on Aozora Bunko
Pretrained on texts from Japan's Aozora Bunko, making it suitable for processing Japanese literary works.
Character-level Tokenization
Uses a character-level tokenizer, which better handles the complex writing system of Japanese.
Adaptable to Downstream Tasks
Can be fine-tuned for various natural language processing tasks such as part-of-speech tagging and dependency parsing.
Model Capabilities
Japanese Text Understanding
Masked Language Modeling
Part-of-Speech Tagging
Dependency Parsing
Use Cases
Natural Language Processing
Part-of-Speech Tagging
Performs part-of-speech tagging on Japanese text
Achieves high tagging accuracy
Dependency Parsing
Analyzes the syntactic structure of Japanese text
Effectively parses dependency relations in Japanese sentences
Featured Recommended AI Models