The open-source model "roberta-small-japanese-aozora-char" - Free support for various Japanese text processing tasks

Roberta Small Japanese Aozora Char

Developed by KoichiYasuoka

A RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer, suitable for Japanese text processing tasks.

Large Language Model

Transformers

Japanese#Japanese character-level pretraining #Aozora Bunko corpus #Masked language model

Downloads 26

Release Time : 3/2/2022

Model Overview

This is a RoBERTa model pretrained on Aozora Bunko texts using a character tokenizer. It can be fine-tuned for downstream tasks such as part-of-speech tagging and dependency parsing.

Model Features

Character-level tokenization

Uses a character-level tokenizer, suitable for processing Japanese text

Aozora Bunko pretraining

Pretrained on Aozora Bunko texts, suitable for classical and modern Japanese text processing

Downstream task adaptation

Can be fine-tuned for various downstream NLP tasks such as part-of-speech tagging and dependency parsing

Model Capabilities

Masked language modeling

Japanese text understanding

Text feature extraction

Use Cases

Natural Language Processing

Part-of-speech tagging

Can be used for part-of-speech tagging tasks in Japanese text

Dependency parsing

Can be used to analyze the syntactic structure of Japanese text

Text completion

Can be used for masked prediction and completion of Japanese text

Property	Details
Language	Japanese
Tags	Japanese, Masked - LM
Pipeline Tag	Fill - Mask
Mask Token	[MASK]
License	cc - by - sa - 4.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Roberta Small Japanese Aozora Char

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 roberta-small-japanese-aozora-char

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License