roberta-small-japanese-aozora Open-source Japanese Model

Home

Roberta Small Japanese Aozora

Developed by KoichiYasuoka

A small Japanese RoBERTa model pre-trained on Aozora Bunko texts, suitable for various downstream NLP tasks

Large Language Model

Transformers

Japanese#Japanese text infilling #Aozora Bunko pre-training #LUW tokenizer

Downloads 19

Release Time : 3/2/2022

Model Overview

This is a RoBERTa model pre-trained on Japanese Aozora Bunko texts using the Japanese-LUW tokenizer, applicable for masked language modeling and downstream task fine-tuning

Model Features

Aozora Bunko pre-training

Pre-trained using texts from Japan's Aozora Bunko, ideal for processing literary Japanese texts

Japanese-LUW tokenizer

Utilizes the LUW (Linguistic Unit Word) tokenizer optimized for Japanese, enhancing Japanese text processing

Small model

Small version suitable for deployment and use in resource-limited environments

Model Capabilities

Masked word prediction

Japanese text understanding

Downstream task fine-tuning

Use Cases

Natural Language Processing

Part-of-speech tagging

Can be used for Japanese POS tagging tasks

Refer to the POS tagging model provided by the author

Dependency parsing

Can be used for Japanese dependency parsing tasks

Text completion

Predicts masked words in text

As shown in examples, can predict recommended Japanese tourist spots

Property	Details
Language	Japanese
Tags	japanese, masked-lm
Pipeline Tag	fill-mask
Mask Token	[MASK]

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Roberta Small Japanese Aozora

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 roberta-small-japanese-aozora

🚀 Quick Start

💻 Usage Examples

Basic Usage

📄 License