L

Layoutlm Wikipedia Ja

Developed by jri-advtechlab
This is a LayoutLM model pre-trained on Japanese text, primarily used for token classification tasks in Japanese documents.
Downloads 22
Release Time : 1/31/2024

Model Overview

This model is a LayoutLM trained on Japanese Wikipedia, mainly fine-tuned for token classification tasks and can also be used for masked language modeling.

Model Features

Japanese Text Processing
Pre-trained specifically for Japanese text, suitable for Japanese document processing tasks.
Layout-aware
Models both text content and layout information (e.g., bounding boxes), suitable for document understanding tasks.
BERT-based Architecture
Initialized based on cl-tohoku/bert-base-japanese-v2 model, inheriting BERT's powerful language understanding capabilities.

Model Capabilities

Token Classification
Masked Language Modeling
Document Layout Understanding

Use Cases

Document Information Extraction
Wikipedia Information Extraction
Extract structured information from Japanese Wikipedia pages
Achieved a macro F1 score of 55.1451 in the SHINRA 2022 shared task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase