mt5_summarize_japanese Open Source Model - Free Deployment for Japanese News Story Summary Generation

Home

Mt5 Summarize Japanese

Developed by tsmatz

A Japanese summarization model fine-tuned from google/mt5-small, specifically designed for news story summarization

Text Generation

Transformers

JapaneseOpen Source License:Apache-2.0 #Japanese News Summarization #MT5 Fine-tuning #Rouge Optimization

Downloads 552

Release Time : 11/26/2022

Model Overview

This model is a Japanese text summarization model based on the MT5 architecture, fine-tuned on the XL-Sum Japanese dataset. It excels at extracting key information from news articles to generate concise summaries.

Model Features

News Summarization Optimization

Specially optimized for news story content, effectively extracting key information such as events, background, and outcomes.

Efficient Training

Fine-tuned based on the pre-trained MT5 model, achieving good results with minimal training data.

Multi-dimensional Evaluation

Comprehensively evaluates summary quality using Rouge metrics to ensure accuracy and coherence.

Model Capabilities

Japanese Text Understanding

News Content Summarization

Key Information Extraction

Use Cases

News Media

News Flash Generation

Automatically generates concise news flashes from lengthy news reports.

Rouge1 score of 0.4625, indicating effective retention of key information from the original text.

News Aggregation Summarization

Automatically generates summaries for multiple related news articles on aggregation platforms.

Content Analysis

Public Sentiment Monitoring

Automatically analyzes large volumes of news reports to generate public sentiment summaries.

🚀 mt5_summarize_japanese

This model is a fine - tuned version of [google/mt5 - small](https://huggingface.co/google/mt5 - small) for Japanese text summarization, offering an effective solution for quickly getting summaries of Japanese content.

🚀 Quick Start

This model is a fine - tuned version of [google/mt5 - small](https://huggingface.co/google/mt5 - small) trained for Japanese summarization. It is fine - tuned on BBC news articles (XL - Sum Japanese dataset), where the first sentence (headline sentence) is used as the summary and the rest as the article.

So, please fill news story (including, such as, event, background, result, and comment) as source text in the inferece widget. (Other corpra - such as, conversation, business document, academic paper, or short tale - are not seen in the training set.)

It achieves the following results on the evaluation set:

Property	Details
Loss	1.8952
Rouge1	0.4625
Rouge2	0.2866
Rougel	0.3656
Rougelsum	0.3868

✨ Features

Language Focus: Specialized for Japanese text summarization.
Fine - Tuned: Based on the [google/mt5 - small](https://huggingface.co/google/mt5 - small) model, fine - tuned on specific Japanese news datasets.

💻 Usage Examples

Basic Usage

from transformers import pipeline

seq2seq = pipeline("summarization", model="tsmatz/mt5_summarize_japanese")
sample_text = "サッカーのワールドカップカタール大会、世界ランキング24位でグループEに属する日本は、23日の1次リーグ初戦において、世界11位で過去4回の優勝を誇るドイツと対戦しました。試合は前半、ドイツの一方的なペースではじまりましたが、後半、日本の森保監督は攻撃的な選手を積極的に動員して流れを変えました。結局、日本は前半に1点を奪われましたが、途中出場の堂安律選手と浅野拓磨選手が後半にゴールを決め、2対1で逆転勝ちしました。ゲームの流れをつかんだ森保采配が功を奏しました。"
result = seq2seq(sample_text)
print(result)

📚 Documentation

Training procedure

You can download the source code for fine - tuning from [here](https://github.com/tsmatz/huggingface - finetune - japanese/blob/master/02 - summarize.ipynb).

Training hyperparameters

The following hyperparameters were used during training:

Property	Details
learning_rate	0.0005
train_batch_size	2
eval_batch_size	1
seed	42
gradient_accumulation_steps	16
total_train_batch_size	32
optimizer	Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type	linear
lr_scheduler_warmup_steps	90
num_epochs	10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
4.2501	0.36	100	3.3685	0.3114	0.1654	0.2627	0.2694
3.6436	0.72	200	3.0095	0.3023	0.1634	0.2684	0.2764
3.3044	1.08	300	2.8025	0.3414	0.1789	0.2912	0.2984
3.2693	1.44	400	2.6284	0.3616	0.1935	0.2979	0.3132
3.2025	1.8	500	2.5271	0.3790	0.2042	0.3046	0.3192
2.9772	2.17	600	2.4203	0.4083	0.2374	0.3422	0.3542
2.9133	2.53	700	2.3863	0.3847	0.2096	0.3316	0.3406
2.9383	2.89	800	2.3573	0.4016	0.2297	0.3361	0.3500
2.7608	3.25	900	2.3223	0.3999	0.2249	0.3461	0.3566
2.7864	3.61	1000	2.2293	0.3932	0.2219	0.3297	0.3445
2.7846	3.97	1100	2.2097	0.4386	0.2617	0.3766	0.3826
2.7495	4.33	1200	2.1879	0.4100	0.2449	0.3481	0.3551
2.6092	4.69	1300	2.1515	0.4398	0.2714	0.3787	0.3842
2.5598	5.05	1400	2.1195	0.4366	0.2545	0.3621	0.3736
2.5283	5.41	1500	2.0637	0.4274	0.2551	0.3649	0.3753
2.5947	5.77	1600	2.0588	0.4454	0.2800	0.3828	0.3921
2.5354	6.14	1700	2.0357	0.4253	0.2582	0.3546	0.3687
2.5203	6.5	1800	2.0263	0.4444	0.2686	0.3648	0.3764
2.5303	6.86	1900	1.9926	0.4455	0.2771	0.3795	0.3948
2.4953	7.22	2000	1.9576	0.4523	0.2873	0.3869	0.4053
2.4271	7.58	2100	1.9384	0.4455	0.2811	0.3713	0.3862
2.4462	7.94	2200	1.9230	0.4530	0.2846	0.3754	0.3947
2.3303	8.3	2300	1.9311	0.4519	0.2814	0.3755	0.3887
2.3916	8.66	2400	1.9213	0.4598	0.2897	0.3688	0.3889
2.5995	9.03	2500	1.9060	0.4526	0.2820	0.3733	0.3946
2.3348	9.39	2600	1.9021	0.4595	0.2856	0.3762	0.3988
2.4035	9.74	2700	1.8952	0.4625	0.2866	0.3656	0.3868

Framework versions

Property	Details
Transformers	4.23.1
Pytorch	1.12.1+cu102
Datasets	2.6.1
Tokenizers	0.13.1

📄 License

This model is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご