T5 Base Korean Summarization
This is a Korean text summarization model based on the T5 architecture, specifically designed for Korean text summarization tasks. It is trained on multiple Korean datasets by fine-tuning the paust/pko-t5-base model.
Downloads 148.32k
Release Time : 1/14/2023
Model Overview
This model is mainly used for the automatic summarization of Korean texts, capable of extracting key information from longer Korean texts to generate concise summaries.
Model Features
Multi-dataset fine-tuning
The model is fine-tuned on three professional datasets: Korean paper summaries, book summaries, and summary statement and report generation.
Professional Korean support
Fine-tuned based on the pko-t5-base model specifically optimized for Korean.
Flexible summary length control
The minimum and maximum lengths of the generated summaries can be controlled through parameter settings.
Model Capabilities
Korean text understanding
Text summarization generation
Key information extraction
Use Cases
Academic research
Paper summary generation
Automatically generate summaries of Korean academic papers.
ROUGE-2-F score: 0.172
Publishing industry
Book content summary
Generate concise content summaries for Korean books.
ROUGE-2-F score: 0.265
Business reports
Report summary generation
Extract key information from long business reports to generate summaries.
ROUGE-2-F score: 0.177
๐ t5-base-korean-summarization
This is a T5 model designed for Korean text summarization. It addresses the need for efficient summarization of Korean texts, offering a practical solution for users dealing with large volumes of Korean language data.
โจ Features
- This model is finetuned based on the 'paust/pko-t5-base' model.
- It is finetuned using 3 datasets, which are as follows:
๐ Quick Start
Prerequisites
Make sure you have nltk
and transformers
installed. You can install them using the following commands:
pip install nltk
pip install transformers
Example Code
import nltk
nltk.download('punkt')
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained('eenzeenee/t5-base-korean-summarization')
tokenizer = AutoTokenizer.from_pretrained('eenzeenee/t5-base-korean-summarization')
prefix = "summarize: "
sample = """
์๋
ํ์ธ์? ์ฐ๋ฆฌ (2ํ๋
)/(์ด ํ๋
) ์น๊ตฌ๋ค ์ฐ๋ฆฌ ์น๊ตฌ๋ค ํ๊ต์ ๊ฐ์ ์ง์ง (2ํ๋
)/(์ด ํ๋
) ์ด ๋๊ณ ์ถ์๋๋ฐ ํ๊ต์ ๋ชป ๊ฐ๊ณ ์์ด์ ๋ต๋ตํ์ฃ ?
๊ทธ๋๋ ์ฐ๋ฆฌ ์น๊ตฌ๋ค์ ์์ ๊ณผ ๊ฑด๊ฐ์ด ์ต์ฐ์ ์ด๋๊น์ ์ค๋๋ถํฐ ์ ์๋์ด๋ ๋งค์ผ ๋งค์ผ ๊ตญ์ด ์ฌํ์ ๋ ๋๋ณด๋๋ก ํด์.
์ด/ ์๊ฐ์ด ๋ฒ์จ ์ด๋ ๊ฒ ๋๋์? ๋ฆ์์ด์. ๋ฆ์์ด์. ๋นจ๋ฆฌ ๊ตญ์ด ์ฌํ์ ๋ ๋์ผ ๋ผ์.
๊ทธ๋ฐ๋ฐ ์ด/ ๊ตญ์ด์ฌํ์ ๋ ๋๊ธฐ ์ ์ ์ฐ๋ฆฌ๊ฐ ์ค๋น๋ฌผ์ ์ฑ๊ฒจ์ผ ๋๊ฒ ์ฃ ? ๊ตญ์ด ์ฌํ์ ๋ ๋ ์ค๋น๋ฌผ, ๊ต์์ ์ด๋ป๊ฒ ๋ฐ์ ์ ์๋์ง ์ ์๋์ด ์ค๋ช
์ ํด์ค๊ฒ์.
(EBS)/(์ด๋น์์ค) ์ด๋ฑ์ ๊ฒ์ํด์ ๋ค์ด๊ฐ๋ฉด์ ์ฒซํ๋ฉด์ด ์ด๋ ๊ฒ ๋์์.
์/ ๊ทธ๋ฌ๋ฉด์ ์ฌ๊ธฐ (X)/(์์ค) ๋๋ฌ์ฃผ(๊ณ ์)/(๊ตฌ์). ์ ๊ธฐ (๋๊ทธ๋ผ๋ฏธ)/(๋ฅ๊ทธ๋ผ๋ฏธ) (EBS)/(์ด๋น์์ค) (2์ฃผ)/(์ด ์ฃผ) ๋ผ์ด๋ธํน๊ฐ์ด๋ผ๊ณ ๋์ด์์ฃ ?
๊ฑฐ๊ธฐ๋ฅผ ๋ฐ๋ก ๊ฐ๊ธฐ๋ฅผ ๋๋ฆ
๋๋ค. ์/ (๋๋ฅด๋ฉด์)/(๋๋ฅด๋ฉด์). ์ด๋ป๊ฒ ๋๋? b/ ๋ฐ์ผ๋ก ๋ด๋ ค์ ๋ด๋ ค์ ๋ด๋ ค์ ์ญ ๋ด๋ ค์.
์ฐ๋ฆฌ ๋ช ํ๋
์ด์ฃ ? ์/ (2ํ๋
)/(์ด ํ๋
) ์ด์ฃ (2ํ๋
)/(์ด ํ๋
)์ ๋ฌด์จ ๊ณผ๋ชฉ? ๊ตญ์ด.
์ด๋ฒ์ฃผ๋ (1์ฃผ)/(์ผ ์ฃผ) ์ฐจ๋๊น์ ์ฌ๊ธฐ ๊ต์. ๋ค์์ฃผ๋ ์ฌ๊ธฐ์ ๋ค์ด์ ๋ฐ์ผ๋ฉด ๋ผ์.
์ด ๊ต์์ ํด๋ฆญ์ ํ๋ฉด, ์ง์/. ์ด๋ ๊ฒ ๊ต์ฌ๊ฐ ๋์ต๋๋ค .์ด ๊ต์์ (๋ค์ด)/(๋ฐ์ด)๋ฐ์์ ์ฐ๋ฆฌ ๊ตญ์ด์ฌํ์ ๋ ๋ ์๊ฐ ์์ด์.
๊ทธ๋ผ ์ฐ๋ฆฌ ์ง์ง๋ก ๊ตญ์ด ์ฌํ์ ํ๋ฒ ๋ ๋๋ณด๋๋ก ํด์? ๊ตญ์ด์ฌํ ์ถ๋ฐ. ์/ (1๋จ์)/(์ผ ๋จ์) ์ ๋ชฉ์ด ๋ญ๊ฐ์? ํ๋ฒ ์ฐพ์๋ด์.
์๋ฅผ ์ฆ๊ฒจ์ ์์. ๊ทธ๋ฅ ์๋ฅผ ์ฝ์ด์ ๊ฐ ์๋์์. ์๋ฅผ ์ฆ๊ฒจ์ผ ๋ผ์ ์ฆ๊ฒจ์ผ ๋ผ. ์ด๋ป๊ฒ ์ฆ๊ธธ๊น? ์ผ๋จ์ ๋ด๋ด ์๋ฅผ ์ฆ๊ธฐ๋ ๋ฐฉ๋ฒ์ ๋ํด์ ๊ณต๋ถ๋ฅผ ํ ๊ฑด๋ฐ์.
๊ทธ๋ผ ์ค๋์์ ์ด๋ป๊ฒ ์ฆ๊ธธ๊น์? ์ค๋ ๊ณต๋ถํ ๋ด์ฉ์์ ์๋ฅผ ์ฌ๋ฌ ๊ฐ์ง ๋ฐฉ๋ฒ์ผ๋ก ์ฝ๊ธฐ๋ฅผ ๊ณต๋ถํ ๊ฒ๋๋ค.
์ด๋ป๊ฒ ์ฌ๋ฌ๊ฐ์ง ๋ฐฉ๋ฒ์ผ๋ก ์ฝ์๊น ์ฐ๋ฆฌ ๊ณต๋ถํด ๋ณด๋๋ก ํด์. ์ค๋์ ์ ๋์๋ผ ์ง์/! ์๊ฐ ๋์์ต๋๋ค ์์ ์ ๋ชฉ์ด ๋ญ๊ฐ์? ๋คํฐ ๋ ์ด์์ ๋คํฐ ๋ .
๋๊ตฌ๋ ๋คํ๋ ๋์์ด๋ ๋คํ๋ ์ธ๋๋ ์น๊ตฌ๋? ๋๊ตฌ๋ ๋คํ๋์ง ์ ์๋์ด ์๋ฅผ ์ฝ์ด ์ค ํ
๋๊น ํ๋ฒ ์๊ฐ์ ํด๋ณด๋๋ก ํด์."""
inputs = [prefix + sample]
inputs = tokenizer(inputs, max_length=512, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=3, do_sample=True, min_length=10, max_length=64)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
result = nltk.sent_tokenize(decoded_output.strip())[0]
print('RESULT >>', result)
๐ Documentation
Evalutation Result
- Korean Paper Summarization Dataset(๋
ผ๋ฌธ์๋ฃ ์์ฝ)
ROUGE-2-R 0.09868624890432466 ROUGE-2-P 0.9666714545849712 ROUGE-2-F 0.17250881441169427
- Korean Book Summarization Dataset(๋์์๋ฃ ์์ฝ)
ROUGE-2-R 0.1575686156943213 ROUGE-2-P 0.9718318136896944 ROUGE-2-F 0.26548116834852586
- Korean Summary statement and Report Generation Dataset(์์ฝ๋ฌธ ๋ฐ ๋ ํฌํธ ์์ฑ ๋ฐ์ดํฐ)
ROUGE-2-R 0.0987891733555808 ROUGE-2-P 0.9276946867981899 ROUGE-2-F 0.17726493110448185
Training
The model was trained with the following parameters:
Seq2SeqTrainingArguments(
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
auto_find_batch_size=False,
weight_decay=0.01,
learning_rate=4e-05,
lr_scheduler_type=linear,
num_train_epochs=3,
fp16=True)
Model Architecture
T5ForConditionalGeneration(
(shared): Embedding(50358, 768)
(encoder): T5Stack(
(embed_tokens): Embedding(50358, 768)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(relative_attention_bias): Embedding(32, 12)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1~11): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(50358, 768)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(relative_attention_bias): Embedding(32, 12)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1~11): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=768, out_features=2048, bias=False)
(wi_1): Linear(in_features=768, out_features=2048, bias=False)
(wo): Linear(in_features=2048, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(lm_head): Linear(in_features=768, out_features=50358, bias=False)
)
๐ License
No license information provided in the original document.
๐ Citation
- Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.
Bart Large Cnn
MIT
BART model pre-trained on English corpus, specifically fine-tuned for the CNN/Daily Mail dataset, suitable for text summarization tasks
Text Generation English
B
facebook
3.8M
1,364
Parrot Paraphraser On T5
Parrot is a T5-based paraphrasing framework designed to accelerate the training of Natural Language Understanding (NLU) models through high-quality paraphrase generation for data augmentation.
Text Generation
Transformers

P
prithivida
910.07k
152
Distilbart Cnn 12 6
Apache-2.0
DistilBART is a distilled version of the BART model, specifically optimized for text summarization tasks, significantly improving inference speed while maintaining high performance.
Text Generation English
D
sshleifer
783.96k
278
T5 Base Summarization Claim Extractor
A T5-based model specialized in extracting atomic claims from summary texts, serving as a key component in summary factuality assessment pipelines.
Text Generation
Transformers English

T
Babelscape
666.36k
9
Unieval Sum
UniEval is a unified multidimensional evaluator for automatic evaluation of natural language generation tasks, supporting assessment across multiple interpretable dimensions.
Text Generation
Transformers

U
MingZhong
318.08k
3
Pegasus Paraphrase
Apache-2.0
A text paraphrasing model fine-tuned based on the PEGASUS architecture, capable of generating sentences with the same meaning but different expressions.
Text Generation
Transformers English

P
tuner007
209.03k
185
T5 Base Korean Summarization
This is a Korean text summarization model based on the T5 architecture, specifically designed for Korean text summarization tasks. It is trained on multiple Korean datasets by fine-tuning the paust/pko-t5-base model.
Text Generation
Transformers Korean

T
eenzeenee
148.32k
25
Pegasus Xsum
PEGASUS is a Transformer-based pretrained model specifically designed for abstractive text summarization tasks.
Text Generation English
P
google
144.72k
198
Bart Large Cnn Samsum
MIT
A dialogue summarization model based on the BART-large architecture, fine-tuned specifically for the SAMSum corpus, suitable for generating dialogue summaries.
Text Generation
Transformers English

B
philschmid
141.28k
258
Kobart Summarization
MIT
A Korean text summarization model based on the KoBART architecture, capable of generating concise summaries of Korean news articles.
Text Generation
Transformers Korean

K
gogamza
119.18k
12
Featured Recommended AI Models
ยฉ 2025AIbase