kobart-korean-summarizer-v2 Open-source Korean Text Summarization Model - Accurately Summarize Content, Highly Efficient and Practical

Home

Kobart Korean Summarizer V2

Developed by gangyeolkim

Korean text summarization model trained on gogamza/kobart-base-v2, using 680,000 summary data points from AI Hub

Text Generation

Transformers

#Korean abstract generation #News text compression #Non-profit organization analysis

Downloads 164

Release Time : 11/23/2023

Model Overview

This is a specialized model for Korean text summarization that can compress long texts into concise summaries

Model Features

Korean Language Optimization

Summarization model specifically optimized for Korean text

Large-scale Data Training

Trained with over 680,000 Korean summary data points from AI Hub

Efficient Training

Completed 3 epochs of training in just 17 hours on an A100 GPU

Model Capabilities

Korean text summarization

Long text compression

Key information extraction

Use Cases

News Summarization

News Article Summarization

Compress lengthy news reports into concise summaries

As shown in examples, can compress long news articles into 1-2 sentence key summaries

Document Processing

Report Summarization

Extract key information from long documents to generate summaries

🚀 Korean Text Summarization Model

This project presents a Korean text summarization model. It is based on a specific base model and trained with a large - scale dataset, offering an effective solution for summarizing Korean texts.

✨ Features

Based on the gogamza/kobart-base-v2 model.
Trained with a large - scale Korean summarization dataset from aihub.
Capable of summarizing various types of Korean texts.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import pipeline
# GPU 사용 케이스
# pipe = pipeline("summarization", model="gangyeolkim/kobart-korean-summarizer-v2", device=0)

# GPU 미사용 케이스
pipe = pipeline("summarization", model="gangyeolkim/kobart-korean-summarizer-v2")
original_text = """
(서울=연합뉴스) 특별취재팀 = 연합뉴스TV에 대한 적대적 인수·합병(M&A)을 시도하는 을지재단이 사실상 박준영 회장 일가의 '족벌경영' 체제 속에 사익을 실현하는 수단으로 활용된다는 지적이 나온다.
을지재단은 산하에 병원, 대학 등 여러 법인을 두고 있지만, 박준영 회장과 아내인 홍성희 을지대 총장이 요직을 주고받으면서 사실상 함께 경영하는 체제다.
비영리법인으로 각종 세제 혜택을 받는 을지재단의 '족벌경영' 폐해는 여러 사례를 통해 여실히 드러나고 있다.
부부가 비상근이사이면서도 재단에서 매달 1천만원씩 '셀프급여'를 받은 것, 박 회장이 '재단 소속 병원'에서 마약성 진통제를 3천회 이상 처방받은 것, 개인 소유의 관계회사를 만들어 병원과 거래에서 생기는 수익을 챙긴 것 등등.
을지재단은 연합뉴스TV의 최대주주 지위를 노리면서 그 운영 방침으로 '소유와 경영의 분리', '공정성 및 공익성 실현'을 내세웠다.
하지만 박 회장 부부의 이익을 위해 철저하게 재단을 '사유화'한 행태가 여러 사례를 통해 드러난 상황에서, 이들의 공영방송 지배를 우려하는 목소리는 갈수록 커지고 있다.
"""

summarized = pipe(original_text)
print(summarized[0]["summary_text"])  # 을지재단이 박 회장 일가의 '족벌경영' 체제 속에 사익을 실현하는 수단으로 활용된다는 지적이 나오고 있다.

🔧 Technical Details

Base Model

This model is based on gogamza/kobart-base-v2 and is trained using summarization data from aihub.

Datasets Used (683,335 cases)

Training

The training process used one Nvidia A100 GPU and took 17 hours for 3 epochs.

📄 License

This project is licensed under the CC - BY - NC 4.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご