P

Pegasus Newsroom

Developed by google
PEGASUS is an abstractive summarization pre-trained model based on gap sentence extraction, developed by Google Research, focusing on generating high-quality text summaries.
Downloads 52
Release Time : 3/2/2022

Model Overview

PEGASUS is a pre-trained model specifically designed for abstractive summarization tasks. By pre-training with gap sentence extraction, it achieves excellent performance across multiple summarization datasets.

Model Features

Mixed and random training
Trained simultaneously on C4 and HugeNews datasets with random sampling and mixing strategies to enhance model performance.
Dynamic gap sentence ratio
Uniformly samples gap sentence ratios between 15% to 45% during training to improve model generalization.
Important sentence sampling
Adds 20% uniform noise during important sentence sampling to enhance model robustness.
Newline character encoding support
Updated SentencePiece tokenizer to support encoding newline characters, preserving paragraph segmentation information.

Model Capabilities

Text summarization generation
Multi-dataset adaptation
Abstractive summarization
Pre-trained model fine-tuning

Use Cases

News summarization
CNN/DailyMail summarization
Generates concise summaries for CNN/DailyMail news articles.
ROUGE-1/2/L: 44.16/21.56/41.30
XSum summarization
Produces high-quality results for extreme summarization (single-sentence summarization) tasks.
ROUGE-1/2/L: 47.60/24.83/39.64
Academic paper summarization
arXiv paper summarization
Generates concise summaries for academic papers.
ROUGE-1/2/L: 44.21/16.95/25.67
PubMed summarization
Generates summaries for biomedical literature.
ROUGE-1/2/L: 45.97/20.15/28.25
Technical document summarization
BigPatent summarization
Generates summaries for patent documents.
ROUGE-1/2/L: 52.29/33.08/41.66
WikiHow summarization
Generates summaries for WikiHow tutorial articles.
ROUGE-1/2/L: 46.39/22.12/38.41
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase