Pegasus Newsroom
PEGASUS is an abstractive summarization pre-trained model based on gap sentence extraction, developed by Google Research, focusing on generating high-quality text summaries.
Downloads 52
Release Time : 3/2/2022
Model Overview
PEGASUS is a pre-trained model specifically designed for abstractive summarization tasks. By pre-training with gap sentence extraction, it achieves excellent performance across multiple summarization datasets.
Model Features
Mixed and random training
Trained simultaneously on C4 and HugeNews datasets with random sampling and mixing strategies to enhance model performance.
Dynamic gap sentence ratio
Uniformly samples gap sentence ratios between 15% to 45% during training to improve model generalization.
Important sentence sampling
Adds 20% uniform noise during important sentence sampling to enhance model robustness.
Newline character encoding support
Updated SentencePiece tokenizer to support encoding newline characters, preserving paragraph segmentation information.
Model Capabilities
Text summarization generation
Multi-dataset adaptation
Abstractive summarization
Pre-trained model fine-tuning
Use Cases
News summarization
CNN/DailyMail summarization
Generates concise summaries for CNN/DailyMail news articles.
ROUGE-1/2/L: 44.16/21.56/41.30
XSum summarization
Produces high-quality results for extreme summarization (single-sentence summarization) tasks.
ROUGE-1/2/L: 47.60/24.83/39.64
Academic paper summarization
arXiv paper summarization
Generates concise summaries for academic papers.
ROUGE-1/2/L: 44.21/16.95/25.67
PubMed summarization
Generates summaries for biomedical literature.
ROUGE-1/2/L: 45.97/20.15/28.25
Technical document summarization
BigPatent summarization
Generates summaries for patent documents.
ROUGE-1/2/L: 52.29/33.08/41.66
WikiHow summarization
Generates summaries for WikiHow tutorial articles.
ROUGE-1/2/L: 46.39/22.12/38.41
Featured Recommended AI Models
Š 2025AIbase