Pegasus Arxiv
PEGASUS is a pre-trained abstractive summarization model based on gap sentence extraction, optimized for summarization generation through hybrid and randomization strategies
Downloads 333
Release Time : 3/2/2022
Model Overview
A Transformer-based pre-trained model specifically designed for text summarization tasks, pre-trained using gap sentence prediction objectives
Model Features
Hybrid and Randomized Training
Trained simultaneously on C4 and HugeNews datasets, employing random sampling of gap sentence ratios and importance score noise perturbations
Dynamic Gap Sentence Sampling
Dynamically samples 15%-45% gap sentence ratios during training to enhance model generalization
Improved Tokenizer
Upgraded SentencePiece tokenizer supports newline encoding, preserving paragraph structure information
Extended Training Duration
Training steps extended to 1.5 million steps to ensure thorough model convergence
Model Capabilities
Text Summarization Generation
Cross-domain Summarization Adaptation
Abstractive Summarization
Use Cases
News Summarization
CNN/DailyMail News Summarization
Generates concise summaries for news articles
ROUGE-1/2/L: 44.16/21.56/41.30
Academic Paper Summarization
arXiv Paper Summarization
Generates technical summaries for academic papers
ROUGE-1/2/L: 44.21/16.95/25.67
Legal Document Processing
Bill Summarization
Generates executive summaries for legal bills
ROUGE-1/2/L: 59.67/41.58/47.59
Featured Recommended AI Models
Š 2025AIbase