Pegasus Multi News
PEGASUS is an abstractive summarization pre-trained model based on gap sentence extraction, trained on a mixture of C4 and HugeNews datasets, supporting various text summarization tasks.
Downloads 577
Release Time : 3/2/2022
Model Overview
PEGASUS is a pre-trained model specifically designed for abstractive text summarization tasks, optimizing summarization performance through gap sentence extraction and importance sentence sampling techniques.
Model Features
Hybrid and Randomized Training
Trained simultaneously on C4 and HugeNews datasets, utilizing random sampling of gap sentence ratios and importance sentence sampling techniques to enhance model generalization.
Multi-dataset Support
Excels on multiple text summarization datasets (e.g., xsum, cnn_dailymail, newsroom), adapting to summarization needs across different domains.
Improved Tokenizer
Upgraded SentencePiece tokenizer to support newline encoding, optimizing paragraph segmentation and information retention.
Model Capabilities
Text Summarization Generation
Multi-domain Summarization Adaptation
Abstractive Summarization
Use Cases
News Summarization
News Article Summarization
Generate concise summaries of news articles while retaining key information.
Achieves a ROUGE-1 score of 44.16 on the cnn_dailymail dataset.
Academic Paper Summarization
Academic Paper Summarization
Generate summaries of academic papers, highlighting research focus.
Achieves a ROUGE-1 score of 44.21 on the arxiv dataset.
Technical Document Summarization
Patent Document Summarization
Generate summaries of technical patent documents, simplifying complex content.
Achieves a ROUGE-1 score of 52.29 on the big_patent dataset.
Featured Recommended AI Models
Š 2025AIbase