summary_loop10 Open-Source English Summary Generation Model - Automatically Generate Multiple Candidate Summaries and Score Them

Summary Loop10

Developed by philippelaban

An English summarization model based on GPT2 architecture, generating multiple candidate summaries via beam search with automatic scoring

Text Generation

Transformers

EnglishOpen Source License:Apache-2.0 #GPT2 Summary Generation #Multi-candidate Summaries #News Summarization

Downloads 13

Release Time : 3/2/2022

Model Overview

This model is fine-tuned on the GPT2 architecture, specifically designed for English text summarization. It generates 4 candidate summaries through beam search and automatically scores them to output the optimal result.

Model Features

Multi-candidate Summary Generation

Generates 4 candidate summaries at once through beam search, providing diverse options

Automatic Scoring System

Automatically scores each generated summary for easy selection of the best result

Short-text Optimization

Specially optimized for summarization of short texts

Model Capabilities

English text summarization

Multi-candidate output

Automatic summary quality scoring

Use Cases

News Summarization

News Content Condensation

Condenses lengthy news reports into brief summaries

Generates multiple candidate headline-style summaries for selection

Scientific Literature

Paper Abstract Generation

Generates summary descriptions for research papers

Extracts key findings to form concise statements

🚀 Summary Loop Model

A text summarization model based on GPT2 architecture, capable of generating summaries for given documents.

🚀 Quick Start

In the right panel, you can try out the model (although it only handles a short sequence length). Just enter the document you want to summarize in the panel on the right.

📦 Installation

The model (based on a GPT2 base architecture) can be loaded in the following way:

from transformers import GPT2LMHeadModel, GPT2TokenizerFast

model = GPT2LMHeadModel.from_pretrained("philippelaban/summary_loop10")
tokenizer = GPT2TokenizerFast.from_pretrained("philippelaban/summary_loop10")

💻 Usage Examples

Basic Usage

document = "Bouncing Boulders Point to Quakes on Mars. A preponderance of boulder tracks on the red planet may be evidence of recent seismic activity. If a rock falls on Mars, and no one is there to see it, does it leave a trace? Yes, and it's a beautiful herringbone-like pattern, new research reveals. Scientists have now spotted thousands of tracks on the red planet created by tumbling boulders. Delicate chevron-shaped piles of Martian dust and sand frame the tracks, the team showed, and most fade over the course of a few years. Rockfalls have been spotted elsewhere in the solar system, including on the moon and even a comet. But a big open question is the timing of these processes on other worlds — are they ongoing or did they predominantly occur in the past?"

tokenized_document = tokenizer([document], max_length=300, truncation=True, return_tensors="pt")["input_ids"].cuda()
input_shape = tokenized_document.shape
outputs = model.generate(tokenized_document, do_sample=False, max_length=500, num_beams=4, num_return_sequences=4, no_repeat_ngram_size=6, return_dict_in_generate=True, output_scores=True)
candidate_sequences = outputs.sequences[:, input_shape[1]:] # Remove the encoded text, keep only the summary
candidate_scores = outputs.sequences_scores.tolist()

for candidate_tokens, score in zip(candidate_sequences, candidate_scores):
    summary = tokenizer.decode(candidate_tokens)
    print("[Score: %.3f] %s" % (score, summary[:summary.index("END")]))

Example Output

[Score: -0.084]  Here's what you need to know about rockfalls
[Score: -0.087]  Here's what you need to know about these tracks
[Score: -0.091]  Here's what we know so far about these tracks
[Score: -0.101]  Here's what you need to know about rockfall

📚 Documentation

You can access more information, access to the scoring function, the training script, or an example training log on the Github repo: https://github.com/CannyLab/summary_loop

📄 License

This project is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご