Summary_loop46: Open-Source English Abstract Generation Model - Free to Use, Generates Multiple Candidate Summaries and Can Provide Scores

Summary Loop46

Developed by philippelaban

An English summary generation model based on the GPT2 architecture, trained on the CNN/DailyMail dataset, supports generating multiple candidate summaries with scoring

Text Generation

Transformers

EnglishOpen Source License:Apache-2.0 #News Summary Generation #Multi-Candidate Summaries #No Repeated n-grams

Downloads 16

Release Time : 3/2/2022

Model Overview

This model is a text summarization model based on the GPT2 architecture, specifically designed to extract key information from English documents and generate concise summaries. It supports generating multiple candidate summaries with automatic scoring.

Model Features

Multi-Candidate Summary Generation

Can generate multiple candidate summaries simultaneously and provide quality scores

Trained on CNN/DailyMail

Trained on a high-quality news summarization dataset, suitable for news text summarization

Beam Search Optimization

Uses beam search algorithm to improve summary quality

Model Capabilities

English text summarization

Multi-candidate summary generation

Summary quality scoring

Use Cases

News Summarization

Automatic News Article Summarization

Generates concise summaries for long news articles

Can generate multiple candidate summaries for selection

Content Extraction

Document Key Information Extraction

Extracts core content from technical documents or reports

🚀 Summarization Model with GPT2

A text summarization model based on GPT2 architecture, capable of summarizing documents and available for testing via the Hosted inference API.

🚀 Quick Start

In the right panel, you can try out the model (although it only handles a short sequence length). Simply enter the document you want to summarize in the panel on the right.

📦 Installation

The model (based on a GPT2 base architecture) can be loaded in the following way:

from transformers import GPT2LMHeadModel, GPT2TokenizerFast

model = GPT2LMHeadModel.from_pretrained("philippelaban/summary_loop46")
tokenizer = GPT2TokenizerFast.from_pretrained("philippelaban/summary_loop46")

💻 Usage Examples

Basic Usage

document = "Bouncing Boulders Point to Quakes on Mars. A preponderance of boulder tracks on the red planet may be evidence of recent seismic activity. If a rock falls on Mars, and no one is there to see it, does it leave a trace? Yes, and it's a beautiful herringbone-like pattern, new research reveals. Scientists have now spotted thousands of tracks on the red planet created by tumbling boulders. Delicate chevron-shaped piles of Martian dust and sand frame the tracks, the team showed, and most fade over the course of a few years. Rockfalls have been spotted elsewhere in the solar system, including on the moon and even a comet. But a big open question is the timing of these processes on other worlds — are they ongoing or did they predominantly occur in the past?"

tokenized_document = tokenizer([document], max_length=300, truncation=True, return_tensors="pt")["input_ids"].cuda()
input_shape = tokenized_document.shape
outputs = model.generate(tokenized_document, do_sample=False, max_length=500, num_beams=4, num_return_sequences=4, no_repeat_ngram_size=6, return_dict_in_generate=True, output_scores=True)
candidate_sequences = outputs.sequences[:, input_shape[1]:] # Remove the encoded text, keep only the summary
candidate_scores = outputs.sequences_scores.tolist()

for candidate_tokens, score in zip(candidate_sequences, candidate_scores):
    summary = tokenizer.decode(candidate_tokens)
    print("[Score: %.3f] %s" % (score, summary[:summary.index("END")]))

Example Output

[Score: -0.153]  These tracks have been spotted elsewhere on Mars. If a rockfalls on Mars has been spotted elsewhere on the red planet. Scientists have spotted thousands of tracks on Mars. A rockfalls on Mars have been spotted elsewhere on the Red Planet.
[Score: -0.154]  These tracks have been spotted elsewhere on Mars. If a rockfalls on Mars has been spotted elsewhere on the red planet. Scientists have spotted thousands of tracks on Mars. A rockfalls on Mars have been spotted elsewhere on the planet.
[Score: -0.154]  These tracks have been spotted elsewhere on Mars. If a rockfalls on Mars has been spotted elsewhere on the red planet. Scientists have spotted thousands of tracks on Mars. A rockfalls have been spotted elsewhere on the Red Planet.
[Score: -0.195]  These tracks have been spotted elsewhere on Mars. If a rockfalls on Mars has been spotted elsewhere on the red planet. Scientists have spotted thousands of tracks on Mars. A rockfalls on Mars have been spotted elsewhere on the Red Planet. A rockfalls have been spotted everywhere on the red planet.

📚 Documentation

You can access more information, access to the scoring function, the training script, or an example training log on the Github repo: https://github.com/CannyLab/summary_loop

📄 License

This project is licensed under the Apache-2.0 License.

📦 Model Information

Property	Details
Model Type	Summarization Model
Training Data	cnn_dailymail
Metrics	rouge
Tags	summarization

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご