Long-t5-tglobal-xl-16384-book-summary Open-source Model - Generate SparkNotes-like Long Text Summaries for Free

Long T5 Tglobal Xl 16384 Book Summary

Developed by pszemraj

A LongT5-XL model fine-tuned on the BookSum dataset, specifically designed for long-text summarization, capable of generating SparkNotes-like summaries.

Text Generation

Transformers

Open Source License:Bsd-3-clause #Long Text Summarization #Book Summarization #High-Precision Generation

Downloads 58

Release Time : 11/27/2022

Model Overview

This model is fine-tuned on the kmfoda/booksum dataset and excels at processing academic and narrative texts, generating high-quality summaries.

Model Features

Long Text Processing Capability

Supports input texts up to 16,384 tokens, suitable for processing long documents like book chapters.

High-Quality Summaries

From a human evaluation perspective, the XL checkpoint can generate better summaries.

Multi-Domain Applicability

Has good generalization capabilities for both academic and narrative texts.

LLM.int8 Quantization Support

Supports 8-bit quantization, significantly reducing memory usage while maintaining summary quality.

Model Capabilities

Long Text Summarization

Book Chapter Summarization

Academic Paper Summarization

Legal Document Summarization

Use Cases

Education

Book Summarization

Generates SparkNotes-like chapter summaries for students.

Produces easy-to-understand chapter overviews.

Research

Academic Paper Summarization

Generates concise summaries of lengthy research papers for researchers.

ROUGE-1 score of 36.2043 (multi_news dataset).

Legal

Legal Document Summarization

Summarizes lengthy legal documents.

ROUGE-1 score of 41.3645 (billsum dataset).

🚀 long-t5-tglobal-xl + BookSum

This model can summarize long text and generate a SparkNotes-like summary for any topic! It generalizes well to academic and narrative text. The XL checkpoint produces even better summaries from a human evaluation perspective.

📄 License

Apache-2.0
BSD-3-Clause

🔍 Tags

summarization
summary
booksum
long-document
long-form
tglobal-xl
XL

📊 Datasets

kmfoda/booksum

📈 Metrics

rouge

⚠️ Inference

Inference is disabled.

📋 Model Index

Property	Details
Model Name	pszemraj/long-t5-tglobal-xl-16384-book-summary
Results
	Task
	Dataset
	Metrics

⚠️ Important Note

As of this discussion, we found issues with long-t5 models >= 4.23.0. Please use pip install transformers==4.22.0 to ensure good performance with this model.

A simple example/use case with the base model on ASR is here.

✨ Features

Generalizes well to academic & narrative text.
The XL checkpoint produces better summaries.

📦 Installation

pip install -U transformers

💻 Usage Examples

Basic Usage

import torch
from transformers import pipeline

summarizer = pipeline(
    "summarization",
    "pszemraj/long-t5-tglobal-xl-16384-book-summary",
    device=0 if torch.cuda.is_available() else -1,
)
long_text = "Here is a lot of text I don't want to read. Replace me"

result = summarizer(long_text)
print(result[0]["summary_text"])

Advanced Usage

Adjusting parameters

Pass other parameters related to beam search textgen when calling summarizer to get even higher quality results.

LLM.int8 Quantization

First, make sure you have the latest versions of the relevant packages:

pip install -U transformers bitsandbytes accelerate

Load in 8-bit:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained(
    "pszemraj/long-t5-tglobal-xl-16384-book-summary"
)

model = AutoModelForSeq2SeqLM.from_pretrained(
    "pszemraj/long-t5-tglobal-xl-16384-book-summary",
    load_in_8bit=True,
    device_map="auto",
)

📚 Documentation

Description

A fine-tuned version of google/long-t5-tglobal-xl on the kmfoda/booksum dataset. Read the paper by Guo et al. here: LongT5: Efficient Text-To-Text Transformer for Long Sequences.

Intended uses & limitations

While this model seems to improve factual consistency, don't take summaries as foolproof and check things that seem odd. Specifically, be careful with negation statements. You can usually check this by comparing a particular statement with what the surrounding sentences imply.

Training and evaluation data

kmfoda/booksum dataset on HuggingFace - read the original paper here.

For initial fine-tuning, only input text with 12288 input tokens or less and 1024 output tokens or less was used. After a quick analysis, summaries in the 12288-16384 range are in the small minority in this dataset.
The final stages of fine-tuning used the standard 16384 input/1024 output conventions, preserving the standard in/out lengths.

Eval results

Official results with the model evaluator will be computed and posted here. The model achieves the following results on the evaluation set:

eval_loss: 1.2756
eval_rouge1: 41.8013
eval_rouge2: 12.0895
eval_rougeL: 21.6007
eval_rougeLsum: 39.5382
eval_gen_len: 387.2945
eval_runtime: 13908.4995
eval_samples_per_second: 0.107
eval_steps_per_second: 0.027

FAQ

How can I run inference with this on CPU?

lol

How to run inference over a very long (30k+ tokens) document in batches?

See summarize.py in the code for my hf space Document Summarization. You can also use the same code to split a document into batches of 4096, etc., and iterate over them with the model. This is useful in situations where CUDA memory is limited.

How to fine-tune further?

See train with a script and the summarization scripts.

Are there simpler ways to run this?

For this reason, I created a Python package utility called textsum. You can use it to load models and summarize things in a few lines of code.

pip install textsum

Use textsum in python with this model:

from textsum.summarize import Summarizer

summarizer = Summarizer(
    model_name_or_path="pszemraj/long-t5-tglobal-xl-16384-book-summary"
)

long_string = "This is a long string of text that will be s

🔧 Technical Details

Training procedure

Updates

TBD

Training hyperparameters

TBD

Framework versions

TBD

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご