đ T5 Summarizer Model
A fine - tuned T5 model for text summarization, generating concise and informative summaries from long - form texts.
đ Quick Start
This model is designed to summarize long - form texts into concise and informative abstracts. It's extremely useful for professionals and researchers who need to quickly understand the essence of detailed reports, research papers, or articles without reading the whole text.
Installation
Install the necessary library with pip
:
pip install transformers
Usage Example
from transformers import pipeline
from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM
model_name = "KipperDev/t5_summarizer_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer)
prefix = "summarize: "
input_text = "Your input text here."
input_ids = tokenizer.encode(prefix + input_text, return_tensors="pt")
summary_ids = model.generate(input_ids)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)
â ī¸ Important Note
For the model to work as intended, you need to append the 'summarize:' prefix before the input data.
⨠Features
This variant of the [t5 - small](https://huggingface.co/google - t5/t5 - small) model is fine - tuned specifically for text summarization. It leverages the power of the T5's text - to - text approach to generate concise, coherent, and informative summaries from extensive text documents.
đĻ Installation
You can install the required library for using this model via pip
:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import pipeline
from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM
model_name = "KipperDev/t5_summarizer_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer)
prefix = "summarize: "
input_text = "Your input text here."
input_ids = tokenizer.encode(prefix + input_text, return_tensors="pt")
summary_ids = model.generate(input_ids)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)
đ Documentation
Training Details
Training Data
The model was trained using the Big Patent Dataset, which consists of 1.3 million US patent documents and their corresponding human - written summaries. This dataset was selected because of its rich language and complex structure, which is representative of the challenging nature of document summarization tasks. Training involved multiple subsets of the dataset to ensure broad coverage and robust model performance across various document types.
Training Procedure
Training was carried out over three rounds. The initial settings included a learning rate of 0.00002, a batch size of 8, and 4 epochs. In subsequent rounds, these parameters were adjusted to further refine the model performance, with values of 0.0003, 8, and 12 respectively. A linear decay learning rate schedule was also applied to enhance the model's learning efficiency over time.
Training Results
Model performance was evaluated using the ROUGE metric, which shows its ability to generate summaries that are very similar to human - written abstracts.
Property |
Details |
Evaluation Loss (Eval Loss) |
1.9984 |
Rouge - 1 |
0.503 |
Rouge - 2 |
0.286 |
Rouge - L |
0.3813 |
Rouge - Lsum |
0.3813 |
Average Generation Length (Gen Len) |
151.918 |
Runtime (seconds) |
714.4344 |
Samples per Second |
2.679 |
Steps per Second |
0.336 |
đ License
This project is licensed under the MIT license.
đ Citation
BibTeX:
@article{kipper_t5_summarizer,
// SOON
}
đ¨âđģ Authors
This model card was written by Fernanda Kipper