đ KipperDev/bart_summarizer_model
A fine - tuned BART - base model for text summarization, capable of generating concise and informative summaries from long - form texts.
đ Quick Start
This model is a fine - tuned variant of [facebook/bart - base](https://huggingface.co/facebook/bart - base) designed for text summarization. It uses the BART bidirectional (BERT - like) encoder and an autoregressive (GPT - like) decoder to generate concise, coherent, and informative summaries from extensive text documents.
⨠Features
- Specialized for text summarization, helping users quickly understand the essence of long - form texts.
- Trained on a large - scale patent dataset, ensuring broad coverage and robust performance.
- Evaluated using the ROUGE metric, showing good alignment with human - written abstracts.
đĻ Installation
Install the transformers
library with pip
:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import pipeline
from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM
model_name = "KipperDev/bart_summarizer_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer)
prefix = "summarize: "
input_text = "Your input text here."
input_ids = tokenizer.encode(prefix + input_text, return_tensors="pt")
summary_ids = model.generate(input_ids)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)
â ī¸ Important Note
FOR THE MODEL TO WORK AS INTENDED, YOU NEED TO APPEND THE 'summarize:' PREFIX BEFORE THE INPUT DATA
đ Documentation
Training Details
Training Data
The model was trained using the Big Patent Dataset, which consists of 1.3 million US patent documents and their corresponding human - written summaries. This dataset was selected due to its rich language and complex structure, which is representative of the challenging nature of document summarization tasks. Multiple subsets of the dataset were used during training to ensure broad coverage and robust model performance across varied document types.
Training Procedure
Training was carried out over three rounds. The initial settings included a learning rate of 0.00002, a batch size of 8, and 4 epochs. In subsequent rounds, these parameters were adjusted to 0.0003, 8, and 12 respectively to further refine model performance. A linear decay learning rate schedule was also applied to enhance model learning efficiency over time.
Training Results
Model performance was evaluated using the ROUGE metric, demonstrating its ability to generate summaries that closely match human - written abstracts.
Property |
Details |
Evaluation Loss (Eval Loss) |
1.9244 |
Rouge - 1 |
0.5007 |
Rouge - 2 |
0.2704 |
Rouge - L |
0.3627 |
Rouge - Lsum |
0.3636 |
Average Generation Length (Gen Len) |
122.1489 |
Runtime (seconds) |
1459.3826 |
Samples per Second |
1.312 |
Steps per Second |
0.164 |
đ License
This project is licensed under the MIT license.
đ Citation
BibTeX:
@article{kipper_t5_summarizer,
// SOON
}
đĨ Authors
This model card was written by Fernanda Kipper