đ fulltrain-xsum-bart
This is a fine-tuned model on the XSum dataset for abstractive summarization tasks, which can generate concise summaries from long documents.
⨠Features
- Architecture: BART (Bidirectional and Auto-Regressive Transformers)
- Task: Abstractive Summarization
- Dataset: XSum (Extreme Summarization)
- Training Hardware: 2x NVIDIA T4 GPUs (using Kaggle)
- Training Time: ~9 hours
đĻ Installation
Since this model is based on the transformers
library, you can install it using the following command:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import pipeline
summarizer = pipeline("summarization", model="bhargavis/fulltrain-xsum-bart")
input_text = """
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
"""
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
print(summary[0]["summary_text"])
đ Documentation
Dataset Details
Property |
Details |
Train Dataset |
204,045 samples |
Validation Dataset |
11,332 samples |
Test Dataset |
11,334 samples |
The XSum dataset consists of BBC articles and their corresponding single-sentence summaries. The model was trained to generate summaries that are concise and capture the essence of the input document.
Training Details
Training Parameter |
Value |
Training Epochs |
1 |
Batch Size |
8 (per device) |
Learning Rate |
5e-5 |
Weight Decay |
0.01 |
Warmup Steps |
500 |
FP16 Training |
Enabled |
Evaluation Strategy |
Per Epoch |
Best Model Selection |
Based on validation loss (eval_loss) |
Evaluation Metrics
Metric |
Score |
Training Loss |
0.3771 |
Validation Loss |
0.350379 |
Rouge-1 |
0.401344019 |
Rouge-2 |
0.188076798 |
Rouge-L |
0.33460693 |
These metrics were computed using the rouge_scorer
library for ROUGE scores.
Training Arguments
Arguments |
Value |
Save Strategy |
Per Epoch |
Logging Steps |
1000 |
Dataloader Workers |
4 |
Predict with Generate |
True |
Load Best Model at End |
True |
Metric for Best Model |
eval_loss |
Greater is Better |
False (Lower validation loss is better) |
Report To |
Weights & Biases (WandB) |
đ§ Technical Details
The model is fine-tuned on the XSum dataset for abstractive summarization tasks. It uses the BART architecture, which is a pre-trained sequence-to-sequence model. The training process involves adjusting the model's weights to minimize the loss on the training data while maximizing the ROUGE scores on the validation data.
đ License
This project is licensed under the MIT License.