đ Few-Shot XSUM BART Model
This model, fewshot-xsum-bart
, is a few-shot learning variant of the facebook/bart-large
model. It's designed for abstractive summarization tasks, aiming to demonstrate the effectiveness of few-shot learning when only limited labeled data is available.
đ Quick Start
To use this model for few-shot abstractive summarization tasks, you can follow the example below:
from transformers import pipeline
summarizer = pipeline("summarization", model="bhargavis/fewshot-xsum-bart")
input_text = """
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
"""
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
print(summary[0]["summary_text"])
⨠Features
- Few-Shot Learning: Trained on a very small subset of the XSUM dataset (100 training samples and 50 validation samples), demonstrating the potential of few-shot learning in summarization tasks.
- Abstractive Summarization: Capable of generating abstractive summaries for input text.
đĻ Installation
No specific installation steps are provided in the original document.
đ Documentation
Model Description
- Model Name: fewshot-xsum-bart
- Base Model: facebook/bart-large
- Task: Summarization (Few-Shot Learning)
Dataset: XSUM (Extreme Summarization Dataset)
- Few-Shot Setup: Trained on 100 samples from the XSUM training set and validated on 50 samples from the XSUM validation set.
- This model is a few-shot learning variant of the BART-large model, fine-tuned on a very small subset of the XSUM dataset.
- The purpose of this model is to demonstrate the effectiveness of few-shot learning in summarization tasks where only a limited amount of labeled data is available.
Purpose
The goal of this model is to explore how well a large pre-trained language model like BART can perform on abstractive summarization when fine-tuned with very limited data (few-shot learning). By training on only 100 samples and validating on 50 samples, this model serves as a proof of concept for few-shot summarization tasks.
- Training Set: 100 samples (randomly selected from the XSUM training set).
- Validation Set: 50 samples (randomly selected from the XSUM validation set).
The small dataset size is intentional, as the focus is on few-shot learning rather than large-scale training.
Fine-Tuning Details
- Base Model: facebook/bart-large (pre-trained on large corpora).
- Fine-Tuning Parameters:
- Epochs: 3
- Batch Size: 8
- Learning Rate: 5e-5
- Max Input Length: 512 tokens
- Max Output Length: 64 tokens
Full-Shot learning model
For a more general-purpose summarization model, check out the full model trained on the entire XSUM dataset: fulltrain-xsum-bart.
Performance
Due to the few-shot nature of this model, its performance is not directly comparable to models trained on the full XSUM dataset. However, it demonstrates the potential of few-shot learning for summarization tasks. Key metrics on the validation set (50 samples) include:
Few-shot learning model
- ROUGE Scores:
- ROUGE-1: 0.34979462836539676
- ROUGE-2: 0.1307846421186083
- ROUGE-L: 0.27450996607520567
- BLEU Score: 6.176957339134279
Zero-shot/Baseline model
- ROUGE Scores:
- ROUGE-1: 0.15600324782737301
- ROUGE-2: 0.017444778781163447
- ROUGE-L: 0.12044578560849475
- BLEU Score: 0.6167333943579659
Limitations
- The model is trained on a very small dataset so its performance may not generalize well to all types of text.
- The purpose of building this model is to compare its performance with Zero-shot and Full-Shot learning model.
- It is best suited for tasks where only limited labeled data is available.
- The model is fine-tuned on BBC articles from the XSUM dataset. Its performance may vary on text from other domains.
- The model may overfit to the training data due to the small dataset size.
Citation
If you use this model in your research please cite it as follows:
@misc{fewshot-xsum-bart,
author = {Bhargavi Sriram},
title = {Few-Shot Abstractive Summarization with BART-Large},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/bhargavis/fewshot-xsum-bart}},
}
đ License
This model is released under the MIT license.