Fulltrain-xsum-bart Open Source Model - Free Deployment, Precise Generation of English Abstract Summaries

Fulltrain Xsum Bart

Developed by bhargavis

BART-large model fine-tuned on the XSum dataset for generating English abstractive summaries

EnglishOpen Source License:MIT #Extreme summarization generation #BBC news adaptation #Single-sentence summarization optimization

Downloads 108

Release Time : 2/5/2025

Model Overview

This model is optimized for extreme summarization tasks of BBC articles, capable of compressing long documents into concise single-sentence summaries

Model Features

Extreme summarization generation

Optimized specifically for generating ultra-concise single-sentence summaries, suitable for scenarios requiring highly condensed summaries

BBC content adaptation

Trained on BBC news articles, delivering optimal performance for texts with similar styles

Efficient training configuration

Utilizes FP16 acceleration and optimized training strategies, completing training in just 9 hours on 2 T4 GPUs

Model Capabilities

Text summarization generation

English content processing

Long text compression

Use Cases

News summarization

News flash generation

Compressing lengthy news reports into single-sentence key points

Generates concise summaries that align with human writing conventions

Content analysis

Document core extraction

Extracting core statements from long documents

Achieves a ROUGE-1 score of 0.401

🚀 fulltrain-xsum-bart

This is a fine-tuned model on the XSum dataset for abstractive summarization tasks, which can generate concise summaries from long documents.

✨ Features

Architecture: BART (Bidirectional and Auto-Regressive Transformers)
Task: Abstractive Summarization
Dataset: XSum (Extreme Summarization)
Training Hardware: 2x NVIDIA T4 GPUs (using Kaggle)
Training Time: ~9 hours

📦 Installation

Since this model is based on the transformers library, you can install it using the following command:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import pipeline

# Load the few-shot model
summarizer = pipeline("summarization", model="bhargavis/fulltrain-xsum-bart")

# Provide input text
input_text = """
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
"""

# Generate summary
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
print(summary[0]["summary_text"])

📚 Documentation

Dataset Details

Property	Details
Train Dataset	204,045 samples
Validation Dataset	11,332 samples
Test Dataset	11,334 samples

The XSum dataset consists of BBC articles and their corresponding single-sentence summaries. The model was trained to generate summaries that are concise and capture the essence of the input document.

Training Details

Training Parameter	Value
Training Epochs	1
Batch Size	8 (per device)
Learning Rate	5e-5
Weight Decay	0.01
Warmup Steps	500
FP16 Training	Enabled
Evaluation Strategy	Per Epoch
Best Model Selection	Based on validation loss (eval_loss)

Evaluation Metrics

Metric	Score
Training Loss	0.3771
Validation Loss	0.350379
Rouge-1	0.401344019
Rouge-2	0.188076798
Rouge-L	0.33460693

These metrics were computed using the rouge_scorer library for ROUGE scores.

Training Arguments

Arguments	Value
Save Strategy	Per Epoch
Logging Steps	1000
Dataloader Workers	4
Predict with Generate	True
Load Best Model at End	True
Metric for Best Model	eval_loss
Greater is Better	False (Lower validation loss is better)
Report To	Weights & Biases (WandB)

🔧 Technical Details

The model is fine-tuned on the XSum dataset for abstractive summarization tasks. It uses the BART architecture, which is a pre-trained sequence-to-sequence model. The training process involves adjusting the model's weights to minimize the loss on the training data while maximizing the ROUGE scores on the validation data.

📄 License

This project is licensed under the MIT License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご