đ LED-FINAL-GENAI15: Financial Document Summarization Model
This model, based on the LED architecture, is specifically fine - tuned for summarizing long financial documents. It can handle up to 8000 - token inputs, ensuring essential content and coherence in the summaries.
đ Quick Start
This model is designed for summarizing long financial documents. You can start using it with either a simple pipeline or a custom global attention mask setup.
đģ Usage Examples
Basic Usage
import torch
from transformers import pipeline
hf_name = 'fahil2631/led-financial_summarization-genai15'
summarizer = pipeline(
"summarization",
model=hf_name,
tokenizer=hf_name,
device=0 if torch.cuda.is_available() else -1,
)
wall_of_text = """Your long financial text goes here."""
result = summarizer(
wall_of_text,
min_length=16,
max_length=256,
no_repeat_ngram_size=3,
encoder_no_repeat_ngram_size=3,
repetition_penalty=2.5,
num_beams=4,
early_stopping=True,
)
print(result[0]["summary_text"])
Advanced Usage
import torch
from transformers import pipeline,AutoTokenizer, AutoModelForSeq2SeqLM
hf_name = 'fahil2631/led-financial_summarization-genai15'
summarizer_1 = pipeline(
"summarization",
hf_name,
device=0 if torch.cuda.is_available() else -1,
)
wall_of_text = """Your long financial text goes here."""
inputs = tokenizer(
wall_of_text,
return_tensors="pt",
truncation=True,
max_length=8000
)
global_attention_mask = torch.zeros(inputs["input_ids"].shape, dtype=torch.long)
global_attention_mask[:, 0] = 1
global_attention_mask[:, -1] = 1
model_1 = AutoModelForSeq2SeqLM.from_pretrained(hf_name).to(device)
summary_ids_1 = model_1.generate(
inputs["input_ids"].to(device),
attention_mask=inputs["attention_mask"].to(device),
global_attention_mask=global_attention_mask.to(device),
max_length=256,
min_length=16,
num_beams=4,
repetition_penalty=2.5,
no_repeat_ngram_size=3,
early_stopping=True
)
result_globalmask_pretrained = tokenizer.decode(summary_ids_1[0], skip_special_tokens=True)
result_globalmask_pretrained
⨠Features
- Accurate Summarization: Capable of accurately summarizing long financial documents up to 8000 tokens while maintaining essential content and coherence.
- Based on Proven Architecture: Fine - tuned from the
pszemraj/led - large - book - summary
model, leveraging the LED architecture designed for long - document handling.
- Consistent Summaries: Trained on summaries primarily generated by ChatGPT (70%), ensuring consistency in style and format.
đĻ Installation
The README does not provide installation steps, so this section is skipped.
đ Documentation
Model Details
Model Description
fahil2631/led - financial_summarization - genai15
, also known as LED - FINAL - GENAI15
, is a fine - tuned version of the pszemraj/led - large - book - summary
model for financial summarization. It was developed by GenAI Group 15 (2024/2025) from Warwick Business School.
The model was trained on the kritsadaK/EDGAR - CORPUS - Financial - Summarization
dataset, which contains long - form financial texts like 10 - K filings from EDGAR (1993â2020). Summaries were mainly generated by ChatGPT (70%).
- Developed by: GenAI Group 15 2024/2025, Warwick Business School
- Fine - tuned from: pszemraj/led - large - book - summary
- Task: Abstractive summarization (financial domain)
- Language(s): English
Model Sources
- [Pretrained base model](https://huggingface.co/pszemraj/led - large - book - summary)
- [Dataset source](https://huggingface.co/datasets/kritsadaK/EDGAR - CORPUS - Financial - Summarization)
Intended Uses
This model is designed for summarizing long financial documents, such as quarterly and annual financial reports and generating executive summaries for financial filings.
Training Details
Training Data
The model was trained on a filtered subset of the [kritsadaK/EDGAR - CORPUS - Financial - Summarization
](https://huggingface.co/datasets/kritsadaK/EDGAR - CORPUS - Financial - Summarization) dataset. The dataset contains financial reports (mostly 10 - K filings) from 1993 to 2020. Only ChatGPT - generated summaries (about 70% of the dataset) were used for training.
- Total samples used: 6,664 (ChatGPT only)
- Train: 5,331
- Validation: 666
- Test: 667
- Input fields:
input
(original financial document), summary
(target text), model
(summary generator)
- Filtering criteria:
model == "ChatGPT"
Training Procedure
- Fine - Tuning Dataset: EDGAR - CORPUS - Financial - Summarization
- Training Batch Size: 1 (with gradient accumulation)
- Training Epochs: 3
- Optimizer: AdamW with 8 - bit precision
- Learning Rate: 3e - 5
- Evaluation: Every 500 steps
- Checkpoints Saved: Every 1000 steps
- GPU: NVIDIA L4 GPU
Training Hyperparameters
- Training regime: FP16 mixed precision
- Batch size: 1 (with gradient accumulation steps = 2, effective batch size = 2)
- Learning rate: 3e - 5
- Epochs: 3
- Optimizer: AdamW (8 - bit via
bitsandbytes
)
- Evaluation steps: every 500 steps
- Checkpointing: every 1000 steps
- Max input length: 8000 tokens
- Max target length: 256 tokens
- Beam search: 4 beams
- Repetition penalty: 2.5
- No - repeat n - gram size: 3
- Global attention mask: enabled on the first token
Speeds, Sizes, Times
- GPU used: NVIDIA L4
- Training runtime: ~2.5 hours per 1000 steps (7995 steps total)
- Training throughput: ~1.68 samples/sec
- Checkpoint size: ~1.84 GB (
.safetensors
)
- Saved model size: ~1.84 GB
Evaluation
Metrics
The model was evaluated using standard ROUGE metrics:
- ROUGE - 1: Measures overlap of unigrams (individual words) between the system and reference summaries.
- ROUGE - 2: Measures overlap of bigrams (two consecutive words).
- ROUGE - L: Measures the longest common subsequence between the system and reference summaries.
- ROUGE - Lsum: A variation of ROUGE - L for multi - sentence summaries.
Evaluation Results
The following results were obtained on a set of 20 randomly selected samples from the test set:
Model |
ROUGE - 1 |
ROUGE - 2 |
ROUGE - L |
ROUGE - Lsum |
led - financial_summarization - genai15 |
0.5121 |
0.2089 |
0.2987 |
0.4359 |
BART - financial - summarization |
0.4574 |
0.1976 |
0.2728 |
0.3876 |
LED - large - book - summary |
0.3066 |
0.0470 |
0.1391 |
0.2128 |
Summary
led - financial_summarization - genai15
outperformed both the BART - based and base LED models across all ROUGE metrics, demonstrating its effectiveness in financial document summarization.
đ§ Technical Details
The README does not provide additional technical details beyond what is covered in other sections, so this section is skipped.
đ License
The README does not provide license information, so this section is skipped.