🚀 PEGASUS for Financial Summarization
This model is fine - tuned on a novel financial news dataset, which can effectively summarize financial news. It is based on the PEGASUS architecture, offering high - quality summarization for financial topics such as stocks, markets, and cryptocurrencies.
🚀 Quick Start
How to use
We provide a simple snippet of how to use this model for the task of financial summarization in PyTorch.
from transformers import PegasusTokenizer, PegasusForConditionalGeneration, TFPegasusForConditionalGeneration
model_name = "human-centered-summarization/financial-summarization-pegasus"
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name)
text_to_summarize = "National Commercial Bank (NCB), Saudi Arabia’s largest lender by assets, agreed to buy rival Samba Financial Group for $15 billion in the biggest banking takeover this year.NCB will pay 28.45 riyals ($7.58) for each Samba share, according to a statement on Sunday, valuing it at about 55.7 billion riyals. NCB will offer 0.739 new shares for each Samba share, at the lower end of the 0.736-0.787 ratio the banks set when they signed an initial framework agreement in June.The offer is a 3.5% premium to Samba’s Oct. 8 closing price of 27.50 riyals and about 24% higher than the level the shares traded at before the talks were made public. Bloomberg News first reported the merger discussions.The new bank will have total assets of more than $220 billion, creating the Gulf region’s third-largest lender. The entity’s $46 billion market capitalization nearly matches that of Qatar National Bank QPSC, which is still the Middle East’s biggest lender with about $268 billion of assets."
input_ids = tokenizer(text_to_summarize, return_tensors="pt").input_ids
output = model.generate(
input_ids,
max_length=32,
num_beams=5,
early_stopping=True
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
✨ Features
- Fine - tuned on Financial News: This model was fine - tuned on a novel financial news dataset, which consists of 2K articles from Bloomberg, covering topics such as stock, markets, currencies, rate and cryptocurrencies.
- Based on PEGASUS: It is based on the PEGASUS model, specifically the PEGASUS fine - tuned on the Extreme Summarization (XSum) dataset: google/pegasus-xsum model.
- Advanced Version Available: This model serves as a base version. For an even more advanced model with significantly enhanced performance, please check out our advanced version on Rapid API. The advanced model offers more than a 16% increase in ROUGE scores (similarity to a human - generated summary) compared to our base model.
📚 Documentation
Model Information
Property |
Details |
Model Type |
human - centered - summarization/financial - summarization - pegasus |
Training Data |
A novel financial news dataset consisting of 2K articles from Bloomberg on topics like stock, markets, currencies, rate and cryptocurrencies |
Evaluation Results
The results before and after the fine - tuning on our dataset are shown below:
Fine - tuning |
R - 1 |
R - 2 |
R - L |
R - S |
Yes |
23.55 |
6.99 |
18.14 |
21.36 |
No |
13.8 |
2.4 |
10.63 |
12.03 |
📄 License
No license information provided in the original document.
📚 Citation
You can find more details about this work in the following workshop paper. If you use our model in your research, please consider citing our paper:
T. Passali, A. Gidiotis, E. Chatzikyriakidis and G. Tsoumakas. 2021.
Towards Human - Centered Summarization: A Case Study on Financial News.
In Proceedings of the First Workshop on Bridging Human - Computer Interaction and Natural Language Processing(pp. 21–27). Association for Computational Linguistics.
BibTeX entry:
@inproceedings{passali-etal-2021-towards,
title = "Towards Human - Centered Summarization: A Case Study on Financial News",
author = "Passali, Tatiana and Gidiotis, Alexios and Chatzikyriakidis, Efstathios and Tsoumakas, Grigorios",
booktitle = "Proceedings of the First Workshop on Bridging Human{--}Computer Interaction and Natural Language Processing",
month = apr,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.hcinlp-1.4",
pages = "21--27",
}
💡 Usage Tip
If you are interested in a more sophisticated version of the model, trained on more articles and adapted to your needs, contact us at info@medoid.ai!
More information about Medoid AI: