đ FinTwitBERT
FinTwitBERT is a specialized language model pre - trained on a large financial tweets dataset. It captures the unique jargon and style in financial Twitter, ideal for sentiment analysis, trend prediction, and other financial NLP tasks.
đ Quick Start
FinTwitBERT is a language model specifically pre - trained on a large dataset of financial tweets. This specialized BERT model aims to capture the unique jargon and communication style found in the financial Twitter sphere, making it an ideal tool for sentiment analysis, trend prediction, and other financial NLP tasks.
⨠Features
Sentiment Analysis
The [FinTwitBERT - sentiment](https://huggingface.co/StephanAkkerman/FinTwitBERT - sentiment) model leverages FinTwitBERT for the sentiment analysis of financial tweets, offering nuanced insights into the prevailing market sentiments.
Dataset
FinTwitBERT is pre - trained on several financial tweets datasets, consisting of tweets mentioning stocks and cryptocurrencies:
- [StephanAkkerman/crypto - stock - tweets](https://huggingface.co/datasets/StephanAkkerman/crypto - stock - tweets): 8,024,269 tweets
- [StephanAkkerman/stock - market - tweets - data](https://huggingface.co/datasets/StephanAkkerman/stock - market - tweets - data): 923,673 tweets
- [StephanAkkerman/financial - tweets](https://huggingface.co/datasets/StephanAkkerman/financial - tweets): 263,119 tweets
Model Details
Based on the [FinBERT](https://huggingface.co/yiyanghkust/finbert - pretrain) model and tokenizer, FinTwitBERT includes additional masks (@USER
and [URL]
) to handle common elements in tweets. The model underwent 10 epochs of pre - training, with early stopping to prevent overfitting.
More Information
For a comprehensive overview, including the complete training setup details and more, visit the FinTwitBERT GitHub repository.
đģ Usage Examples
Basic Usage
from transformers import pipeline
pipe = pipeline(
"fill - mask",
model="StephanAkkerman/FinTwitBERT",
)
print(pipe("Bitcoin is a [MASK] coin."))
đ Documentation
Citing & Authors
If you use FinTwitBERT or FinTwitBERT - sentiment in your research, please cite us as follows, noting that both authors contributed equally to this work:
@misc{FinTwitBERT,
author = {Stephan Akkerman, Tim Koornstra},
title = {FinTwitBERT: A Specialized Language Model for Financial Tweets},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/TimKoornstra/FinTwitBERT}}
}
Additionally, if you utilize the sentiment classifier, please cite:
@misc{FinTwitBERT - sentiment,
author = {Stephan Akkerman, Tim Koornstra},
title = {FinTwitBERT - sentiment: A Sentiment Classifier for Financial Tweets},
year = {2023},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/StephanAkkerman/FinTwitBERT - sentiment}}
}
đ License
This project is licensed under the MIT License. See the LICENSE file for details.
Property |
Details |
Model Type |
FinTwitBERT, a specialized BERT model for financial tweets |
Training Data |
[StephanAkkerman/stock - market - tweets - data](https://huggingface.co/datasets/StephanAkkerman/stock - market - tweets - data), [StephanAkkerman/financial - tweets](https://huggingface.co/datasets/StephanAkkerman/financial - tweets), [StephanAkkerman/crypto - stock - tweets](https://huggingface.co/datasets/StephanAkkerman/crypto - stock - tweets) |
Metrics |
perplexity |
Base Model |
yiyanghkust/finbert - pretrain |