🚀 Swedish BERT Models for Sentiment Analysis
Recorded Future, in collaboration with AI Sweden, presents two Swedish language models tailored for sentiment analysis. These models are built upon the KB/bert-base-swedish-cased model and fine - tuned to handle multi - label sentiment analysis tasks.
These models have been optimized for detecting sentiments related to fear and violence. They output three floating - point numbers corresponding to the labels "Negative", "Weak sentiment", and "Strong Sentiment" at their respective indices.
The models were trained on Swedish conversational data gathered from diverse internet sources and forums. They are exclusively trained on Swedish data and support inference only for Swedish input texts. Inference metrics for non - Swedish inputs are undefined, and such inputs are regarded as out - of - domain data.
The current models are supported by Transformers version >= 4.3.3 and Torch version 1.8.0. Compatibility with older versions has not been verified.
✨ Features
- Sentiment Focus: Fine - tuned for fear and violence sentiment analysis.
- Swedish - Specific: Trained solely on Swedish data for accurate Swedish text inference.
- Multi - Label Output: Provides three - label sentiment output.
📦 Installation
The models can be imported from the transformers
library.
Swedish - Sentiment - Fear
from transformers import BertForSequenceClassification, BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained("RecordedFuture/Swedish-Sentiment-Fear")
classifier_fear = BertForSequenceClassification.from_pretrained("RecordedFuture/Swedish-Sentiment-Fear")
Swedish - Sentiment - Violence
from transformers import BertForSequenceClassification, BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained("RecordedFuture/Swedish-Sentiment-Violence")
classifier_violence = BertForSequenceClassification.from_pretrained("RecordedFuture/Swedish-Sentiment-Violence")
💻 Usage Examples
Basic Usage
After initializing the model and tokenizer, you can use the model for inference.
from transformers import BertForSequenceClassification, BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained("RecordedFuture/Swedish-Sentiment-Fear")
classifier_fear = BertForSequenceClassification.from_pretrained("RecordedFuture/Swedish-Sentiment-Fear")
input_text = "Your Swedish text here"
inputs = tokenizer(input_text, return_tensors='pt')
outputs = classifier_fear(**inputs)
Advanced Usage
The models are optimized for specific sentiment analysis tasks. You can use them for more complex scenarios by fine - tuning the input pre - processing or post - processing steps.
from transformers import BertForSequenceClassification, BertTokenizerFast
tokenizer = BertTokenizerFast.from_pretrained("RecordedFuture/Swedish-Sentiment-Violence")
classifier_violence = BertForSequenceClassification.from_pretrained("RecordedFuture/Swedish-Sentiment-Violence")
input_text = "Your Swedish text here"
inputs = tokenizer(input_text, return_tensors='pt')
outputs = classifier_violence(**inputs)
import torch
probabilities = torch.softmax(outputs.logits, dim = 1)
📚 Documentation
Sentiment Definitions
Swedish - Sentiment - Fear
Strong Sentiment
Texts that hold an expressive emphasis on fear and/ or anxiety.
Weak Sentiment
Texts that express fear and/ or anxiety in a neutral way.
Swedish - Sentiment - Violence
Strong Sentiment
Texts that reference highly violent acts or hold an aggressive tone.
Weak Sentiment
Texts that include general violent statements that do not fall under the strong sentiment.
Verification Metrics
Swedish - Sentiment - Fear
Classification Breakpoint |
F - score |
Precision |
Recall |
0.45 |
0.8754 |
0.8618 |
0.8895 |
Swedish - Sentiment - Violence
Classification Breakpoint |
F - score |
Precision |
Recall |
0.35 |
0.7677 |
0.7456 |
0.791 |
📄 License
This project is released under the MIT license.