🚀 News Category Classification for IPTC NewsCodes
This model is designed to classify news content in Norwegian, Swedish, and English into 16 specified categories according to IPTC NewsCodes. It's a fine - tuned version of [KB/bert - base - swedish - cased](https://huggingface.co/KB/bert - base - swedish - cased) on a private dataset, aiming to outperform Claude Haiku and GPT - 3.5 in this specific use case.
✨ Features
- Built from a limited set of English, Swedish, and Norwegian titles for news content classification.
- Fine - tuned on a skewed dataset that has been slightly augmented for stability.
- Can categorize news texts into 16 IPTC NewsCodes categories.
📦 Installation
No installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
To use this model for news category classification, you can input a news title and get the corresponding category.
Input: Mann siktet for drapsforsøk på Slovakias statsministeren
Output: politics
Input: Tre døde i kioskbrann i Tyskland
Output: disaster, accident, and emergency incident
Input: Kultfilm får Netflix - oppfølger. Kultfilmen «Happy Gilmore» fra 1996 får en oppfølger på Netflix. Det røper strømmetjenesten selv på X, tidligere Twitter. –Happy Gilmore er tilbake!
Output: arts, culture, entertainment and media
Advanced Usage
When using the model, it's recommended to only set the category if the value for the label is at least 60%. Otherwise, the model is uncertain about the classification.
📚 Documentation
Model description
The model is a test model for demonstration purposes. It's intended to categorize Norwegian, Swedish, and English news content within the 16 specified categories. However, it needs more data in several categories to provide 100% value.
Intended uses & limitations
Use it to categorize news texts. Only set the category if the value is at least 60% for the label. Otherwise, the model is uncertain about the classification.
Performance
It achieves the following results on the evaluation set:
- Loss: 0.8030
- Accuracy: 0.7431
- F1: 0.7474
- Precision: 0.7695
- Recall: 0.7431
See the performance (accuracy) for each label below:
Category |
Accuracy |
Arts, culture, entertainment and media |
0.6842 |
Conflict, war and peace |
0.7351 |
Crime, law and justice |
0.8918 |
Disaster, accident, and emergency incident |
0.8699 |
Economy, business, and finance |
0.6893 |
Environment |
0.4483 |
Health |
0.7222 |
Human interest |
0.3182 |
Labour |
0.5 |
Lifestyle and leisure |
0.5556 |
Politics |
0.7909 |
Science and technology |
0.4583 |
Society |
0.3538 |
Sport |
0.9615 |
Weather |
1.0 |
Religion |
0.0 |
Training and evaluation data
The model was trained with the trainer, setting a learning rate of 2e - 05 and a batch size of 16 for 3 epochs.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
Property |
Details |
learning_rate |
2e - 05 |
train_batch_size |
16 |
eval_batch_size |
16 |
seed |
42 |
gradient_accumulation_steps |
2 |
total_train_batch_size |
32 |
optimizer |
Adam with betas=(0.9,0.999) and epsilon = 1e - 08 |
lr_scheduler_type |
linear |
lr_scheduler_warmup_steps |
500 |
num_epochs |
3 |
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Accuracy |
F1 |
Precision |
Recall |
Accuracy Label Arts, culture, entertainment and media |
Accuracy Label Conflict, war and peace |
Accuracy Label Crime, law and justice |
Accuracy Label Disaster, accident, and emergency incident |
Accuracy Label Economy, business, and finance |
Accuracy Label Environment |
Accuracy Label Health |
Accuracy Label Human interest |
Accuracy Label Labour |
Accuracy Label Lifestyle and leisure |
Accuracy Label Politics |
Accuracy Label Religion |
Accuracy Label Science and technology |
Accuracy Label Society |
Accuracy Label Sport |
Accuracy Label Weather |
1.9761 |
0.2907 |
200 |
1.4046 |
0.6462 |
0.6164 |
0.6057 |
0.6462 |
0.3158 |
0.8315 |
0.7629 |
0.7055 |
0.5437 |
0.0 |
0.5 |
0.0 |
0.0 |
0.3333 |
0.4843 |
0.0 |
0.0833 |
0.0 |
0.9615 |
0.0 |
1.2153 |
0.5814 |
400 |
1.0225 |
0.6894 |
0.6868 |
0.7652 |
0.6894 |
0.7895 |
0.6554 |
0.8196 |
0.8562 |
0.6408 |
0.2414 |
0.8333 |
0.1364 |
0.0 |
0.6667 |
0.8467 |
0.0 |
0.375 |
0.0154 |
0.9615 |
1.0 |
0.954 |
0.8721 |
600 |
0.8858 |
0.7231 |
0.7138 |
0.7309 |
0.7231 |
0.7368 |
0.7795 |
0.8918 |
0.8699 |
0.6214 |
0.3448 |
0.8889 |
0.1818 |
1.0 |
0.5556 |
0.6899 |
0.0 |
0.0833 |
0.0 |
0.9615 |
0.0 |
0.6662 |
1.1628 |
800 |
0.9381 |
0.6881 |
0.7009 |
0.7618 |
0.6881 |
0.7895 |
0.6126 |
0.8454 |
0.8630 |
0.6505 |
0.4483 |
0.7222 |
0.2273 |
1.0 |
0.4444 |
0.8293 |
0.0 |
0.5417 |
0.2308 |
0.9615 |
1.0 |
0.5554 |
1.4535 |
1000 |
0.8791 |
0.7025 |
0.7124 |
0.7628 |
0.7025 |
0.7368 |
0.6478 |
0.9021 |
0.8562 |
0.6602 |
0.3103 |
0.7778 |
0.3636 |
0.5 |
0.5556 |
0.8084 |
0.0 |
0.5 |
0.1846 |
0.9615 |
1.0 |
0.4396 |
1.7442 |
1200 |
0.8275 |
0.7175 |
0.7280 |
0.7686 |
0.7175 |
0.7895 |
0.6631 |
0.8196 |
0.8836 |
0.6893 |
0.3793 |
0.8333 |
0.4091 |
0.5 |
0.5556 |
0.8362 |
0.0 |
0.4167 |
0.3692 |
0.9615 |
1.0 |
0.383 |
2.0349 |
1400 |
0.7929 |
0.745 |
0.7501 |
0.7653 |
0.745 |
0.6842 |
0.7841 |
0.8866 |
0.8767 |
0.7087 |
0.4483 |
0.7778 |
0.4091 |
0.5 |
0.5556 |
0.6899 |
0.0 |
0.4167 |
0.2923 |
0.9615 |
0.0 |
0.3418 |
2.3256 |
1600 |
0.8042 |
0.7438 |
0.7440 |
0.7686 |
0.7438 |
0.7895 |
0.7351 |
0.9072 |
0.8493 |
0.7864 |
0.4483 |
0.7778 |
0.3182 |
0.5 |
0.5556 |
0.7909 |
0.0 |
0.4167 |
0.1846 |
0.9615 |
0.0 |
0.248 |
2.6163 |
1800 |
0.8387 |
0.7275 |
0.7325 |
0.7610 |
0.7275 |
0.6842 |
0.6891 |
0.8814 |
0.8699 |
0.7573 |
0.4138 |
0.8333 |
0.4091 |
0.5 |
0.5556 |
0.8014 |
0.0 |
0.4167 |
0.2769 |
0.9615 |
0.0 |
0.2525 |
2.9070 |
2000 |
0.8137 |
0.735 |
0.7413 |
0.7697 |
0.735 |
0.6842 |
0.7106 |
0.8763 |
0.8699 |
0.6796 |
0.4483 |
0.7222 |
0.3636 |
0.5 |
0.5556 |
0.8153 |
0.0 |
0.4583 |
0.3385 |
0.9615 |
0.0 |
Framework versions
- Transformers 4.40.2
- Pytorch 2.2.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
🔧 Technical Details
The model is fine - tuned from [KB/bert - base - swedish - cased](https://huggingface.co/KB/bert - base - swedish - cased) on a private dataset. The dataset is skewed but has been augmented to some extent. The model uses specific hyperparameters during training to achieve better performance in news category classification.
📄 License
No license information is provided in the original document, so this section is skipped.