Model Overview
Model Features
Model Capabilities
Use Cases
🚀 ScandiNLI - Natural Language Inference model for Scandinavian Languages
ScandiNLI is a fine - tuned model for Natural Language Inference in Danish, Norwegian Bokmål, and Swedish, offering different sizes to meet various needs.
📋 Model Information
Property | Details |
---|---|
Model Type | Fine - tuned version of [NbAiLab/nb - bert - base](https://huggingface.co/NbAiLab/nb - bert - base) |
Training Data | [DanFEVER](https://aclanthology.org/2021.nodalida - main.pdf#page = 439), machine - translated versions of MultiNLI, CommitmentBank, [FEVER](https://aclanthology.org/N18 - 1074/), and [Adversarial NLI](https://aclanthology.org/2020.acl - main.441/) |
Pipeline Tag | zero - shot - classification |
Base Model | [NbAiLab/nb - bert - base](https://huggingface.co/NbAiLab/nb - bert - base) |
🚀 Quick Start
You can use this model in your scripts as follows:
Basic Usage
>>> from transformers import pipeline
>>> classifier = pipeline(
... "zero-shot-classification",
... model="alexandrainst/scandi-nli-base",
... )
>>> classifier(
... "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
... candidate_labels=['sundhed', 'politik', 'sport', 'religion'],
... hypothesis_template="Dette eksempel handler om {}",
... )
{'sequence': "Mexicansk bokser advarer Messi - 'Du skal bede til gud, om at jeg ikke finder dig'",
'labels': ['sport', 'religion', 'sundhed', 'politik'],
'scores': [0.724335789680481,
0.1176532730460167,
0.08848614990711212,
0.06952482461929321]}
✨ Features
We have released three models for Scandinavian NLI, of different sizes:
- [alexandrainst/scandi - nli - large - v2](https://huggingface.co/alexandrainst/scandi - nli - large - v2)
- [alexandrainst/scandi - nli - large](https://huggingface.co/alexandrainst/scandi - nli - large)
- alexandrainst/scandi - nli - base (this)
- [alexandrainst/scandi - nli - small](https://huggingface.co/alexandrainst/scandi - nli - small)
A demo of the large - v2 model can be found in [this Hugging Face Space](https://huggingface.co/spaces/alexandrainst/zero - shot - classification) - check it out!
📊 Performance
We evaluate the models in Danish, Swedish, and Norwegian Bokmål separately. In all cases, we report Matthew's Correlation Coefficient (MCC), macro - average F1 - score, as well as accuracy.
Scandinavian Evaluation
The Scandinavian scores are the average of the Danish, Swedish, and Norwegian scores.
Model | MCC | Macro - F1 | Accuracy | Number of Parameters |
---|---|---|---|---|
[alexandrainst/scandi - nli - large - v2 ](https://huggingface.co/alexandrainst/scandi - nli - large - v2) |
75.42% | 75.41% | 84.95% | 354M |
[alexandrainst/scandi - nli - large ](https://huggingface.co/alexandrainst/scandi - nli - large) |
73.70% | 74.44% | 83.91% | 354M |
[MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7 ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7) |
69.01% | 71.99% | 80.66% | 279M |
alexandrainst/scandi - nli - base (this) |
67.42% | 71.54% | 80.09% | 178M |
[joeddav/xlm - roberta - large - xnli ](https://huggingface.co/joeddav/xlm - roberta - large - xnli) |
64.17% | 70.80% | 77.29% | 560M |
[MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli) |
63.94% | 70.41% | 77.23% | 279M |
[NbAiLab/nb - bert - base - mnli ](https://huggingface.co/NbAiLab/nb - bert - base - mnli) |
61.71% | 68.36% | 76.08% | 178M |
[alexandrainst/scandi - nli - small ](https://huggingface.co/alexandrainst/scandi - nli - small) |
56.02% | 65.30% | 73.56% | 22M |
Danish Evaluation
We use a test split of the [DanFEVER dataset](https://aclanthology.org/2021.nodalida - main.pdf#page = 439) to evaluate the Danish performance of the models. The test split is generated using this gist.
Model | MCC | Macro - F1 | Accuracy | Number of Parameters |
---|---|---|---|---|
[alexandrainst/scandi - nli - large - v2 ](https://huggingface.co/alexandrainst/scandi - nli - large - v2) |
75.65% | 59.23% | 87.89% | 354M |
[alexandrainst/scandi - nli - large ](https://huggingface.co/alexandrainst/scandi - nli - large) |
73.80% | 58.41% | 86.98% | 354M |
[MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7 ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7) |
68.37% | 57.10% | 83.25% | 279M |
alexandrainst/scandi - nli - base (this) |
62.44% | 55.00% | 80.42% | 178M |
[NbAiLab/nb - bert - base - mnli ](https://huggingface.co/NbAiLab/nb - bert - base - mnli) |
56.92% | 53.25% | 76.39% | 178M |
[MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli) |
52.79% | 52.00% | 72.35% | 279M |
[joeddav/xlm - roberta - large - xnli ](https://huggingface.co/joeddav/xlm - roberta - large - xnli) |
49.18% | 50.31% | 69.73% | 560M |
[alexandrainst/scandi - nli - small ](https://huggingface.co/alexandrainst/scandi - nli - small) |
47.28% | 48.88% | 73.46% | 22M |
Swedish Evaluation
We use the test split of the machine - translated version of the MultiNLI dataset to evaluate the Swedish performance of the models. We acknowledge that not evaluating on a gold - standard dataset is not ideal, but unfortunately we are not aware of any NLI datasets in Swedish.
Model | MCC | Macro - F1 | Accuracy | Number of Parameters |
---|---|---|---|---|
[alexandrainst/scandi - nli - large - v2 ](https://huggingface.co/alexandrainst/scandi - nli - large - v2) |
79.02% | 85.99% | 85.99% | 354M |
[alexandrainst/scandi - nli - large ](https://huggingface.co/alexandrainst/scandi - nli - large) |
76.69% | 84.47% | 84.38% | 354M |
[joeddav/xlm - roberta - large - xnli ](https://huggingface.co/joeddav/xlm - roberta - large - xnli) |
75.35% | 83.42% | 83.55% | 560M |
[MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli) |
73.84% | 82.46% | 82.58% | 279M |
[MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7 ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7) |
73.32% | 82.15% | 82.08% | 279M |
alexandrainst/scandi - nli - base (this) |
72.29% | 81.37% | 81.51% | 178M |
[NbAiLab/nb - bert - base - mnli ](https://huggingface.co/NbAiLab/nb - bert - base - mnli) |
64.69% | 76.40% | 76.47% | 178M |
[alexandrainst/scandi - nli - small ](https://huggingface.co/alexandrainst/scandi - nli - small) |
62.35% | 74.79% | 74.93% | 22M |
Norwegian Evaluation
We use the test split of the machine - translated version of the MultiNLI dataset to evaluate the Norwegian performance of the models. We acknowledge that not evaluating on a gold - standard dataset is not ideal, but unfortunately we are not aware of any NLI datasets in Norwegian.
Model | MCC | Macro - F1 | Accuracy | Number of Parameters |
---|---|---|---|---|
[alexandrainst/scandi - nli - large - v2 ](https://huggingface.co/alexandrainst/scandi - nli - large - v2) |
71.59% | 81.00% | 80.96% | 354M |
[alexandrainst/scandi - nli - large ](https://huggingface.co/alexandrainst/scandi - nli - large) |
70.61% | 80.43% | 80.36% | 354M |
[joeddav/xlm - roberta - large - xnli ](https://huggingface.co/joeddav/xlm - roberta - large - xnli) |
67.99% | 78.68% | 78.60% | 560M |
alexandrainst/scandi - nli - base (this) |
67.53% | 78.24% | 78.33% | 178M |
[MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7 ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - xnli - multilingual - nli - 2mil7) |
65.33% | 76.73% | 76.65% | 279M |
[MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli ](https://huggingface.co/MoritzLaurer/mDeBERTa - v3 - base - mnli - xnli) |
65.18% | 76.76% | 76.77% | 279M |
[NbAiLab/nb - bert - base - mnli ](https://huggingface.co/NbAiLab/nb - bert - base - mnli) |
63.51% | 75.42% | 75.39% | 178M |
[alexandrainst/scandi - nli - small ](https://huggingface.co/alexandrainst/scandi - nli - small) |
58.42% | 72.22% | 72.30% | 22M |
🔧 Technical Details
Training procedure
It has been fine - tuned on a dataset composed of [DanFEVER](https://aclanthology.org/2021.nodalida - main.pdf#page = 439) as well as machine - translated versions of MultiNLI and CommitmentBank into all three languages, and machine - translated versions of [FEVER](https://aclanthology.org/N18 - 1074/) and [Adversarial NLI](https://aclanthology.org/2020.acl - main.441/) into Swedish.
The training split of DanFEVER is generated using this gist.
The three languages are sampled equally during training, and they're validated on validation splits of [DanFEVER](https://aclanthology.org/2021.nodalida - main.pdf#page = 439) and machine - translated versions of MultiNLI for Swedish and Norwegian Bokmål, sampled equally.
Check out the Github repository for the code used to train the ScandiNLI models, and the full training logs can be found in this Weights and Biases report.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e - 05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 4242
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9, 0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- max_steps: 50,000
📄 License
This model is licensed under the Apache 2.0 license.






