đ deberta-v3-xsmall-zeroshot-v1.1-all-33
This model offers a small and highly efficient zero-shot option, especially suitable for edge devices or in-browser use-cases with transformers.js.
đ Quick Start
This model was fine-tuned using the same pipeline as described in the model card for MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33 and in this paper.
The foundation model is microsoft/deberta-v3-xsmall. The model only has 22 million backbone parameters and 128 million vocabulary parameters. The backbone parameters are the main parameters active during inference, providing a significant speedup over larger models. The model is 142 MB small.
This model was trained to provide a small and highly efficient zeroshot option, especially for edge devices or in-browser use-cases with transformers.js.
⨠Features
- High Efficiency: With only 22 million backbone parameters, it offers a significant speedup during inference compared to larger models.
- Small Size: The model is only 142 MB, making it suitable for edge devices and in-browser use-cases.
đ Documentation
For usage instructions and other details refer to this model card MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33 and this paper.
đ§ Technical Details
Metrics
I didn't do zeroshot evaluation for this model to save time and compute. The table below shows standard accuracy for all datasets the model was trained on (note that the NLI datasets are binary).
General takeaway: the model is much more efficient than its larger sisters, but it performs less well.
Property |
Details |
Model Type |
Zero-shot text classification model |
Training Data |
Multiple datasets including mnli_m, mnli_mm, fevernli, etc. |
Datasets |
Accuracy |
Inference text/sec (A10G, batch=128) |
mnli_m |
0.925 |
1573.0 |
mnli_mm |
0.923 |
1630.0 |
fevernli |
0.886 |
683.0 |
anli_r1 |
0.732 |
1282.0 |
anli_r2 |
0.633 |
1352.0 |
anli_r3 |
0.661 |
1072.0 |
wanli |
0.814 |
2325.0 |
lingnli |
0.887 |
2008.0 |
wellformedquery |
0.722 |
4781.0 |
rottentomatoes |
0.872 |
2743.0 |
amazonpolarity |
0.944 |
677.0 |
imdb |
0.925 |
228.0 |
yelpreviews |
0.967 |
238.0 |
hatexplain |
0.774 |
2357.0 |
massive |
0.734 |
5027.0 |
banking77 |
0.627 |
4323.0 |
emotiondair |
0.762 |
3247.0 |
emocontext |
0.745 |
3129.0 |
empathetic |
0.465 |
941.0 |
agnews |
0.888 |
1643.0 |
yahootopics |
0.702 |
335.0 |
biasframes_sex |
0.94 |
1517.0 |
biasframes_offensive |
0.853 |
1452.0 |
biasframes_intent |
0.863 |
1498.0 |
financialphrasebank |
0.914 |
2367.0 |
appreviews |
0.926 |
974.0 |
hateoffensive |
0.921 |
2634.0 |
trueteacher |
0.635 |
353.0 |
spam |
0.968 |
2284.0 |
wikitoxic_toxicaggregated |
0.897 |
260.0 |
wikitoxic_obscene |
0.918 |
252.0 |
wikitoxic_identityhate |
0.915 |
256.0 |
wikitoxic_threat |
0.935 |
254.0 |
wikitoxic_insult |
0.9 |
259.0 |
manifesto |
0.505 |
1941.0 |
capsotu |
0.701 |
2080.0 |
đ License
This project is licensed under the MIT license.