đ XLM-RoBERTa base Universal Dependencies v2.8 POS tagging: Slovak
This model is designed for part - of - speech tagging in Slovak, leveraging the XLM - RoBERTa base architecture and Universal Dependencies v2.8 dataset, offering high - quality tagging results.
đ Quick Start
This model is part of our paper called:
- Make the Best of Cross - lingual Transfer: Evidence from POS Tagging with over 100 Languages
Check the Space for more details.
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-sk")
model = AutoModelForTokenClassification.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-sk")
đ License
The model is released under the Apache - 2.0 license.
đ Model Information
Property |
Details |
Library Name |
transformers |
Tags |
part - of - speech, token - classification |
Datasets |
universal_dependencies |
Metrics |
accuracy |
Model Name |
xlm - roberta - base - ft - udpos28 - sk |
Model Results
The following table shows the accuracy metrics of the model on different languages:
Language |
Test Accuracy |
English |
82.6 |
Dutch |
84.2 |
German |
79.4 |
Italian |
82.0 |
French |
83.9 |
Spanish |
87.9 |
Russian |
90.5 |
Swedish |
84.6 |
Norwegian |
77.9 |
Danish |
82.2 |
Low Saxon |
53.9 |
Akkadian |
35.8 |
Armenian |
83.8 |
Welsh |
64.8 |
Old East Slavic |
74.9 |
Albanian |
77.9 |
Slovenian |
87.7 |
Guajajara |
36.6 |
Kurmanji |
76.5 |
Turkish |
75.1 |
Finnish |
79.5 |
Indonesian |
81.3 |
Ukrainian |
92.0 |
Polish |
93.3 |
Portuguese |
85.1 |
Kazakh |
79.5 |
Latin |
77.1 |
Old French |
58.0 |
Buryat |
60.6 |
Kaapor |
22.1 |
Korean |
57.4 |
Estonian |
80.7 |
Croatian |
93.7 |
Gothic |
28.3 |
Swiss German |
44.1 |
Assyrian |
14.8 |
North Sami |
40.6 |
Naija |
39.9 |
Latvian |
84.2 |
Chinese |
42.5 |
Tagalog |
70.8 |
Bambara |
28.8 |
Lithuanian |
85.8 |
Galician |
86.1 |
Vietnamese |
67.4 |
Greek |
84.6 |
Catalan |
85.8 |
Czech |
94.3 |
Erzya |
49.8 |
Bhojpuri |
48.1 |
Thai |
58.1 |
Marathi |
87.7 |
Basque |
74.0 |
Slovak |
97.5 |
Kiche |
33.9 |
Yoruba |
26.9 |
Warlpiri |
42.1 |
Tamil |
83.0 |
Maltese |
29.1 |
Ancient Greek |
59.0 |
Icelandic |
77.4 |
Mbya Guarani |
33.1 |
Urdu |
62.2 |
Romanian |
81.4 |
Persian |
77.9 |
Apurina |
46.7 |
Japanese |
27.4 |
Hungarian |
81.9 |
Hindi |
65.3 |
Classical Chinese |
30.2 |
Komi Permyak |
48.7 |
Faroese |
75.4 |
Sanskrit |
36.3 |
Livvi |
64.9 |
Arabic |
79.6 |
Wolof |
39.0 |
Bulgarian |
90.5 |
Akuntsu |
39.1 |
Makurap |
24.7 |
Kangri |
49.9 |
Breton |
61.8 |
Telugu |
79.6 |
Cantonese |
45.6 |
Old Church Slavonic |
45.9 |
Karelian |
67.9 |
Upper Sorbian |
78.6 |
South Levantine Arabic |
66.7 |
Komi Zyrian |
44.2 |
Irish |
67.2 |
Nayini |
43.6 |
Munduruku |
27.3 |
Manx |
36.8 |
Skolt Sami |
41.3 |
Afrikaans |
79.2 |
Old Turkish |
38.0 |
Tupinamba |
40.3 |
Belarusian |
89.8 |
Serbian |
94.6 |
Moksha |
48.2 |
Western Armenian |
76.0 |
Scottish Gaelic |
57.0 |
Khunsari |
37.8 |
Hebrew |
81.2 |
Uyghur |
72.4 |
Chukchi |
37.0 |