đ XLM-RoBERTa base Universal Dependencies v2.8 POS tagging: Dutch
This model addresses the part - of - speech tagging task across multiple languages, offering high - accuracy results. It is a significant contribution in the field of cross - lingual transfer, as demonstrated in the related research paper.
đ Quick Start
This model is part of our paper called:
- Make the Best of Cross - lingual Transfer: Evidence from POS Tagging with over 100 Languages
Check the Space for more details.
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-nl")
model = AutoModelForTokenClassification.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-nl")
đ License
The model is licensed under the Apache - 2.0 license.
đ Model Information
Property |
Details |
Model Type |
xlm - roberta - base - ft - udpos28 - nl |
Training Data |
Universal Dependencies v2.8 |
Tags |
part - of - speech, token - classification |
Metrics |
accuracy |
đ Model Results
The model xlm - roberta - base - ft - udpos28 - nl
has the following performance metrics on different languages:
Language |
Test Accuracy |
English |
88.8 |
Dutch |
97.0 |
German |
89.0 |
Italian |
89.9 |
French |
88.1 |
Spanish |
90.5 |
Russian |
89.2 |
Swedish |
90.7 |
Norwegian |
87.6 |
Danish |
89.0 |
Low Saxon |
58.3 |
Akkadian |
22.9 |
Armenian |
86.7 |
Welsh |
70.2 |
Old East Slavic |
73.5 |
Albanian |
78.9 |
Slovenian |
76.3 |
Guajajara |
22.1 |
Kurmanji |
78.3 |
Turkish |
78.3 |
Finnish |
86.2 |
Indonesian |
85.4 |
Ukrainian |
85.8 |
Polish |
86.3 |
Portuguese |
90.0 |
Kazakh |
83.0 |
Latin |
79.0 |
Old French |
53.1 |
Buryat |
58.4 |
Kaapor |
13.8 |
Korean |
62.2 |
Estonian |
87.6 |
Croatian |
87.6 |
Gothic |
16.5 |
Swiss German |
48.3 |
Assyrian |
14.6 |
North Sami |
36.5 |
Naija |
36.0 |
Latvian |
86.6 |
Chinese |
47.9 |
Tagalog |
73.9 |
Bambara |
29.7 |
Lithuanian |
85.7 |
Galician |
87.4 |
Vietnamese |
65.1 |
Greek |
86.3 |
Catalan |
89.5 |
Czech |
87.3 |
Erzya |
43.0 |
Bhojpuri |
48.5 |
Thai |
58.1 |
Marathi |
87.7 |
Basque |
78.2 |
Slovak |
88.2 |
Kiche |
28.2 |
Yoruba |
19.5 |
Warlpiri |
27.9 |
Tamil |
84.3 |
Maltese |
19.2 |
Ancient Greek |
66.3 |
Icelandic |
84.3 |
Mbya Guarani |
25.6 |
Urdu |
68.5 |
Romanian |
83.8 |
Persian |
78.3 |
Apurina |
27.3 |
Japanese |
34.1 |
Hungarian |
87.2 |
Hindi |
73.3 |
Classical Chinese |
28.3 |
Komi Permyak |
45.1 |
Faroese |
78.3 |
Sanskrit |
30.3 |
Livvi |
63.1 |
Arabic |
80.0 |
Wolof |
27.7 |
Bulgarian |
89.2 |
Akuntsu |
28.0 |
Makurap |
7.5 |
Kangri |
44.9 |
Breton |
65.8 |
Telugu |
85.7 |
Cantonese |
50.7 |
Old Church Slavonic |
49.4 |
Karelian |
73.5 |
Upper Sorbian |
70.9 |
South Levantine Arabic |
64.8 |
Komi Zyrian |
37.1 |
Irish |
68.9 |
Nayini |
46.2 |
Munduruku |
12.3 |
Manx |
35.7 |
Skolt Sami |
30.1 |
Afrikaans |
88.4 |
Old Turkish |
37.1 |
Tupinamba |
24.9 |
Belarusian |
87.2 |
Serbian |
89.0 |
Moksha |
41.5 |
Western Armenian |
79.0 |
Scottish Gaelic |
59.5 |
Khunsari |
40.5 |
Hebrew |
94.8 |
Uyghur |
77.2 |
Chukchi |
30.5 |