đ XLM-RoBERTa base Universal Dependencies v2.8 POS tagging: Romanian
This model addresses the task of part - of - speech tagging across multiple languages. It is a significant contribution in the field of cross - lingual transfer, providing evidence from POS tagging with over 100 languages. Check the Space for more details.
đ Quick Start
This model is part of the paper named "Make the Best of Cross - lingual Transfer: Evidence from POS Tagging with over 100 Languages". You can access more details by visiting the Space.
⨠Features
- Multilingual Support: Capable of performing part - of - speech tagging on over 100 languages.
- Cross - lingual Transfer: Demonstrates effective cross - lingual transfer in POS tagging tasks.
đĻ Installation
There is no specific installation command provided in the original README. So, this section is skipped.
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-ro")
model = AutoModelForTokenClassification.from_pretrained("wietsedv/xlm-roberta-base-ft-udpos28-ro")
Advanced Usage
There is no advanced usage code example provided in the original README. So, this part is skipped.
đ Documentation
Model Information
Property |
Details |
Model Type |
xlm - roberta - base - ft - udpos28 - ro |
Training Data |
Universal Dependencies v2.8 |
Metrics
The following table shows the accuracy metrics of the model on different languages:
Language |
Test Accuracy |
English |
88.4 |
Dutch |
86.1 |
German |
87.3 |
Italian |
88.2 |
French |
91.3 |
Spanish |
91.1 |
Russian |
90.4 |
Swedish |
90.7 |
Norwegian |
85.0 |
Danish |
91.0 |
Low Saxon |
56.2 |
Akkadian |
41.8 |
Armenian |
88.4 |
Welsh |
71.7 |
Old East Slavic |
78.7 |
Albanian |
90.2 |
Slovenian |
80.3 |
Guajajara |
39.3 |
Kurmanji |
79.5 |
Turkish |
79.5 |
Finnish |
86.0 |
Indonesian |
84.2 |
Ukrainian |
89.7 |
Polish |
89.5 |
Portuguese |
90.3 |
Kazakh |
85.0 |
Latin |
81.8 |
Old French |
65.7 |
Buryat |
64.9 |
Kaapor |
27.1 |
Korean |
64.3 |
Estonian |
87.5 |
Croatian |
89.7 |
Gothic |
35.1 |
Swiss German |
55.5 |
Assyrian |
16.8 |
North Sami |
45.0 |
Naija |
43.8 |
Latvian |
89.5 |
Chinese |
54.9 |
Tagalog |
74.0 |
Bambara |
32.9 |
Lithuanian |
87.7 |
Galician |
89.9 |
Vietnamese |
66.2 |
Greek |
88.9 |
Catalan |
90.0 |
Czech |
89.8 |
Erzya |
51.5 |
Bhojpuri |
55.0 |
Thai |
64.9 |
Marathi |
87.1 |
Basque |
80.7 |
Slovak |
89.8 |
Kiche |
42.4 |
Yoruba |
30.3 |
Warlpiri |
46.2 |
Tamil |
82.5 |
Maltese |
38.3 |
Ancient Greek |
67.8 |
Icelandic |
85.1 |
Mbya Guarani |
34.4 |
Urdu |
63.4 |
Romanian |
96.8 |
Persian |
79.0 |
Apurina |
43.1 |
Japanese |
43.7 |
Hungarian |
79.9 |
Hindi |
70.6 |
Classical Chinese |
40.8 |
Komi Permyak |
57.2 |
Faroese |
80.9 |
Sanskrit |
40.4 |
Livvi |
66.9 |
Arabic |
83.5 |
Wolof |
43.1 |
Bulgarian |
91.2 |
Akuntsu |
40.6 |
Makurap |
20.5 |
Kangri |
53.7 |
Breton |
68.7 |
Telugu |
82.9 |
Cantonese |
57.0 |
Old Church Slavonic |
59.1 |
Karelian |
75.0 |
Upper Sorbian |
77.8 |
South Levantine Arabic |
71.2 |
Komi Zyrian |
47.0 |
Irish |
69.4 |
Nayini |
56.4 |
Munduruku |
29.2 |
Manx |
38.8 |
Skolt Sami |
43.7 |
Afrikaans |
88.2 |
Old Turkish |
37.1 |
Tupinamba |
44.5 |
Belarusian |
90.4 |
Serbian |
89.5 |
Moksha |
49.1 |
Western Armenian |
82.0 |
Scottish Gaelic |
63.1 |
Khunsari |
47.3 |
Hebrew |
88.5 |
Uyghur |
78.0 |
Chukchi |
37.5 |
đ License
This model is released under the Apache 2.0 license.