๐ Italian-Bert (Italian Bert) + POS ๐๐ท
This model is a fine - tuned version of Bert Base Italian on xtreme udpos Italian for the POS downstream task. It offers high - quality performance for part - of - speech tagging in Italian.
๐ Quick Start
The model is ready to use for POS tagging in Italian. You can follow the usage examples below to start.
โจ Features
- Fine - tuned on the xtreme udpos Italian dataset for the POS downstream task.
- Covers a wide range of POS labels, including ADJ, ADP, ADV, etc.
- Achieves high scores on evaluation metrics such as F1, Precision, and Recall.
๐ฆ Installation
No specific installation steps are provided in the original document.
๐ป Usage Examples
Basic Usage
from transformers import pipeline
nlp_pos = pipeline(
"ner",
model="sachaarbonel/bert-italian-cased-finetuned-pos",
tokenizer=(
'sachaarbonel/bert-spanish-cased-finetuned-pos',
{"use_fast": False}
))
text = 'Roma รจ la Capitale d\'Italia.'
nlp_pos(text)
'''
Output:
--------
[{'entity': 'PROPN', 'index': 1, 'score': 0.9995346665382385, 'word': 'roma'},
{'entity': 'AUX', 'index': 2, 'score': 0.9966597557067871, 'word': 'e'},
{'entity': 'DET', 'index': 3, 'score': 0.9994786977767944, 'word': 'la'},
{'entity': 'NOUN',
'index': 4,
'score': 0.9995198249816895,
'word': 'capitale'},
{'entity': 'ADP', 'index': 5, 'score': 0.9990894198417664, 'word': 'd'},
{'entity': 'PART', 'index': 6, 'score': 0.57159024477005, 'word': "'"},
{'entity': 'PROPN',
'index': 7,
'score': 0.9994804263114929,
'word': 'italia'},
{'entity': 'PUNCT', 'index': 8, 'score': 0.9772886633872986, 'word': '.'}]
'''
๐ Documentation
Details of the downstream task (POS) - Dataset
Dataset |
# Examples |
Train |
716 K |
Dev |
85 K |
ADJ
ADP
ADV
AUX
CCONJ
DET
INTJ
NOUN
NUM
PART
PRON
PROPN
PUNCT
SCONJ
SYM
VERB
X
Metrics on evaluation set ๐งพ
Metric |
# score |
F1 |
97.25 |
Precision |
97.15 |
Recall |
97.36 |
๐ง Technical Details
The model is a fine - tuned version of [Bert Base Italian](https://huggingface.co/dbmdz/bert - base - italian - cased) on the xtreme udpos Italian dataset for the POS downstream task. It uses the ner
pipeline from the transformers
library for inference.
Created by Sacha Arbonel/@sachaarbonel | [LinkedIn](https://www.linkedin.com/in/sacha - arbonel)
Made with โฅ in Paris