Tr Core News Md
Medium-sized spaCy pipeline optimized for Turkish, including tokenization, part-of-speech tagging, morphological analysis, dependency parsing, and named entity recognition
Downloads 85
Release Time : 11/3/2022
Model Overview
This model is part of the TrSpaCy project, specifically designed for Turkish, providing comprehensive natural language processing capabilities including part-of-speech tagging, morphological analysis, dependency parsing, and named entity recognition.
Model Features
Comprehensive Turkish Language Support
Specifically designed and optimized for Turkish, handling unique morphological and syntactic features of Turkish
Multi-task Processing Capability
Single pipeline simultaneously handles tokenization, part-of-speech tagging, morphological analysis, dependency parsing, and named entity recognition
High Accuracy Tagging
Achieves 90.52% accuracy in part-of-speech tagging (UPOS) and 88.94% F1 score in named entity recognition
Pre-trained Word Vectors
Includes 50,000 unique word vectors (300 dimensions) based on Medium-sized Turkish Floret word vectors
Model Capabilities
Turkish Tokenization
Part-of-speech Tagging
Morphological Analysis
Lemmatization
Dependency Parsing
Named Entity Recognition
Sentence Boundary Detection
Use Cases
Text Processing
Turkish Text Annotation
Automatically annotate Turkish text with part-of-speech, morphological features, and syntactic structures
Can be used to build Turkish language resources or preprocess text
Information Extraction
Extract named entities (person names, locations, organizations, etc.) from Turkish text
NER F1 score reaches 88.94%
Linguistic Research
Turkish Morphological Analysis
Analyze the complex morphological structure of Turkish
Morphological feature accuracy 88.93%
🚀 tr_core_news_md
A medium-sized Turkish pipeline for TrSpaCy, supporting various token classification tasks such as NER, TAG, POS, etc.
🚀 Quick Start
This tr_core_news_md
model is a powerful tool for Turkish language processing in TrSpaCy. It can handle multiple token classification tasks with high performance.
✨ Features
- Multi-Task Support: Capable of performing NER, TAG, POS, MORPH, LEMMA, UNLABELED_DEPENDENCIES, LABELED_DEPENDENCIES, and SENTS tasks.
- High Performance: Achieves high precision, recall, and F-score in NER and other tasks.
📦 Installation
No installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
No code examples are provided in the original document, so this section is skipped.
📚 Documentation
Model Information
Property | Details |
---|---|
Model Name | tr_core_news_md |
Version | 3.4.2 |
spaCy Compatibility | >=3.4.2,<3.5.0 |
Default Pipeline | tok2vec , tagger , morphologizer , trainable_lemmatizer , parser , ner |
Components | tok2vec , tagger , morphologizer , trainable_lemmatizer , parser , ner |
Vectors | -1 keys, 50000 unique vectors (300 dimensions) |
Sources | UD Turkish BOUN (Türk, Utku; Atmaca, Furkan; Özateş, Şaziye Betül; Berk, Gözde; Bedir, Seyyit Talha; Köksal, Abdullatif; Öztürk Başaran, Balkız; Güngör, Tunga; Özgür, Arzucan) Turkish Wiki NER dataset (Duygu Altinok, Co-one Istanbul) PANX/WikiANN (Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji) Medium-sized Turkish Floret word vectors (MC4 corpus) (Duygu Altinok) |
License | cc-by-sa-4.0 |
Author | Duygu Altinok |
Results
The model has been evaluated on several tasks, and the following are the performance metrics:
Task | Metric | Value |
---|---|---|
NER | Precision | 0.8890235772 |
NER | Recall | 0.8897246148 |
NER | F Score | 0.8893739579 |
TAG | Accuracy (XPOS) | 0.9141711565 |
POS | Accuracy (UPOS) | 0.9052411777 |
MORPH | Accuracy (UFeats) | 0.8892973515 |
LEMMA | Accuracy | 0.8171693155 |
UNLABELED_DEPENDENCIES | Unlabeled Attachment Score (UAS) | 0.7275183906 |
LABELED_DEPENDENCIES | Labeled Attachment Score (LAS) | 0.6355130835 |
SENTS | F-Score | 0.8349007315 |
Label Scheme
View label scheme (1572 labels for 4 components)
Component | Labels |
---|---|
tagger |
ADP , ADV , ANum , ANum_Adj , ANum_Ness , ANum_Noun , ANum_With , ANum_Zero , Abr , Abr_With , Adj , Adj_Ness , Adj_With , Adj_Without , Adj_Zero , Adv , Adverb , Adverb_Adverb , Adverb_Noun , Adverb_Zero , Conj , Conj_Conj , DET , Demons , Demons_Zero , Det , Det_Zero , Dup , Interj , NAdj , NAdj_Aux , NAdj_Ness , NAdj_Noun , NAdj_Rel , NAdj_Verb , NAdj_With , NAdj_Without , NAdj_Zero , NNum , NNum_Rel , NNum_Zero , NOUN , Neg , Ness , Noun , Noun_Ness , Noun_Noun , Noun_Rel , Noun_Since , Noun_Verb , Noun_With , Noun_With_Ness , Noun_With_Verb , Noun_With_Zero , Noun_Without , Noun_Zero , PCAbl , PCAbl_Rel , PCAcc , PCDat , PCDat_Zero , PCGen , PCIns , PCIns_Zero , PCNom , PCNom_Adj , PCNom_Noun , PCNom_Zero , PRON , PUNCT , Pers , Pers_Ness , Pers_Pers , Pers_Rel , Pers_Zero , Postp , Prop , Prop_Conj , Prop_Rel , Prop_Since , Prop_With , Prop_Zero , Punc , Punc_Noun_Ness , Punc_Noun_Rel , Quant , Quant_Zero , Ques , Ques_Zero , Reflex , Reflex_Zero , Rel , SYM , Since , Since_Since , Verb , Verb_Conj , Verb_Ness , Verb_Noun , Verb_Verb , Verb_With , Verb_Zero , With , Without , Without_Zero , Zero |
morphologizer |
NumType=Card|POS=NUM , Aspect=Perf|Case=Loc|Mood=Ind|Number=Plur,Sing|Number[psor]=Sing|POS=NOUN|Person=1,3|Person[psor]=3|Tense=Pres , POS=PUNCT , POS=ADV , POS=NOUN , Case=Nom|Number=Sing|POS=ADJ|Person=3 , POS=DET , Case=Loc|Number=Sing|POS=VERB|Person=1 , Case=Nom|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Dat|Number=Sing|POS=VERB|Person=3 , POS=ADJ , Aspect=Perf|Case=Nom|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Pos|Tense=Past|VerbForm=Part , Case=Gen|Number=Sing|POS=NOUN|Person=3 , POS=PRON , Case=Nom|Number=Sing|POS=NOUN|Person=3 , Aspect=Perf|Case=Acc|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Pos|Tense=Past|VerbForm=Part , POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part , Case=Acc|Number=Plur|POS=NOUN|Person=3 , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=3|Tense=Past , Case=Nom|Number=Sing|POS=PROPN|Person=3 , Case=Dat|Number=Sing|POS=PROPN|Person=3 , POS=VERB|Polarity=Pos , Case=Acc|Number=Sing|POS=VERB|Person=3|Polarity=Pos , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past , Aspect=Prog|Evident=Fh|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past , Case=Abl|Number=Sing|POS=ADJ|Person=3 , Case=Nom|Number=Plur|POS=NOUN|Person=3 , Case=Loc|Number=Plur|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , POS=INTJ , Case=Abl|Number=Plur|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Ins|Number=Sing|POS=PROPN|Person=3 , Case=Loc|Number=Sing|POS=PROPN|Person=3 , Case=Acc|Number=Sing|POS=NOUN|Person=3 , Aspect=Imp|POS=VERB|Polarity=Pos|Tense=Fut|VerbForm=Part , Case=Nom|Number=Sing|POS=PRON|Person=3 , POS=CCONJ , Case=Nom|Number=Plur|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Nom|Mood=Imp|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Conv|Voice=Cau , Case=Dat|Number=Sing|Number[psor]=Plur|POS=ADJ|Person=3|Person[psor]=1 , Aspect=Prog|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres , Case=Gen|Number=Sing|POS=PROPN|Person=3 , Case=Abl|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Nom|Number=Sing|POS=ADP|Person=3 , Case=Dat|Number=Plur|POS=NOUN|Person=3 , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|Voice=Pass , Case=Nom|POS=VERB|Polarity=Pos , Case=Nom|Number=Sing|POS=VERB|Person=3 , Case=Loc|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Nom|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Voice=Cau , Case=Dat|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Acc|Number=Sing|POS=PROPN|Person=3 , Aspect=Imp|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut , POS=ADP , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Past|Voice=Pass , Evident=Nfh|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past , Case=Nom|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=1 , Aspect=Perf|Number[psor]=Sing|POS=VERB|Person[psor]=3|Polarity=Pos|Tense=Past|VerbForm=Part , Aspect=Perf|Case=Nom|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Neg|Tense=Past|VerbForm=Part , Case=Acc|Number=Plur|POS=PRON|Person=3 , Aspect=Perf|Number[psor]=Sing|POS=VERB|Person[psor]=3|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Cau , Case=Acc|Number=Plur|POS=VERB|Person=3 , Aspect=Perf|Case=Abl|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Neg|Tense=Past|VerbForm=Part , Mood=Opt|Number=Sing|POS=VERB|Person=1|Polarity=Pos , Case=Dat|Number=Sing|POS=NOUN|Person=3 , Aspect=Prog|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres , Case=Gen|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Dat|Number=Plur|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Aspect=Prog|Evident=Fh|Number=Plur|POS=VERB|Person=1|Polarity=Pos|Tense=Past , Case=Acc|Number=Sing|POS=PRON|Person=1 , Aspect=Perf|Evident=Fh|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Past , Case=Ins|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Gen|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=1 , Case=Dat|Number=Sing|Number[psor]=Sing|POS=ADJ|Person=3|Person[psor]=3 , Case=Gen|Number=Sing|POS=PRON|Person=3 , Case=Acc|Number=Plur|Number[psor]=Plur|POS=NOUN|Person=3|Person[psor]=1 , Aspect=Hab|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres , Aspect=Hab|Number=Plur|POS=VERB|Person=1|Polarity=Pos|Tense=Pres , Case=Loc|Number=Sing|POS=NOUN|Person=3 , Aspect=Perf|Case=Acc|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Neg|Tense=Past|VerbForm=Part , Aspect=Hab|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Past , Case=Gen|Number=Sing|Number[psor]=Plur|POS=NOUN|Person=3|Person[psor]=1 , Aspect=Hab|Mood=Pot|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres , Case=Acc|Number=Plur|POS=PRON|Person=1 , Case=Nom|Number=Sing|POS=NOUN|Person=3|Polarity=Pos , Case=Nom|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Person[psor]=3 , Aspect=Hab|Mood=Imp|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Conv , Aspect=Hab|Mood=Pot|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|Voice=Cau , Case=Dat|Number=Plur|Number[psor]=Plur|POS=NOUN|Person=3|Person[psor]=1 , Case=Abl|Number=Sing|POS=NOUN|Person=3 , Mood=Imp|POS=VERB|Polarity=Pos|VerbForm=Conv , Aspect=Perf|Evident=Fh|Number=Plur|POS=VERB|Person=1|Polarity=Pos|Tense=Past , Case=Nom|Number=Plur|POS=PRON|Person=3 , Case=Nom|Number=Sing|Number[psor]=Sing|POS=NUM|Person=3|Person[psor]=3 , Case=Nom|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=1|Polarity=Neg|Tense=Past|Voice=Cau , Case=Nom|Number=Plur|POS=ADJ|Person=3 , Aspect=Hab|Mood=Cnd|Number=Plur|POS=VERB|Person=2|Polarity=Pos|Tense=Pres , Aspect=Hab|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres , Aspect=Hab|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres , Aspect=Hab|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres , Case=Gen|Number=Plur|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Case=Gen|Number=Plur|POS=NOUN|Person=3 , Case=Ins|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3|Polarity=Pos , Aspect=Imp|Case=Acc|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=3|Polarity=Pos|Tense=Fut|VerbForm=Part , Case=Acc|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=3 , Aspect=Imp|Number=Sing|POS=AUX|Person=3|Tense=Pres , Case=Loc|Number=Sing|POS=NUM|Person=3 , Aspect=Perf|Evident=Fh|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past , Case=Loc|Number=Sing|Number[psor]=Sing|POS=NOUN|Person=3|Person[psor]=2 , Case=Gen|Number=Plur|POS=PRON|Person=1 , Aspect=Perf|Number[psor]=Plur|POS=VERB|Person[psor]=1|Polarity=Pos|Tense=Past|VerbForm=Part , Aspect=Prog|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres , Case=Nom|Number=Sing|POS=PRON|Person=1 , Case=Nom|Number=Sing|POS=NOUN|Person=1 , Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos , Case=Acc|Number=Sing|POS=PRON|Person=3 , Aspect=Prog|Number=Plur|POS=VERB|Person=1|Polarity=Pos|Tense=Pres , Case=Ins|Number=Sing|POS=NOUN|Person=3 , POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , Aspect=Perf|Case=Nom|Number=Sing|Number[psor]=Sing|POS=VERB|Person=3|Person[psor]=1|Polarity=Pos|Tense=Past|VerbForm=Part , Case=Nom|POS=VERB|Polarity=Pos|Voice=Cau , Aspect=Prog|Evident=Fh|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Past , Case=Nom|Number=Sing|POS=ADJ|Person=3|Polarity=Pos , Case=Acc|Number=Sing|POS=VERB|Person=3 , Aspect=Perf|Case=Nom|Mood=Gen|Number=Sing|POS=NOUN|Person=3|Tense=Pres , Case=Abl|Number=Plur|POS=NOUN|Person=3 , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Past , Aspect=Prog|Evident=Fh|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Past , Mood=Imp|POS=VERB|Polarity=Pos|VerbForm=Conv|Voice=Cau , Aspect=Perf|Evident=Fh|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Past|Voice=Cau , Case=Nom|Number=Plur|Number[psor]=Plur|POS=NOUN |
🔧 Technical Details
No technical details are provided in the original document, so this section is skipped.
📄 License
The model is released under the cc-by-sa-4.0
license.
Indonesian Roberta Base Posp Tagger
MIT
This is a POS tagging model fine-tuned based on the Indonesian RoBERTa model, trained on the indonlu dataset for Indonesian text POS tagging tasks.
Sequence Labeling
Transformers Other

I
w11wo
2.2M
7
Bert Base NER
MIT
BERT fine-tuned named entity recognition model capable of identifying four entity types: Location (LOC), Organization (ORG), Person (PER), and Miscellaneous (MISC)
Sequence Labeling English
B
dslim
1.8M
592
Deid Roberta I2b2
MIT
This model is a sequence labeling model fine-tuned on RoBERTa, designed to identify and remove Protected Health Information (PHI/PII) from medical records.
Sequence Labeling
Transformers Supports Multiple Languages

D
obi
1.1M
33
Ner English Fast
Flair's built-in fast English 4-class named entity recognition model, based on Flair embeddings and LSTM-CRF architecture, achieving an F1 score of 92.92 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
978.01k
24
French Camembert Postag Model
French POS tagging model based on Camembert-base, trained using the free-french-treebank dataset
Sequence Labeling
Transformers French

F
gilf
950.03k
9
Xlm Roberta Large Ner Spanish
A Spanish named entity recognition model fine-tuned based on the XLM-Roberta-large architecture, with excellent performance on the CoNLL-2002 dataset.
Sequence Labeling
Transformers Spanish

X
MMG
767.35k
29
Nusabert Ner V1.3
MIT
Named entity recognition model fine-tuned on Indonesian NER tasks based on NusaBert-v1.3
Sequence Labeling
Transformers Other

N
cahya
759.09k
3
Ner English Large
Flair framework's built-in large English NER model for 4 entity types, utilizing document-level XLM-R embeddings and FLERT technique, achieving an F1 score of 94.36 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
749.04k
44
Punctuate All
MIT
A multilingual punctuation prediction model fine-tuned based on xlm-roberta-base, supporting automatic punctuation completion for 12 European languages
Sequence Labeling
Transformers

P
kredor
728.70k
20
Xlm Roberta Ner Japanese
MIT
Japanese named entity recognition model fine-tuned based on xlm-roberta-base
Sequence Labeling
Transformers Supports Multiple Languages

X
tsmatz
630.71k
25
Featured Recommended AI Models