Ru Core News Md
CPU-optimized Russian processing pipeline including token classification, dependency parsing, named entity recognition and other NLP tasks
Downloads 25
Release Time : 3/2/2022
Model Overview
Medium-sized Russian processing model for spaCy, featuring POS tagging, morphological analysis, dependency parsing, named entity recognition, suitable for Russian text processing tasks
Model Features
CPU Optimization
Specifically optimized for CPU processing, suitable for running in environments without GPU
Comprehensive NLP Features
Provides complete NLP processing pipeline from POS tagging to named entity recognition
High-quality Vector Representations
Includes 20,000 unique vectors (300 dimensions), providing good word vector representations
Model Capabilities
POS tagging
Morphological analysis
Lemmatization
Dependency parsing
Named entity recognition
Sentence segmentation
Use Cases
Text Processing
Russian Text Analysis
Performing grammatical and semantic analysis on Russian texts
Accurately identifies POS, morphological features and syntactic relationships
Information Extraction
Extracting named entities from Russian texts
NER F-score reaches 0.9456
Linguistic Research
Russian Grammar Research
Analyzing Russian inflection and syntactic structures
Provides detailed morphological feature annotations
🚀 ru_core_news_md
A Russian language processing pipeline optimized for CPU, offering multiple token - classification tasks.
📚 Documentation
Details: https://spacy.io/models/ru#ru_core_news_md
The ru_core_news_md
is a Russian pipeline optimized for CPU. It consists of components such as tok2vec
, morphologizer
, parser
, senter
, ner
, attribute_ruler
, and lemmatizer
.
Property | Details |
---|---|
Model Type | ru_core_news_md |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | tok2vec , morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 500002 keys, 20000 unique vectors (300 dimensions) |
Sources | Nerus (Alexander Kukushkin) Navec (Alexander Kukushkin) |
License | MIT |
Author | Explosion |
Model Index
The ru_core_news_md
model has been evaluated on several token - classification tasks, and here are the results:
Task | Metric | Value |
---|---|---|
NER | NER Precision | 0.9438296445 |
NER | NER Recall | 0.9474835886 |
NER | NER F Score | 0.9456530869 |
TAG | TAG (XPOS) Accuracy | 0.9882061909 |
POS | POS (UPOS) Accuracy | 0.9882061909 |
MORPH | Morph (UFeats) Accuracy | 0.972948348 |
LEMMA | Lemma Accuracy | 2.15295e - 05 |
UNLABELED_DEPENDENCIES | Unlabeled Attachment Score (UAS) | 0.9595456565 |
LABELED_DEPENDENCIES | Labeled Attachment Score (LAS) | 0.9474984155 |
SENTS | Sentences F - Score | 0.9985729236 |
Label Scheme
View label scheme (900 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
Case=Nom|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Anim|Case=Nom|Gender=Masc|Number=Plur|POS=NOUN , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Animacy=Inan|Case=Acc|POS=NUM , Animacy=Inan|Case=Gen|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing|POS=NOUN , POS=ADP , Case=Gen|Gender=Fem|Number=Sing|POS=DET , Animacy=Inan|Case=Gen|Gender=Fem|Number=Sing|POS=NOUN , POS=PUNCT , Degree=Pos|POS=ADV , Aspect=Imp|Mood=Ind|Number=Plur|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Mid , Animacy=Inan|Case=Nom|Gender=Masc|Number=Plur|POS=NOUN , Animacy=Anim|Case=Gen|Gender=Masc|Number=Plur|POS=NOUN , Aspect=Perf|Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Case=Loc|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Inan|Case=Loc|Gender=Neut|Number=Plur|POS=NOUN , Animacy=Inan|Case=Loc|Gender=Neut|Number=Sing|POS=PRON , Aspect=Imp|Mood=Ind|Number=Sing|POS=VERB|Person=Third|Tense=Pres|VerbForm=Fin|Voice=Act , Animacy=Inan|Case=Nom|Gender=Neut|Number=Sing|POS=NOUN , Foreign=Yes|POS=PROPN , Case=Loc|Gender=Fem|Number=Sing|POS=NUM , Aspect=Imp|Gender=Neut|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Animacy=Anim|Case=Gen|Gender=Masc|Number=Sing|POS=NOUN , Animacy=Inan|Case=Loc|Gender=Masc|Number=Sing|POS=NOUN , POS=NUM , Animacy=Inan|Case=Gen|Gender=Masc|Number=Plur|POS=NOUN , Case=Nom|Gender=Masc|Number=Sing|POS=PRON|Person=Third , Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act , Animacy=Anim|Case=Ins|Gender=Masc|Number=Sing|POS=NOUN , Animacy=Inan|Case=Dat|Gender=Neut|Number=Sing|POS=NOUN , POS=DET , Animacy=Inan|Case=Nom|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|Gender=Fem|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Case=Dat|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Inan|Case=Dat|Gender=Fem|Number=Plur|POS=NOUN , Animacy=Inan|Case=Nom|Gender=Masc|Number=Sing|POS=NOUN , Aspect=Perf|Gender=Masc|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , POS=SCONJ , Animacy=Inan|Case=Ins|Gender=Neut|Number=Sing|POS=NOUN , Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=Third , Case=Acc|POS=NUM , Case=Ins|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Inan|Case=Ins|Gender=Masc|Number=Plur|POS=NOUN , POS=CCONJ , Case=Nom|POS=NUM , Animacy=Inan|Case=Dat|Gender=Masc|Number=Sing|POS=NOUN , Aspect=Perf|Gender=Masc|Number=Sing|POS=VERB|StyleVariant=Short|Tense=Past|VerbForm=Part|Voice=Pass , Case=Nom|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Ins|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ , Aspect=Imp|Mood=Ind|Number=Plur|POS=VERB|Person=Third|Tense=Pres|VerbForm=Fin|Voice=Act , Case=Nom|Gender=Masc|Number=Sing|POS=DET , Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Case=Acc|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Animacy=Inan|Case=Acc|Gender=Fem|Number=Sing|POS=NOUN , Case=Nom|Gender=Fem|Number=Sing|POS=PRON , Aspect=Imp|Mood=Ind|Number=Sing|POS=VERB|Person=Third|Tense=Pres|VerbForm=Fin|Voice=Mid , Case=Ins|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Animacy=Anim|Case=Nom|Gender=Fem|Number=Sing|POS=NOUN , Case=Dat|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Animacy=Inan|Case=Dat|Gender=Fem|Number=Sing|POS=NOUN , Animacy=Inan|Case=Gen|Gender=Neut|Number=Sing|POS=NOUN , Animacy=Inan|Case=Nom|Gender=Neut|Number=Plur|POS=NOUN , Degree=Pos|Number=Plur|POS=ADJ|StyleVariant=Short , Aspect=Imp|Mood=Ind|Number=Plur|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act , Aspect=Perf|POS=VERB|VerbForm=Inf|Voice=Act , Animacy=Inan|Case=Acc|Gender=Neut|Number=Sing|POS=PRON , Case=Loc|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Animacy=Inan|Case=Loc|Gender=Fem|Number=Sing|POS=NOUN , Animacy=Inan|Case=Loc|Gender=Masc|Number=Plur|POS=NOUN , Case=Gen|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Aspect=Perf|Number=Plur|POS=VERB|StyleVariant=Short|Tense=Past|VerbForm=Part|Voice=Pass , Animacy=Anim|Case=Acc|Gender=Masc|POS=NUM , Animacy=Anim|Case=Gen|Gender=Fem|Number=Plur|POS=NOUN , Animacy=Anim|Case=Acc|Gender=Neut|Number=Plur|POS=NOUN , Mood=Cnd|POS=SCONJ , Case=Nom|Number=Plur|POS=PRON|Person=Third , POS=PART|Polarity=Neg , Aspect=Imp|POS=VERB|VerbForm=Inf|Voice=Mid , Animacy=Inan|Aspect=Perf|Case=Acc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Animacy=Inan|Case=Acc|Gender=Fem|Number=Plur|POS=NOUN , POS=SPACE , Case=Nom|Number=Plur|POS=DET , Aspect=Imp|Mood=Ind|Number=Plur|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Animacy=Anim|Case=Acc|Gender=Masc|Number=Sing|POS=NOUN , Aspect=Imp|Gender=Neut|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Mid , Animacy=Inan|Case=Acc|Gender=Masc|Number=Sing|POS=NOUN , Animacy=Anim|Case=Acc|Number=Plur|POS=PRON , Animacy=Inan|Case=Acc|Gender=Neut|Number=Sing|POS=NOUN , Case=Gen|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ , Animacy=Anim|Case=Gen|Gender=Masc|Number=Sing|POS=PROPN , Animacy=Anim|Case=Nom|Gender=Fem|Number=Sing|POS=PROPN , Aspect=Imp|Gender=Fem|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , POS=INTJ , Animacy=Inan|Case=Loc|Gender=Fem|Number=Plur|POS=NOUN , Animacy=Inan|Case=Nom|Gender=Neut|Number=Sing|POS=PRON , Aspect=Imp|Gender=Fem|Mood=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act , Case=Nom|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Case=Acc|Gender=Masc|Number=Sing|POS=PRON|Person=Third , Case=Nom|Number=Plur|POS=PRON , Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Mid , Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass , Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ|StyleVariant=Short , Case=Gen|Gender=Masc|Number=Sing|POS=PRON|Person=Third , Case=Gen|POS=PRON , Animacy=Inan|Case=Dat|Gender=Neut|Number=Plur|POS=NOUN , Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing|POS=PROPN , Aspect=Imp|POS=VERB|VerbForm=Inf|Voice=Act , Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing|POS=NOUN , Case=Acc|Gender=Fem|Number=Sing|POS=PRON|Person=Third , Animacy=Inan|Case=Acc|Number=Plur|POS=DET , Case=Nom|POS=PRON , Animacy=Anim|Case=Ins|Gender=Masc|Number=Plur|POS=NOUN , POS=ADJ , Case=Loc|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Animacy=Inan|Case=Gen|Gender=Fem|Number=Sing|POS=PROPN , Aspect=Imp|Mood=Ind|Number=Sing|POS=AUX|Person=Third|Tense=Pres|VerbForm=Fin|Voice=Act , Case=Nom|Gender=Fem|Number=Sing|POS=PRON|Person=Third , Case=Ins|Gender=Masc|Number=Sing|POS=DET , Animacy=Inan|Case=Ins|Gender=Masc|Number=Sing|POS=NOUN , Aspect=Perf|Case=Acc|Gender=Neut|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Animacy=Inan|Case=Loc|Gender=Neut|Number=Sing|POS=NOUN , Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing|POS=PROPN , Case=Nom|Number=Sing|POS=PRON|Person=First , Aspect=Imp|Mood=Ind|Number=Sing|POS=VERB|Person=First|Tense=Pres|VerbForm=Fin|Voice=Act , Animacy=Inan|Case=Acc|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Mood=Cnd|POS=AUX , Case=Nom|Number=Plur|POS=PRON|Person=First , Case=Gen|Number=Plur|POS=DET , Animacy=Inan|Case=Ins|Gender=Masc|Number=Sing|POS=PROPN , Aspect=Imp|Case=Gen|Gender=Masc|Number=Sing|POS=VERB|Tense=Pres|VerbForm=Part|Voice=Act , Animacy=Inan|Case=Ins|Gender=Neut|Number=Sing|POS=PRON , Aspect=Perf|POS=VERB|VerbForm=Inf|Voice=Mid , Aspect=Perf|Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Act , Animacy=Inan|Case=Acc|Gender=Masc|Number=Sing|POS=PROPN , Animacy=Inan|Case=Acc|Gender=Neut|Number=Sing|POS=DET , POS=PART , Case=Dat|Gender=Masc|Number=Sing|POS=DET , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=Third|Tense=Fut|VerbForm=Fin|Voice=Mid , Aspect=Perf|Gender=Masc|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Mid , Case=Nom|Gender=Masc|Number=Sing|POS=NUM , Animacy=Anim|Case=Dat|Gender=Fem|Number=Sing|POS=PROPN , Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=Third|Tense=Fut|VerbForm=Fin|Voice=Mid , Case=Loc|Gender=Masc|Number=Sing|POS=DET , Aspect=Perf|Gender=Neut|Mood=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act , Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ|StyleVariant=Short , Animacy=Inan|Case=Gen|Gender=Neut|Number=Plur|POS=NOUN , Animacy=Anim|Case=Dat|Gender=Masc|Number=Sing|POS=NOUN , Case=Nom|Gender=Neut|Number=Sing|POS=PRON|Person=Third , Aspect=Perf|Gender=Neut|Number=Sing|POS=VERB|StyleVariant=Short|Tense=Past|VerbForm=Part|Voice=Pass , Animacy=Inan|Case=Loc|Gender=Fem|Number=Sing|POS=PROPN , Animacy=Inan|Case=Acc|Gender=Masc|Number=Plur|POS=NOUN , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=Third|Tense=Fut|VerbForm=Fin|Voice=Act , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Mid , Animacy=Inan|Case=Gen|Gender=Neut|Number=Sing|POS=PRON , Aspect=Perf|Case=Loc|Gender=Neut|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Animacy=Inan|Case=Loc|Gender=Neut|Number=Sing|POS=PROPN , Case=Dat|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Animacy=Inan|Case=Dat|Gender=Masc|Number=Plur|POS=PROPN , Animacy=Inan|Case=Acc|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Inan|Case=Acc|Gender=Neut|Number=Plur|POS=NOUN , Foreign=Yes|POS=X , Animacy=Inan|Case=Loc|Gender=Masc|Number=Sing|POS=PROPN , Aspect=Imp|POS=VERB|Tense=Pres|VerbForm=Conv|Voice=Act , Case=Gen|Degree=Pos|Number=Plur|POS=ADJ , Animacy=Inan|Case=Ins|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Imp|Gender=Neut|Mood=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act , Case=Nom|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ , Aspect=Imp|Case=Nom|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|POS=NUM , Animacy=Inan|Case=Acc|Gender=Masc|POS=NUM , `Aspect=Imp|Case=Gen|Number=Plur|POS=VERB|Tense=Pres| |
📄 License
This project is licensed under the MIT License.
Indonesian Roberta Base Posp Tagger
MIT
This is a POS tagging model fine-tuned based on the Indonesian RoBERTa model, trained on the indonlu dataset for Indonesian text POS tagging tasks.
Sequence Labeling
Transformers Other

I
w11wo
2.2M
7
Bert Base NER
MIT
BERT fine-tuned named entity recognition model capable of identifying four entity types: Location (LOC), Organization (ORG), Person (PER), and Miscellaneous (MISC)
Sequence Labeling English
B
dslim
1.8M
592
Deid Roberta I2b2
MIT
This model is a sequence labeling model fine-tuned on RoBERTa, designed to identify and remove Protected Health Information (PHI/PII) from medical records.
Sequence Labeling
Transformers Supports Multiple Languages

D
obi
1.1M
33
Ner English Fast
Flair's built-in fast English 4-class named entity recognition model, based on Flair embeddings and LSTM-CRF architecture, achieving an F1 score of 92.92 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
978.01k
24
French Camembert Postag Model
French POS tagging model based on Camembert-base, trained using the free-french-treebank dataset
Sequence Labeling
Transformers French

F
gilf
950.03k
9
Xlm Roberta Large Ner Spanish
A Spanish named entity recognition model fine-tuned based on the XLM-Roberta-large architecture, with excellent performance on the CoNLL-2002 dataset.
Sequence Labeling
Transformers Spanish

X
MMG
767.35k
29
Nusabert Ner V1.3
MIT
Named entity recognition model fine-tuned on Indonesian NER tasks based on NusaBert-v1.3
Sequence Labeling
Transformers Other

N
cahya
759.09k
3
Ner English Large
Flair framework's built-in large English NER model for 4 entity types, utilizing document-level XLM-R embeddings and FLERT technique, achieving an F1 score of 94.36 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
749.04k
44
Punctuate All
MIT
A multilingual punctuation prediction model fine-tuned based on xlm-roberta-base, supporting automatic punctuation completion for 12 European languages
Sequence Labeling
Transformers

P
kredor
728.70k
20
Xlm Roberta Ner Japanese
MIT
Japanese named entity recognition model fine-tuned based on xlm-roberta-base
Sequence Labeling
Transformers Supports Multiple Languages

X
tsmatz
630.71k
25
Featured Recommended AI Models