Fr Core News Sm
A CPU-optimized small French natural language processing model provided by spaCy, featuring tokenization, part-of-speech tagging, dependency parsing, named entity recognition, and more.
Downloads 160
Release Time : 3/2/2022
Model Overview
This is a French processing pipeline model primarily used for basic NLP tasks in French text, including tokenization, part-of-speech tagging, dependency parsing, named entity recognition, etc. The model is optimized for CPU usage, making it suitable for lightweight application scenarios.
Model Features
CPU Optimization
Model specifically optimized for CPU usage, suitable for resource-limited environments.
Comprehensive NLP Capabilities
Provides complete NLP processing capabilities from basic tokenization to complex syntactic analysis.
High-Accuracy Part-of-Speech Tagging
Part-of-speech tagging accuracy reaches 96.18% (UPOS).
Named Entity Recognition
F1 score reaches 81.27%, capable of identifying various named entities in French text.
Model Capabilities
Text Tokenization
Part-of-Speech Tagging
Named Entity Recognition
Dependency Parsing
Lemmatization
Sentence Segmentation
Morphological Analysis
Use Cases
Text Processing
French Text Analysis
Basic NLP processing for French news, articles, etc.
Obtain structured information such as tokenization, part-of-speech tagging, and named entities.
Information Extraction
French Entity Recognition
Extract named entities such as person names, locations, and organizations from French text.
81.27% F1 score recognition accuracy.
🚀 fr_core_news_sm
A French language processing pipeline optimized for CPU, offering various token - classification capabilities.
📚 Documentation
Details: https://spacy.io/models/fr#fr_core_news_sm
This is a French pipeline optimized for CPU. Its components include tok2vec, morphologizer, parser, senter, ner, attribute_ruler, and lemmatizer.
Property | Details |
---|---|
Name | fr_core_news_sm |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | tok2vec , morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD French Sequoia v2.8 (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno) WikiNER (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran) spaCy lookups data (Explosion) |
License | LGPL-LR |
Author | Explosion |
Label Scheme
View label scheme (237 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=PROPN , Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Gender=Fem|Number=Sing|POS=NOUN , Number=Plur|POS=PRON|Person=1 , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , POS=SCONJ , POS=ADP , Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art , NumType=Ord|POS=ADJ , Gender=Masc|Number=Sing|POS=NOUN , POS=PUNCT , Gender=Masc|Number=Sing|POS=PROPN , Number=Plur|POS=ADJ , Gender=Masc|Number=Plur|POS=NOUN , Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Art , Number=Sing|POS=ADJ , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , POS=ADV , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Past|VerbForm=Fin , Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art , Gender=Fem|Number=Sing|POS=PROPN , Definite=Def|Number=Sing|POS=DET|PronType=Art , NumType=Card|POS=NUM , Definite=Def|Number=Plur|POS=DET|PronType=Art , Gender=Masc|Number=Plur|POS=ADJ , POS=CCONJ , Gender=Fem|Number=Plur|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Past|VerbForm=Fin , Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part , Gender=Fem|Number=Plur|POS=ADJ , POS=ADJ , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin , POS=PRON|PronType=Rel , Number=Sing|POS=DET|Poss=Yes , Definite=Def|Gender=Masc|Number=Sing|POS=ADP|PronType=Art , Definite=Def|Number=Plur|POS=ADP|PronType=Art , Definite=Ind|Number=Plur|POS=DET|PronType=Art , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Past|VerbForm=Fin , Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , POS=VERB|VerbForm=Inf , Gender=Fem|Number=Sing|POS=ADJ , Gender=Masc|Number=Sing|POS=PRON|Person=3 , Number=Plur|POS=DET , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=ADJ , Gender=Masc|Number=Sing|POS=DET|PronType=Dem , POS=ADV|PronType=Int , POS=VERB|Tense=Pres|VerbForm=Part , Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part , Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Art , Gender=Masc|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin , Number=Plur|POS=DET|Poss=Yes , POS=AUX|VerbForm=Inf , Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part , POS=ADV|Polarity=Neg , Definite=Ind|Number=Sing|POS=DET|PronType=Art , Gender=Fem|Number=Sing|POS=PRON|Person=3 , POS=PRON|Person=3|Reflex=Yes , Gender=Masc|POS=NOUN , POS=AUX|Tense=Past|VerbForm=Part , POS=PRON|Person=3 , Number=Plur|POS=NOUN , NumType=Ord|Number=Sing|POS=ADJ , POS=VERB|Tense=Past|VerbForm=Part , POS=AUX|Tense=Pres|VerbForm=Part , Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part , Number=Sing|POS=PRON|Person=3 , Number=Sing|POS=NOUN , Gender=Masc|Number=Plur|POS=PRON|Person=3 , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , Gender=Fem|NumType=Ord|Number=Sing|POS=ADJ , Number=Plur|POS=PROPN , Number=Sing|POS=PROPN , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Plur|POS=PRON|PronType=Dem , Gender=Masc|Number=Sing|POS=DET , Gender=Fem|Number=Sing|POS=DET|Poss=Yes , Gender=Masc|POS=PRON , POS=NOUN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON , Gender=Masc|NumType=Ord|Number=Plur|POS=ADJ , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin , Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Number=Sing|POS=PRON , Number=Sing|POS=PRON|PronType=Dem , Mood=Ind|POS=VERB|VerbForm=Fin , Number=Plur|POS=DET|PronType=Dem , Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Gender=Masc|Number=Sing|POS=PRON , Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Dem , Number=Sing|POS=PRON|Person=2|PronType=Prs , Gender=Masc|Number=Sing|POS=PRON|PronType=Rel , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|NumType=Ord|Number=Sing|POS=ADJ , POS=PRON , POS=NUM , Gender=Fem|POS=NOUN , POS=SPACE , Gender=Fem|Number=Plur|POS=PRON , Number=Plur|POS=PRON|Person=3 , Number=Sing|POS=VERB|Tense=Past|VerbForm=Part , Number=Sing|POS=PRON|Person=1 , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin , Gender=Fem|Number=Sing|POS=PRON , Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , POS=INTJ , Number=Plur|POS=PRON|Person=2 , NumType=Card|POS=PRON , Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Art , Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part , NumType=Card|POS=NOUN , POS=PRON|PronType=Int , Gender=Fem|Number=Plur|POS=PRON|Person=3 , Gender=Fem|Number=Sing|POS=DET , Mood=Cnd|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Gender=Fem|Number=Plur|POS=DET , Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Art , Mood=Cnd|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=PRON|PronType=Dem , Gender=Masc|Number=Plur|POS=PROPN , Mood=Cnd|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Fem|Number=Sing|POS=PRON|PronType=Dem , Number=Sing|POS=DET , Gender=Masc|NumType=Card|Number=Plur|POS=NOUN , Gender=Fem|Number=Plur|POS=PRON|PronType=Dem , Mood=Ind|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Fem|POS=PRON , Gender=Masc|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Gender=Fem|Number=Sing|POS=PRON|PronType=Rel , Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , Mood=Cnd|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part , POS=X , POS=SYM , Mood=Imp|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=DET|PronType=Int , Gender=Fem|Number=Plur|POS=DET|PronType=Int , POS=DET , Gender=Masc|Number=Plur|POS=PRON , Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Mood=Ind|POS=VERB|Person=3|VerbForm=Fin , Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Mood=Cnd|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin , Gender=Fem|Number=Sing|POS=DET|PronType=Int , Gender=Masc|Number=Plur|POS=DET , Gender=Fem|Number=Plur|POS=PRON|PronType=Rel , Number=Plur|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Gender=Masc|Number=Plur|POS=PRON|PronType=Rel , POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass , Gender=Fem|NumType=Ord|Number=Plur|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin , Mood=Imp|POS=VERB|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON|Person=2|Reflex=Yes , Mood=Cnd|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON|Person=1|Reflex=Yes , Gender=Masc|NumType=Card|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin , Number=Sing|POS=PRON|Person=1|Reflex=Yes , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Gender=Masc|POS=PROPN , Mood=Cnd|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON|Person=1|PronType=Prs , Mood=Sub|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON|Person=2|PronType=Prs , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin , Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs , Number=Sing|POS=PRON|Person=1|PronType=Prs , Mood=Cnd|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin , Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin , Number=Plur|POS=VERB|Tense=Past|VerbForm=Part , Gender=Fem|Number=Plur|POS=PROPN , Gender=Masc|NumType=Card|POS=NUM |
parser |
ROOT , acl , acl:relcl , advcl , advmod , amod , appos , aux:pass , aux:tense , case , cc , ccomp , conj , cop , dep , det , expl:comp , expl:pass , expl:subj , fixed , flat:foreign , flat:name , iobj , mark , nmod , nsubj , nsubj:pass , nummod , obj , obl:agent , obl:arg , obl:mod , parataxis , punct , vocative , xcomp |
ner |
LOC , MISC , ORG , PER |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
99.80 |
TOKEN_P |
98.44 |
TOKEN_R |
98.96 |
TOKEN_F |
98.70 |
POS_ACC |
96.18 |
MORPH_ACC |
95.30 |
MORPH_MICRO_P |
97.96 |
MORPH_MICRO_R |
96.64 |
Model Index
The model fr_core_news_sm
has the following performance results on different tasks:
Task | Metric | Value |
---|---|---|
NER | NER Precision | 0.8148438757 |
NER Recall | 0.8106360834 | |
NER F Score | 0.8127345333 | |
TAG | TAG (XPOS) Accuracy | 0.933216531 |
POS | POS (UPOS) Accuracy | 0.9617644028 |
MORPH | Morph (UFeats) Accuracy | 0.9529502705 |
LEMMA | Lemma Accuracy | 0.9084463625 |
UNLABELED_DEPENDENCIES | Unlabeled Attachment Score (UAS) | 0.8781984485 |
LABELED_DEPENDENCIES | Labeled Attachment Score (LAS) | 0.8347514036 |
SENTS | Sentences F - Score | 0.861278649 |
📄 License
The license of this model is LGPL-LR
.
Indonesian Roberta Base Posp Tagger
MIT
This is a POS tagging model fine-tuned based on the Indonesian RoBERTa model, trained on the indonlu dataset for Indonesian text POS tagging tasks.
Sequence Labeling
Transformers Other

I
w11wo
2.2M
7
Bert Base NER
MIT
BERT fine-tuned named entity recognition model capable of identifying four entity types: Location (LOC), Organization (ORG), Person (PER), and Miscellaneous (MISC)
Sequence Labeling English
B
dslim
1.8M
592
Deid Roberta I2b2
MIT
This model is a sequence labeling model fine-tuned on RoBERTa, designed to identify and remove Protected Health Information (PHI/PII) from medical records.
Sequence Labeling
Transformers Supports Multiple Languages

D
obi
1.1M
33
Ner English Fast
Flair's built-in fast English 4-class named entity recognition model, based on Flair embeddings and LSTM-CRF architecture, achieving an F1 score of 92.92 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
978.01k
24
French Camembert Postag Model
French POS tagging model based on Camembert-base, trained using the free-french-treebank dataset
Sequence Labeling
Transformers French

F
gilf
950.03k
9
Xlm Roberta Large Ner Spanish
A Spanish named entity recognition model fine-tuned based on the XLM-Roberta-large architecture, with excellent performance on the CoNLL-2002 dataset.
Sequence Labeling
Transformers Spanish

X
MMG
767.35k
29
Nusabert Ner V1.3
MIT
Named entity recognition model fine-tuned on Indonesian NER tasks based on NusaBert-v1.3
Sequence Labeling
Transformers Other

N
cahya
759.09k
3
Ner English Large
Flair framework's built-in large English NER model for 4 entity types, utilizing document-level XLM-R embeddings and FLERT technique, achieving an F1 score of 94.36 on the CoNLL-03 dataset.
Sequence Labeling
PyTorch English
N
flair
749.04k
44
Punctuate All
MIT
A multilingual punctuation prediction model fine-tuned based on xlm-roberta-base, supporting automatic punctuation completion for 12 European languages
Sequence Labeling
Transformers

P
kredor
728.70k
20
Xlm Roberta Ner Japanese
MIT
Japanese named entity recognition model fine-tuned based on xlm-roberta-base
Sequence Labeling
Transformers Supports Multiple Languages

X
tsmatz
630.71k
25
Featured Recommended AI Models