🚀 ONNX convert of bert-base-NER
This project focuses on the ONNX conversion of the bert-base-NER
model. It provides a ready - to - use solution for Named Entity Recognition (NER) with state - of - the - art performance.
🚀 Quick Start
The core of this project is the conversion of the bert-base-NER
model, enabling it to be used in more scenarios.
✨ Features
Model description
bert-base-NER is a fine - tuned BERT model designed for Named Entity Recognition. It achieves state - of - the - art performance in the NER task. This model has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER), and Miscellaneous (MISC).
Specifically, it is a bert - base - cased model fine - tuned on the English version of the standard [CoNLL - 2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03 - 0419.pdf) dataset.
If you prefer a larger model, the [bert - large - NER](https://huggingface.co/dslim/bert - large - NER/) version, fine - tuned on the same dataset, is also available.
Intended uses & limitations
How to use
You can utilize this model with Transformers pipeline for NER.
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"
ner_results = nlp(example)
print(ner_results)
Limitations and bias
This model is restricted by its training dataset of entity - annotated news articles from a specific time span. It may not generalize well for all use cases in different domains. Additionally, the model sometimes tags subword tokens as entities, and post - processing of results may be required.
📦 Installation
No specific installation steps are provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
The basic usage of the model for NER is as follows:
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"
ner_results = nlp(example)
print(ner_results)
📚 Documentation
Training data
This model was fine - tuned on the English version of the standard [CoNLL - 2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03 - 0419.pdf) dataset.
The training dataset differentiates between the beginning and continuation of an entity. Each token in the dataset is classified into one of the following classes:
Property |
Details |
O |
Outside of a named entity |
B - MIS |
Beginning of a miscellaneous entity right after another miscellaneous entity |
I - MIS |
Miscellaneous entity |
B - PER |
Beginning of a person’s name right after another person’s name |
I - PER |
Person’s name |
B - ORG |
Beginning of an organization right after another organization |
I - ORG |
Organization |
B - LOC |
Beginning of a location right after another location |
I - LOC |
Location |
CoNLL - 2003 English Dataset Statistics
This dataset is derived from the Reuters corpus, which consists of Reuters news stories. You can find more information about its creation in the CoNLL - 2003 paper.
# of training examples per entity type
Dataset |
LOC |
MISC |
ORG |
PER |
Train |
7140 |
3438 |
6321 |
6600 |
Dev |
1837 |
922 |
1341 |
1842 |
Test |
1668 |
702 |
1661 |
1617 |
# of articles/sentences/tokens per dataset
Dataset |
Articles |
Sentences |
Tokens |
Train |
946 |
14,987 |
203,621 |
Dev |
216 |
3,466 |
51,362 |
Test |
231 |
3,684 |
46,435 |
Training procedure
This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the original BERT paper, which trained and evaluated the model on the CoNLL - 2003 NER task.
Eval results
metric |
dev |
test |
f1 |
95.1 |
91.3 |
precision |
95.0 |
90.7 |
recall |
95.3 |
91.9 |
The test metrics are slightly lower than the official Google BERT results, which encoded document context and experimented with CRF. More information on replicating the original results can be found [here](https://github.com/google - research/bert/issues/223).
BibTeX entry and citation info
@article{DBLP:journals/corr/abs-1810-04805,
author = {Jacob Devlin and
Ming{-}Wei Chang and
Kenton Lee and
Kristina Toutanova},
title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language
Understanding},
journal = {CoRR},
volume = {abs/1810.04805},
year = {2018},
url = {http://arxiv.org/abs/1810.04805},
archivePrefix = {arXiv},
eprint = {1810.04805},
timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,
title = "Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition",
author = "Tjong Kim Sang, Erik F. and
De Meulder, Fien",
booktitle = "Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003",
year = "2003",
url = "https://www.aclweb.org/anthology/W03-0419",
pages = "142--147",
}
📄 License
This project is licensed under the MIT license.