Gliner-model-merge-large-v1.0 Open-Source Named Entity Recognition Model - Fusion Optimization for More Precise Recognition

Home

Gliner Model Merge Large V1.0

Developed by xomad

Named entity recognition model optimized with model fusion technology, F1 score improved by 3.25 points to 0.6601

Sequence Labeling

PyTorch

EnglishOpen Source License:Apache-2.0 #Multi-task NER #Model Fusion Optimization #Zero-shot Learning

Downloads 129

Release Time : 9/24/2024

Model Overview

This model is a named entity recognition model based on the GLiNER architecture, significantly enhancing performance through innovative model fusion techniques. Supports zero-shot NER tasks and can identify multiple entity types in text.

Model Features

Model Fusion Technology

Utilizes advanced model fusion methods like WiSE-FT, significantly improving performance by 3.25 F1 points

Business-friendly License

Trained exclusively on datasets with business-friendly licenses, ensuring broad applicability

Multi-dataset Training

Incorporates knowledge from 5 high-quality datasets to enhance model generalization

Zero-shot Capability

Supports zero-shot named entity recognition without requiring domain-specific training data

Model Capabilities

Named Entity Recognition

Zero-shot Learning

Multi-category Entity Detection

Text Analysis

Use Cases

News Analysis

News Figure and Organization Identification

Automatically identifies figures, organizations, locations, and other entities from news texts

Achieves 78.51% F1 in political domain

Business Intelligence

Enterprise Information Extraction

Extracts company, founder, product, and other information from business documents

Example accurately identifies Microsoft Corporation and its founders

Academic Research

Scientific Literature Analysis

Identifies specialized terms and concepts in research papers

Achieves 72.41% F1 in scientific domain

🚀 xomad/gliner-model-merge-large-v1.0

The xomad/gliner-model-merge-large-v1.0 model is developed from the pretrained model knowledgator/gliner-multitask-large-v0.5. It explores the capabilities of model merging techniques, resulting in a significant performance boost of 3.25 points, elevating the model's capability from 0.6276 to 0.6601 F1-score.

🚀 Quick Start

The xomad/gliner-model-merge-large-v1.0 model is designed for token - classification tasks, specifically in the field of NER. It is trained on datasets with commercial - friendly licenses, ensuring broad applicability under the Apache - 2.0 license.

✨ Features

Performance Boost: Achieved a 3.25 - point increase in F1 - score through model merging techniques.
Commercial - Friendly: Trained on datasets with commercial - friendly licenses, suitable for a wide range of applications.
Multi - Dataset Training: Utilized multiple datasets during the training process, enhancing the model's generalization ability.

📦 Installation

To use this model, you must install the GLiNER Python library:

pip install gliner

Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using GLiNER.from_pretrained.

💻 Usage Examples

Basic Usage

from gliner import GLiNER

model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")

text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""

labels = ["founder", "computer", "software", "position", "date", "company"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

Output:

Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date

📚 Documentation

⚙️ Finetuning process

The process begins with the base model knowledgator/gliner-multitask-large-v0.5. Our model xomad/gliner-model-merge-large-v1.0 is fine - tuned separately on each of the following datasets:

We save multiple checkpoints along the fine - tuning process. We put all these checkpoints together into a pool and then we apply the Model soups technique to produce different merged models:

uniform_merged
greedy_on_random
greedy_on_sorted

Following this, we apply WiSE - FT merging technique to pairs of models selected from a group of the above 3 models and the original model to produce the wise_ft_merged model. This concludes the 1st finetuning phase.

The process is then repeated in the 2nd finetuning phase, using the wise_ft_merged as the new starting point, to produce the final model. The whole finetuning flow is illustrated in the following figure:

Finetuning flow

The performance of the pool of fine - tuned models and the merged models are evaluated on the CrossNER, TwitterNER benchmarks, and plotted in the following 2 figures (as crossner_f1 and other_f1 respectively).

The 1st finetuning phase plot:

The 2nd finetuning phase plot:

📊 Benchmarks

Model Performance

Performance on different zero - shot NER benchmarks (CrossNER, mit - movie and mit - restaurant), numbers reported from https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:

Property	Details
Model Type	Token - Classification (NER)
Training Data	knowledgator/GLINER-multi-task-synthetic-data, EmergentMethods/AskNews-NER-v0, urchade/pile-mistral-v0.1, MultiCoNER/multiconer_v2, DFKI-SLT/few-nerd

Model	F1 Score
xomad/gliner-model-merge-large-v1.0	0.6601
knowledgator/gliner-multitask-v0.5	0.6276
numind/NuNER_Zero-span	0.6196
gliner-community/gliner_large-v2.5	0.615
EmergentMethods/gliner_large_news-v2.1	0.5876
urchade/gliner_large-v2.1	0.5754

Detailed performance on different datasets:

Model	Dataset	Precision	Recall	F1 Score	F1 Score (Decimal)
xomad/gliner-model-merge-large-v1.0	CrossNER_AI	62.66%	57.48%	59.96%	0.5996
	CrossNER_literature	73.28%	66.42%	69.68%	0.6968
	CrossNER_music	74.89%	70.67%	72.72%	0.7272
	CrossNER_politics	79.46%	77.57%	78.51%	0.7851
	CrossNER_science	74.72%	70.24%	72.41%	0.7241
	mit-movie	67.33%	57.89%	62.25%	0.6225
	mit-restaurant	54.94%	40.41%	46.57%	0.4657
	Average				0.6601
numind/NuNER_Zero-span	CrossNER_AI	63.82%	56.82%	60.12%	0.6012
	CrossNER_literature	73.53%	58.06%	64.89%	0.6489
	CrossNER_music	72.69%	67.40%	69.95%	0.6995
	CrossNER_politics	77.28%	68.69%	72.73%	0.7273
	CrossNER_science	70.08%	63.12%	66.42%	0.6642
	mit-movie	63.00%	48.88%	55.05%	0.5505
	mit-restaurant	54.81%	37.62%	44.62%	0.4462
	Average				0.6196
knowledgator/gliner-multitask-v0.5	CrossNER_AI	51.00%	51.11%	51.05%	0.5105
	CrossNER_literature	72.65%	65.62%	68.96%	0.6896
	CrossNER_music	74.91%	73.70%	74.30%	0.7430
	CrossNER_politics	78.84%	77.71%	78.27%	0.7827
	CrossNER_science	69.20%	65.48%	67.29%	0.6729
	mit-movie	61.29%	52.59%	56.60%	0.5660
	mit-restaurant	50.65%	38.13%	43.51%	0.4351
	Average				0.6276
gliner-community/gliner_large-v2.5	CrossNER_AI	50.85%	63.03%	56.29%	0.5629
	CrossNER_literature	64.92%	67.21%	66.04%	0.6604
	CrossNER_music	70.88%	73.10%	71.97%	0.7197
	CrossNER_politics	72.67%	72.93%	72.80%	0.7280
	CrossNER_science	61.71%	68.85%	65.08%	0.6508
	mit-movie	54.63%	52.83%	53.71%	0.5371
	mit-restaurant	47.99%	42.13%	44.87%	0.4487
	Average				0.6154
urchade/gliner_large-v2.1	CrossNER_AI	54.98%	52.00%	53.45%	0.5345
	CrossNER_literature	59.33%	56.47%	57.87%	0.5787
	CrossNER_music	67.39%	66.77%	67.08%	0.6708
	CrossNER_politics	66.07%	63.76%	64.90%	0.6490
	CrossNER_science	61.45%	62.56%	62.00%	0.6200
	mit-movie	55.94%	47.36%	51.29%	0.5129
	mit-restaurant	53.34%	40.83%	46.25%	0.4625
	Average				0.5754
EmergentMethods/gliner_large_news-v2.1	CrossNER_AI	59.60%	54.55%	56.96%	0.5696
	CrossNER_literature	65.41%	56.16%	60.44%	0.6044
	CrossNER_music	67.47%	63.08%	65.20%	0.6520
	CrossNER_politics	66.05%	60.07%	62.92%	0.6292
	CrossNER_science	68.44%	63.57%	65.92%	0.6592
	mit-movie	65.85%	49.59%	56.57%	0.5657
	mit-restaurant	54.71%	35.94%	43.38%	0.4338
	Average				0.5876

🔧 Technical Details

The model merging techniques, such as Model soups and WiSE - FT, play a crucial role in improving the model's performance. By combining multiple checkpoints and applying these techniques, the model can achieve better generalization and accuracy.

📄 License

This project is licensed under the Apache - 2.0 license.

Authors

Hoan Nguyen, at xomad.com

Citations

@misc{wortsman2022modelsoupsaveragingweights,
      title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, 
      author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
      year={2022},
      eprint={2203.05482},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2203.05482}, 
}

@InProceedings{Wortsman_2022_CVPR,
    author    = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
    title     = {Robust Fine-Tuning of Zero-Shot Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7959-7971}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, 
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}

@misc{zaratiana2023gliner,
      title={GLiNER:

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご