gliner-model-merge-large-v1.0開源命名實體識別模型

首頁

Gliner Model Merge Large V1.0

由xomad開發

基於模型融合技術優化的命名實體識別模型，F1分數提升3.25個點至0.6601

序列標註

PyTorch

英語開源協議:Apache-2.0 #多任務NER #模型融合優化 #零樣本學習

下載量 129

發布時間 : 9/24/2024

模型概述

該模型是基於GLiNER架構的命名實體識別模型，通過創新的模型融合技術顯著提升性能。支持零樣本NER任務，可識別文本中的多種實體類型。

模型特點

模型融合技術

採用WiSE-FT等先進模型融合方法，顯著提升性能3.25個F1點

商業友好許可

僅在具有商業友好許可的數據集上訓練，確保廣泛適用性

多數據集訓練

融合5個高質量數據集的知識，增強模型泛化能力

零樣本能力

支持零樣本命名實體識別，無需特定領域訓練數據

模型能力

命名實體識別

零樣本學習

多類別實體檢測

文本分析

使用案例

新聞分析

新聞人物與組織識別

從新聞文本中自動識別人物、組織、地點等實體

在政治領域F1達78.51%

商業智能

企業信息提取

從商業文檔中提取公司、創始人、產品等信息

示例中準確識別微軟公司和其創始人

學術研究

科學文獻分析

識別科研論文中的專業術語和概念

科學領域F1達72.41%

🚀 xomad/gliner-model-merge-large-v1.0模型

xomad/gliner-model-merge-large-v1.0 模型基於預訓練模型 knowledgator/gliner-multitask-large-v0.5 開發，旨在探索模型融合技術的能力。通過該技術，模型性能顯著提升了3.25個百分點，F1分數從0.6276提高到了0.6601。

該模型僅在具有商業友好許可的數據集上進行訓練，以確保在Apache - 2.0許可下具有廣泛的適用性。訓練過程中使用了以下數據集：

🚀 快速開始

本模型基於預訓練模型 knowledgator/gliner-multitask-large-v0.5 開發，通過模型融合技術顯著提升了性能。下面將介紹如何安裝和使用該模型。

✨ 主要特性

性能提升：通過模型融合技術，F1分數顯著提高，從0.6276提升到0.6601。
商業友好：僅在具有商業友好許可的數據集上訓練，適用於Apache - 2.0許可。
多數據集訓練：使用多個公開數據集進行訓練，包括 knowledgator/GLINER-multi-task-synthetic-data、EmergentMethods/AskNews-NER-v0 等。

📦 安裝指南

要使用此模型，你必須安裝 GLiNER Python庫：

pip install gliner

下載GLiNER庫後，你可以導入GLiNER類，然後使用 GLiNER.from_pretrained 加載此模型。

💻 使用示例

基礎用法

from gliner import GLiNER

model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")

text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""

labels = ["founder", "computer", "software", "position", "date", "company"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

輸出：

Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date

📚 詳細文檔

⚙️ 微調過程

該過程從基礎模型 knowledgator/gliner-multitask-large-v0.5 開始。我們的模型 xomad/gliner-model-merge-large-v1.0 在上述每個數據集上分別進行微調，並在微調過程中保存多個檢查點。我們將所有這些檢查點放入一個池中，然後應用 Model soups 技術來生成不同的融合模型：

uniform_merged
greedy_on_random
greedy_on_sorted

隨後，我們對從上述3個模型和原始模型中選出的模型對應用 WiSE - FT 融合技術，生成 wise_ft_merged 模型。這結束了第一階段的微調。

在第二階段的微調中，以 wise_ft_merged 作為新的起點重複該過程，以生成最終模型。整個微調流程如下圖所示：

Finetuning flow

微調模型池和融合模型的性能在 CrossNER、TwitterNER基準測試中進行評估，並繪製在以下兩個圖中（分別為 crossner_f1 和 other_f1）。

第一階段微調圖： 1st finetuning phase

第二階段微調圖： 2nd finetuning phase

📊 基準測試

Model Performance

不同零樣本NER基準測試（CrossNER、mit - movie和mit - restaurant）的性能，數據來源於 https://huggingface.co/knowledgator/gliner-multitask-large-v0.5：

模型	F1分數
xomad/gliner-model-merge-large-v1.0	0.6601
knowledgator/gliner-multitask-v0.5	0.6276
numind/NuNER_Zero-span	0.6196
gliner-community/gliner_large-v2.5	0.615
EmergentMethods/gliner_large_news-v2.1	0.5876
urchade/gliner_large-v2.1	0.5754

不同數據集上的詳細性能：

模型	數據集	精確率	召回率	F1分數	F1分數（小數）
xomad/gliner-model-merge-large-v1.0	CrossNER_AI	62.66%	57.48%	59.96%	0.5996
	CrossNER_literature	73.28%	66.42%	69.68%	0.6968
	CrossNER_music	74.89%	70.67%	72.72%	0.7272
	CrossNER_politics	79.46%	77.57%	78.51%	0.7851
	CrossNER_science	74.72%	70.24%	72.41%	0.7241
	mit-movie	67.33%	57.89%	62.25%	0.6225
	mit-restaurant	54.94%	40.41%	46.57%	0.4657
	平均				0.6601
numind/NuNER_Zero-span	CrossNER_AI	63.82%	56.82%	60.12%	0.6012
	CrossNER_literature	73.53%	58.06%	64.89%	0.6489
	CrossNER_music	72.69%	67.40%	69.95%	0.6995
	CrossNER_politics	77.28%	68.69%	72.73%	0.7273
	CrossNER_science	70.08%	63.12%	66.42%	0.6642
	mit-movie	63.00%	48.88%	55.05%	0.5505
	mit-restaurant	54.81%	37.62%	44.62%	0.4462
	平均				0.6196
knowledgator/gliner-multitask-v0.5	CrossNER_AI	51.00%	51.11%	51.05%	0.5105
	CrossNER_literature	72.65%	65.62%	68.96%	0.6896
	CrossNER_music	74.91%	73.70%	74.30%	0.7430
	CrossNER_politics	78.84%	77.71%	78.27%	0.7827
	CrossNER_science	69.20%	65.48%	67.29%	0.6729
	mit-movie	61.29%	52.59%	56.60%	0.5660
	mit-restaurant	50.65%	38.13%	43.51%	0.4351
	平均				0.6276
gliner-community/gliner_large-v2.5	CrossNER_AI	50.85%	63.03%	56.29%	0.5629
	CrossNER_literature	64.92%	67.21%	66.04%	0.6604
	CrossNER_music	70.88%	73.10%	71.97%	0.7197
	CrossNER_politics	72.67%	72.93%	72.80%	0.7280
	CrossNER_science	61.71%	68.85%	65.08%	0.6508
	mit-movie	54.63%	52.83%	53.71%	0.5371
	mit-restaurant	47.99%	42.13%	44.87%	0.4487
	平均				0.6154
urchade/gliner_large-v2.1	CrossNER_AI	54.98%	52.00%	53.45%	0.5345
	CrossNER_literature	59.33%	56.47%	57.87%	0.5787
	CrossNER_music	67.39%	66.77%	67.08%	0.6708
	CrossNER_politics	66.07%	63.76%	64.90%	0.6490
	CrossNER_science	61.45%	62.56%	62.00%	0.6200
	mit-movie	55.94%	47.36%	51.29%	0.5129
	mit-restaurant	53.34%	40.83%	46.25%	0.4625
	平均				0.5754
EmergentMethods/gliner_large_news-v2.1	CrossNER_AI	59.60%	54.55%	56.96%	0.5696
	CrossNER_literature	65.41%	56.16%	60.44%	0.6044
	CrossNER_music	67.47%	63.08%	65.20%	0.6520
	CrossNER_politics	66.05%	60.07%	62.92%	0.6292
	CrossNER_science	68.44%	63.57%	65.92%	0.6592
	mit-movie	65.85%	49.59%	56.57%	0.5657
	mit-restaurant	54.71%	35.94%	43.38%	0.4338
	平均				0.5876

作者

Hoan Nguyen，來自xomad.com

引用

@misc{wortsman2022modelsoupsaveragingweights,
      title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, 
      author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
      year={2022},
      eprint={2203.05482},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2203.05482}, 
}

@InProceedings{Wortsman_2022_CVPR,
    author    = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
    title     = {Robust Fine-Tuning of Zero-Shot Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7959-7971}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, 
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}