模型概述
模型特點
模型能力
使用案例
🚀 xomad/gliner-model-merge-large-v1.0模型
xomad/gliner-model-merge-large-v1.0
模型基於預訓練模型 knowledgator/gliner-multitask-large-v0.5
開發,旨在探索模型融合技術的能力。通過該技術,模型性能顯著提升了3.25個百分點,F1分數從0.6276提高到了0.6601。
該模型僅在具有商業友好許可的數據集上進行訓練,以確保在Apache - 2.0許可下具有廣泛的適用性。訓練過程中使用了以下數據集:
- knowledgator/GLINER-multi-task-synthetic-data
- EmergentMethods/AskNews-NER-v0
- urchade/pile-mistral-v0.1
- MultiCoNER/multiconer_v2
- DFKI-SLT/few-nerd
🚀 快速開始
本模型基於預訓練模型 knowledgator/gliner-multitask-large-v0.5
開發,通過模型融合技術顯著提升了性能。下面將介紹如何安裝和使用該模型。
✨ 主要特性
- 性能提升:通過模型融合技術,F1分數顯著提高,從0.6276提升到0.6601。
- 商業友好:僅在具有商業友好許可的數據集上訓練,適用於Apache - 2.0許可。
- 多數據集訓練:使用多個公開數據集進行訓練,包括
knowledgator/GLINER-multi-task-synthetic-data
、EmergentMethods/AskNews-NER-v0
等。
📦 安裝指南
要使用此模型,你必須安裝 GLiNER Python庫:
pip install gliner
下載GLiNER庫後,你可以導入GLiNER類,然後使用 GLiNER.from_pretrained
加載此模型。
💻 使用示例
基礎用法
from gliner import GLiNER
model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")
text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""
labels = ["founder", "computer", "software", "position", "date", "company"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
輸出:
Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date
📚 詳細文檔
⚙️ 微調過程
該過程從基礎模型 knowledgator/gliner-multitask-large-v0.5
開始。我們的模型 xomad/gliner-model-merge-large-v1.0
在上述每個數據集上分別進行微調,並在微調過程中保存多個檢查點。我們將所有這些檢查點放入一個池中,然後應用 Model soups 技術來生成不同的融合模型:
uniform_merged
greedy_on_random
greedy_on_sorted
隨後,我們對從上述3個模型和原始模型中選出的模型對應用 WiSE - FT 融合技術,生成 wise_ft_merged
模型。這結束了第一階段的微調。
在第二階段的微調中,以 wise_ft_merged
作為新的起點重複該過程,以生成最終模型。整個微調流程如下圖所示:
微調模型池和融合模型的性能在 CrossNER
、TwitterNER基準測試中進行評估,並繪製在以下兩個圖中(分別為 crossner_f1
和 other_f1
)。
第一階段微調圖:
第二階段微調圖:
📊 基準測試
不同零樣本NER基準測試(CrossNER、mit - movie和mit - restaurant)的性能,數據來源於 https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:
模型 | F1分數 |
---|---|
xomad/gliner-model-merge-large-v1.0 | 0.6601 |
knowledgator/gliner-multitask-v0.5 | 0.6276 |
numind/NuNER_Zero-span | 0.6196 |
gliner-community/gliner_large-v2.5 | 0.615 |
EmergentMethods/gliner_large_news-v2.1 | 0.5876 |
urchade/gliner_large-v2.1 | 0.5754 |
不同數據集上的詳細性能:
模型 | 數據集 | 精確率 | 召回率 | F1分數 | F1分數(小數) |
---|---|---|---|---|---|
xomad/gliner-model-merge-large-v1.0 | CrossNER_AI | 62.66% | 57.48% | 59.96% | 0.5996 |
CrossNER_literature | 73.28% | 66.42% | 69.68% | 0.6968 | |
CrossNER_music | 74.89% | 70.67% | 72.72% | 0.7272 | |
CrossNER_politics | 79.46% | 77.57% | 78.51% | 0.7851 | |
CrossNER_science | 74.72% | 70.24% | 72.41% | 0.7241 | |
mit-movie | 67.33% | 57.89% | 62.25% | 0.6225 | |
mit-restaurant | 54.94% | 40.41% | 46.57% | 0.4657 | |
平均 | 0.6601 | ||||
numind/NuNER_Zero-span | CrossNER_AI | 63.82% | 56.82% | 60.12% | 0.6012 |
CrossNER_literature | 73.53% | 58.06% | 64.89% | 0.6489 | |
CrossNER_music | 72.69% | 67.40% | 69.95% | 0.6995 | |
CrossNER_politics | 77.28% | 68.69% | 72.73% | 0.7273 | |
CrossNER_science | 70.08% | 63.12% | 66.42% | 0.6642 | |
mit-movie | 63.00% | 48.88% | 55.05% | 0.5505 | |
mit-restaurant | 54.81% | 37.62% | 44.62% | 0.4462 | |
平均 | 0.6196 | ||||
knowledgator/gliner-multitask-v0.5 | CrossNER_AI | 51.00% | 51.11% | 51.05% | 0.5105 |
CrossNER_literature | 72.65% | 65.62% | 68.96% | 0.6896 | |
CrossNER_music | 74.91% | 73.70% | 74.30% | 0.7430 | |
CrossNER_politics | 78.84% | 77.71% | 78.27% | 0.7827 | |
CrossNER_science | 69.20% | 65.48% | 67.29% | 0.6729 | |
mit-movie | 61.29% | 52.59% | 56.60% | 0.5660 | |
mit-restaurant | 50.65% | 38.13% | 43.51% | 0.4351 | |
平均 | 0.6276 | ||||
gliner-community/gliner_large-v2.5 | CrossNER_AI | 50.85% | 63.03% | 56.29% | 0.5629 |
CrossNER_literature | 64.92% | 67.21% | 66.04% | 0.6604 | |
CrossNER_music | 70.88% | 73.10% | 71.97% | 0.7197 | |
CrossNER_politics | 72.67% | 72.93% | 72.80% | 0.7280 | |
CrossNER_science | 61.71% | 68.85% | 65.08% | 0.6508 | |
mit-movie | 54.63% | 52.83% | 53.71% | 0.5371 | |
mit-restaurant | 47.99% | 42.13% | 44.87% | 0.4487 | |
平均 | 0.6154 | ||||
urchade/gliner_large-v2.1 | CrossNER_AI | 54.98% | 52.00% | 53.45% | 0.5345 |
CrossNER_literature | 59.33% | 56.47% | 57.87% | 0.5787 | |
CrossNER_music | 67.39% | 66.77% | 67.08% | 0.6708 | |
CrossNER_politics | 66.07% | 63.76% | 64.90% | 0.6490 | |
CrossNER_science | 61.45% | 62.56% | 62.00% | 0.6200 | |
mit-movie | 55.94% | 47.36% | 51.29% | 0.5129 | |
mit-restaurant | 53.34% | 40.83% | 46.25% | 0.4625 | |
平均 | 0.5754 | ||||
EmergentMethods/gliner_large_news-v2.1 | CrossNER_AI | 59.60% | 54.55% | 56.96% | 0.5696 |
CrossNER_literature | 65.41% | 56.16% | 60.44% | 0.6044 | |
CrossNER_music | 67.47% | 63.08% | 65.20% | 0.6520 | |
CrossNER_politics | 66.05% | 60.07% | 62.92% | 0.6292 | |
CrossNER_science | 68.44% | 63.57% | 65.92% | 0.6592 | |
mit-movie | 65.85% | 49.59% | 56.57% | 0.5657 | |
mit-restaurant | 54.71% | 35.94% | 43.38% | 0.4338 | |
平均 | 0.5876 |
作者
Hoan Nguyen,來自xomad.com
引用
@misc{wortsman2022modelsoupsaveragingweights,
title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time},
author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
year={2022},
eprint={2203.05482},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2203.05482},
}
@InProceedings{Wortsman_2022_CVPR,
author = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
title = {Robust Fine-Tuning of Zero-Shot Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {7959-7971}
}
@misc{stepanov2024gliner,
title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks},
author={Ihor Stepanov and Mykhailo Shtopko},
year={2024},
eprint={2406.12925},
archivePrefix={arXiv},
primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}
@misc{zaratiana2023gliner,
title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
year={2023},
eprint={2311.08526},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
📄 許可證
本模型採用Apache - 2.0許可證。








