span-marker-xlm-roberta-base-fewnerd-fine-super開源模型

首頁

Span Marker Xlm Roberta Base Fewnerd Fine Super

由tomaarsen開發

這是一個在FewNERD數據集上訓練的SpanMarker模型，用於多語言命名實體識別任務，基於xlm-roberta-base編碼器。

序列標註 #多語言實體識別 #細粒度NER #跨領域實體抽取

下載量 148

發布時間 : 6/15/2023

模型概述

該模型採用SpanMarker架構，專門用於命名實體識別(NER)任務，支持英語和多語言文本處理。

模型特點

多語言支持

基於xlm-roberta-base編碼器，支持英語和多語言文本處理

細粒度實體識別

能夠識別66種不同類型的實體，包括藝術、建築、事件、地點、組織等

SpanMarker架構

採用SpanMarker架構，專門優化用於命名實體識別任務

模型能力

命名實體識別

多語言文本處理

細粒度實體分類

使用案例

信息提取

新聞文章實體識別

從新聞文章中提取人名、地名、組織名等實體

F1分數0.6885

學術文獻分析

識別科研論文中的化學物質、生物學術語等專業實體

化學物質F1分數0.5832，生物學術語F1分數0.6497

商業智能

公司名稱識別

從商業文檔中提取公司名稱和組織信息

F1分數0.6917

產品識別

識別文本中提到的產品名稱和類型

汽車產品F1分數0.7234，飛機產品F1分數0.6464

🚀 在FewNERD數據集上使用xlm - roberta - base的SpanMarker模型

這是一個在[FewNERD](https://huggingface.co/datasets/DFKI - SLT/few - nerd)數據集上訓練的SpanMarker模型，可用於命名實體識別。該SpanMarker模型使用[xlm - roberta - base](https://huggingface.co/xlm - roberta - base)作為基礎編碼器。

✨ 主要特性

適用於命名實體識別任務。
支持英語和多語言。
基於強大的xlm - roberta - base編碼器。

📚 詳細文檔

模型詳情

模型描述

屬性	詳情
模型類型	SpanMarker
編碼器	[xlm - roberta - base](https://huggingface.co/xlm - roberta - base)
最大序列長度	256個標記
最大實體長度	8個單詞
訓練數據	[FewNERD](https://huggingface.co/datasets/DFKI - SLT/few - nerd)
支持語言	英語、多語言
許可證	cc - by - sa - 4.0

模型來源

倉庫：GitHub上的SpanMarker
論文：用於命名實體識別的SpanMarker

模型標籤

標籤	示例
art - broadcastprogram	"The Gale Storm Show : Oh , Susanna"、"Corazones"、"Street Cents"
art - film	"L'Atlantide"、"Shawshank Redemption"、"Bosch"
art - music	"Hollywood Studio Symphony"、"Atkinson , Danko and Ford ( with Brockie and Hilton )"、"Champion Lover"
art - other	"Venus de Milo"、"Aphrodite of Milos"、"The Today Show"
art - painting	"Cofiwch Dryweryn"、"Production/Reproduction"、"Touit"
art - writtenart	"The Seven Year Itch"、"Time"、"Imelda de ' Lambertazzi"
building - airport	"Newark Liberty International Airport"、"Luton Airport"、"Sheremetyevo International Airport"
building - hospital	"Hokkaido University Hospital"、"Yeungnam University Hospital"、"Memorial Sloan - Kettering Cancer Center"
building - hotel	"Radisson Blu Sea Plaza Hotel"、"The Standard Hotel"、"Flamingo Hotel"
building - library	"British Library"、"Berlin State Library"、"Bayerische Staatsbibliothek"
building - other	"Communiplex"、"Henry Ford Museum"、"Alpha Recording Studios"
building - restaurant	"Fatburger"、"Carnegie Deli"、"Trumbull"
building - sportsfacility	"Boston Garden"、"Glenn Warner Soccer Facility"、"Sports Center"
building - theater	"Pittsburgh Civic Light Opera"、"National Paris Opera"、"Sanders Theatre"
event - attack/battle/war/militaryconflict	"Jurist"、"Easter Offensive"、"Vietnam War"
event - disaster	"1693 Sicily earthquake"、"1990s North Korean famine"、"the 1912 North Mount Lyell Disaster"
event - election	"March 1898 elections"、"Elections to the European Parliament"、"1982 Mitcham and Morden by - election"
event - other	"Eastwood Scoring Stage"、"Union for a Popular Movement"、"Masaryk Democratic Movement"
event - protest	"Russian Revolution"、"French Revolution"、"Iranian Constitutional Revolution"
event - sportsevent	"World Cup"、"Stanley Cup"、"National Champions"
location - GPE	"Mediterranean Basin"、"Croatian"、"the Republic of Croatia"
location - bodiesofwater	"Norfolk coast"、"Atatürk Dam Lake"、"Arthur Kill"
location - island	"Laccadives"、"Staten Island"、"new Samsat district"
location - mountain	"Ruweisat Ridge"、"Miteirya Ridge"、"Salamander Glacier"
location - other	"Victoria line"、"Northern City Line"、"Cartuther"
location - park	"Painted Desert Community Complex Historic District"、"Shenandoah National Park"、"Gramercy Park"
location - road/railway/highway/transit	"Newark - Elizabeth Rail Link"、"NJT"、"Friern Barnet Road"
organization - company	"Church 's Chicken"、"Texas Chicken"、"Dixy Chicken"
organization - education	"MIT"、"Belfast Royal Academy and the Ulster College of Physical Education"、"Barnard College"
organization - government/governmentagency	"Congregazione dei Nobili"、"Diet"、"Supreme Court"
organization - media/newspaper	"TimeOut Melbourne"、"Al Jazeera"、"Clash"
organization - other	"IAEA"、"4th Army"、"Defence Sector C"
organization - politicalparty	"Al Wafa ' Islamic"、"Shimpot≈ç"、"Kenseit≈ç"
organization - religion	"UPCUSA"、"Jewish"、"Christian"
organization - showorganization	"Bochumer Symphoniker"、"Mr. Mister"、"Lizzy"
organization - sportsleague	"First Division"、"NHL"、"China League One"
organization - sportsteam	"Tottenham"、"Arsenal"、"Luc Alphand Aventures"
other - astronomything	"Algol"、"Zodiac"、"`` Caput Larvae ''"
other - award	"Grand Commander of the Order of the Niger"、"Order of the Republic of Guinea and Nigeria"、"GCON"
other - biologything	"Amphiphysin"、"BAR"、"N - terminal lipid"
other - chemicalthing	"carbon dioxide"、"sulfur"、"uranium"
other - currency	"$"、"lac crore"、"Travancore Rupee"
other - disease	"hypothyroidism"、"bladder cancer"、"French Dysentery Epidemic of 1779"
other - educationaldegree	"Master"、"Bachelor"、"BSc ( Hons ) in physics"
other - god	"El"、"Fujin"、"Raijin"
other - language	"Breton - speaking"、"Latin"、"English"
other - law	"United States Freedom Support Act"、"Thirty Years ' Peace"、"Leahy‚ÄìSmith America Invents Act ( AIA"
other - livingthing	"insects"、"patchouli"、"monkeys"
other - medical	"amitriptyline"、"pediatrician"、"Pediatrics"
person - actor	"Tch√©ky Karyo"、"Edmund Payne"、"Ellaline Terriss"
person - artist/author	"George Axelrod"、"Hicks"、"Gaetano Donizett"
person - athlete	"Jaguar"、"Neville"、"Tozawa"
person - director	"Richard Quine"、"Frank Darabont"、"Bob Swaim"
person - other	"Campbell"、"Richard Benson"、"Holden"
person - politician	"Rivi√®re"、"Emeric"、"William"
person - scholar	"Stedman"、"Wurdack"、"Stalmine"
person - soldier	"Joachim Ziegler"、"Krukenberg"、"Helmuth Weidling"
product - airplane	"EC135T2 CPDS"、"Spey - equipped FGR.2s"、"Luton"
product - car	"Phantom"、"Corvettes - GT1 C6R"、"100EX"
product - food	"V. labrusca"、"red grape"、"yakiniku"
product - game	"Hardcore RPG"、"Airforce Delta"、"Splinter Cell"
product - other	"PDP - 1"、"Fairbottom Bobs"、"X11"
product - ship	"Essex"、"Congress"、"HMS `` Chinkara ''"
product - software	"Wikipedia"、"Apdf"、"AmiPDF"
product - train	"55022"、"Royal Scots Grey"、"High Speed Trains"
product - weapon	"AR - 15 's"、"ZU - 23 - 2MR Wr√≥bel II"、"ZU - 23 - 2M Wr√≥bel"

評估

指標

標籤	精確率	召回率	F1值
全部	0.6890	0.6879	0.6885
art - broadcastprogram	0.6	0.5771	0.5883
art - film	0.7384	0.7453	0.7419
art - music	0.7930	0.7221	0.7558
art - other	0.4245	0.2900	0.3446
art - painting	0.5476	0.4035	0.4646
art - writtenart	0.6400	0.6539	0.6469
building - airport	0.8219	0.8242	0.8230
building - hospital	0.7024	0.8104	0.7526
building - hotel	0.7175	0.7283	0.7228
building - library	0.74	0.7296	0.7348
building - other	0.5828	0.5910	0.5869
building - restaurant	0.5525	0.5216	0.5366
building - sportsfacility	0.6187	0.7881	0.6932
building - theater	0.7067	0.7626	0.7336
event - attack/battle/war/militaryconflict	0.7544	0.7468	0.7506
event - disaster	0.5882	0.5314	0.5584
event - election	0.4167	0.2198	0.2878
event - other	0.4902	0.4042	0.4430
event - protest	0.3643	0.2831	0.3186
event - sportsevent	0.6125	0.6239	0.6182
location - GPE	0.8102	0.8553	0.8321
location - bodiesofwater	0.6888	0.7725	0.7282
location - island	0.7285	0.6440	0.6836
location - mountain	0.7129	0.7327	0.7227
location - other	0.4376	0.2560	0.3231
location - park	0.6991	0.6900	0.6945
location - road/railway/highway/transit	0.6936	0.7259	0.7094
organization - company	0.6921	0.6912	0.6917
organization - education	0.7838	0.7963	0.7900
organization - government/governmentagency	0.5363	0.4394	0.4831
organization - media/newspaper	0.6215	0.6705	0.6451
organization - other	0.5766	0.5157	0.5444
organization - politicalparty	0.6449	0.7324	0.6859
organization - religion	0.5139	0.6057	0.5560
organization - showorganization	0.5620	0.5657	0.5638
organization - sportsleague	0.6348	0.6542	0.6443
organization - sportsteam	0.7138	0.7566	0.7346
other - astronomything	0.7418	0.7625	0.752
other - award	0.7291	0.6736	0.7002
other - biologything	0.6735	0.6275	0.6497
other - chemicalthing	0.6025	0.5651	0.5832
other - currency	0.6843	0.8411	0.7546
other - disease	0.6284	0.7089	0.6662
other - educationaldegree	0.5856	0.6033	0.5943
other - god	0.6089	0.6913	0.6475
other - language	0.6608	0.7968	0.7225
other - law	0.6693	0.7246	0.6958
other - livingthing	0.6070	0.6014	0.6042
other - medical	0.5062	0.5113	0.5088
person - actor	0.8274	0.7673	0.7962
person - artist/author	0.6761	0.7294	0.7018
person - athlete	0.8132	0.8347	0.8238
person - director	0.675	0.6823	0.6786
person - other	0.6472	0.6388	0.6429
person - politician	0.6621	0.6593	0.6607
person - scholar	0.5181	0.5007	0.5092
person - soldier	0.4750	0.5131	0.4933
product - airplane	0.6230	0.6717	0.6464
product - car	0.7293	0.7176	0.7234