🚀 Model Card for gliner_large_news-v2.1
This model is a fine - tuned version of GLiNER, aiming to enhance accuracy in long - context news entity extraction across various topics.
🚀 Quick Start
To start using the gliner_large_news-v2.1
model, you can follow the code example below:
from gliner import GLiNER
model = GLiNER.from_pretrained("EmergentMethods/gliner_large_news-v2.1")
text = """
The Chihuahua State Public Security Secretariat (SSPE) arrested 35-year-old Salomón C. T. in Ciudad Juárez, found in possession of a stolen vehicle, a white GMC Yukon, which was reported stolen in the city's streets. The arrest was made by intelligence and police analysis personnel during an investigation in the border city. The arrest is related to a previous detention on February 6, which involved armed men in a private vehicle. The detainee and the vehicle were turned over to the Chihuahua State Attorney General's Office for further investigation into the case.
"""
labels = ["person", "location", "date", "event", "facility", "vehicle", "number", "organization"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
Output:
Chihuahua State Public Security Secretariat => organization
SSPE => organization
35-year-old => number
Salomón C. T. => person
Ciudad Juárez => location
GMC Yukon => vehicle
February 6 => date
Chihuahua State Attorney General's Office => organization
✨ Features
- Enhanced Accuracy: This fine - tuned model improves upon the base GLiNER model's zero - shot accuracy by up to 7.5% across 18 benchmark datasets, especially in long - context news entity extraction.
- Diverse Dataset: The underlying dataset AskNews - NER - v0 enforces country/language/topic/temporal diversity to diversify global perspectives.
- Compact and High - throughput: The model is compact and suitable for high - throughput production use cases.
📚 Documentation
Model Details
Model Description
The synthetic data for this news fine - tune is sourced from the AskNews API. We ensured diversity across country, language, topic, and time.

- Developed by: Emergent Methods
- Funded by: Emergent Methods
- Shared by: Emergent Methods
- Model type: microsoft/deberta
- Language(s) (NLP): English (en) (English texts and translations from Spanish (es), Portuguese (pt), German (de), Russian (ru), French (fr), Arabic (ar), Italian (it), Ukrainian (uk), Norwegian (no), Swedish (sv), Danish (da))
- License: Apache 2.0
- Finetuned from model: GLiNER
Model Sources [optional]
- Repository: To be added
- Paper: To be added
- Demo: To be added
Uses
Direct Use
This model is designed for generalist entity extraction. Despite being fine - tuned on news data, it shows improved accuracy across 18 benchmark datasets. The broad and diversified dataset enables it to recognize and extract more entity types. It is currently used by AskNews for entity extraction in their system.
Bias, Risks, and Limitations
Although the dataset aims to reduce bias and improve diversity, it is still biased towards western languages and countries. This bias comes from the translation and summary generation abilities of Llama2 and Llama3. Any bias in their training data will also be present in this dataset.

Training Details
The training dataset is AskNews - NER - v0. Other training details can be found in the companion paper.
Environmental Impact
Citation
BibTeX: To be added
APA: To be added
Model Authors
Elin Törnquist, Emergent Methods elin at emergentmethods.ai
Robert Caulk, Emergent Methods rob at emergentmethods.ai
Model Contact
Elin Törnquist, Emergent Methods elin at emergentmethods.ai
Robert Caulk, Emergent Methods rob at emergentmethods.ai
📄 License
This model is licensed under the Apache 2.0 license.