GNER-LLaMA-7B Open-Source Model - Free Deployment to Boost Zero-Shot Scene Entity Recognition Tasks

GNER LLaMA 7B

Developed by dyyyyyyyy

GNER-LLaMA-7B is a generative named entity recognition model based on the LLaMA architecture, focusing on zero-shot entity recognition tasks.

Sequence Labeling

Transformers

EnglishOpen Source License:Apache-2.0 #Generative NER #Zero-shot Learning #Cross-domain Entity Recognition

Downloads 38

Release Time : 2/27/2024

Model Overview

This model adopts a generative approach for named entity recognition, enhancing recognition capabilities in unseen domains through negative sample reconstruction techniques, supporting multiple entity types.

Model Features

Zero-shot Recognition Capability

Demonstrates stronger zero-shot recognition capability in unseen entity domains.

Negative Sample Training

Significantly improves performance by incorporating negative samples into the training process.

Multi-model Support

Based on two representative generative models: LLaMA and Flan-T5.

Model Capabilities

Text Generation

Named Entity Recognition

Zero-shot Learning

Use Cases

Information Extraction

Film and TV Entity Recognition

Identifies entities such as actors, directors, and years in film and TV works.

Achieves an F1 score of 66.1 on test data.

Cross-domain Entity Recognition

Performs entity recognition in unseen domains.

Outperforms current state-of-the-art solutions by 8-11 F1 points.

🚀 Rethinking Negative Instances for Generative Named Entity Recognition

This project introduces GNER, a Generative Named Entity Recognition framework. It has enhanced zero - shot capabilities across unseen entity domains. By integrating negative instances into training, GNER - LLaMA and GNER - T5 outperform state - of - the - art approaches, with significant improvements in the $F_1$ score. The code and models are publicly available.

🚀 Quick Start

We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero - shot capabilities across unseen entity domains. Experiments on two representative generative models, i.e., LLaMA and Flan - T5, show that the integration of negative instances into the training process yields substantial performance enhancements. The resulting models, GNER - LLaMA and GNER - T5, outperform state - of - the - art (SoTA) approaches by a large margin, achieving improvements of 8 and 11 points in $F_1$ score, respectively.

💻 Code: https://github.com/yyDing1/GNER/
📖 Paper: Rethinking Negative Instances for Generative Named Entity Recognition
💾 Models in the 🤗 HuggingFace Hub: GNER - Models
🧪 Reproduction Materials: [Reproduction Materials](https://drive.google.com/drive/folders/1m2FqDgItEbSoeUVo - i18AwMvBcNkZD46?usp=drive_link)
🎨 Example Jupyter Notebooks: GNER Notebook

Zero - shot Results

✨ Features

Enhanced Zero - shot Capabilities: GNER shows improved performance in zero - shot scenarios across unseen entity domains.
Performance Boost: Integrating negative instances in training leads to significant performance improvements in $F_1$ score compared to state - of - the - art approaches.
Multiple Model Variants: Five GNER models are released based on LLaMA (7B) and Flan - T5 (base, large, xl and xxl).

📦 Installation

You should install the dependencies:

pip install torch datasets deepspeed accelerate transformers protobuf

💻 Usage Examples

Basic Usage

Please check out Example Jupyter Notebooks for guidance on utilizing GNER models.

Advanced Usage

A simple inference example using GNER - LLaMA is as follows:

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
>>> model = AutoModelForCausalLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
>>> model = model.eval()
>>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token - by - token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO - format to label the entities, where:\n1. B - (Begin) indicates the start of a named entity.\n2. I - (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
>>> sentence = "did george clooney make a musical in the 1980s"
>>> entity_labels = ["genre", "rating", "review", "plot", "song", "average ratings", "director", "character", "trailer", "year", "actor", "title"]
>>> instruction = f"{instruction_template}\nUse the specific entity tags: {', '.join(entity_labels)} and O.\nSentence: {sentence}"
>>> instruction = f"[INST] {instruction} [/INST]"
>>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
>>> outputs = model.generate(**inputs, max_new_tokens=640)
>>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
>>> response = response[response.find("[/INST]") + len("[/INST]"):].strip()
>>> print(response)
"did(O) george(B - actor) clooney(I - actor) make(O) a(O) musical(B - genre) in(O) the(O) 1980s(B - year)"

📚 Documentation

PreTrained Models

We release five GNER models based on LLaMA (7B) and Flan - T5 (base, large, xl and xxl).

Property	Details
Model Type	GNER - LLaMA (7B), GNER - T5 (base, large, xl, xxl)
Training Data	Not specified in the original document

Model	# Params	Zero - shot Average $F_1$	Supervised Average $F_1$	🤗 HuggingFace Download Link
GNER - LLaMA	7B	66.1	86.09	link
GNER - T5 - base	248M	59.5	83.21	link
GNER - T5 - large	783M	63.5	85.45	link
GNER - T5 - xl	3B	66.1	85.94	link
GNER - T5 - xxl	11B	69.1	86.15	link

📄 License

This project is licensed under the Apache - 2.0 license.

📚 Citation

@misc{ding2024rethinking,
      title={Rethinking Negative Instances for Generative Named Entity Recognition}, 
      author={Yuyang Ding and Juntao Li and Pinzheng Wang and Zecheng Tang and Bowen Yan and Min Zhang},
      year={2024},
      eprint={2402.16602},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご