🚀 GoLLIE 7B
GoLLIE is a Large Language Model trained to follow annotation guidelines, excelling in zero - shot Information Extraction and enabling on - the - fly inference with custom annotation schemas.
🚀 Quick Start
To get started with GoLLIE, please read our 🚀 Example Jupyter Notebooks. The best way to load the model is using our custom load_model
function. However, you can also load it using the AutoModelForCausalLM
class.
⚠️ Important Note
Our flash attention implementation has small numerical differences compared to the attention implementation in Huggingface. You must use the flag trust_remote_code=True
or you will get inferior results. Flash attention requires an available CUDA GPU. Running GOLLIE pre - trained models on a CPU is not supported. We plan to address this in future releases. First, install flash attention 2:
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
Then you can load the model using
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("HiTZ/GoLLIE-7B")
model = AutoModelForCausalLM.from_pretrained("HiTZ/GoLLIE-7B", trust_remote_code=True, torch_dtype=torch.bfloat16)
model.to("cuda")
Read our 🚀 Example Jupyter Notebooks to learn how to easily define guidelines, generate model inputs and parse the output!
✨ Features
- Schema - based Inference: Labels are represented as Python classes, and guidelines are introduced as docstrings, enabling flexible and precise information extraction.
- Zero - shot Information Extraction: Outperforms previous approaches in zero - shot scenarios, allowing users to perform inferences with annotation schemas defined on the fly.
- Following Detailed Definitions: Different from previous approaches, GoLLIE can follow detailed definitions and does not solely rely on the knowledge already encoded in the LLM.
📦 Installation
First, install flash attention 2:
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
💻 Usage Examples
Basic Usage
The labels are represented as Python classes, and the guidelines or instructions are introduced as docstrings. The model start generating after the result = [
line.
@dataclass
class Launcher(Template):
"""Refers to a vehicle designed primarily to transport payloads from the Earth's
surface to space. Launchers can carry various payloads, including satellites,
crewed spacecraft, and cargo, into various orbits or even beyond Earth's orbit.
They are usually multi-stage vehicles that use rocket engines for propulsion."""
mention: str
"""
The name of the launcher vehicle.
Such as: "Sturn V", "Atlas V", "Soyuz", "Ariane 5"
"""
space_company: str
crew: List[str]
@dataclass
class Mission(Template):
"""Any planned or accomplished journey beyond Earth's atmosphere with specific objectives,
either crewed or uncrewed. It includes missions to satellites, the International
Space Station (ISS), other celestial bodies, and deep space."""
mention: str
"""
The name of the mission.
Such as: "Apollo 11", "Artemis", "Mercury"
"""
date: str
departure: str
destination: str
text = (
"The Ares 3 mission to Mars is scheduled for 2032. The Starship rocket build by SpaceX will take off from Boca Chica,"
"carrying the astronauts Max Rutherford, Elena Soto, and Jake Martinez."
)
result = [
Mission(mention='Ares 3', date='2032', departure='Boca Chica', destination='Mars'),
Launcher(mention='Starship', space_company='SpaceX', crew=['Max Rutherford', 'Elena Soto', 'Jake Martinez'])
]
📚 Documentation
Model Card for GoLLIE 7B
We present GoLLIE, a Large Language Model trained to follow annotation guidelines. GoLLIE outperforms previous approaches on zero - shot Information Extraction and allows the user to perform inferences with annotation schemas defined on the fly. Different from previous approaches, GoLLIE is able to follow detailed definitions and does not only rely on the knowledge already encoded in the LLM.
Model Description
Training Data
This is the list of task used for training and evaluating GoLLIE. However, as demonstrated in the 🚀 Create Custom Task notebook GoLLIE can perform a wide range of unseen tasks.
For more info, read our 📖Paper.
Evaluation
Model |
Supervised average F1 |
Zero - shot average F1 |
🤗HuggingFace Hub |
GoLLIE - 7B |
73.0 |
55.3 |
[HiTZ/GoLLIE - 7B](https://huggingface.co/HiTZ/GoLLIE - 7B) |
GoLLIE - 13B |
73.9 |
56.0 |
[HiTZ/GoLLIE - 13B](https://huggingface.co/HiTZ/GoLLIE - 13B) |
GoLLIE - 34B |
75.0 |
57.2 |
[HiTZ/GoLLIE - 34B](https://huggingface.co/HiTZ/GoLLIE - 34B) |
Environmental Impact
Model |
Hardware |
FLOPs |
Time (h) |
CO2eq (kg) |
GoLLIE 7B |
1xA100 |
11.9e18 |
44.5 |
1.57 |
GoLLIE 13B |
1xA100 |
22.7e18 |
79.5 |
2.80 |
GoLLIE 34B |
2xA100 |
55.8e18 |
94.6 |
6.67 |
🔧 Technical Details
The labels are represented as Python classes, and the guidelines or instructions are introduced as docstrings. The model start generating after the result = [
line, enabling schema - based information extraction.
📄 License
LLaMA2 License for the base and merged model. Apache 2.0 for pre - trained LoRA Adapters
Citation
@misc{sainz2023gollie,
title={GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction},
author={Oscar Sainz and Iker García-Ferrero and Rodrigo Agerri and Oier Lopez de Lacalle and German Rigau and Eneko Agirre},
year={2023},
eprint={2310.03668},
archivePrefix={arXiv},
primaryClass={cs.CL}
}