đ ULTRA: A Foundation Model for Knowledge Graph Reasoning
ULTRA is a foundation model designed for knowledge graph (KG) reasoning. It offers a unified solution for link prediction on multi - relational graphs. A single pre - trained ULTRA model can handle any multi - relational graph with any entity/relation vocabulary. On average across 50+ KGs, it outperforms many state - of - the - art (SOTA) models in 0 - shot inference mode. You can run a pre - trained ULTRA checkpoint immediately in zero - shot manner and also use fine - tuning.
ULTRA provides unified, learnable, and transferable representations for any KG. It uses graph neural networks and modified NBFNet, obtaining relative relation representations based on relation interactions without learning graph - specific entity and relation embeddings.
⨠Features
- Unified Representation: Offers unified, learnable, and transferable representations for any knowledge graph.
- Zero - shot Inference: A single pre - trained model can perform link prediction on any multi - relational graph in zero - shot mode.
- High Performance: Outperforms many SOTA models in 0 - shot inference mode on average across 50+ KGs.
đĻ Installation
- Install the dependencies as listed in the Installation instructions on the GitHub repo.
- Clone this model repo to find the
UltraForKnowledgeGraphReasoning
class in modeling.py
and load the checkpoint (all the necessary model code is in this model repo as well).
đģ Usage Examples
Basic Usage
from modeling import UltraForKnowledgeGraphReasoning
from ultra.datasets import CoDExSmall
from ultra.eval import test
model = UltraForKnowledgeGraphReasoning.from_pretrained("mgalkin/ultra_50g")
dataset = CoDExSmall(root="./datasets/")
test(model, mode="test", dataset=dataset, gpus=None)
Advanced Usage
from transformers import AutoModel
from ultra.datasets import CoDExSmall
from ultra.eval import test
model = AutoModel.from_pretrained("mgalkin/ultra_50g", trust_remote_code=True)
dataset = CoDExSmall(root="./datasets/")
test(model, mode="test", dataset=dataset, gpus=None)
đ Documentation
Checkpoints
On HuggingFace, we provide 3 pre - trained ULTRA checkpoints (all ~169k params) with different amounts of pre - training data.
- ultra_3g and ultra_4g are the PyG models reported in the github repo.
- ultra_50g is a new ULTRA checkpoint pre - trained on 50 different KGs (transductive and inductive) for 1M steps to maximize the performance on any unseen downstream KG.
Performance
Averaged zero - shot performance of ultra - 3g and ultra - 4g
Model |
Inductive (e) (18 graphs) - Avg MRR |
Inductive (e) (18 graphs) - Avg Hits@10 |
Inductive (e,r) (23 graphs) - Avg MRR |
Inductive (e,r) (23 graphs) - Avg Hits@10 |
Transductive (16 graphs) - Avg MRR |
Transductive (16 graphs) - Avg Hits@10 |
ULTRA (3g) PyG |
0.420 |
0.562 |
0.344 |
0.511 |
0.329 |
0.479 |
ULTRA (4g) PyG |
0.444 |
0.588 |
0.344 |
0.513 |
WIP |
WIP |
ULTRA (50g) PyG (pre - trained on 50 KGs) |
0.444 |
0.580 |
0.395 |
0.554 |
0.389 |
0.549 |
Fine - tuning ULTRA on specific graphs brings, on average, a further 10% relative performance boost both in MRR and Hits@10. See the paper for more comparisons.
ULTRA 50g Performance
ULTRA 50g was pre - trained on 50 graphs, so we can't really apply the zero - shot evaluation protocol to the graphs. However, we can compare with Supervised SOTA models trained from scratch on each dataset:
Model |
Avg MRR, Transductive graphs (16) |
Avg Hits@10, Transductive graphs (16) |
Supervised SOTA models |
0.371 |
0.511 |
ULTRA 50g (single model) |
0.389 |
0.549 |
That is, instead of training a big KG embedding model on your graph, you might want to consider running ULTRA (any of the checkpoints) as its performance might already be higher đ
đ License
This project is licensed under the MIT license.