đ LinkBERT-large
LinkBERT-large is a pre - trained model leveraging English Wikipedia articles and hyperlink information. It offers enhanced performance in various NLP tasks, especially knowledge - intensive and cross - document ones.
đ Quick Start
LinkBERT-large is a pre - trained model on English Wikipedia articles with hyperlink information. It was introduced in the paper LinkBERT: Pretraining Language Models with Document Links (ACL 2022). The code and data can be found in this repository.
⨠Features
- Document Link Awareness: LinkBERT is an improved transformer encoder (similar to BERT) that captures document links like hyperlinks and citation links, incorporating knowledge across multiple documents.
- Versatile Application: It can serve as a drop - in replacement for BERT, performing well in general language understanding tasks, and excelling in knowledge - intensive and cross - document tasks.
đĻ Installation
This section is not explicitly provided in the original README. Since there are no specific installation steps, we skip this section.
đģ Usage Examples
Basic Usage
To use the model to get the features of a given text in PyTorch:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained('michiyasunaga/LinkBERT-large')
model = AutoModel.from_pretrained('michiyasunaga/LinkBERT-large')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
Advanced Usage
For fine - tuning, you can use this repository or follow any other BERT fine - tuning codebases.
đ Documentation
Model description
LinkBERT is a transformer encoder (BERT - like) model pre - trained on a large corpus of documents. It enhances BERT by capturing document links such as hyperlinks and citation links, integrating knowledge across multiple documents. Specifically, it was pre - trained by including linked documents in the same language model context, in addition to single documents.
LinkBERT can replace BERT directly. It performs better in general language understanding tasks (e.g., text classification), and is particularly effective for knowledge - intensive tasks (e.g., question answering) and cross - document tasks (e.g., reading comprehension, document retrieval).
Intended uses & limitations
The model can be fine - tuned for downstream tasks like question answering, sequence classification, and token classification. You can also use the raw model for feature extraction (i.e., obtaining embeddings for input text).
đ§ Technical Details
Evaluation results
When fine - tuned on downstream tasks, LinkBERT achieves the following results.
General benchmarks (MRQA and GLUE):
Property |
Details |
Model Type |
LinkBERT-large |
Training Data |
English Wikipedia articles with hyperlink information |
|
HotpotQA |
TriviaQA |
SearchQA |
NaturalQ |
NewsQA |
SQuAD |
GLUE |
|
F1 |
F1 |
F1 |
F1 |
F1 |
F1 |
Avg score |
BERT-base |
76.0 |
70.3 |
74.2 |
76.5 |
65.7 |
88.7 |
79.2 |
LinkBERT-base |
78.2 |
73.9 |
76.8 |
78.3 |
69.3 |
90.1 |
79.6 |
BERT-large |
78.1 |
73.7 |
78.3 |
79.0 |
70.9 |
91.1 |
80.7 |
LinkBERT-large |
80.8 |
78.2 |
80.5 |
81.0 |
72.6 |
92.7 |
81.1 |
đ License
The model is licensed under the Apache 2.0 license.
đ Citation
If you find LinkBERT useful in your project, please cite the following:
@InProceedings{yasunaga2022linkbert,
author = {Michihiro Yasunaga and Jure Leskovec and Percy Liang},
title = {LinkBERT: Pretraining Language Models with Document Links},
year = {2022},
booktitle = {Association for Computational Linguistics (ACL)},
}