đ Nominalization Detector
This model is designed to identify "predicative nominalizations", which are nominalizations carrying an eventive (or "verbal") meaning in context. It offers a practical solution for nominalization detection tasks.
đ Quick Start
This model is a fine - tuned bert - base - cased
pretrained model. It is trained for token classification on the "nominalization detection" task defined and annotated by the QANom project (Klein et. al., COLING 2020).
⨠Features
- Binary Classification: The model serves as a binary classifier for classifying candidate nominalizations.
- Candidate Extraction: It uses a POS tagger and lexical resources (such as WordNet and CatVar) to extract candidates.
- Full Pipeline Encapsulation: The
qanom.nominalization_detector.NominalizationDetector
class encapsulates the full nominalization detection pipeline.
đĻ Installation
The qanom
package, which includes the candidate extraction algorithm, is available via pip install qanom
.
đģ Usage Examples
Basic Usage
The candidate extraction algorithm is implemented inside the qanom
package. For full documentation, refer to the README in the QANom github repo.
We have encapsulated the full nominalization detection pipeline in the qanom.nominalization_detector.NominalizationDetector
class:
from qanom.nominalization_detector import NominalizationDetector
detector = NominalizationDetector()
raw_sentences = ["The construction of the officer 's building finished right after the beginning of the destruction of the previous construction ."]
print(detector(raw_sentences, return_all_candidates=True))
print(detector(raw_sentences, threshold=0.75, return_probability=False))
Outputs:
[[{'predicate_idx': 1,
'predicate': 'construction',
'predicate_detector_prediction': True,
'predicate_detector_probability': 0.7626778483390808,
'verb_form': 'construct'},
{'predicate_idx': 4,
'predicate': 'officer',
'predicate_detector_prediction': False,
'predicate_detector_probability': 0.19832570850849152,
'verb_form': 'officer'},
{'predicate_idx': 6,
'predicate': 'building',
'predicate_detector_prediction': True,
'predicate_detector_probability': 0.5794129371643066,
'verb_form': 'build'},
{'predicate_idx': 11,
'predicate': 'beginning',
'predicate_detector_prediction': True,
'predicate_detector_probability': 0.8937646150588989,
'verb_form': 'begin'},
{'predicate_idx': 14,
'predicate': 'destruction',
'predicate_detector_prediction': True,
'predicate_detector_probability': 0.8501205444335938,
'verb_form': 'destruct'},
{'predicate_idx': 18,
'predicate': 'construction',
'predicate_detector_prediction': True,
'predicate_detector_probability': 0.7022264003753662,
'verb_form': 'construct'}]]
[[{'predicate_idx': 1, 'predicate': 'construction', 'verb_form': 'construct'},
{'predicate_idx': 11, 'predicate': 'beginning', 'verb_form': 'begin'},
{'predicate_idx': 14, 'predicate': 'destruction', 'verb_form': 'destruct'}]]
đ Documentation
Task Description
The model is trained as a binary classifier for classifying candidate nominalizations. The candidates are extracted using a POS tagger (filtering common nouns) and lexical resources (e.g., WordNet and CatVar), which filter nouns having (at least one) derivationally - related verb. In the QANom annotation project, annotators decide whether these candidates carry a "verbal" meaning in the sentence context, and the current model reproduces this binary classification.
Demo
Check out our cool [demo](https://huggingface.co/spaces/kleinay/nominalization - detection - demo)!
đ License
No license information provided in the original document.
đ Cite
@inproceedings{klein2020qanom,
title={QANom: Question - Answer driven SRL for Nominalizations},
author={Klein, Ayal and Mamou, Jonathan and Pyatkin, Valentina and Stepanov, Daniela and He, Hangfeng and Roth, Dan and Zettlemoyer, Luke and Dagan, Ido},
booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
pages={3069--3083},
year={2020}
}