đ Non Factoid Question Category classification in English
This project focuses on the classification of non - factoid question categories in English, using a model trained on the NFQA dataset to address text - classification tasks.
đ Quick Start
The project provides a model for non - factoid question category classification. The model is trained with the NFQA dataset, and the base model is [roberta - base - squad2](https://huggingface.co/deepset/roberta - base - squad2), a RoBERTa - based model for the Question Answering task, fine - tuned using the SQuAD2.0 dataset.
Repository: [https://github.com/Lurunchik/NF - CATS](https://github.com/Lurunchik/NF - CATS)
⨠Features
- The model uses labels such as
NOT - A - QUESTION
, FACTOID
, DEBATE
, EVIDENCE - BASED
, INSTRUCTION
, REASON
, EXPERIENCE
, COMPARISON
.
đĻ Installation
The installation mainly involves loading the model and its tokenizer.
from transformers import AutoTokenizer
from nfqa_model import RobertaNFQAClassification
nfqa_model = RobertaNFQAClassification.from_pretrained("Lurunchik/nf - cats")
nfqa_tokenizer = AutoTokenizer.from_pretrained("deepset/roberta - base - squad2")
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer
from nfqa_model import RobertaNFQAClassification
nfqa_model = RobertaNFQAClassification.from_pretrained("Lurunchik/nf - cats")
nfqa_tokenizer = AutoTokenizer.from_pretrained("deepset/roberta - base - squad2")
Advanced Usage
def get_nfqa_category_prediction(text):
output = nfqa_model(**nfqa_tokenizer(text, return_tensors="pt"))
index = output.logits.argmax()
return nfqa_model.config.id2label[int(index)]
get_nfqa_category_prediction('how to assign category?')
đ Documentation
You can test the model via [hugginface space](https://huggingface.co/spaces/Lurunchik/nf - cats).
[
](https://huggingface.co/spaces/Lurunchik/nf - cats)
đ License
This project is licensed under the MIT license.
đ Citation
If you use NFQA - cats
in your work, please cite this paper
@misc{bolotova2022nfcats,
author = {Bolotova, Valeriia and Blinov, Vladislav and Scholer, Falk and Croft, W. Bruce and Sanderson, Mark},
title = {A Non - Factoid Question - Answering Taxonomy},
year = {2022},
isbn = {9781450387323},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3477495.3531926},
doi = {10.1145/3477495.3531926},
booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {1196â1207},
numpages = {12},
keywords = {question taxonomy, non - factoid question - answering, editorial study, dataset analysis},
location = {Madrid, Spain},
series = {SIGIR '22}
}
Enjoy! đ¤