🚀 How to Use
This guide provides step-by-step instructions on using the Transformer model for question-answering tasks. It covers installation requirements, using pipelines, and a manual approach with both PyTorch and TensorFlow 2.X.
📦 Installation
Transformers require transformers
and sentencepiece
, both of which can be installed using pip
.
pip install transformers sentencepiece
💻 Usage Examples
🚀 Basic Usage with Pipelines
In case you are not familiar with Transformers, you can use pipelines instead. Note that, pipelines can't have no answer for the questions.
from transformers import pipeline
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
for question in questions:
print(qa_pipeline({"context": text, "question": question}))
>>> {'score': 0.4839823544025421, 'start': 8, 'end': 18, 'answer': 'سجاد ایوبی'}
>>> {'score': 0.3747948706150055, 'start': 24, 'end': 32, 'answer': '۲۰ سالمه'}
>>> {'score': 0.5945395827293396, 'start': 38, 'end': 55, 'answer': 'پردازش زبان طبیعی'}
🔥 Advanced Usage with Manual Approach
Using the Manual approach, it is possible to have no answer with even better performance.
PyTorch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
from src.utils import AnswerPredictor
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
predictor = AnswerPredictor(model, tokenizer, device="cpu", n_best=10)
preds = predictor(questions, [text] * 3, batch_size=3)
for k, v in preds.items():
print(v)
Produces an output such below:
100%|██████████| 1/1 [00:00<00:00, 3.56it/s]
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
{'score': 9.901972770690918, 'text': '۲۰'}
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}
TensorFlow 2.X
from transformers import AutoTokenizer, TFAutoModelForQuestionAnswering
from src.utils import TFAnswerPredictor
model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForQuestionAnswering.from_pretrained(model_name)
text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]
predictor = TFAnswerPredictor(model, tokenizer, n_best=10)
preds = predictor(questions, [text] * 3, batch_size=3)
for k, v in preds.items():
print(v)
Produces an output such below:
100%|██████████| 1/1 [00:00<00:00, 3.56it/s]
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
{'score': 9.901972770690918, 'text': '۲۰'}
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}
🔗 Additional Resources
Or you can access the whole demonstration using HowToUse iPython Notebook on Google Colab