xlm-roberta-large-fa-qa Open-source Persian Question Answering Model - Free Deployment and Optimization for QA Tasks

Home

Xlm Roberta Large Fa Qa

Developed by SajjadAyoubi

A Persian Q&A model based on the RoBERTa architecture, optimized for Persian Q&A tasks

Question Answering System

Transformers

#Persian Q&A #No-answer detection #RoBERTa architecture

Downloads 141

Release Time : 3/2/2022

Model Overview

This model is a large-scale Persian Q&A model based on the RoBERTa architecture, specifically designed for handling Persian Q&A tasks. It can extract answers to questions from given texts and supports both manual invocation and fast pipeline invocation modes.

Model Features

Persian optimization

Specifically optimized for Persian, enabling better understanding and processing of Persian texts.

No-answer judgment support

In manual invocation mode, supports determining whether a question has no answer in the text.

High performance

The model excels in Persian Q&A tasks, accurately extracting answers from texts.

Model Capabilities

Persian text understanding

Q&A extraction

No-answer judgment

Use Cases

Education

Persian learning aid

Helps students quickly find answers to questions in Persian texts.

Improves learning efficiency and enables quick knowledge acquisition.

Customer service

Automated Q&A system

Used for automated Q&A in Persian customer service to quickly respond to customer inquiries.

Reduces the burden on human customer service and improves response speed.

🚀 How to Use

This guide provides step-by-step instructions on using the Transformer model for question-answering tasks. It covers installation requirements, using pipelines, and a manual approach with both PyTorch and TensorFlow 2.X.

📦 Installation

Transformers require transformers and sentencepiece, both of which can be installed using pip.

pip install transformers sentencepiece

💻 Usage Examples

🚀 Basic Usage with Pipelines

In case you are not familiar with Transformers, you can use pipelines instead. Note that, pipelines can't have no answer for the questions.

from transformers import pipeline

model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)

text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]

for question in questions:
    print(qa_pipeline({"context": text, "question": question}))

>>> {'score': 0.4839823544025421, 'start': 8, 'end': 18, 'answer': 'سجاد ایوبی'}
>>> {'score': 0.3747948706150055, 'start': 24, 'end': 32, 'answer': '۲۰ سالمه'}
>>> {'score': 0.5945395827293396, 'start': 38, 'end': 55, 'answer': 'پردازش زبان طبیعی'}

🔥 Advanced Usage with Manual Approach

Using the Manual approach, it is possible to have no answer with even better performance.

PyTorch

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
from src.utils import AnswerPredictor

model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)

text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]

# this class is from src/utils.py and you can read more about it
predictor = AnswerPredictor(model, tokenizer, device="cpu", n_best=10)
preds = predictor(questions, [text] * 3, batch_size=3)

for k, v in preds.items():
    print(v)

Produces an output such below:

100%|██████████| 1/1 [00:00<00:00,  3.56it/s]
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
{'score': 9.901972770690918, 'text': '۲۰'}
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}

TensorFlow 2.X

from transformers import AutoTokenizer, TFAutoModelForQuestionAnswering
from src.utils import TFAnswerPredictor

model_name = "SajjadAyoubi/lm-roberta-large-fa-qa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForQuestionAnswering.from_pretrained(model_name)

text = "سلام من سجاد ایوبی هستم ۲۰ سالمه و به پردازش زبان طبیعی علاقه دارم"
questions = ["اسمم چیه؟", "چند سالمه؟", "به چی علاقه دارم؟"]

# this class is from src/utils.py, you can read more about it
predictor = TFAnswerPredictor(model, tokenizer, n_best=10)
preds = predictor(questions, [text] * 3, batch_size=3)

for k, v in preds.items():
    print(v)

Produces an output such below:

100%|██████████| 1/1 [00:00<00:00,  3.56it/s]
{'score': 8.040637016296387, 'text': 'سجاد ایوبی'}
{'score': 9.901972770690918, 'text': '۲۰'}
{'score': 12.117212295532227, 'text': 'پردازش زبان طبیعی'}

🔗 Additional Resources

Or you can access the whole demonstration using HowToUse iPython Notebook on Google Colab

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご