Chinese-Question-Answering Open-source Chinese Question Answering Model - Designed Specifically for Handling Question-Answer Pairs

Home

Chinese Question Answering

Developed by NchuNLP

Chinese question-answering model fine-tuned from bert-base-chinese, specifically designed for question-answer pairs

Question Answering System

Transformers

Chinese#Chinese Q&A #BERT Fine-tuning #Reading Comprehension

Downloads 81

Release Time : 10/12/2022

Model Overview

This is a Chinese question-answering model based on BERT architecture, fine-tuned on DRCD dataset, capable of extracting answers from given contexts.

Model Features

Chinese Optimization

Optimized for Chinese Q&A tasks based on bert-base-chinese model

Precise Answer Localization

Capable of accurately locating answer start and end positions in context

Strong Domain Adaptability

Performs well across various domains including academic and medical fields

Model Capabilities

Chinese Reading Comprehension

Answer Extraction

Context Understanding

Use Cases

Education

Academic Literature Q&A

Extract answers to specific questions from academic articles

Can accurately extract academic concepts, institutional locations, etc.

Medical

Medical Text Analysis

Extract answers to specific medical questions from literature

Can identify medical terminology and institutional information

🚀 bert-base-chinese for QA

This is a fine-tuned model based on bert-base-chinese using the DRCD dataset. It has been trained on question-answer pairs for the Question Answering task, aiming to accurately answer various questions.

🚀 Quick Start

This model is fine-tuned using the DRCD dataset and trained on question-answer pairs for the Question Answering task.

💻 Usage Examples

Basic Usage

from transformers import BertTokenizerFast, BertForQuestionAnswering, pipeline
model_name = "NchuNLP/Chinese-Question-Answering"
tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)

# a) Get predictions 
nlp = pipeline('question-answering', model=model, tokenizer=tokenizer)
QA_input = {
    'question': '中興大學在哪裡？',
    'context': '國立中興大學（簡稱興大、NCHU），是位於臺中的一所高等教育機構。中興大學以農業科學、農業經濟學、獸醫、生命科學、轉譯醫學、生醫工程、生物科技、綠色科技等研究領域見長 。近年中興大學與臺中榮民總醫院、彰化師範大學、中國醫藥大學等機構合作，聚焦於癌症醫學、免疫醫學及醫學工程三項領域，將實驗室成果逐步應用到臨床上，未來「衛生福利部南投醫院中興院區」將改為「國立中興大學醫學院附設醫院」。興大也與臺中市政府合作，簽訂合作意向書，共同推動數位文化、智慧城市等面相帶動區域發展。'
}
res = nlp(QA_input)

{'score': 1.0, 'start': 21, 'end': 23, 'answer': '臺中'}

Advanced Usage

# b) Inside the Question answering pipeline

inputs = tokenizer(query, text, return_tensors="pt",padding=True, truncation=True, max_length=512, stride=256)
outputs = model(**inputs)

sequence_ids = inputs.sequence_ids()
# Mask everything apart from the tokens of the context
mask = [i != 1 for i in sequence_ids]
# Unmask the [CLS] token
mask[0] = False
mask = torch.tensor(mask)[None]

start_logits[mask] = -10000
end_logits[mask] = -10000

start_probabilities = torch.nn.functional.softmax(start_logits, dim=-1)[0]
end_probabilities = torch.nn.functional.softmax(end_logits, dim=-1)[0]

scores = start_probabilities[:, None] * end_probabilities[None, :]

max_index = scores.argmax().item()
start_index = max_index // scores.shape[1]
end_index = max_index % scores.shape[1]


inputs_with_offsets = tokenizer(query, text, return_offsets_mapping=True)
offsets = inputs_with_offsets["offset_mapping"]

start_char, _ = offsets[start_index]
_, end_char = offsets[end_index]
answer = text[start_char:end_char]

result = {
    "answer": answer,
    "start": start_char,
    "end": end_char,
    "score": scores[start_index, end_index],
}
print(result)

📚 Documentation

Authors

Han Cheng Yu: boy19990222@gmail.com
Yao-Chung Fan: yfan@nchu.edu.tw

About us

The NCHU Natural Language Processing Laboratory focuses on research in deep learning techniques for text mining and natural language processing. Currently, the research topics of the laboratory members mainly focus on machine reading comprehension and natural language generation.

More Information

For more info about Nchu NLP Lab, visit our Lab Online Demo repo and GitHub.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご