🚀 bert-base-chinese for QA
This is a fine-tuned model based on bert-base-chinese using the DRCD dataset. It has been trained on question-answer pairs for the Question Answering task, aiming to accurately answer various questions.
🚀 Quick Start
This model is fine-tuned using the DRCD dataset and trained on question-answer pairs for the Question Answering task.
💻 Usage Examples
Basic Usage
from transformers import BertTokenizerFast, BertForQuestionAnswering, pipeline
model_name = "NchuNLP/Chinese-Question-Answering"
tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)
nlp = pipeline('question-answering', model=model, tokenizer=tokenizer)
QA_input = {
'question': '中興大學在哪裡?',
'context': '國立中興大學(簡稱興大、NCHU),是位於臺中的一所高等教育機構。中興大學以農業科學、農業經濟學、獸醫、生命科學、轉譯醫學、生醫工程、生物科技、綠色科技等研究領域見長 。近年中興大學與臺中榮民總醫院、彰化師範大學、中國醫藥大學等機構合作,聚焦於癌症醫學、免疫醫學及醫學工程三項領域,將實驗室成果逐步應用到臨床上,未來「衛生福利部南投醫院中興院區」將改為「國立中興大學醫學院附設醫院」。興大也與臺中市政府合作,簽訂合作意向書,共同推動數位文化、智慧城市等面相帶動區域發展。'
}
res = nlp(QA_input)
{'score': 1.0, 'start': 21, 'end': 23, 'answer': '臺中'}
Advanced Usage
inputs = tokenizer(query, text, return_tensors="pt",padding=True, truncation=True, max_length=512, stride=256)
outputs = model(**inputs)
sequence_ids = inputs.sequence_ids()
mask = [i != 1 for i in sequence_ids]
mask[0] = False
mask = torch.tensor(mask)[None]
start_logits[mask] = -10000
end_logits[mask] = -10000
start_probabilities = torch.nn.functional.softmax(start_logits, dim=-1)[0]
end_probabilities = torch.nn.functional.softmax(end_logits, dim=-1)[0]
scores = start_probabilities[:, None] * end_probabilities[None, :]
max_index = scores.argmax().item()
start_index = max_index // scores.shape[1]
end_index = max_index % scores.shape[1]
inputs_with_offsets = tokenizer(query, text, return_offsets_mapping=True)
offsets = inputs_with_offsets["offset_mapping"]
start_char, _ = offsets[start_index]
_, end_char = offsets[end_index]
answer = text[start_char:end_char]
result = {
"answer": answer,
"start": start_char,
"end": end_char,
"score": scores[start_index, end_index],
}
print(result)
📚 Documentation
Authors
- Han Cheng Yu: boy19990222@gmail.com
- Yao-Chung Fan: yfan@nchu.edu.tw
About us
The NCHU Natural Language Processing Laboratory focuses on research in deep learning techniques for text mining and natural language processing. Currently, the research topics of the laboratory members mainly focus on machine reading comprehension and natural language generation.
More Information
For more info about Nchu NLP Lab, visit our Lab Online Demo repo and GitHub.