roberta-base-chinese-extractive-qa Open-source Model - Free Deployment to Assist in Chinese Text Answer Extraction

Roberta Base Chinese Extractive Qa

Developed by uer

A Chinese extractive QA model based on the RoBERTa architecture, suitable for tasks that extract answers from given texts.

Question Answering System Chinese#Chinese extractive QA #Text comprehension #High accuracy

Downloads 2,694

Release Time : 3/2/2022

Model Overview

This model is designed for Chinese extractive QA tasks, capable of finding answers to questions within provided contexts. Fine-tuned based on UER-py and TencentPretrain frameworks, it supports locating and extracting answers from Chinese texts.

Model Features

Chinese Optimization

Specifically optimized for Chinese text, effectively handling Chinese QA tasks.

Multi-dataset Training

Trained on multiple Chinese QA datasets including cmrc2018, webqa, and laisi, providing broad knowledge coverage.

High Accuracy

Demonstrates a confidence score as high as 97.6% in examples, indicating high answer accuracy.

Model Capabilities

Chinese text comprehension

Answer extraction

Context analysis

Use Cases

Education

Literature Knowledge QA

Identifying authors or related content of literary works

As shown in the example, accurately identifying Pushkin as the author of 'If Life Deceives You'

Information Retrieval

Document QA System

Extracting answers to specific questions from long documents

🚀 Chinese RoBERTa-Base Model for QA

This model is designed for extractive question answering. It offers a reliable solution for retrieving answers from given contexts, which is highly useful in various natural language processing scenarios.

🚀 Quick Start

The model can be directly used with a pipeline for extractive question answering. Here is an example of how to use it:

>>> from transformers import AutoModelForQuestionAnswering,AutoTokenizer,pipeline
>>> model = AutoModelForQuestionAnswering.from_pretrained('uer/roberta-base-chinese-extractive-qa')
>>> tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-chinese-extractive-qa')
>>> QA = pipeline('question-answering', model=model, tokenizer=tokenizer)
>>> QA_input = {'question': "著名诗歌《假如生活欺骗了你》的作者是",'context': "普希金从那里学习人民的语言，吸取了许多有益的养料，这一切对普希金后来的创作产生了很大的影响。这两年里，普希金创作了不少优秀的作品，如《囚徒》、《致大海》、《致凯恩》和《假如生活欺骗了你》等几十首抒情诗，叙事诗《努林伯爵》，历史剧《鲍里斯·戈都诺夫》，以及《叶甫盖尼·奥涅金》前六章。"}
>>> QA(QA_input)
    {'score': 0.9766426682472229, 'start': 0, 'end': 3, 'answer': '普希金'}

✨ Features

Fine-tuning Flexibility: The model can be fine-tuned by UER-py introduced in this paper, and also by TencentPretrain introduced in this paper. TencentPretrain inherits UER-py to support models with parameters above one billion and extends it to a multimodal pre-training framework.
Multiple Download Sources: You can download the model either from the UER-py Modelzoo page, or via HuggingFace from the link roberta-base-chinese-extractive-qa.

📦 Installation

There is no specific installation steps provided in the original README. If you want to use the model, you can follow the usage example above to load the pre - trained model.

📚 Documentation

Model description

The model is used for extractive question answering. It is fine - tuned by UER-py and TencentPretrain as mentioned above.

Training data

Training data comes from three sources: cmrc2018, webqa, and laisi. Only the train set of these three datasets is used.

Training procedure

The model is fine - tuned by UER-py on Tencent Cloud. It is fine - tuned for three epochs with a sequence length of 512 on the basis of the pre - trained model chinese_roberta_L-12_H-768. At the end of each epoch, the model is saved when the best performance on the development set is achieved.

python3 finetune/run_cmrc.py --pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
                             --vocab_path models/google_zh_vocab.txt \
                             --train_path datasets/extractive_qa.json \
                             --dev_path datasets/cmrc2018/dev.json \
                             --output_model_path models/extractive_qa_model.bin \
                             --learning_rate 3e-5 --epochs_num 3 --batch_size 32 --seq_length 512

Finally, convert the fine - tuned model into Huggingface's format:

python3 scripts/convert_bert_extractive_qa_from_uer_to_huggingface.py --input_model_path models/extractive_qa_model.bin \
                                                                      --output_model_path pytorch_model.bin \
                                                                      --layers_num 12

BibTeX entry and citation info

@article{liu2019roberta,
  title={Roberta: A robustly optimized bert pretraining approach},
  author={Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin},
  journal={arXiv preprint arXiv:1907.11692},
  year={2019}
}

@article{zhao2019uer,
  title={UER: An Open-Source Toolkit for Pre-training Models},
  author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
  journal={EMNLP-IJCNLP 2019},
  pages={241},
  year={2019}
}

@article{zhao2023tencentpretrain,
  title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
  author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
  journal={ACL 2023},
  pages={217},
  year={2023}

📄 License

There is no license information provided in the original README, so this section is skipped.

🔧 Technical Details

There is no specific technical details provided in the original README that meet the requirement of more than 50 - word specific technical description, so this section is skipped.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご