🚀 XLM-ROBERTA-LARGE在SQuADv2上微调
本项目是在SQuADv2数据集上针对问答任务微调的xlm-roberta-large模型,可有效解决问答相关问题,为问答系统的构建提供强大的模型支持。
🚀 快速开始
以下是使用该模型进行问答的示例代码:
from transformers import XLMRobertaTokenizer, XLMRobertaForQuestionAnswering
import torch
tokenizer = XLMRobertaTokenizer.from_pretrained('a-ware/xlmroberta-squadv2')
model = XLMRobertaForQuestionAnswering.from_pretrained('a-ware/xlmroberta-squadv2')
question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
encoding = tokenizer(question, text, return_tensors='pt')
input_ids = encoding['input_ids']
attention_mask = encoding['attention_mask']
start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
answer = tokenizer.convert_tokens_to_ids(answer.split())
answer = tokenizer.decode(answer)
#answer => 'a nice puppet'
✨ 主要特性
本模型基于XLM-Roberta架构,在SQuADv2数据集上进行微调,可用于问答任务。
📚 详细文档
模型详情
XLM-Roberta在论文 XLM-R: State-of-the-art cross-lingual understanding through self-supervision 中被提出。
模型训练
该模型使用simpletransformers包装器,以下列参数进行训练:
train_args = {
'learning_rate': 1e-5,
'max_seq_length': 512,
'doc_stride': 512,
'overwrite_output_dir': True,
'reprocess_input_data': False,
'train_batch_size': 8,
'num_train_epochs': 2,
'gradient_accumulation_steps': 2,
'no_cache': True,
'use_cached_eval_features': False,
'save_model_every_epoch': False,
'output_dir': "bart-squadv2",
'eval_batch_size': 32,
'fp16_opt_level': 'O2',
}
结果
{"correct": 6961, "similar": 4359, "incorrect": 553, "eval_loss": -12.177856394381962}
本项目由 A-ware UG 用心打造 