🚀 XLM-ROBERTA-LARGE在SQuADv2上微調
本項目是在SQuADv2數據集上針對問答任務微調的xlm-roberta-large模型,可有效解決問答相關問題,為問答系統的構建提供強大的模型支持。
🚀 快速開始
以下是使用該模型進行問答的示例代碼:
from transformers import XLMRobertaTokenizer, XLMRobertaForQuestionAnswering
import torch
tokenizer = XLMRobertaTokenizer.from_pretrained('a-ware/xlmroberta-squadv2')
model = XLMRobertaForQuestionAnswering.from_pretrained('a-ware/xlmroberta-squadv2')
question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
encoding = tokenizer(question, text, return_tensors='pt')
input_ids = encoding['input_ids']
attention_mask = encoding['attention_mask']
start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
answer = tokenizer.convert_tokens_to_ids(answer.split())
answer = tokenizer.decode(answer)
#answer => 'a nice puppet'
✨ 主要特性
本模型基於XLM-Roberta架構,在SQuADv2數據集上進行微調,可用於問答任務。
📚 詳細文檔
模型詳情
XLM-Roberta在論文 XLM-R: State-of-the-art cross-lingual understanding through self-supervision 中被提出。
模型訓練
該模型使用simpletransformers包裝器,以下列參數進行訓練:
train_args = {
'learning_rate': 1e-5,
'max_seq_length': 512,
'doc_stride': 512,
'overwrite_output_dir': True,
'reprocess_input_data': False,
'train_batch_size': 8,
'num_train_epochs': 2,
'gradient_accumulation_steps': 2,
'no_cache': True,
'use_cached_eval_features': False,
'save_model_every_epoch': False,
'output_dir': "bart-squadv2",
'eval_batch_size': 32,
'fp16_opt_level': 'O2',
}
結果
{"correct": 6961, "similar": 4359, "incorrect": 553, "eval_loss": -12.177856394381962}
本項目由 A-ware UG 用心打造 