SRL-EN_XLMR-Base Open-Source Model - Empowering Free Deployment of English Semantic Role Labeling Tasks

Srl En Xlmr Base

Developed by liaad

This model is a fine-tuned version of xlm-roberta-base on English CoNLL-formatted OntoNotes v5.0 semantic role labeling data, specifically designed for English semantic role labeling tasks.

Sequence Labeling

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Multilingual Semantic Role Labeling #English SRL #XLM-R Fine-tuning

Downloads 17

Release Time : 3/2/2022

Model Overview

This is a fine-tuned XLM-RoBERTa model specialized for English Semantic Role Labeling (SRL) tasks, capable of identifying predicates and their associated roles in sentences.

Model Features

Multilingual Base Model Fine-tuning

Fine-tuned for English-specific tasks based on the powerful multilingual model XLM-RoBERTa

Semantic Role Labeling Optimization

Specifically optimized for English semantic role labeling tasks

Preprocessing Adaptation

English data preprocessed to match Portuguese data format, ensuring cross-lingual consistency

Model Capabilities

English Semantic Role Labeling

Predicate Identification

Semantic Role Classification

Use Cases

Natural Language Processing

Text Semantic Analysis

Analyze semantic roles such as action performers and recipients in sentences

Achieved F1 score of 66.59 (in-domain testing)

Cross-lingual Research

Comparative studies with Portuguese SRL models

Part of cross-lingual research projects

🚀 XLM-R base fine-tuned on English semantic role labeling

This model is fine-tuned on English semantic role labeling data, aiming to enhance the performance in semantic analysis tasks.

🚀 Quick Start

To use the transformers portion of this model:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en_xlmr-base")
model = AutoModel.from_pretrained("liaad/srl-en_xlmr-base")

To use the full SRL model (transformers portion + a decoding layer), refer to the project's github.

✨ Features

Multilingual support, including Portuguese and English.
Fine-tuned on specific semantic role labeling tasks.

📦 Installation

The installation process mainly involves using the transformers library. You can install it via the following command:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en_xlmr-base")
model = AutoModel.from_pretrained("liaad/srl-en_xlmr-base")

input_text = "Your input text here"
inputs = tokenizer(input_text, return_tensors='pt')
outputs = model(**inputs)

📚 Documentation

Model description

This model is the xlm-roberta-base fine-tuned on the English CoNLL formatted OntoNotes v5.0 semantic role labeling data. This is part of a project from which resulted the following models:

For more information, please see the accompanying article (See BibTeX entry and citation info below) and the project's github.

Intended uses & limitations

How to use

As shown in the quick start section, you can use the transformers portion of the model. For the full SRL model, refer to the project's github.

Limitations and bias

This model does not include a Tensorflow version. This is because the "type_vocab_size" in this model was changed (from 1 to 2) and, therefore, it cannot be easily converted to Tensorflow.
The models were trained only for 5 epochs.
The English data was preprocessed to match the Portuguese data, so there are some differences in role attributions and some roles were removed from the data.

Training procedure

The models were trained on the CoNLL-2012 dataset, preprocessed to match the Portuguese PropBank.Br data. They were tested on the PropBank.Br data set as well as on a smaller opinion dataset "Buscapé". For more information, please see the accompanying article (See BibTeX entry and citation info below) and the project's github.

Eval results

Model Name	F₁ CV PropBank.Br (in domain)	F₁ Buscapé (out of domain)
`srl-pt_bertimbau-base`	76.30	73.33
`srl-pt_bertimbau-large`	77.42	74.85
`srl-pt_xlmr-base`	75.22	72.82
`srl-pt_xlmr-large`	77.59	73.84
`srl-pt_mbert-base`	72.76	66.89
`srl-en_xlmr-base`	66.59	65.24
`srl-en_xlmr-large`	67.60	64.94
`srl-en_mbert-base`	63.07	58.56
`srl-enpt_xlmr-base`	76.50	73.74
`srl-enpt_xlmr-large`	78.22	74.55
`srl-enpt_mbert-base`	74.88	69.19
`ud_srl-pt_bertimbau-large`	77.53	74.49
`ud_srl-pt_xlmr-large`	77.69	74.91
`ud_srl-enpt_xlmr-large`	77.97	75.05

BibTeX entry and citation info

@misc{oliveira2021transformers,
      title={Transformers and Transfer Learning for Improving Portuguese Semantic Role Labeling}, 
      author={Sofia Oliveira and Daniel Loureiro and Alípio Jorge},
      year={2021},
      eprint={2101.01213},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

🔧 Technical Details

The "type_vocab_size" in this model was changed from 1 to 2, which causes the difficulty in converting it to Tensorflow.
The models were trained for 5 epochs on specific preprocessed datasets.

📄 License

This project is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご