Open-source srl-en_mbert-base Model - Free Deployment to Support English Semantic Role Labeling Tasks

Srl En Mbert Base

Developed by liaad

This model is a bert-base-multilingual-cased model fine-tuned on English CoNLL-formatted OntoNotes v5.0 semantic role labeling data, primarily used for semantic role labeling tasks.

Sequence Labeling Supports Multiple LanguagesOpen Source License:Apache-2.0 #Multilingual SRL #English Semantic Role Labeling #BERT Fine-tuning

Downloads 93

Release Time : 3/2/2022

Model Overview

This is a fine-tuned multilingual BERT model specifically designed for English Semantic Role Labeling (SRL) tasks. It can identify predicates and their related arguments in sentences, providing structured information for natural language understanding.

Model Features

Multilingual Foundation

Fine-tuned from bert-base-multilingual-cased model with multilingual understanding capabilities

Specialized for English SRL

Optimized specifically for English semantic role labeling tasks

Research Project Outcome

Developed as part of a research project and released alongside various other SRL models

Model Capabilities

Semantic Role Labeling

Natural Language Understanding

Predicate-Argument Structure Recognition

Use Cases

Natural Language Processing

Semantic Role Analysis

Analyze predicate-argument relationships in sentences

F1 score of 63.07 (PropBank.Br)

Cross-domain Semantic Analysis

Perform semantic role labeling on texts from different domains

F1 score of 58.56 (Buscapé dataset)

🚀 mBERT fine-tuned on English semantic role labeling

This project fine-tunes mBERT for English semantic role labeling, offering multiple related models and detailed evaluation results.

🚀 Quick Start

Basic Usage

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en_mbert-base")
model = AutoModel.from_pretrained("liaad/srl-en_mbert-base")

To use the full SRL model (transformers portion + a decoding layer), refer to the project's github.

✨ Features

Multilingual Adaptation: Based on bert-base-multilingual-cased, it can handle multiple languages.
Semantic Role Labeling: Specifically fine-tuned for English semantic role labeling tasks.
Multiple Model Variants: There are multiple related models for different language combinations and tasks.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en_mbert-base")
model = AutoModel.from_pretrained("liaad/srl-en_mbert-base")

Advanced Usage

To use the full SRL model (transformers portion + a decoding layer), refer to the project's github.

📚 Documentation

Model description

This model is the bert-base-multilingual-cased fine-tuned on the English CoNLL formatted OntoNotes v5.0 semantic role labeling data. This is part of a project from which resulted the following models:

For more information, please see the accompanying article (See BibTeX entry and citation info below) and the project's github.

Intended uses & limitations

How to use

To use the transformers portion of this model:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en_mbert-base")
model = AutoModel.from_pretrained("liaad/srl-en_mbert-base")

To use the full SRL model (transformers portion + a decoding layer), refer to the project's github.

Limitations and bias

The models were trained only for 5 epochs.
The English data was preprocessed to match the Portuguese data, so there are some differences in role attributions and some roles were removed from the data.

Training procedure

The model was trained on the CoNLL-2012 dataset, preprocessed to match the Portuguese PropBank.Br data. They were tested on the PropBank.Br data set as well as on a smaller opinion dataset "Buscapé". For more information, please see the accompanying article (See BibTeX entry and citation info below) and the project's github.

Eval results

Model Name	F₁ CV PropBank.Br (in domain)	F₁ Buscapé (out of domain)
`srl-pt_bertimbau-base`	76.30	73.33
`srl-pt_bertimbau-large`	77.42	74.85
`srl-pt_xlmr-base`	75.22	72.82
`srl-pt_xlmr-large`	77.59	73.84
`srl-pt_mbert-base`	72.76	66.89
`srl-en_xlmr-base`	66.59	65.24
`srl-en_xlmr-large`	67.60	64.94
`srl-en_mbert-base`	63.07	58.56
`srl-enpt_xlmr-base`	76.50	73.74
`srl-enpt_xlmr-large`	78.22	74.55
`srl-enpt_mbert-base`	74.88	69.19
`ud_srl-pt_bertimbau-large`	77.53	74.49
`ud_srl-pt_xlmr-large`	77.69	74.91
`ud_srl-enpt_xlmr-large`	77.97	75.05

BibTeX entry and citation info

@misc{oliveira2021transformers,
      title={Transformers and Transfer Learning for Improving Portuguese Semantic Role Labeling}, 
      author={Sofia Oliveira and Daniel Loureiro and Alípio Jorge},
      year={2021},
      eprint={2101.01213},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

🔧 Technical Details

The model is based on the bert-base-multilingual-cased architecture and fine-tuned on the CoNLL-2012 dataset for English semantic role labeling. The data was preprocessed to match the Portuguese PropBank.Br data. The training was conducted for 5 epochs.

📄 License

The license of this project is apache-2.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご