Efficient Mlm M0.15 801010
E
Efficient Mlm M0.15 801010
Developed by princeton-nlp
A RoBERTa model employing pre-layer normalization technology, studying the impact of masking ratio in masked language modeling
Downloads 114
Release Time : 4/22/2022
Model Overview
This model is an improved pre-trained language model based on the RoBERTa architecture, primarily investigating the influence of masking content ratio on model performance in masked language modeling tasks. It utilizes pre-layer normalization technology not currently supported by HuggingFace.
Model Features
Pre-layer Normalization Technology
Adopts a pre-layer normalization architecture not currently supported by the official HuggingFace library, potentially improving training stability
Masking Ratio Research
Specifically investigates whether the 15% masking ratio in masked language modeling is optimal
HuggingFace Compatibility
Despite using a special architecture, it remains compatible with the HuggingFace ecosystem through custom code
Model Capabilities
Masked language modeling
Text representation learning
Sequence classification
Use Cases
Natural Language Processing Research
Masking Ratio Optimization Research
Used to study the impact of different masking ratios on the performance of pre-trained language models
Text Understanding
Text Classification
Can be fine-tuned for various text classification tasks
Featured Recommended AI Models