Efficient_mlm_m0.15 - 801010 Open - Source Model: Exploring the Masking Ratio in Masked Language Modeling, High Practical Value

Efficient Mlm M0.15 801010

Developed by princeton-nlp

A RoBERTa model employing pre-layer normalization technology, studying the impact of masking ratio in masked language modeling

Large Language Model

Transformers

#Pre-layer Normalization #Masked Language Modeling #RoBERTa Variant

Downloads 114

Release Time : 4/22/2022

Model Overview

This model is an improved pre-trained language model based on the RoBERTa architecture, primarily investigating the influence of masking content ratio on model performance in masked language modeling tasks. It utilizes pre-layer normalization technology not currently supported by HuggingFace.

Model Features

Pre-layer Normalization Technology

Adopts a pre-layer normalization architecture not currently supported by the official HuggingFace library, potentially improving training stability

Masking Ratio Research

Specifically investigates whether the 15% masking ratio in masked language modeling is optimal

HuggingFace Compatibility

Despite using a special architecture, it remains compatible with the HuggingFace ecosystem through custom code

Model Capabilities

Masked language modeling

Text representation learning

Sequence classification

Use Cases

Natural Language Processing Research

Masking Ratio Optimization Research

Used to study the impact of different masking ratios on the performance of pre-trained language models

Text Understanding

Text Classification

Can be fine-tuned for various text classification tasks

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Efficient Mlm M0.15 801010

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Model Checkpoint for "Should You Mask 15% in Masked Language Modeling"

🚀 Quick Start

💻 Usage Examples

Basic Usage