Electra Large Generator
ELECTRA is an efficient self-supervised language representation learning method that replaces traditional generative pretraining with discriminative pretraining, significantly improving computational efficiency.
Downloads 473
Release Time : 3/2/2022
Model Overview
ELECTRA employs a discriminator architecture to pretrain Transformer models by distinguishing real tokens from generator-forged tokens, achieving excellent performance on tasks like GLUE and SQuAD.
Model Features
Efficient Pretraining
Achieves over 4x computational efficiency compared to traditional MLM pretraining methods
Discriminative Learning
Adopts GAN-style discriminator architecture to learn to distinguish real/fake tokens
Multi-scale Adaptation
Offers various parameter scales including Base/Small/Large
Model Capabilities
Text Encoding
Language Understanding
Mask Prediction
Downstream Task Fine-tuning
Use Cases
Natural Language Understanding
GLUE Benchmark
Delivers outstanding performance on the General Language Understanding Evaluation benchmark
Outperforms BERT models of the same parameter scale
Question Answering
Applied to the SQuAD question answering dataset
Achieved state-of-the-art (SOTA) on SQuAD 2.0 at the time
Text Processing
Sequence Labeling
Supports sequence labeling tasks such as text chunking
Featured Recommended AI Models