Open-source roberta-base-100M-3 model - Natural language processing adapted to resource-constrained scenarios

Roberta Base 100M 3

Developed by nyu-mll

RoBERTa variants pre-trained on datasets ranging from 1M to 1B tokens, including BASE and MED-SMALL specifications, suitable for natural language processing tasks in resource-limited scenarios

Large Language Model #Small-scale pre-training #Multi-batch optimization #English text comprehension

Downloads 18

Release Time : 3/2/2022

Model Overview

RoBERTa models pre-trained on datasets of varying scales (1M/10M/100M/1B tokens), optimized for small-data scenarios by adjusting model specifications and training parameters

Model Features

Small-data optimization

Specifically optimized for small-scale data (1M-1B tokens), making it more suitable for data-constrained scenarios compared to the original RoBERTa

Optional specifications

Offers two parameter scales: BASE (125M) and MED-SMALL (45M), balancing performance and efficiency

Rigorous validation

Publishes the top 3 models with the lowest validation perplexity for each data scale to ensure quality

Model Capabilities

Text representation learning

Downstream task fine-tuning

Masked word prediction

Use Cases

Education

Small-scale data fine-tuning

Serves as a pre-training base for educational text classification tasks with limited annotated data

Research

Pre-training strategy research

Investigates the impact of different data scales on pre-trained model performance

Model Name	Training Size	Model Size	Max Steps	Batch Size	Validation Perplexity
roberta-base-1B-1	1B	BASE	100K	512	3.93
roberta-base-1B-2	1B	BASE	31K	1024	4.25
roberta-base-1B-3	1B	BASE	31K	4096	3.84
roberta-base-100M-1	100M	BASE	100K	512	4.99
roberta-base-100M-2	100M	BASE	31K	1024	4.61
roberta-base-100M-3	100M	BASE	31K	512	5.02
roberta-base-10M-1	10M	BASE	10K	1024	11.31
roberta-base-10M-2	10M	BASE	10K	512	10.78
roberta-base-10M-3	10M	BASE	31K	512	11.58
roberta-med-small-1M-1	1M	MED - SMALL	100K	512	153.38
roberta-med-small-1M-2	1M	MED - SMALL	10K	512	134.18
roberta-med-small-1M-3	1M	MED - SMALL	31K	512	139.39

Model Size	L	AH	HS	FFN	P
BASE	12	12	768	3072	125M
MED - SMALL	6	8	512	2048	45M

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Roberta Base 100M 3

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 RoBERTa Pretrained on Smaller Datasets

✨ Features

📚 Documentation

Hyperparameters and Validation Perplexity