R

Roberta Med Small 1M 1

Developed by nyu-mll
A RoBERTa model pretrained on a small-scale dataset of 1M tokens, using the MED-SMALL architecture, suitable for text understanding tasks.
Downloads 23
Release Time : 3/2/2022

Model Overview

This model is a small-scale pretrained language model based on the RoBERTa architecture, focusing on language representation learning with limited data.

Model Features

Small-scale data pretraining
Specifically designed for effective pretraining on small-scale datasets ranging from 1M to 1B tokens.
Multiple scale options
Provides model versions with different training scales from 1M to 1B tokens.
Optimized architecture
MED-SMALL architecture (6 layers, 512 hidden dimensions) adjusted for small-scale data.

Model Capabilities

Text representation learning
Context understanding
Language modeling

Use Cases

Educational research
Small-scale data language model research
Used to study the performance of language models under limited data conditions.
Validation perplexity 134.18-153.38
Resource-constrained environments
Low-resource NLP applications
Suitable for environments with limited computational resources or training data.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase