Babyberta 1
A lightweight RoBERTa variant trained on 5 million American English child-directed corpora, specifically designed for language acquisition research
Downloads 295
Release Time : 3/2/2022
Model Overview
BabyBERTa is a lightweight RoBERTa variant model designed for language acquisition research, capable of running on a single GPU-equipped desktop computer
Model Features
Lightweight Design
The model is lightweight and can run on a single GPU-equipped desktop computer without requiring high-performance computing infrastructure
Optimized for Child Language Acquisition
Specifically optimized for child language acquisition research, trained on child-directed corpora
Special Training Strategy
During training, the model never predicts unmasked words (unmask_prob set to zero), focusing on grammar learning
Model Capabilities
Language Modeling
Grammar Analysis
Masked Language Modeling
Use Cases
Language Research
Child Language Acquisition Research
Used to study the mechanisms of grammar learning in child language acquisition
Achieved a comprehensive accuracy of 80.3 on the Zorro test set
Grammar Competence Evaluation
Evaluate the model's grammar comprehension ability
Performance comparable to RoBERTa-base on the Zorro test set
Featured Recommended AI Models