Roberta Base Turkish Uncased
R
Roberta Base Turkish Uncased
Developed by TURKCELL
This is a RoBERTa base model based on Turkish. The pre - training data is sourced from Turkish Wikipedia, the Turkish OSCAR corpus, and some news websites.
Downloads 109
Release Time : 12/7/2023
Model Overview
This model is a case - insensitive RoBERTa model for Turkish, mainly used for Turkish text understanding and generation tasks.
Model Features
Large - scale pre - training data
Trained with 38GB of Turkish text data, containing 329,720,508 sentences.
High - performance hardware training
Trained using Intel Xeon Gold processors and Tesla V100 graphics cards.
Turkish optimization
Specifically optimized for Turkish characteristics, including Turkish Wikipedia and news data.
Model Capabilities
Turkish text understanding
Masked language modeling
Text fill - in tasks
Use Cases
Natural language processing
Text fill - in
Predict the masked words in a sentence
As shown in the example, it can accurately predict the blank word in 'iki ülke arasında <mask> başladı'
Text generation
Generate coherent Turkish text based on the context
Featured Recommended AI Models