S

Slovakbert

Developed by gerulata
A pretrained model based on Slovak language, using masked language modeling (MLM) objective, case-sensitive.
Downloads 5,009
Release Time : 3/2/2022

Model Overview

Slovak BERT is a pretrained model based on Slovak language, primarily used for masked language modeling tasks, and can also be fine-tuned for downstream tasks.

Model Features

Case-sensitive
The model can distinguish between uppercase and lowercase, e.g., 'slovensko' and 'Slovensko' are treated as different words.
Large-scale pretraining data
The model was pretrained on multiple high-quality datasets (e.g., Wikipedia, Open Subtitles, OSCAR, etc.), totaling 19.35GB of text.
Optimized text processing
Training data underwent URL and email address replacement, punctuation reduction, Markdown syntax removal, etc., to enhance model performance.

Model Capabilities

Masked language modeling
Text feature extraction
Downstream task fine-tuning

Use Cases

Natural language processing
Sentence completion
Use masked language modeling to complete missing parts of a sentence.
For example, input 'Deti sa <mask> na ihrisku.', the model may predict words like 'hrali'.
Historical event prediction
Predict key information in historical events, such as years.
For example, input 'Slovenské národne povstanie sa uskutočnilo v roku <mask>.', the model may predict '1944'.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase