S

Sroberta F

Developed by Andrija
RoBERTa model trained on a 43GB dataset of Croatian and Serbian languages, supporting masked language modeling tasks.
Downloads 51
Release Time : 3/2/2022

Model Overview

This is a RoBERTa model optimized for Croatian and Serbian, primarily used for natural language processing tasks, especially masked language modeling.

Model Features

Multi-source Training Data
Integrates multiple high-quality datasets including Leipzig, OSCAR, srWac, hrWac, cc100-hr, and cc100-sr, totaling 43GB of text data.
Continuous Training Potential
The training process shows no signs of stagnation, indicating further optimization is possible.
Bilingual Support
Specifically optimized for Croatian and Serbian languages.

Model Capabilities

Text Understanding
Language Modeling
Contextual Prediction

Use Cases

Natural Language Processing
Text Completion
Predict masked words in a sentence
Example: 'Ovo je početak <mask>.' can predict the completion of the sentence
Language Model Fine-tuning
Used as a base model for downstream NLP tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase