🚀 Artiwise ModernBERT - Base Turkish Uncased
We present Artiwise ModernBERT for Turkish 🎉. It is a BERT model with a modernized architecture and an increased context size. (The context size of older BERT models is 512, while that of ModernBERT is 8192). This model is a Turkish adaptation of ModernBERT, fine - tuned from answerdotai/ModernBERT - base
using only the Turkish part of CulturaX.

📦 Installation
Note: Torch version must be >= 2.6.0 and transformers version >= 4.50.0 for the model to function properly.
Also Don't use the do_lower_case = True
flag with the tokenizer. Instead, convert your text to lower case as follows:
text.replace("I", "ı").lower()
This is due to a known issue with the tokenizer.
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForMaskedLM
import torch
tokenizer = AutoTokenizer.from_pretrained("artiwise-ai/modernbert-base-tr-uncased")
model = AutoModelForMaskedLM.from_pretrained("artiwise-ai/modernbert-base-tr-uncased")
text = "Türkiye'nin başkenti [MASK]'dır."
text.replace("I", "ı").lower()
inputs = tokenizer(text, return_tensors="pt")
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
mask_token_logits = logits[0, mask_token_index, :]
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()
print(f"Original text: {text}")
print("Top 5 predictions for [MASK]:")
for token in top_5_tokens:
print(f"- {tokenizer.decode([token])}")
📚 Documentation
Stats
Property |
Details |
Model Type |
Artiwise ModernBERT - Base Turkish Uncased |
Training Data |
CulturaX 192GB (tr) |
Base Model |
answerdotai/ModernBERT-base |
Benchmark
The benchmark results below demonstrate that Artiwise ModernBERT consistently outperforms existing Turkish BERT variants across multiple domains and masking levels, highlighting its superior generalization capabilities.
Dataset & Mask Level |
Artiwise Modern Bert |
ytu - ce - cosmos/turkish - base - bert - uncased |
dbmdz/bert - base - turkish - uncased |
QA Dataset (5% mask) |
74.50 |
60.84 |
48.57 |
QA Dataset (10% mask) |
72.18 |
58.75 |
46.29 |
QA Dataset (15% mask) |
69.46 |
56.50 |
44.30 |
Review Dataset (5% mask) |
62.67 |
48.57 |
35.38 |
Review Dataset (10% mask) |
59.60 |
45.77 |
33.04 |
Review Dataset (15% mask) |
56.51 |
43.05 |
31.05 |
Biomedical Dataset (5% mask) |
58.11 |
50.78 |
40.82 |
Biomedical Dataset (10% mask) |
55.55 |
48.37 |
38.51 |
Biomedical Dataset (15% mask) |
52.71 |
45.82 |
36.44 |
Our experiments used three datasets: the [Turkish Biomedical Corpus](https://huggingface.co/datasets/hazal/Turkish - Biomedical - corpus - trM), the Turkish Product Reviews dataset, and the general - domain QA corpus turkish_v2.
📄 License
This project is under the MIT license.