🚀 TookaBERT Model
TookaBERT models are encoder models designed for Persian language processing. They come in two sizes, base and large, and are pre - trained on over 500GB of Persian data covering diverse topics like news, blogs, forums, and books. These models are pre - trained with the MLM (WWM) objective using two context lengths. Notably, TookaBERT - Large is the first large encoder model pre - trained on Persian and currently stands as the state - of - the - art model for Persian tasks.
Model Metadata
Property |
Details |
License |
Apache-2.0 |
Language |
Persian (fa) |
Pipeline Tag |
fill - mask |
Mask Token |
<mask> |
Widget Examples
- "توانا بود هر که بود ز دانش دل پیر برنا بود"
- "شهر برلین در کشور واقع شده است."
- "بهنام از خوانندگان مشهور کشور ما است."
- "رضا از بازیگران مشهور کشور ما است."
- "سید ابراهیم رییسی در سال رییس جمهور ایران شد."
- "دیگر امکان ادامه وجود ندارد. باید قرارداد را کنیم."
🚀 Quick Start
✨ Features
- Two Sizes: Available in base and large sizes to suit different needs.
- Extensive Training Data: Pre - trained on over 500GB of diverse Persian data.
- State - of - the - Art: TookaBERT - Large is the leading model for Persian tasks.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
You can use this model directly for Masked Language Modeling using the provided code below.
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("PartAI/TookaBERT-Large")
model = AutoModelForMaskedLM.from_pretrained("PartAI/TookaBERT-Large")
text = "شهر برلین در کشور <mask> واقع شده است."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
Advanced Usage
It is also possible to use inference pipelines such as below.
from transformers import pipeline
inference_pipeline = pipeline('fill-mask', model="PartAI/TookaBERT-Large")
inference_pipeline("شهر برلین در کشور <mask> واقع شده است.")
You can use this model to fine - tune it over your dataset and prepare it for your task.
- DeepSentiPers (Sentiment Analysis)

- ParsiNLU - Multiple - choice (Multiple - choice)

📚 Documentation
For more information you can read our paper on arXiv.
🔧 Technical Details
TookaBERT models are pre - trained on Persian data with the MLM (WWM) objective using two context lengths. The large version, TookaBERT - Large, is a significant milestone as the first large encoder model pre - trained on Persian.
📄 License
The model is licensed under the Apache - 2.0 license.
📊 Evaluation
TookaBERT models are evaluated on a wide range of NLP downstream tasks, such as Sentiment Analysis (SA), Text Classification, Multiple - choice, Question Answering, and Named Entity Recognition (NER).
Here are some key performance results:
Model name |
DeepSentiPers (f1/acc) |
MultiCoNER - v2 (f1/acc) |
PQuAD (best_exact/best_f1/HasAns_exact/HasAns_f1) |
FarsTail (f1/acc) |
ParsiNLU - Multiple - choice (f1/acc) |
ParsiNLU - Reading - comprehension (exact/f1) |
ParsiNLU - QQP (f1/acc) |
TookaBERT - large |
85.66/85.78 |
69.69/94.07 |
75.56/88.06/70.24/87.83 |
89.71/89.72 |
36.13/35.97 |
33.6/60.5 |
82.72/82.63 |
TookaBERT - base |
83.93/83.93 |
66.23/93.3 |
73.18/85.71/68.29/85.94 |
83.26/83.41 |
33.6/33.81 |
20.8/42.52 |
81.33/81.29 |
Shiraz |
81.17/81.08 |
59.1/92.83 |
65.96/81.25/59.63/81.31 |
77.76/77.75 |
34.73/34.53 |
17.6/39.61 |
79.68/79.51 |
ParsBERT |
80.22/80.23 |
64.91/93.23 |
71.41/84.21/66.29/84.57 |
80.89/80.94 |
35.34/35.25 |
20/39.58 |
80.15/80.07 |
XLM - V - base |
83.43/83.36 |
58.83/92.23 |
73.26/85.69/68.21/85.56 |
81.1/81.2 |
35.28/35.25 |
8/26.66 |
80.1/79.96 |
XLM - RoBERTa - base |
83.99/84.07 |
60.38/92.49 |
73.72/86.24/68.16/85.8 |
82.0/81.98 |
32.4/32.37 |
20.0/40.43 |
79.14/78.95 |
FaBERT |
82.68/82.65 |
63.89/93.01 |
72.57/85.39/67.16/85.31 |
83.69/83.67 |
32.47/32.37 |
27.2/48.42 |
82.34/82.29 |
mBERT |
78.57/78.66 |
60.31/92.54 |
71.79/84.68/65.89/83.99 |
82.69/82.82 |
33.41/33.09 |
27.2/42.18 |
79.19/79.29 |
AriaBERT |
80.51/80.51 |
60.98/92.45 |
68.09/81.23/62.12/80.94 |
74.47/74.43 |
30.75/30.94 |
14.4/35.48 |
79.09/78.84 |
*Note: because of the randomness in the fine - tuning process, results with less than 1% differences are considered together.
📞 Contact us
If you have any questions regarding this model, you can reach us via the community of the model in Hugging Face.