# Masked language modeling
Chinesebert Base
ChineseBERT is a Chinese pre-trained model that integrates glyph and pinyin information, optimized for Chinese text processing.
Large Language Model
Transformers Chinese

C
iioSnail
118
7
Syllaberta
SyllaBERTa is an experimental Transformer-based masked language model specifically designed for processing Ancient Greek texts, employing syllable-level tokenization.
Large Language Model
Transformers Other

S
Ericu950
19
1
Moderncamembert Base
MIT
ModernCamemBERT is a French language model pre-trained on a 1T high-quality French text corpus. It is the French version of ModernBERT, focusing on long contexts and efficient inference speed.
Large Language Model
Transformers French

M
almanach
213
4
Rnafm
RNA foundation model pre-trained on non-coding RNA data using masked language modeling (MLM) objective
Protein Model
Safetensors Other
R
multimolecule
6,791
1
Medbert Base
Apache-2.0
medBERT-base is a BERT-based model focused on masked language modeling tasks for medical and gastroenterology texts.
Large Language Model
Transformers English

M
suayptalha
24
5
Nomic Xlm 2048
A fine-tuned version based on the XLM-Roberta base model, using RoPE (Rotary Position Embedding) to replace the original positional embeddings, supporting 2048 sequence length
Large Language Model
Transformers

N
nomic-ai
440
6
Ankh3 Xl
Ankh3 is a protein language model based on the T5 architecture. It is pre - trained by jointly optimizing masked language modeling and sequence completion tasks, and is suitable for protein feature extraction and sequence analysis.
Protein Model
Transformers

A
ElnaggarLab
131
2
Rinalmo
RiNALMo is a non-coding RNA (ncRNA) model pre-trained based on the masked language modeling (MLM) objective, trained through self-supervised learning on a large number of non-coding RNA sequences.
Protein Model Other
R
multimolecule
21.38k
2
Caduceus Ps Seqlen 131k D Model 256 N Layer 16
Apache-2.0
Caduceus-PS is a DNA sequence modeling model with reverse-complement equivariance, designed for processing long sequences.
Molecular Model
Transformers

C
kuleshov-group
2,618
14
Multilingual Albert Base Cased 128k
Apache-2.0
A multilingual ALBERT model pretrained with masked language modeling (MLM) objective, supporting 60+ languages, featuring a lightweight architecture with parameter sharing
Large Language Model
Transformers Supports Multiple Languages

M
cservan
277
2
Multilingual Albert Base Cased 32k
Apache-2.0
Multilingual ALBERT model pretrained with masked language modeling objective, supporting 50+ languages, case-sensitive
Large Language Model
Transformers Supports Multiple Languages

M
cservan
243
2
Albertina 1b5 Portuguese Ptbr Encoder
MIT
Albertina 1.5B PTBR is a foundational large language model for the Brazilian Portuguese variant. It is an encoder belonging to the BERT family, based on the Transformer neural network architecture and developed on the basis of the DeBERTa model.
Large Language Model
Transformers Other

A
PORTULAN
83
4
Tahrirchi Bert Base
Apache-2.0
TahrirchiBERT-base is an encoder-only Transformer text model for Uzbek (Latin script) with 110 million parameters, pre-trained using masked language modeling objectives.
Large Language Model
Transformers Other

T
tahrirchi
88
9
Dictabert
State-of-the-art BERT language model suite for Modern Hebrew
Large Language Model
Transformers Other

D
dicta-il
50.83k
8
Parlbert German Law
MIT
BERT model trained on German legal data, specialized in legal text processing
Large Language Model
Transformers German

P
InfAI
62
2
BEREL 3.0
Apache-2.0
BEREL 3.0 is an embedding model based on the BERT architecture, specifically designed for rabbinic coded language, providing support for relevant research and applications.
Large Language Model
Transformers Other

B
dicta-il
802
3
Legalnlp Bert
MIT
BERTikal is a case-sensitive BERT base model for Brazilian legal language, trained on Brazilian legal texts and based on the BERTimbau checkpoint.
Large Language Model
Transformers Other

L
felipemaiapolo
97
7
Roberta News
MIT
A RoBERTa-based masked language model specifically pretrained for news text
Large Language Model
Transformers English

R
AndyReas
17
1
Arbertv2
ARBERTv2 is an upgraded BERT model trained on Modern Standard Arabic (MSA) with a corpus of 243GB text, containing 27.8 billion tokens.
Large Language Model
Transformers Arabic

A
UBC-NLP
267
6
Norbert3 Base
Apache-2.0
NorBERT 3 is a next-generation Norwegian language model based on the BERT architecture, supporting both Bokmål and Nynorsk written Norwegian.
Large Language Model
Transformers Other

N
ltg
345
7
Switch C 2048
Apache-2.0
A Mixture of Experts (MoE) model trained on masked language modeling tasks, with a parameter scale of 1.6 trillion. It uses an architecture similar to T5 but replaces the feed - forward layer with a sparse MLP layer.
Large Language Model
Transformers English

S
google
73
290
Bart Base Cantonese
Other
This is a Cantonese model based on the base version of BART, obtained through second-phase pre-training on the LIHKG dataset.
Large Language Model Other
B
Ayaka
42
9
Esm2 T36 3B UR50D
MIT
ESM-2 is a next-generation protein model trained with masked language modeling objectives, suitable for fine-tuning on various downstream tasks with protein sequences as input.
Protein Model
Transformers

E
facebook
3.5M
22
Esm2 T30 150M UR50D
MIT
ESM-2 is a state-of-the-art protein model trained on masked language modeling objectives, suitable for fine-tuning on various protein sequence input tasks.
Protein Model
Transformers

E
facebook
69.91k
7
Esm2 T12 35M UR50D
MIT
ESM-2 is a cutting-edge protein model trained on masked language modeling objectives, suitable for various protein sequence analysis tasks
Protein Model
Transformers

E
facebook
332.83k
15
Esm2 T6 8M UR50D
MIT
ESM-2 is a next-generation protein model trained with masked language modeling objectives, suitable for fine-tuning on various protein sequence tasks.
Protein Model
Transformers

E
facebook
1.5M
21
Microbert Coptic Mx
This is a MicroBERT model for the Coptic language, pre-trained through masked language modeling and supervised XPOS tagging.
Large Language Model
Transformers Other

M
lgessler
141
0
Efficient Mlm M0.40 801010
This model studies the effectiveness of masking 15% content in masked language modeling, employing pre-layer normalization techniques not currently supported by HuggingFace.
Large Language Model
Transformers

E
princeton-nlp
119
0
Bert Base Bg
MIT
A model pre-trained on Bulgarian language using Masked Language Modeling (MLM) objective, case-sensitive.
Large Language Model
Transformers Other

B
rmihaylov
561
8
Bert Base Uncased
Apache-2.0
A BERT base model for the English language, pre-trained using the Masked Language Modeling (MLM) objective, case-insensitive.
Large Language Model
Transformers English

B
OWG
15
0
Roberta TR Medium Morph 44k
A RoBERTa model for Turkish language, pre-trained with morphological-level tokenization and masked language modeling objectives, suitable for Turkish NLP tasks.
Large Language Model
Transformers Other

R
ctoraman
453
0
Roberta TR Medium Bpe 44k
A RoBERTa model based on Turkish, pre-trained with masked language modeling (MLM) objective, case-insensitive.
Large Language Model
Transformers Other

R
ctoraman
48
0
Roberta TR Medium Bpe 16k
A RoBERTa model pre-trained on Turkish with masked language modeling (MLM) objective, case-insensitive, medium-sized architecture.
Large Language Model
Transformers Other

R
ctoraman
26
0
Chinese Roberta L 8 H 512
A Chinese RoBERTa model pre-trained on CLUECorpusSmall, with a parameter scale of 8 layers and 512 hidden units, supporting masked language modeling tasks.
Large Language Model Chinese
C
uer
76
3
Batteryscibert Cased
Apache-2.0
A language model pre-trained on a large corpus of battery research papers, inherited from SciBERT-cased, specializing in battery domain text comprehension
Large Language Model
Transformers English

B
batterydata
22
0
Chinese Roberta L 6 H 256
A Chinese RoBERTa model pre-trained on CLUECorpusSmall, with a parameter scale of 8 layers and 512 hidden units.
Large Language Model Chinese
C
uer
58
1
Uztext 568Mb Roberta BPE
UzRoBerta is a pre-trained Uzbek (Cyrillic script) model for masked language modeling and next sentence prediction.
Large Language Model
Transformers

U
rifkat
24
0
Mk Roberta Base
Apache-2.0
Masked language modeling pretrained model based on Macedonian language training with case-sensitive processing
Large Language Model Other
M
macedonizer
18
0
Sportsbert
SportsBERT is a BERT model specialized in the sports domain, trained on a corpus of sports news, suitable for sports-related natural language processing tasks.
Large Language Model
S
microsoft
3,361
24
Roberta Base Indonesian 522M
MIT
An Indonesian pretrained model based on RoBERTa-base architecture, trained on Indonesian Wikipedia data, case insensitive.
Large Language Model Other
R
cahya
454
6
- 1
- 2
Featured Recommended AI Models