# Multi-source data training
Kanjidnn
Apache-2.0
A convolutional neural network (CNN) built on the Keras framework, specifically designed to recognize individual Japanese characters from 64×64 grayscale images, supporting both handwritten and printed text recognition.
Text Recognition Japanese
K
gaiseras
38
0
Navaistt V1 Medium
Apache-2.0
Uzbek speech recognition model fine-tuned based on Whisper medium, supports Tashkent dialect, trained on approximately 700 hours of data
Speech Recognition Other
N
islomov
3,081
12
Kazrush Kk Ru
Apache-2.0
kazRush-kk-ru is a Kazakh-to-Russian translation model based on the T5 configuration, trained on multiple parallel datasets.
Machine Translation
Transformers Other

K
deepvk
2,630
8
Skywork Critic Llama 3.1 8B
Other
The Skywork Critic series of models are advanced judgment models that excel in paired preference evaluation. They can compare and evaluate a pair of input contents and provide detailed judgments.
Large Language Model
PyTorch
S
Skywork
1,376
12
Aitextdetector
Openrail
A text classification model fine-tuned based on GPT-2, used to distinguish AI-generated text, Zhihu user responses, and texts from other sources.
Text Classification
Safetensors Chinese
A
hugfaceguy0001
293
1
Real3d
MIT
Real3D is a 2D-to-3D mapping Transformer model based on the TripoSR architecture, extending its capability to process real-world images through unsupervised self-training and automatic data filtering.
3D Vision
R
hwjiang
22
19
Turkish Llama 8b V0.1
A Turkish text generation model fully fine-tuned on 30GB of Turkish dataset based on LLaMA-3 8B
Large Language Model
Transformers Other

T
ytu-ce-cosmos
3,317
60
Distill Whisper Th Medium
MIT
A distilled automatic speech recognition model based on the Whisper architecture, optimized for Thai language with balanced performance and efficiency
Speech Recognition
Transformers

D
biodatlab
303
2
Russian Text Normalizer
Apache-2.0
A Russian text normalization model fine-tuned from FRED-T5-large, supporting normalization of numerals and Latin characters
Large Language Model
Transformers Other

R
saarus72
577
8
Titulm Mpt 1b V1.0
Apache-2.0
TituLM-1B-BN-V1 is a large language model specifically trained for generating and understanding Bengali text, extensively trained on a dataset containing 4.51 billion Bengali tokens.
Large Language Model
Transformers Other

T
hishab
61
11
Hamsa V0.1 Beta
Apache-2.0
Hamsa is an Arabic speech recognition model built on the Whisper architecture, focusing on the linguistic needs of the Middle East and North Africa region.
Speech Recognition
Transformers Arabic

H
nadsoft
46
6
Uniner 7B All
The optimal version in the UniNER series, integrating named entity recognition models from three major data sources
Sequence Labeling
Transformers English

U
Universal-NER
4,430
90
Trocr Base Printed Fr
MIT
Transformer-based French printed text OCR model, filling the gap of French version in TrOCR models
Image-to-Text
Transformers French

T
agomberto
110
2
Deberta V1 Distill
Apache-2.0
A bidirectional encoder model pre-trained for Russian language, trained on large-scale text corpora using standard masked language modeling objectives
Large Language Model
Transformers Supports Multiple Languages

D
deepvk
166
5
Google Safesearch Mini V2
Apache-2.0
Ultra-high-precision multi-class image classifier for accurate sensitive content detection
Image Classification
G
FredZhang7
3,791
4
Dutch Sarcasm Detector
A Dutch text classification model based on BERT architecture for detecting sarcasm in news headlines
Text Classification
Transformers Other

D
helinivan
29
2
Bert Base Swedish Cased
Swedish BERT base model released by the National Library of Sweden/KBLab, trained on multi-source texts
Large Language Model Other
B
KB
11.16k
21
Bert Base Bg
MIT
A model pre-trained on Bulgarian language using Masked Language Modeling (MLM) objective, case-sensitive.
Large Language Model
Transformers Other

B
rmihaylov
561
8
Wav2vec2 Large Xlsr 53 Finnish
Apache-2.0
Finnish speech recognition model fine-tuned based on XLSR-53 large model, supports 16kHz audio input
Speech Recognition Other
W
jonatasgrosman
73.11k
1
Wav2vec2 Large 100k Voxpopuli Catala
Apache-2.0
A Catalan speech recognition model fine-tuned based on facebook/wav2vec2-large-100k-voxpopuli
Speech Recognition Other
W
ccoreilly
56
2
Alephbert Base
Apache-2.0
AlephBERT is a cutting-edge language model for Hebrew, based on Google's BERT architecture, specifically designed for processing Hebrew text.
Large Language Model
Transformers Other

A
biu-nlp
26
0
Wav2vec2 Large Xlsr Catala
Apache-2.0
Catalan automatic speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53
Speech Recognition Other
W
ccoreilly
31
1
Alephbert Base
Apache-2.0
The state-of-the-art Hebrew language model based on the BERT architecture
Large Language Model Other
A
onlplab
25.26k
18
Wav2vec2 Xls R 300m Cv6 Turkish
Apache-2.0
Turkish automatic speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers Other

W
mpoyraz
38
7
Roberta Small Bulgarian
This is a streamlined version of the Bulgarian RoBERTa model, containing only 6 hidden layers while maintaining comparable performance.
Large Language Model Other
R
iarfmoose
21
0
Rubertconv Toxic Clf
Apache-2.0
Russian toxicity text classifier based on rubert-base-cased-conversational model
Text Classification
Transformers Other

R
IlyaGusev
1,381
13
Bert Large Swedish Cased
A Swedish Bert Large model implemented based on the Megatron-LM framework, containing 340 million parameters, pre-trained on 85GB of Swedish text
Large Language Model
Transformers Other

B
AI-Nordics
734
11
Bert Base Swedish Cased Ner
Swedish BERT base model released by the National Library of Sweden/KBLab, trained on multi-source text data
Large Language Model Other
B
KBLab
245
5
Darijabert
The first BERT model specifically designed for the Moroccan Arabic dialect 'Darija', based on the BERT-base architecture, trained on approximately 3 million Darija dialect text sequences.
Large Language Model
Transformers Supports Multiple Languages

D
SI2M-Lab
554
34
Gpt2 Finnish
Apache-2.0
Finnish language model pre-trained on GPT-2 architecture, 117M parameter version
Large Language Model Other
G
Finnish-NLP
201
2
Distilbert Punctuator En
A DistilBERT fine-tuned model for English text punctuation restoration, specifically designed to add punctuation to lowercase English text without punctuation.
Sequence Labeling
Transformers

D
Qishuai
55
7
Distilgpt2 Base Pretrained He
MIT
A compact Hebrew text generation model based on GPT2 architecture, trained on TPU and GPU
Large Language Model Other
D
Norod78
1,632
1
Featured Recommended AI Models