Model Selection

Multi-source data training

# Multi-source data training

A convolutional neural network (CNN) built on the Keras framework, specifically designed to recognize individual Japanese characters from 64×64 grayscale images, supporting both handwritten and printed text recognition.

Text Recognition Japanese

Navaistt V1 Medium

Uzbek speech recognition model fine-tuned based on Whisper medium, supports Tashkent dialect, trained on approximately 700 hours of data

Speech Recognition Other

kazRush-kk-ru is a Kazakh-to-Russian translation model based on the T5 configuration, trained on multiple parallel datasets.

Machine Translation

Transformers Other

Skywork Critic Llama 3.1 8B

The Skywork Critic series of models are advanced judgment models that excel in paired preference evaluation. They can compare and evaluate a pair of input contents and provide detailed judgments.

Large Language Model

A text classification model fine-tuned based on GPT-2, used to distinguish AI-generated text, Zhihu user responses, and texts from other sources.

Text Classification

Safetensors Chinese

Real3D is a 2D-to-3D mapping Transformer model based on the TripoSR architecture, extending its capability to process real-world images through unsupervised self-training and automatic data filtering.

Turkish Llama 8b V0.1

A Turkish text generation model fully fine-tuned on 30GB of Turkish dataset based on LLaMA-3 8B

Large Language Model

Transformers Other

Distill Whisper Th Medium

A distilled automatic speech recognition model based on the Whisper architecture, optimized for Thai language with balanced performance and efficiency

Speech Recognition

Russian Text Normalizer

A Russian text normalization model fine-tuned from FRED-T5-large, supporting normalization of numerals and Latin characters

Large Language Model

Transformers Other

Titulm Mpt 1b V1.0

TituLM-1B-BN-V1 is a large language model specifically trained for generating and understanding Bengali text, extensively trained on a dataset containing 4.51 billion Bengali tokens.

Large Language Model

Transformers Other

Hamsa V0.1 Beta

Hamsa is an Arabic speech recognition model built on the Whisper architecture, focusing on the linguistic needs of the Middle East and North Africa region.

Speech Recognition

Transformers Arabic

The optimal version in the UniNER series, integrating named entity recognition models from three major data sources

Sequence Labeling

Transformers English

Trocr Base Printed Fr

Transformer-based French printed text OCR model, filling the gap of French version in TrOCR models

Transformers French

Deberta V1 Distill

A bidirectional encoder model pre-trained for Russian language, trained on large-scale text corpora using standard masked language modeling objectives

Large Language Model

Transformers Supports Multiple Languages

Google Safesearch Mini V2

Ultra-high-precision multi-class image classifier for accurate sensitive content detection

Image Classification

Dutch Sarcasm Detector

A Dutch text classification model based on BERT architecture for detecting sarcasm in news headlines

Text Classification

Transformers Other

Bert Base Swedish Cased

Swedish BERT base model released by the National Library of Sweden/KBLab, trained on multi-source texts

Large Language Model Other

A model pre-trained on Bulgarian language using Masked Language Modeling (MLM) objective, case-sensitive.

Large Language Model

Transformers Other

Wav2vec2 Large Xlsr 53 Finnish

Finnish speech recognition model fine-tuned based on XLSR-53 large model, supports 16kHz audio input

Speech Recognition Other

Wav2vec2 Large 100k Voxpopuli Catala

A Catalan speech recognition model fine-tuned based on facebook/wav2vec2-large-100k-voxpopuli

Speech Recognition Other

AlephBERT is a cutting-edge language model for Hebrew, based on Google's BERT architecture, specifically designed for processing Hebrew text.

Large Language Model

Transformers Other

Wav2vec2 Large Xlsr Catala

Catalan automatic speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53

Speech Recognition Other

The state-of-the-art Hebrew language model based on the BERT architecture

Large Language Model Other

Wav2vec2 Xls R 300m Cv6 Turkish

Turkish automatic speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Roberta Small Bulgarian

This is a streamlined version of the Bulgarian RoBERTa model, containing only 6 hidden layers while maintaining comparable performance.

Large Language Model Other

Rubertconv Toxic Clf

Russian toxicity text classifier based on rubert-base-cased-conversational model

Text Classification

Transformers Other

Bert Large Swedish Cased

A Swedish Bert Large model implemented based on the Megatron-LM framework, containing 340 million parameters, pre-trained on 85GB of Swedish text

Large Language Model

Transformers Other

Bert Base Swedish Cased Ner

Swedish BERT base model released by the National Library of Sweden/KBLab, trained on multi-source text data

Large Language Model Other

The first BERT model specifically designed for the Moroccan Arabic dialect 'Darija', based on the BERT-base architecture, trained on approximately 3 million Darija dialect text sequences.

Large Language Model

Transformers Supports Multiple Languages

Finnish language model pre-trained on GPT-2 architecture, 117M parameter version

Large Language Model Other

Distilbert Punctuator En

A DistilBERT fine-tuned model for English text punctuation restoration, specifically designed to add punctuation to lowercase English text without punctuation.

Sequence Labeling

Distilgpt2 Base Pretrained He

A compact Hebrew text generation model based on GPT2 architecture, trained on TPU and GPU

Large Language Model Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase