Model Selection

Multi-dataset training

# Multi-dataset training

Ritrieve Zh V1 GGUF

This project provides a static quantized version of the richinfoai/ritrieve_zh_v1 model. Through quantization, it reduces storage space and computational resource requirements while maintaining certain performance.

Large Language Model

Transformers Chinese

Chunkformer Large Vie

A large-scale Vietnamese automatic speech recognition model based on the ChunkFormer architecture, fine-tuned on approximately 3000 hours of publicly available Vietnamese speech data, with excellent performance.

Speech Recognition

Bert Uncased Intent Classification

This is a fine-tuned model based on BERT, used to classify user inputs into 82 different intents, suitable for dialogue systems and natural language understanding tasks.

Text Classification

Transformers English

Aura-4B is a quantized version based on AuraIndustries/Aura-4B, using llama.cpp for imatrix quantization, supporting multiple quantization types, suitable for text generation tasks.

Large Language Model English

Viwhisper Medium

Whisper-medium model optimized for Vietnamese speech recognition tasks, fine-tuned on 1308 hours of Vietnamese data

Speech Recognition

Transformers Other

Whisper Ja Anime V0.1

A Whisper variant model focused on speech recognition in the Japanese anime field, optimized for the characteristics of anime audio

Speech Recognition Japanese

Llama3 Aloe 8B Alpha GGUF

Llama3-Aloe-8B-Alpha is an 8B parameter large language model focused on the fields of biology and medicine, offering a quantized version in GGUF format

Large Language Model

Transformers English

Noobai Xl Nai Xl Epsilonpred10version Sdxl

An anime-style text-to-image generation model based on SDXL, suitable for beginners, capable of generating high-quality anime characters and stylized images.

Image Generation English

Birefnet Matting

BiRefNet is a high-resolution binary image segmentation model based on bilateral reference, focusing on background removal and mask generation tasks.

Image Segmentation

Birefnet Lite 2K

A bilateral reference framework for high-resolution binary image segmentation, focusing on background removal and mask generation tasks

Image Segmentation

Octo is a multimodal foundation model for robotics, capable of predicting robot actions through visual and language inputs.

Multimodal Fusion

Vision Transformer model trained with self-supervised DINOv2, specifically designed for encoding chest X-ray images

Image Classification

Whisper Tiny Vi

Vietnamese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper-tiny architecture, demonstrating excellent performance on multiple Vietnamese datasets

Speech Recognition

Transformers Other

Finance LLM GGUF

Finance LLM is a language model specialized in the financial domain, based on the Llama architecture, fine-tuned with datasets such as OpenOrca, Lima, and WizardLM.

Large Language Model English

Deberta V3 Large Mnli Fever Anli Ling Wanli Binary

This model is a zero-shot classification model based on the DeBERTa-v3-large architecture, primarily trained on five NLI datasets, suitable for tasks that strictly follow the original NLI task format.

Text Classification

Transformers English

Silver Retriever Base V1.1

The Silver Retriever model encodes Polish sentences or paragraphs into a 768-dimensional dense vector space, suitable for tasks like document retrieval or semantic search.

Transformers Other

Wav2vec2 Large Robust 24 Ft Age Gender

This model takes raw audio signals as input and outputs age predictions and gender probabilities (child/female/male), along with the pooled state of the last transformer layer.

Audio Classification

Silver Retriever Base V1

Silver Retriever is a neural retrieval model specifically designed for Polish language, focusing on sentence similarity and paragraph retrieval tasks.

Transformers Other

Vegam Whisper Medium Ml

This is a version of thennal/whisper-medium-ml converted to the CTranslate2 model format for Malayalam speech recognition

Speech Recognition Other

Whisper Small Japanese

This model is a Japanese speech recognition model fine-tuned based on openai/whisper-small, supporting Japanese speech-to-text tasks.

Speech Recognition

Transformers Japanese

Reward Model Deberta V3 Large

This reward model is trained to predict which generated answer human evaluators would prefer for a given question.

Large Language Model

Transformers English

T5 Base Korean Summarization

This is a Korean text summarization model based on the T5 architecture, specifically designed for Korean text summarization tasks. It is trained on multiple Korean datasets by fine-tuning the paust/pko-t5-base model.

Text Generation

Transformers Korean

Whisper Large V2 French

French speech recognition model fine-tuned from openai/whisper-large-v2, trained on over 2200 hours of French audio data

Speech Recognition

Transformers French

Whisper Large V2 Mn 13

A Mongolian speech recognition model fine-tuned on Mongolian datasets based on OpenAI's whisper-large-v2 model, supporting automatic speech recognition tasks in Mongolian.

Speech Recognition

Transformers Other

T5 Xxl True Nli Mixture

This is a natural language inference (NLI) model based on the T5-XXL architecture, used to predict entailment relationships between text pairs ('1' for entailment, '0' for non-entailment).

Large Language Model

Transformers English

Bert Large Portuguese Cased Sts

A Portuguese semantic text similarity model fine-tuned based on the BERTimbau large model, capable of mapping sentences to a 1024-dimensional vector space

Transformers Other

Stt En Conformer Transducer Xlarge

This is an Automatic Speech Recognition (ASR) model developed by NVIDIA, based on the Conformer-Transducer architecture, with approximately 600 million parameters, specifically designed for English speech transcription.

Speech Recognition English

Wav2vec2 Large Xlsr 53 Th Cv8 Deepcut

This model is a Thai automatic speech recognition model trained on the CommonVoice V8 dataset, incorporating the DeepCut tokenizer and language model to improve recognition accuracy.

Speech Recognition

Transformers Other

Deberta V3 Large Mnli Fever Anli Ling Wanli

NLI model fine-tuned on DeBERTa-v3-large, achieving state-of-the-art performance on multiple NLI datasets

Text Classification

Transformers English

Wav2vec2 Base Vietnamese 160h

Vietnamese speech recognition model based on Wav2vec2, fine-tuned on 160 hours of Vietnamese speech data

Speech Recognition

Transformers Other

Stt En Conformer Ctc Large

This is a large automatic speech recognition (ASR) model based on the Conformer architecture, supporting English speech transcription and trained using CTC loss function.

Speech Recognition English

Wav2vec2 Large Xlsr 53 Coraa Brazilian Portuguese Gain Normalization

This is a Wav2vec 2.0 model fine-tuned for Portuguese, trained on multiple Portuguese speech datasets including CORAA, CETUC, MLS, etc.

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Coraa Brazilian Portuguese Gain Normalization Sna

This is a Wav2vec 2.0 model fine-tuned for Portuguese, trained using multiple Portuguese speech datasets including CORAA, CETUC, and Multilingual LibriSpeech.

Speech Recognition

Transformers Other

Iwslt Asr Wav2vec Large 4500h

A large-scale English automatic speech recognition model based on the Wav2Vec2 architecture, fine-tuned on 4500 hours of multi-source speech data, supporting decoding with a language model

Speech Recognition

Transformers English

Wav2vec2 Xls R 1b Dutch

This is a Dutch automatic speech recognition (ASR) model fine-tuned based on the XLS-R 1 billion parameter model, trained on multiple datasets including Common Voice 8.0, supporting 16kHz sampling rate audio input.

Speech Recognition

Transformers Other

Wav2vec2 Xls R 1b Italian

This is an Italian automatic speech recognition model based on the XLS-R 1B architecture, fine-tuned on multiple Italian datasets

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Sw

Swahili automatic speech recognition model fine-tuned on XLSR-53 large model, supports 16kHz sampling rate audio input

Speech Recognition Other

Wav2vec2 Xls R 1b Polish

This is a Polish automatic speech recognition (ASR) model fine-tuned based on the XLS-R 1-billion parameter model, trained on datasets such as Common Voice 8.0, supporting 16kHz sampling rate audio input.

Speech Recognition

Transformers Other

All Mpnet Base V2

This is a sentence embedding model based on the MPNet architecture, capable of mapping text to a 768-dimensional vector space, suitable for semantic search and sentence similarity tasks.

Text Embedding English

Roberta Base Chinese Extractive Qa

A Chinese extractive QA model based on the RoBERTa architecture, suitable for tasks that extract answers from given texts.

Question Answering System Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase