# Multi-dataset training
Ritrieve Zh V1 GGUF
MIT
This project provides a static quantized version of the richinfoai/ritrieve_zh_v1 model. Through quantization, it reduces storage space and computational resource requirements while maintaining certain performance.
Large Language Model
Transformers Chinese

R
mradermacher
212
1
Chunkformer Large Vie
A large-scale Vietnamese automatic speech recognition model based on the ChunkFormer architecture, fine-tuned on approximately 3000 hours of publicly available Vietnamese speech data, with excellent performance.
Speech Recognition
PyTorch Other
C
khanhld
1,765
12
Bert Uncased Intent Classification
Apache-2.0
This is a fine-tuned model based on BERT, used to classify user inputs into 82 different intents, suitable for dialogue systems and natural language understanding tasks.
Text Classification
Transformers English

B
yeniguno
1,942
1
Aura 4B GGUF
Apache-2.0
Aura-4B is a quantized version based on AuraIndustries/Aura-4B, using llama.cpp for imatrix quantization, supporting multiple quantization types, suitable for text generation tasks.
Large Language Model English
A
bartowski
290
8
Viwhisper Medium
MIT
Whisper-medium model optimized for Vietnamese speech recognition tasks, fine-tuned on 1308 hours of Vietnamese data
Speech Recognition
Transformers Other

V
NhutP
139
4
Whisper Ja Anime V0.1
A Whisper variant model focused on speech recognition in the Japanese anime field, optimized for the characteristics of anime audio
Speech Recognition Japanese
W
efwkjn
205
15
Llama3 Aloe 8B Alpha GGUF
Llama3-Aloe-8B-Alpha is an 8B parameter large language model focused on the fields of biology and medicine, offering a quantized version in GGUF format
Large Language Model
Transformers English

L
tensorblock
224
1
Noobai Xl Nai Xl Epsilonpred10version Sdxl
Other
An anime-style text-to-image generation model based on SDXL, suitable for beginners, capable of generating high-quality anime characters and stylized images.
Image Generation English
N
John6666
87
3
Birefnet Matting
BiRefNet is a high-resolution binary image segmentation model based on bilateral reference, focusing on background removal and mask generation tasks.
Image Segmentation
B
ZhengPeng7
1,578
18
Birefnet Lite 2K
A bilateral reference framework for high-resolution binary image segmentation, focusing on background removal and mask generation tasks
Image Segmentation
B
ZhengPeng7
3,400
8
Octo Base 1.5
MIT
Octo is a multimodal foundation model for robotics, capable of predicting robot actions through visual and language inputs.
Multimodal Fusion
Transformers

O
rail-berkeley
87
14
Rad Dino
Other
Vision Transformer model trained with self-supervised DINOv2, specifically designed for encoding chest X-ray images
Image Classification
Transformers

R
microsoft
411.96k
48
Whisper Tiny Vi
Apache-2.0
Vietnamese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper-tiny architecture, demonstrating excellent performance on multiple Vietnamese datasets
Speech Recognition
Transformers Other

W
doof-ferb
44
2
Finance LLM GGUF
Other
Finance LLM is a language model specialized in the financial domain, based on the Llama architecture, fine-tuned with datasets such as OpenOrca, Lima, and WizardLM.
Large Language Model English
F
TheBloke
641
21
Deberta V3 Large Mnli Fever Anli Ling Wanli Binary
MIT
This model is a zero-shot classification model based on the DeBERTa-v3-large architecture, primarily trained on five NLI datasets, suitable for tasks that strictly follow the original NLI task format.
Text Classification
Transformers English

D
MoritzLaurer
30
0
Silver Retriever Base V1.1
The Silver Retriever model encodes Polish sentences or paragraphs into a 768-dimensional dense vector space, suitable for tasks like document retrieval or semantic search.
Text Embedding
Transformers Other

S
ipipan
862
9
Wav2vec2 Large Robust 24 Ft Age Gender
This model takes raw audio signals as input and outputs age predictions and gender probabilities (child/female/male), along with the pooled state of the last transformer layer.
Audio Classification
Transformers

W
audeering
44.13k
33
Silver Retriever Base V1
Silver Retriever is a neural retrieval model specifically designed for Polish language, focusing on sentence similarity and paragraph retrieval tasks.
Text Embedding
Transformers Other

S
ipipan
554
11
Vegam Whisper Medium Ml
MIT
This is a version of thennal/whisper-medium-ml converted to the CTranslate2 model format for Malayalam speech recognition
Speech Recognition Other
V
smcproject
83
5
Whisper Small Japanese
Apache-2.0
This model is a Japanese speech recognition model fine-tuned based on openai/whisper-small, supporting Japanese speech-to-text tasks.
Speech Recognition
Transformers Japanese

W
Ivydata
356
5
Reward Model Deberta V3 Large
MIT
This reward model is trained to predict which generated answer human evaluators would prefer for a given question.
Large Language Model
Transformers English

R
OpenAssistant
796
23
T5 Base Korean Summarization
This is a Korean text summarization model based on the T5 architecture, specifically designed for Korean text summarization tasks. It is trained on multiple Korean datasets by fine-tuning the paust/pko-t5-base model.
Text Generation
Transformers Korean

T
eenzeenee
148.32k
25
Whisper Large V2 French
Apache-2.0
French speech recognition model fine-tuned from openai/whisper-large-v2, trained on over 2200 hours of French audio data
Speech Recognition
Transformers French

W
bofenghuang
103
14
Whisper Large V2 Mn 13
Apache-2.0
A Mongolian speech recognition model fine-tuned on Mongolian datasets based on OpenAI's whisper-large-v2 model, supporting automatic speech recognition tasks in Mongolian.
Speech Recognition
Transformers Other

W
bayartsogt
161
6
T5 Xxl True Nli Mixture
Apache-2.0
This is a natural language inference (NLI) model based on the T5-XXL architecture, used to predict entailment relationships between text pairs ('1' for entailment, '0' for non-entailment).
Large Language Model
Transformers English

T
google
2,971
46
Bert Large Portuguese Cased Sts
A Portuguese semantic text similarity model fine-tuned based on the BERTimbau large model, capable of mapping sentences to a 1024-dimensional vector space
Text Embedding
Transformers Other

B
rufimelo
633
8
Stt En Conformer Transducer Xlarge
This is an Automatic Speech Recognition (ASR) model developed by NVIDIA, based on the Conformer-Transducer architecture, with approximately 600 million parameters, specifically designed for English speech transcription.
Speech Recognition English
S
nvidia
496
54
Wav2vec2 Large Xlsr 53 Th Cv8 Deepcut
Apache-2.0
This model is a Thai automatic speech recognition model trained on the CommonVoice V8 dataset, incorporating the DeepCut tokenizer and language model to improve recognition accuracy.
Speech Recognition
Transformers Other

W
wannaphong
504
5
Deberta V3 Large Mnli Fever Anli Ling Wanli
MIT
NLI model fine-tuned on DeBERTa-v3-large, achieving state-of-the-art performance on multiple NLI datasets
Text Classification
Transformers English

D
MoritzLaurer
312.01k
95
Wav2vec2 Base Vietnamese 160h
Vietnamese speech recognition model based on Wav2vec2, fine-tuned on 160 hours of Vietnamese speech data
Speech Recognition
Transformers Other

W
khanhld
356
10
Stt En Conformer Ctc Large
This is a large automatic speech recognition (ASR) model based on the Conformer architecture, supporting English speech transcription and trained using CTC loss function.
Speech Recognition English
S
nvidia
3,740
24
Wav2vec2 Large Xlsr 53 Coraa Brazilian Portuguese Gain Normalization
Apache-2.0
This is a Wav2vec 2.0 model fine-tuned for Portuguese, trained on multiple Portuguese speech datasets including CORAA, CETUC, MLS, etc.
Speech Recognition
Transformers Other

W
alefiury
28
0
Wav2vec2 Large Xlsr 53 Coraa Brazilian Portuguese Gain Normalization Sna
Apache-2.0
This is a Wav2vec 2.0 model fine-tuned for Portuguese, trained using multiple Portuguese speech datasets including CORAA, CETUC, and Multilingual LibriSpeech.
Speech Recognition
Transformers Other

W
alefiury
23
2
Iwslt Asr Wav2vec Large 4500h
A large-scale English automatic speech recognition model based on the Wav2Vec2 architecture, fine-tuned on 4500 hours of multi-source speech data, supporting decoding with a language model
Speech Recognition
Transformers English

I
nguyenvulebinh
27
2
Wav2vec2 Xls R 1b Dutch
Apache-2.0
This is a Dutch automatic speech recognition (ASR) model fine-tuned based on the XLS-R 1 billion parameter model, trained on multiple datasets including Common Voice 8.0, supporting 16kHz sampling rate audio input.
Speech Recognition
Transformers Other

W
jonatasgrosman
146
2
Wav2vec2 Xls R 1b Italian
Apache-2.0
This is an Italian automatic speech recognition model based on the XLS-R 1B architecture, fine-tuned on multiple Italian datasets
Speech Recognition
Transformers Other

W
jonatasgrosman
2,703
1
Wav2vec2 Large Xlsr 53 Sw
Apache-2.0
Swahili automatic speech recognition model fine-tuned on XLSR-53 large model, supports 16kHz sampling rate audio input
Speech Recognition Other
W
alokmatta
158
2
Wav2vec2 Xls R 1b Polish
Apache-2.0
This is a Polish automatic speech recognition (ASR) model fine-tuned based on the XLS-R 1-billion parameter model, trained on datasets such as Common Voice 8.0, supporting 16kHz sampling rate audio input.
Speech Recognition
Transformers Other

W
jonatasgrosman
212
0
All Mpnet Base V2
MIT
This is a sentence embedding model based on the MPNet architecture, capable of mapping text to a 768-dimensional vector space, suitable for semantic search and sentence similarity tasks.
Text Embedding English
A
navteca
14
1
Roberta Base Chinese Extractive Qa
A Chinese extractive QA model based on the RoBERTa architecture, suitable for tasks that extract answers from given texts.
Question Answering System Chinese
R
uer
2,694
98
- 1
- 2
Featured Recommended AI Models