# Lightweight deployment

Qwen2.5 VL 7B Meteorology GGUF
Other
Quantized version of Qwen2.5-VL-7B-Meteorology, suitable for image-text processing tasks related to meteorology.
Image-to-Text Transformers English
Q
mradermacher
159
1
Midm 2.0 Mini Instruct Gguf
MIT
Mi:dm 2.0 is an AI model centered around South Korea, developed by KT using its proprietary technology. This model has deeply internalized the unique values, cognitive frameworks, and common-sense reasoning of South Korean society. It can not only process and generate Korean content but also reflect a profound understanding of South Korean social and cultural norms and values.
Large Language Model Transformers Supports Multiple Languages
M
mykor
470
3
Tencent.hunyuan A13B Instruct GGUF
The quantized version of Tencent Hunyuan A13B Instruction Model, which uses technical means to improve operational efficiency while ensuring performance.
Large Language Model
T
DevQuasar
402
1
Apollo2 7B GGUF
Apache-2.0
Apollo2-7B-GGUF is a quantized version of FreedomIntelligence/Apollo2-7B, supporting medical large language model applications in multiple languages.
Large Language Model Supports Multiple Languages
A
QuantFactory
111
3
Qwen3 Embedding 8B 4bit DWQ
Apache-2.0
This is a 4-bit DWQ quantized version converted from Qwen/Qwen3-Embedding-8B, suitable for the embedding model of the MLX framework.
Text Embedding
Q
mlx-community
213
1
PP OCRv4 Mobile Det
Apache-2.0
PP-OCRv4_mobile_det is an efficient text detection model optimized for mobile devices developed by the PaddleOCR team, suitable for deployment on edge devices.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
360
0
Qwen.qwen3 Reranker 0.6B GGUF
The quantized version of Qwen3-Reranker-0.6B, dedicated to making knowledge accessible to everyone.
Large Language Model
Q
DevQuasar
1,481
3
PP OCRv5 Mobile Det
Apache-2.0
PP-OCRv5_mobile_det is the latest generation of lightweight text detection model developed by the PaddleOCR team, supporting efficient text detection in multiple languages and scenarios.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
556
0
Fpham Sydney Overthinker 13b HF GGUF
This project provides optimized GGUF quantized files, which can significantly improve model performance. These quantized files are supported by Featherless AI. Users can run any desired model by paying a small fee.
Large Language Model
F
featherless-ai-quants
133
1
Kakaocorp.kanana Safeguard 8b GGUF
This project is a quantized version of kakaocorp/kanana-safeguard-8b, dedicated to making knowledge accessible to the public.
Large Language Model
K
DevQuasar
156
1
Josiefied DeepSeek R1 0528 Qwen3 8B Abliterated V1 8bit
This is an 8-bit quantized version in MLX format converted from the DeepSeek-R1-0528-Qwen3-8B model, suitable for text generation tasks.
Large Language Model
J
mlx-community
847
1
Deepseek R1 0528 Qwen3 8B 4bit
MIT
This model is a 4-bit quantized version converted from DeepSeek-R1-0528-Qwen3-8B, optimized for the MLX framework and suitable for text generation tasks.
Large Language Model
D
mlx-community
924
1
Qwen2.5 Omni 7B GGUF
Other
Qwen2.5-Omni-7B-GGUF is the GGUF format version of the Qwen2.5-Omni-7B model, supporting multimodal inputs including text, audio, and images.
Large Language Model English
Q
ggml-org
319
3
Bytedance Seed.academic Ds 9B GGUF
This project provides a quantized version of academic-ds-9B, aiming to make knowledge accessible to everyone.
Large Language Model
B
DevQuasar
277
1
Devstral Small 2505 8bit
Apache-2.0
Devstral-Small-2505-8bit is an 8-bit quantized model converted from mistralai/Devstral-Small-2505, suitable for the MLX framework and supporting text generation tasks in multiple languages.
Large Language Model Supports Multiple Languages
D
mlx-community
789
1
Skywork Skywork OR1 7B GGUF
Skywork-OR1-7B is a 7B-parameter large language model offering multiple quantization versions to accommodate different hardware requirements.
Large Language Model
S
bartowski
634
1
Qwen3 4B 4bit DWQ
Apache-2.0
This model is a 4-bit DWQ quantized version of Qwen3-4B, converted to the MLX format for easy text generation using the mlx library.
Large Language Model
Q
mlx-community
517
2
Openvision Vit Large Patch14 84
Apache-2.0
OpenVision is a fully open, cost-effective family of advanced visual encoders focused on multimodal learning tasks.
Image Classification Transformers
O
UCSC-VLAA
21
0
Huihui Ai.qwen3 4B Abliterated GGUF
The quantized version of Huihui AI's Qwen3-4B model, aiming to make knowledge more widely accessible to the public.
Large Language Model
H
DevQuasar
540
1
Phi 4 Mini Reasoning GGUF
MIT
Phi-4-mini-reasoning is a lightweight open model built on synthetic data, focusing on high-quality, reasoning-rich data, and further fine-tuned for more advanced mathematical reasoning capabilities.
Large Language Model Transformers
P
Mungert
3,592
3
Josiefied Qwen3 8B Abliterated V1 8bit
An optimized 8-bit quantized version of Qwen3-8B, designed for efficient inference on the MLX framework
Large Language Model
J
mlx-community
450
1
Josiefied Qwen3 4B Abliterated V1 6bit
This is a 6-bit quantized version of the Qwen3-4B model converted to the MLX format, suitable for text generation tasks.
Large Language Model
J
mlx-community
15
1
Qwen3 8B 4bit DWQ
Apache-2.0
Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.
Large Language Model
Q
mlx-community
306
1
Ast Finetuned Audioset 10 10 0.4593 ONNX
This is the ONNX version of the AST (Audio Spectrogram Transformer) model, designed specifically for audio classification tasks and fine-tuned on the AudioSet dataset.
Audio Classification Transformers
A
onnx-community
684
1
Microsoft Phi 4 Mini Reasoning GGUF
MIT
This is a quantized version of the Microsoft Phi - 4 - mini - reasoning model, which is quantized using the llamacpp tool to improve the model's operating efficiency and performance in different hardware environments.
Large Language Model Supports Multiple Languages
M
bartowski
1,667
7
Muyan TTS SFT Q8 0 GGUF
This model is a GGUF format text-to-speech model converted from MYZY-AI/Muyan-TTS-SFT, supporting Chinese speech synthesis.
Speech Synthesis
M
NikolayKozloff
20
1
Fdtn Ai.foundation Sec 8B GGUF
Foundation-Sec-8B is a large language model based on the Transformer architecture, focusing on text generation tasks.
Large Language Model
F
DevQuasar
1,248
2
Industry Project V2
Apache-2.0
An instruction fine-tuned model optimized based on the Mistral architecture, suitable for zero-shot classification tasks
Large Language Model
I
omsh97
58
0
Qwen3 8B 4bit
Apache-2.0
This is the 4-bit quantized version of the Qwen/Qwen3-8B model, converted to the MLX framework format, suitable for efficient inference on Apple silicon devices.
Large Language Model
Q
mlx-community
2,131
2
Qwen3 4B 4bit
Apache-2.0
Qwen3-4B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-4B to the MLX format, designed for efficient operation on Apple chips.
Large Language Model
Q
mlx-community
7,400
6
Qwen3 4B MNN
Apache-2.0
The 4-bit quantized version of the MNN model for Qwen3-4B, used for efficient text generation tasks
Large Language Model English
Q
taobao-mnn
10.60k
2
Internvl2 5 1B MNN
Apache-2.0
A 4-bit quantized version based on InternVL2_5-1B, suitable for text generation and chat scenarios.
Large Language Model English
I
taobao-mnn
2,718
1
Deepcogito Cogito V1 Preview Llama 3B GGUF
A 3B-parameter language model based on the Llama architecture, offering multiple quantization versions to suit different hardware needs
Large Language Model
D
tensorblock
162
1
Mistral Small 24B Instruct 2501 GGUF
Apache-2.0
Mistral-Small-24B-Instruct-2501 is a 24B-parameter instruction-finetuned large language model supporting multilingual text generation tasks.
Large Language Model Supports Multiple Languages
M
bartowski
48.61k
111
Gemma 3 27b It Qat Unsloth Bnb 4bit
Gemma 3 is a lightweight, state-of-the-art multimodal open-source model launched by Google, capable of processing text and image inputs and generating text outputs.
Image-to-Text Transformers
G
unsloth
2,591
1
Gemma 3 1b It Qat
Gemma 3 is a lightweight multimodal model launched by Google, capable of processing text and image inputs and generating text outputs. This model has a 128K large context window and multilingual support for over 140 languages.
Image-to-Text Transformers
G
unsloth
2,558
1
Hyperclovax SEED Text Instruct 0.5B
Other
A Korean-optimized text generation model with instruction-following capability, featuring lightweight design suitable for edge device deployment
Large Language Model Transformers
H
naver-hyperclovax
7,531
60
Gemma 3 4b It Qat GGUF
Gemma 3 is a lightweight, advanced open model series from Google, built on the same research and technology used to create Gemini models. This model is multimodal, capable of processing both text and image inputs to generate text outputs.
Text-to-Image English
G
unsloth
2,629
2
Gigaam V2 Onnx
MIT
GigaAM v2 is an automatic speech recognition (ASR) model that supports Russian speech-to-text tasks, offering both CTC and RNN-T architectures.
Speech Recognition Other
G
istupakov
170
2
Gemma 3 27b It Qat
Gemma is a lightweight open model series launched by Google, built on Gemini model technology. Gemma 3 is a multimodal model supporting text and image inputs with text outputs, featuring a 128K large context window and multilingual capabilities.
Image-to-Text Transformers
G
unsloth
168
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase