# Local Deployment

Jan Nano GGUF
Apache-2.0
Jan - Nano is a compact language model with 4 billion parameters, designed specifically for in - depth research tasks and optimized to support MCP server integration.
Large Language Model
J
unsloth
1,884
1
Openaudio Gguf
The GGUF quantized version of OpenAudio is a text-to-speech synthesis tool based on the FishAudio model. It supports running through simple commands and provides a convenient speech synthesis experience.
Speech Synthesis
O
calcuis
287
3
Qwen3 235B A22B Mixed 3 6bit
Apache-2.0
This is a mixed 3-6bit quantized version converted from the Qwen/Qwen3-235B-A22B model, optimized for efficient inference on the Apple MLX framework.
Large Language Model
Q
mlx-community
100
2
Qwen3 8B Q4 K M GGUF
Apache-2.0
This is the GGUF format version of the Qwen3-8B model, suitable for the llama.cpp framework and supports text generation tasks.
Large Language Model Transformers
Q
ufoym
342
3
Qwen Qwen3 8B GGUF
Apache-2.0
GGUF format quantized version of Qwen3-8B, provided by TensorBlock, compatible with llama.cpp
Large Language Model
Q
tensorblock
452
1
Qwen3 8B MLX 8bit
Apache-2.0
An 8-bit quantized large language model in MLX format converted from Qwen/Qwen3-8B, suitable for text generation tasks
Large Language Model
Q
lmstudio-community
63.46k
2
Medra4b LORA
Apache-2.0
Medra is a lightweight medical language model specifically designed for clinical reasoning, education, and dialogue modeling, built on Gemma 3 and suitable for local and mobile environments.
Large Language Model Supports Multiple Languages
M
drwlf
53
1
Deepseek R1 GGUF UD
MIT
DeepSeek-R1 is an efficient large language model that employs Unsloth Dynamic v2.0 quantization technology to achieve outstanding accuracy.
Large Language Model English
D
unsloth
3,149
11
3b Zh Ft Research Release Q8 0 GGUF
Apache-2.0
This model is converted from canopylabs/3b-zh-ft-research_release into GGUF format, suitable for Chinese text generation tasks.
Large Language Model Chinese
3
cludyw
20
0
Barcenas 4b
A multimodal model trained based on google/gemma-3-4b-it, specializing in high-quality data processing for mathematics, programming, science, and puzzle-solving domains.
Image-to-Text Transformers English
B
Danielbrdz
15
2
Orpheus 3b 0.1 Ft Q8 0 GGUF
Apache-2.0
This model is converted from canopylabs/orpheus-3b-0.1-ft into GGUF format, suitable for text generation tasks.
Large Language Model English
O
dodgeinmedia
22
0
Gemma 3 4b It GGUF
This model is converted from google/gemma-3-4b-it to GGUF format via llama.cpp, suitable for local deployment and inference.
Large Language Model
G
ysn-rfd
62
1
Deepseek V3 0324 GGUF
MIT
GGUF quantized version of DeepSeek-V3-0324, suitable for local text generation tasks.
Large Language Model
D
MaziyarPanahi
97.25k
19
Gemma 3 4b It Q4 K M GGUF
Gemma 3.4B IT is an open-source large language model developed by Google. This version is the 4-bit quantized version converted to GGUF format via llama.cpp.
Large Language Model
G
DravenBlack
186
1
Qwen2 VL 7B Captioner Relaxed GGUF
Apache-2.0
This model is a GGUF format conversion based on Qwen2-VL-7B-Captioner-Relaxed, optimized for image-to-text tasks and supports running via tools like llama.cpp and Koboldcpp.
Image-to-Text English
Q
r3b31
321
1
Thedrummer Gemmasutra 9B V1.1 GGUF
Other
This is a quantized version based on TheDrummer/Gemmasutra-9B-v1.1 model, processed using llama.cpp, suitable for text generation tasks.
Large Language Model
T
bartowski
1,198
6
Qwen2.5 Coder 0.5B Q8 0 GGUF
Apache-2.0
This is a GGUF format model converted from the Qwen2.5-Coder-0.5B model, suitable for code generation tasks.
Large Language Model Supports Multiple Languages
Q
ggml-org
943
5
Mistral Small 24B Instruct 2501 GGUF
GGUF quantized version of Mistral-Small-24B-Instruct-2501, suitable for local deployment and text generation tasks.
Large Language Model
M
MaziyarPanahi
474.73k
2
Llama 3.1 0x Mini Q8 0 GGUF
This is a GGUF format model converted from ozone-ai/llama-3.1-0x-mini, suitable for the llama.cpp framework.
Large Language Model
L
NikolayKozloff
19
1
Llama 2 7b Chat Hf Q4 K M GGUF
GGUF quantized version of Meta's Llama 2 series 7B parameter chat model, suitable for local deployment and inference
Large Language Model English
L
matrixportal
220
4
Qwen2.5 7B Instruct GGUF
The quantized model of Qwen2.5-7B-Instruct in GGUF format, suitable for text generation tasks.
Large Language Model
Q
MaziyarPanahi
194.13k
10
Yi Coder 1.5B Chat GGUF
Yi-Coder-1.5B-Chat-GGUF is the GGUF format model file of 01-ai/Yi-Coder-1.5B-Chat, suitable for text generation tasks.
Large Language Model
Y
MaziyarPanahi
254.78k
10
Prem 1B SQL
Apache-2.0
Prem-1B-SQL is a 1-billion-parameter text-to-SQL model developed by Prem AI, designed for local deployment and supports running on low-end GPU and CPU devices.
Large Language Model Safetensors English
P
premai-io
521
35
Phi 3.5 Mini Instruct Uncensored GGUF
Apache-2.0
Phi-3.5-mini-instruct_Uncensored is a quantized language model suitable for use under various hardware conditions.
Large Language Model
P
bartowski
1,953
42
Gguf Sharded LaMini Flan T5 248M
This is a GGUF format model converted from MBZUAI/LaMini-Flan-T5-248M, suitable for text generation tasks.
Large Language Model English
G
Felladrin
30
1
Phi 3 Mini 128k Instruct Function GGUF
Phi-3-mini-128k-instruct_function is a text generation model based on GGUF format quantization, supporting multiple quantization levels.
Large Language Model
P
afrideva
40
1
Vecteus V1 Gguf
Apache-2.0
GGUF format version of Vecteus-v1, supporting English and Japanese text generation
Large Language Model Supports Multiple Languages
V
Local-Novel-LLM-project
588
8
Meta Llama 3 70B Instruct GGUF
The GGUF format version of Llama 3 70B Instruct, providing a more efficient local running experience
Large Language Model Transformers English
M
PawanKrd
468
4
Distil Whisper Large V3 German
Apache-2.0
A German speech recognition model based on distil-whisper technology, with 756 million parameters, achieving faster inference speeds while maintaining high quality.
Speech Recognition Transformers German
D
primeline
207
15
Longalpaca 13B GGUF
LongAlpaca-13B-GGUF is the GGUF quantized version of the Yukang/LongAlpaca-13B model, supporting 2-8 bit quantization options, suitable for local text generation tasks.
Large Language Model
L
MaziyarPanahi
285
3
Gemma 2b It GGUF
Other
GGUF quantized version of the Gemma 2B model, suitable for local deployment and inference
Large Language Model
G
MaziyarPanahi
517
10
OPEN SOLAR KO 10.7B GGUF
Apache-2.0
This is a GGUF-format quantized version of the beomi/OPEN-SOLAR-KO-10.7B model, supporting 2-8 bit quantization levels, suitable for Korean and English text generation tasks.
Large Language Model Supports Multiple Languages
O
MaziyarPanahi
86
1
Stable Diffusion V2 1 GGUF
Stable-diffusion-GGUF is a text-to-image generation model that offers multiple quantization versions and is suitable for fields such as art creation.
Text-to-Image Other
S
jiaowobaba02
441
15
Has 820m
A privacy protection model developed by Tencent Security Xuanwu Lab, safeguarding user privacy by hiding sensitive information and restoring output content.
Large Language Model Transformers Supports Multiple Languages
H
SecurityXuanwuLab
2,730
24
Vits Female It
A VITS-based Italian female voice synthesis model capable of converting text into natural and fluent Italian speech.
Speech Synthesis Transformers Other
V
z-uo
218
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase