# Low-Resource Deployment
Neobert GGUF
MIT
This is a static quantized version of the chandar-lab/NeoBERT model, aiming to reduce model storage space and computational resource requirements.
Large Language Model
Transformers English

N
mradermacher
219
1
Qwen2.5 VL 7B Instruct Gemlite Ao A8w8
Apache-2.0
This is a multimodal large language model quantized with A8W8, based on Qwen2.5-VL-7B-Instruct, supporting vision and language tasks.
Image-to-Text
Transformers

Q
mobiuslabsgmbh
161
1
Qwen2 Audio 7B Instruct GGUF
Apache-2.0
Static quantized version of Qwen2-Audio-7B-Instruct model, supporting English audio-to-text conversion tasks
Audio-to-Text
Transformers English

Q
mradermacher
146
0
Sarvamai Sarvam M GGUF
Apache-2.0
This is a quantized version of the Sarvam-m model, supporting text generation tasks in multiple Indian languages and English.
Large Language Model Supports Multiple Languages
S
bartowski
845
1
Wan2.1 VACE 14B GGUF
Apache-2.0
This is the GGUF quantized conversion version of the Wan-AI/Wan2.1-VACE-14B model, primarily designed for text-to-video generation tasks.
Text-to-Video
W
QuantStack
2,331
23
Qwen3 4B GGUF
Apache-2.0
Qwen3 is the latest generation of large language models in the Tongyi Qianwen series, offering a complete combination of dense models and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 achieves breakthrough progress in reasoning capabilities, instruction following, agent functions, and multilingual support.
Large Language Model English
Q
prithivMLmods
829
1
Qwen Qwen2.5 VL 7B Instruct GGUF
Apache-2.0
A quantized version of Qwen2.5-VL-7B-Instruct, using llama.cpp for quantization, supporting multimodal tasks such as image-to-text conversion.
Text-to-Image English
Q
bartowski
2,056
2
Nvidia OpenCodeReasoning Nemotron 32B IOI GGUF
Apache-2.0
This is the quantized version of the NVIDIA OpenCodeReasoning-Nemotron-32B-IOI model, processed using llama.cpp for quantization, suitable for code reasoning tasks.
Large Language Model Supports Multiple Languages
N
bartowski
1,272
2
Nomic Ai Nomic Embed Code GGUF
Apache-2.0
This is the quantized version of the nomic-ai/nomic-embed-code model, using llama.cpp for imatrix quantization, suitable for code embedding and feature extraction tasks.
Text Embedding
N
bartowski
2,109
3
Microsoft Phi 4 Reasoning GGUF
MIT
This is a quantized version of Microsoft's Phi-4-reasoning model, optimized using llama.cpp for inference tasks and supporting multiple quantization options.
Large Language Model
M
bartowski
5,443
4
Qwen3 4B GGUF
Apache-2.0
Qwen3-4B is a GGUF format model based on Qwen3-4B-Base, suitable for text generation tasks.
Large Language Model
Q
Mungert
1,507
7
Llasa 1B Multilingual Mlx 8Bit
This is a multilingual text-to-speech model supporting 11 languages including Chinese, English, German, etc., converted from HKUSTAudio/Llasa-1B-Multilingual.
Speech Synthesis Supports Multiple Languages
L
nhe-ai
21
0
Qwen3 1.7B Q8 0 GGUF
Apache-2.0
Qwen3-1.7B-Q8_0-GGUF is a GGUF-format model converted from Qwen/Qwen3-1.7B, supporting text generation tasks with multilingual capabilities and efficient reasoning.
Large Language Model
Q
Triangle104
277
1
Chengsenwang ChatTime 1 7B Base GGUF
Apache-2.0
ChatTime-1-7B-Base is a foundational model specialized in time series forecasting, supporting multimodal time series processing.
Multimodal Fusion
C
tensorblock
175
0
Chengsenwang ChatTime 1 7B Chat GGUF
Apache-2.0
ChatTime-1-7B-Chat is a multimodal foundation model specialized in time series forecasting, built on a 7B parameter scale.
Multimodal Fusion
C
tensorblock
153
0
Qwen Qwen3 0.6B GGUF
The Llamacpp imatrix quantized version of Qwen3-0.6B provided by the Qwen team, quantized using llama.cpp, supports running in LM Studio or projects based on llama.cpp.
Large Language Model
Q
bartowski
10.24k
14
Ltxv0.9.6 Gguf
Other
GGUF quantized versions of the Lightricks/LTX-Video model, including development and distilled editions, designed for text-to-video generation tasks.
Text-to-Video English
L
calcuis
1,753
5
Llava 1.5 13b Hf I1 GGUF
This project provides weighted/matrix quantized versions of the llava-1.5-13b-hf model, including various quantization types to meet the usage requirements in different scenarios.
Text-to-Image
Transformers English

L
mradermacher
332
1
Gemma 3 4b It GPTQ 4b 128g
INT4 quantized version based on the gemma-3-4b-it model, significantly reducing storage and computational resource requirements
Image-to-Text
Transformers

G
ISTA-DASLab
502
2
Qwen2.5 3B YiLin GGUF Q4 K M
Gpl-3.0
A 4-bit quantized model optimized based on Qwen2.5-3B-Instruct, supporting both Chinese and English, with chain-of-thought control and tool invocation capabilities.
Large Language Model Supports Multiple Languages
Q
likewendy
171
7
Beaver 7b V3.0 GGUF
Beaver-7B-v3.0 is a 7B-parameter large language model based on the LLaMA architecture, focusing on safety and human feedback reinforcement learning (RLHF).
Large Language Model English
B
mradermacher
405
1
News Summarizer T5 GGUF
Apache-2.0
This is a statically quantized version of a T5-based news summarization model, supporting English text summarization tasks.
Text Generation English
N
mradermacher
167
0
Orpheus 3b FT Q4 K M.gguf
Apache-2.0
Orpheus is a high-performance text-to-speech model, fine-tuned to achieve natural and emotionally rich speech synthesis. This repository hosts the 8-bit quantized version of the 3-billion-parameter model, optimizing operational efficiency while maintaining high-quality output.
Speech Synthesis Supports Multiple Languages
O
lex-au
736
2
STEVE R1 7B SFT GGUF
Apache-2.0
Static quantized version of STEVE-R1-7B-SFT, supporting multiple quantization levels for different hardware requirements
Text-to-Image English
S
mradermacher
203
0
Bge Reranker V2 M3 Q4 K M GGUF
Apache-2.0
This model is a GGUF format conversion of BAAI/bge-reranker-v2-m3, designed for text ranking tasks with multilingual support.
Text Embedding Other
B
sabafallah
49
0
Heron NVILA Lite 2B
Apache-2.0
Heron-NVILA-Lite-2B is a vision-language model based on the NVILA-Lite architecture, specifically trained for Japanese, supporting image-text interaction tasks in both Japanese and English.
Image-to-Text Supports Multiple Languages
H
turing-motors
1,023
4
Qwen2.5 VL 7B Instruct GGUF
Apache-2.0
Qwen2.5-VL-7B-Instruct is a multimodal vision-language model that supports image-text generation tasks.
Image-to-Text English
Q
samgreen
5,052
9
Trillion 7B Preview AWQ
Apache-2.0
The Trillion-7B Preview is a multilingual large language model supporting English, Korean, Japanese, and Chinese. It outperforms other 7B-scale models in computational efficiency and performance.
Large Language Model Supports Multiple Languages
T
trillionlabs
22
4
Mlabonne Gemma 3 27b It Abliterated GGUF
A quantized version based on Google Gemma 3B model, optimized using llama.cpp, supporting multiple quantization levels, suitable for text generation tasks.
Large Language Model
M
bartowski
7,217
20
Lightblue Reranker 0.5 Cont Gguf
This is a text ranking model used for reordering and scoring texts.
Text Embedding
L
RichardErkhov
1,986
0
Jbaron34 Qwen2.5 0.5b Bebop Reranker Gguf
A 0.5B parameter text reranking model based on Qwen2.5 architecture, efficiently trained using Unsloth and TRL libraries
Large Language Model
J
RichardErkhov
2,119
0
Thedrummer Gemmasutra Small 4B V1 GGUF
Gemmasutra-Small-4B-v1 is a 4B-parameter text generation model, quantized based on llama.cpp, suitable for various quantization version choices.
Large Language Model
T
bartowski
583
2
Terjman Nano V2.0
Terjman-Nano-v2.0 is a Transformer-based English-Moroccan dialect translation model with 77M parameters, optimized for high-quality and precise translation.
Machine Translation
Transformers Supports Multiple Languages

T
atlasia
95
2
Qwen2.5 VL 7B Instruct Quantized.w4a16
Apache-2.0
Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, with weights quantized to INT4 and activations to FP16.
Text-to-Image
Transformers English

Q
RedHatAI
605
3
Summllama3.2 3B Q4 0 GGUF
This is a GGUF format model converted from DISLab/SummLlama3.2-3B, primarily used for text summarization tasks.
Large Language Model
S
fernandoruiz
17
0
Terjman Large V2.0
Terjman Large-v2.0 is a Transformer-based English-Moroccan dialect translation model with significantly improved performance, comparable to commercial models.
Machine Translation
Transformers Supports Multiple Languages

T
BounharAbdelaziz
20
1
Qwen2 VL 7B Instruct GGUF
Apache-2.0
A quantized version of the multimodal model based on Qwen2-VL-7B-Instruct, supporting image-text-to-text tasks with various quantization levels.
Image-to-Text English
Q
XelotX
201
1
Internlm3 8b Instruct Gguf
Apache-2.0
The GGUF format version of the InternLM3-8B-Instruct model, suitable for the llama.cpp framework and supporting multiple quantization versions.
Large Language Model English
I
internlm
1,072
26
Vintern 1B V3 5
MIT
Vintern-1B-v3.5 is a multimodal large language model fine-tuned based on InternVL2.5-1B, specializing in Vietnamese text processing, excelling in OCR and understanding Vietnamese-specific documents.
Image-to-Text
Transformers Supports Multiple Languages

V
5CD-AI
6,875
35
Qwq 32B Preview IdeaWhiz V1 GGUF
Apache-2.0
A 32B-parameter large language model based on llama.cpp, specializing in text generation tasks for chemistry, biology, climate, and medical fields
Large Language Model English
Q
bartowski
847
3
- 1
- 2
Featured Recommended AI Models