# Low Memory Optimization

Gryphe Codex 24B Small 3.2 GGUF
Apache-2.0
This is a quantized version of Gryphe's Codex-24B-Small-3.2 model, which optimizes the running efficiency under different hardware conditions through quantization technology.
Large Language Model English
G
bartowski
626
3
Xlangai Jedi 7B 1080p GGUF
Apache-2.0
This is a Jedi - 7B - 1080p model quantized using llama.cpp, offering multiple quantization types for users to choose from, balancing file size and model quality.
Large Language Model English
X
bartowski
225
1
E N V Y Legion V2.1 LLaMa 70B Elarablated V0.8 Hf GGUF
Legion-V2.1-LLaMa-70B-Elarablated-v0.8-hf is a quantized version based on LLaMa-70B, optimized using llama.cpp, offering multiple quantization options to accommodate different hardware requirements.
Large Language Model
E
bartowski
267
1
Nvidia Llama 3.1 Nemotron Nano 4B V1.1 GGUF
Other
A quantized version of the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model, processed with llama.cpp tools for various quantization methods, suitable for running in resource-constrained environments.
Large Language Model English
N
bartowski
2,553
8
Qwen2.5 Omni 7B GPTQ Int4
Other
Qwen2.5-Omni is an end-to-end multimodal model capable of perceiving various modalities such as text, images, audio, and video, and generating text and natural speech responses in a streaming manner.
Multimodal Fusion Transformers English
Q
Qwen
389
8
Qwen Qwen3 1.7B GGUF
A quantized version based on Qwen/Qwen3-1.7B, using llama.cpp for quantization, supporting multiple quantization types, suitable for text generation tasks.
Large Language Model
Q
bartowski
7,150
10
Qwen Qwen3 4B GGUF
The Llamacpp imatrix quantization version of Qwen3-4B provided by the Qwen team, supporting multiple quantization types and suitable for text generation tasks.
Large Language Model
Q
bartowski
10.58k
9
Llama 3.2 1B Instruct GGUF
Llama-3.2-1B-Instruct is a 1B-parameter instruction-fine-tuned model based on the Llama architecture, offering multiple quantization formats to accommodate different hardware requirements.
Large Language Model Supports Multiple Languages
L
Mungert
708
3
Qwen2.5 72B Instruct GGUF
Other
The GGUF quantized version of Qwen2.5-72B-Instruct, supporting multiple precision formats for efficient inference across different hardware environments.
Large Language Model English
Q
Mungert
1,439
4
Mxbai Rerank Large V2 GGUF
Apache-2.0
mxbai-rerank-large-v2 is a multilingual text reranking model that supports multiple languages and various quantization formats, suitable for different hardware environments.
Text Embedding Supports Multiple Languages
M
Mungert
2,209
2
Meta Llama 3 8B GGUF
Meta-Llama-3-8B is an 8B-parameter large language model based on the GGUF format, supporting multiple quantized versions for various hardware environments.
Large Language Model English
M
Mungert
1,303
2
Rwkv7 2.9B World GGUF
Apache-2.0
RWKV-7 architecture with 2.9 billion parameters, supporting multilingual text generation tasks
Large Language Model Supports Multiple Languages
R
Mungert
748
3
Wan 1.3b Gguf
Apache-2.0
This is a GGUF quantized version based on Wan-AI/Wan2.1-T2V-1.3B, specifically designed for text-to-video generation tasks, compatible with comfyui-gguf and gguf nodes.
Text-to-Video English
W
calcuis
3,058
12
Mochi Gguf
Apache-2.0
The GGUF quantized version of Mochi is a text-to-video generation model that includes a GGUF encoder and GGUF variational autoencoder, suitable for fast video content generation.
Text-to-Video English
M
calcuis
284
2
Mochi
Apache-2.0
Mochi is a text-to-video generation model based on the GGUF quantized version, supporting video content generation from text descriptions.
Text-to-Video English
M
calcuis
140
8
Mixtral 8x7B V0.1
Apache-2.0
Mixtral-8x7B is a pre-trained generative sparse mixture of experts model that outperforms Llama 2 70B on most benchmarks.
Large Language Model Transformers Supports Multiple Languages
M
mistralai
42.78k
1,709
Starcoder2 3b
Openrail
StarCoder2-3B is a 3-billion-parameter code generation model trained on 17 programming languages, supporting a context window of 16,384 tokens.
Large Language Model Transformers Other
S
bigcode
199.62k
178
Blip2 Flan T5 Xl Sharded
MIT
This is a sharded version of the BLIP-2 model implemented with Flan T5-xl for image-to-text tasks such as image captioning and visual question answering. Sharding allows it to be loaded in low-memory environments.
Image-to-Text Transformers English
B
ethzanalytics
71
6
Nystromformer 4096
Long-sequence Nyströmformer model trained on WikiText-103 v1 dataset, supports sequence processing up to 4096 tokens
Large Language Model Transformers
N
uw-madison
74
3
Nystromformer 2048
Nystromformer model trained on the WikiText-103 dataset, supporting long sequence processing (2048 tokens)
Large Language Model Transformers
N
uw-madison
38
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase