AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Local inference

# Local inference

Qwen3 8B 4bit DWQ
Apache-2.0
Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.
Large Language Model
Q
mlx-community
306
1
Pllum 8x7B Chat GGUF
Apache-2.0
The GGUF quantization version of PLLuM-8x7B-chat, optimized for local inference, supporting multiple quantization levels to meet different hardware requirements.
Large Language Model Transformers
P
piotrmaciejbednarski
126
2
Llama 3.2 3B Instruct GGUF
The GGUF format file of the Llama-3.2-3B-Instruct model, which is convenient for users to perform text generation tasks.
Large Language Model
L
MaziyarPanahi
203.56k
13
Deepseek V2 Lite IMat GGUF
The GGUF quantized version of DeepSeek-V2-Lite, processed by Llama.cpp imatrix quantization, reduces storage and computing resource requirements and facilitates deployment.
Large Language Model
D
legraphista
491
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase