Model Selection

Local inference

# Local inference

Qwen3 8B 4bit DWQ

Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.

Large Language Model

Pllum 8x7B Chat GGUF

The GGUF quantization version of PLLuM-8x7B-chat, optimized for local inference, supporting multiple quantization levels to meet different hardware requirements.

Large Language Model

piotrmaciejbednarski

Llama 3.2 3B Instruct GGUF

The GGUF format file of the Llama-3.2-3B-Instruct model, which is convenient for users to perform text generation tasks.

Large Language Model

Deepseek V2 Lite IMat GGUF

The GGUF quantized version of DeepSeek-V2-Lite, processed by Llama.cpp imatrix quantization, reduces storage and computing resource requirements and facilitates deployment.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase