Model Selection

Lightweight deployment

# Lightweight deployment

Qwen2.5 VL 7B Meteorology GGUF

Quantized version of Qwen2.5-VL-7B-Meteorology, suitable for image-text processing tasks related to meteorology.

Transformers English

Midm 2.0 Mini Instruct Gguf

Mi:dm 2.0 is an AI model centered around South Korea, developed by KT using its proprietary technology. This model has deeply internalized the unique values, cognitive frameworks, and common-sense reasoning of South Korean society. It can not only process and generate Korean content but also reflect a profound understanding of South Korean social and cultural norms and values.

Large Language Model

Transformers Supports Multiple Languages

Tencent.hunyuan A13B Instruct GGUF

The quantized version of Tencent Hunyuan A13B Instruction Model, which uses technical means to improve operational efficiency while ensuring performance.

Large Language Model

Apollo2 7B GGUF

Apollo2-7B-GGUF is a quantized version of FreedomIntelligence/Apollo2-7B, supporting medical large language model applications in multiple languages.

Large Language Model Supports Multiple Languages

Qwen3 Embedding 8B 4bit DWQ

This is a 4-bit DWQ quantized version converted from Qwen/Qwen3-Embedding-8B, suitable for the embedding model of the MLX framework.

PP OCRv4 Mobile Det

PP-OCRv4_mobile_det is an efficient text detection model optimized for mobile devices developed by the PaddleOCR team, suitable for deployment on edge devices.

Text Recognition Supports Multiple Languages

Qwen.qwen3 Reranker 0.6B GGUF

The quantized version of Qwen3-Reranker-0.6B, dedicated to making knowledge accessible to everyone.

Large Language Model

PP OCRv5 Mobile Det

PP-OCRv5_mobile_det is the latest generation of lightweight text detection model developed by the PaddleOCR team, supporting efficient text detection in multiple languages and scenarios.

Text Recognition Supports Multiple Languages

Fpham Sydney Overthinker 13b HF GGUF

This project provides optimized GGUF quantized files, which can significantly improve model performance. These quantized files are supported by Featherless AI. Users can run any desired model by paying a small fee.

Large Language Model

featherless-ai-quants

Kakaocorp.kanana Safeguard 8b GGUF

This project is a quantized version of kakaocorp/kanana-safeguard-8b, dedicated to making knowledge accessible to the public.

Large Language Model

Josiefied DeepSeek R1 0528 Qwen3 8B Abliterated V1 8bit

This is an 8-bit quantized version in MLX format converted from the DeepSeek-R1-0528-Qwen3-8B model, suitable for text generation tasks.

Large Language Model

Deepseek R1 0528 Qwen3 8B 4bit

This model is a 4-bit quantized version converted from DeepSeek-R1-0528-Qwen3-8B, optimized for the MLX framework and suitable for text generation tasks.

Large Language Model

Qwen2.5 Omni 7B GGUF

Qwen2.5-Omni-7B-GGUF is the GGUF format version of the Qwen2.5-Omni-7B model, supporting multimodal inputs including text, audio, and images.

Large Language Model English

Bytedance Seed.academic Ds 9B GGUF

This project provides a quantized version of academic-ds-9B, aiming to make knowledge accessible to everyone.

Large Language Model

Devstral Small 2505 8bit

Devstral-Small-2505-8bit is an 8-bit quantized model converted from mistralai/Devstral-Small-2505, suitable for the MLX framework and supporting text generation tasks in multiple languages.

Large Language Model Supports Multiple Languages

Skywork Skywork OR1 7B GGUF

Skywork-OR1-7B is a 7B-parameter large language model offering multiple quantization versions to accommodate different hardware requirements.

Large Language Model

Qwen3 4B 4bit DWQ

This model is a 4-bit DWQ quantized version of Qwen3-4B, converted to the MLX format for easy text generation using the mlx library.

Large Language Model

Openvision Vit Large Patch14 84

OpenVision is a fully open, cost-effective family of advanced visual encoders focused on multimodal learning tasks.

Image Classification

Huihui Ai.qwen3 4B Abliterated GGUF

The quantized version of Huihui AI's Qwen3-4B model, aiming to make knowledge more widely accessible to the public.

Large Language Model

Phi 4 Mini Reasoning GGUF

Phi-4-mini-reasoning is a lightweight open model built on synthetic data, focusing on high-quality, reasoning-rich data, and further fine-tuned for more advanced mathematical reasoning capabilities.

Large Language Model

Josiefied Qwen3 8B Abliterated V1 8bit

An optimized 8-bit quantized version of Qwen3-8B, designed for efficient inference on the MLX framework

Large Language Model

Josiefied Qwen3 4B Abliterated V1 6bit

This is a 6-bit quantized version of the Qwen3-4B model converted to the MLX format, suitable for text generation tasks.

Large Language Model

Qwen3 8B 4bit DWQ

Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.

Large Language Model

Ast Finetuned Audioset 10 10 0.4593 ONNX

This is the ONNX version of the AST (Audio Spectrogram Transformer) model, designed specifically for audio classification tasks and fine-tuned on the AudioSet dataset.

Audio Classification

Microsoft Phi 4 Mini Reasoning GGUF

This is a quantized version of the Microsoft Phi - 4 - mini - reasoning model, which is quantized using the llamacpp tool to improve the model's operating efficiency and performance in different hardware environments.

Large Language Model Supports Multiple Languages

Muyan TTS SFT Q8 0 GGUF

This model is a GGUF format text-to-speech model converted from MYZY-AI/Muyan-TTS-SFT, supporting Chinese speech synthesis.

Speech Synthesis

Fdtn Ai.foundation Sec 8B GGUF

Foundation-Sec-8B is a large language model based on the Transformer architecture, focusing on text generation tasks.

Large Language Model

Industry Project V2

An instruction fine-tuned model optimized based on the Mistral architecture, suitable for zero-shot classification tasks

Large Language Model

This is the 4-bit quantized version of the Qwen/Qwen3-8B model, converted to the MLX framework format, suitable for efficient inference on Apple silicon devices.

Large Language Model

Qwen3-4B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-4B to the MLX format, designed for efficient operation on Apple chips.

Large Language Model

The 4-bit quantized version of the MNN model for Qwen3-4B, used for efficient text generation tasks

Large Language Model English

Internvl2 5 1B MNN

A 4-bit quantized version based on InternVL2_5-1B, suitable for text generation and chat scenarios.

Large Language Model English

Deepcogito Cogito V1 Preview Llama 3B GGUF

A 3B-parameter language model based on the Llama architecture, offering multiple quantization versions to suit different hardware needs

Large Language Model

Mistral Small 24B Instruct 2501 GGUF

Mistral-Small-24B-Instruct-2501 is a 24B-parameter instruction-finetuned large language model supporting multilingual text generation tasks.

Large Language Model Supports Multiple Languages

Gemma 3 27b It Qat Unsloth Bnb 4bit

Gemma 3 is a lightweight, state-of-the-art multimodal open-source model launched by Google, capable of processing text and image inputs and generating text outputs.

Gemma 3 1b It Qat

Gemma 3 is a lightweight multimodal model launched by Google, capable of processing text and image inputs and generating text outputs. This model has a 128K large context window and multilingual support for over 140 languages.

Hyperclovax SEED Text Instruct 0.5B

A Korean-optimized text generation model with instruction-following capability, featuring lightweight design suitable for edge device deployment

Large Language Model

naver-hyperclovax

Gemma 3 4b It Qat GGUF

Gemma 3 is a lightweight, advanced open model series from Google, built on the same research and technology used to create Gemini models. This model is multimodal, capable of processing both text and image inputs to generate text outputs.

Text-to-Image English

GigaAM v2 is an automatic speech recognition (ASR) model that supports Russian speech-to-text tasks, offering both CTC and RNN-T architectures.

Speech Recognition Other

Gemma 3 27b It Qat

Gemma is a lightweight open model series launched by Google, built on Gemini model technology. Gemma 3 is a multimodal model supporting text and image inputs with text outputs, featuring a 128K large context window and multilingual capabilities.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase