Model Selection

Low-Resource Deployment

# Low-Resource Deployment

This is a static quantized version of the chandar-lab/NeoBERT model, aiming to reduce model storage space and computational resource requirements.

Large Language Model

Transformers English

Qwen2.5 VL 7B Instruct Gemlite Ao A8w8

This is a multimodal large language model quantized with A8W8, based on Qwen2.5-VL-7B-Instruct, supporting vision and language tasks.

Qwen2 Audio 7B Instruct GGUF

Static quantized version of Qwen2-Audio-7B-Instruct model, supporting English audio-to-text conversion tasks

Transformers English

Sarvamai Sarvam M GGUF

This is a quantized version of the Sarvam-m model, supporting text generation tasks in multiple Indian languages and English.

Large Language Model Supports Multiple Languages

Wan2.1 VACE 14B GGUF

This is the GGUF quantized conversion version of the Wan-AI/Wan2.1-VACE-14B model, primarily designed for text-to-video generation tasks.

Qwen3 is the latest generation of large language models in the Tongyi Qianwen series, offering a complete combination of dense models and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 achieves breakthrough progress in reasoning capabilities, instruction following, agent functions, and multilingual support.

Large Language Model English

Qwen Qwen2.5 VL 7B Instruct GGUF

A quantized version of Qwen2.5-VL-7B-Instruct, using llama.cpp for quantization, supporting multimodal tasks such as image-to-text conversion.

Text-to-Image English

Nvidia OpenCodeReasoning Nemotron 32B IOI GGUF

This is the quantized version of the NVIDIA OpenCodeReasoning-Nemotron-32B-IOI model, processed using llama.cpp for quantization, suitable for code reasoning tasks.

Large Language Model Supports Multiple Languages

Nomic Ai Nomic Embed Code GGUF

This is the quantized version of the nomic-ai/nomic-embed-code model, using llama.cpp for imatrix quantization, suitable for code embedding and feature extraction tasks.

Microsoft Phi 4 Reasoning GGUF

This is a quantized version of Microsoft's Phi-4-reasoning model, optimized using llama.cpp for inference tasks and supporting multiple quantization options.

Large Language Model

Qwen3-4B is a GGUF format model based on Qwen3-4B-Base, suitable for text generation tasks.

Large Language Model

Llasa 1B Multilingual Mlx 8Bit

This is a multilingual text-to-speech model supporting 11 languages including Chinese, English, German, etc., converted from HKUSTAudio/Llasa-1B-Multilingual.

Speech Synthesis Supports Multiple Languages

Qwen3 1.7B Q8 0 GGUF

Qwen3-1.7B-Q8_0-GGUF is a GGUF-format model converted from Qwen/Qwen3-1.7B, supporting text generation tasks with multilingual capabilities and efficient reasoning.

Large Language Model

Chengsenwang ChatTime 1 7B Base GGUF

ChatTime-1-7B-Base is a foundational model specialized in time series forecasting, supporting multimodal time series processing.

Multimodal Fusion

Chengsenwang ChatTime 1 7B Chat GGUF

ChatTime-1-7B-Chat is a multimodal foundation model specialized in time series forecasting, built on a 7B parameter scale.

Multimodal Fusion

Qwen Qwen3 0.6B GGUF

The Llamacpp imatrix quantized version of Qwen3-0.6B provided by the Qwen team, quantized using llama.cpp, supports running in LM Studio or projects based on llama.cpp.

Large Language Model

GGUF quantized versions of the Lightricks/LTX-Video model, including development and distilled editions, designed for text-to-video generation tasks.

Text-to-Video English

Llava 1.5 13b Hf I1 GGUF

This project provides weighted/matrix quantized versions of the llava-1.5-13b-hf model, including various quantization types to meet the usage requirements in different scenarios.

Transformers English

Gemma 3 4b It GPTQ 4b 128g

INT4 quantized version based on the gemma-3-4b-it model, significantly reducing storage and computational resource requirements

Qwen2.5 3B YiLin GGUF Q4 K M

A 4-bit quantized model optimized based on Qwen2.5-3B-Instruct, supporting both Chinese and English, with chain-of-thought control and tool invocation capabilities.

Large Language Model Supports Multiple Languages

Beaver 7b V3.0 GGUF

Beaver-7B-v3.0 is a 7B-parameter large language model based on the LLaMA architecture, focusing on safety and human feedback reinforcement learning (RLHF).

Large Language Model English

News Summarizer T5 GGUF

This is a statically quantized version of a T5-based news summarization model, supporting English text summarization tasks.

Text Generation English

Orpheus 3b FT Q4 K M.gguf

Orpheus is a high-performance text-to-speech model, fine-tuned to achieve natural and emotionally rich speech synthesis. This repository hosts the 8-bit quantized version of the 3-billion-parameter model, optimizing operational efficiency while maintaining high-quality output.

Speech Synthesis Supports Multiple Languages

STEVE R1 7B SFT GGUF

Static quantized version of STEVE-R1-7B-SFT, supporting multiple quantization levels for different hardware requirements

Text-to-Image English

Bge Reranker V2 M3 Q4 K M GGUF

This model is a GGUF format conversion of BAAI/bge-reranker-v2-m3, designed for text ranking tasks with multilingual support.

Text Embedding Other

Heron NVILA Lite 2B

Heron-NVILA-Lite-2B is a vision-language model based on the NVILA-Lite architecture, specifically trained for Japanese, supporting image-text interaction tasks in both Japanese and English.

Image-to-Text Supports Multiple Languages

Qwen2.5 VL 7B Instruct GGUF

Qwen2.5-VL-7B-Instruct is a multimodal vision-language model that supports image-text generation tasks.

Image-to-Text English

Trillion 7B Preview AWQ

The Trillion-7B Preview is a multilingual large language model supporting English, Korean, Japanese, and Chinese. It outperforms other 7B-scale models in computational efficiency and performance.

Large Language Model Supports Multiple Languages

Mlabonne Gemma 3 27b It Abliterated GGUF

A quantized version based on Google Gemma 3B model, optimized using llama.cpp, supporting multiple quantization levels, suitable for text generation tasks.

Large Language Model

Lightblue Reranker 0.5 Cont Gguf

This is a text ranking model used for reordering and scoring texts.

Jbaron34 Qwen2.5 0.5b Bebop Reranker Gguf

A 0.5B parameter text reranking model based on Qwen2.5 architecture, efficiently trained using Unsloth and TRL libraries

Large Language Model

Thedrummer Gemmasutra Small 4B V1 GGUF

Gemmasutra-Small-4B-v1 is a 4B-parameter text generation model, quantized based on llama.cpp, suitable for various quantization version choices.

Large Language Model

Terjman Nano V2.0

Terjman-Nano-v2.0 is a Transformer-based English-Moroccan dialect translation model with 77M parameters, optimized for high-quality and precise translation.

Machine Translation

Transformers Supports Multiple Languages

Qwen2.5 VL 7B Instruct Quantized.w4a16

Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, with weights quantized to INT4 and activations to FP16.

Transformers English

Summllama3.2 3B Q4 0 GGUF

This is a GGUF format model converted from DISLab/SummLlama3.2-3B, primarily used for text summarization tasks.

Large Language Model

Terjman Large V2.0

Terjman Large-v2.0 is a Transformer-based English-Moroccan dialect translation model with significantly improved performance, comparable to commercial models.

Machine Translation

Transformers Supports Multiple Languages

BounharAbdelaziz

Qwen2 VL 7B Instruct GGUF

A quantized version of the multimodal model based on Qwen2-VL-7B-Instruct, supporting image-text-to-text tasks with various quantization levels.

Image-to-Text English

Internlm3 8b Instruct Gguf

The GGUF format version of the InternLM3-8B-Instruct model, suitable for the llama.cpp framework and supporting multiple quantization versions.

Large Language Model English

Vintern 1B V3 5

Vintern-1B-v3.5 is a multimodal large language model fine-tuned based on InternVL2.5-1B, specializing in Vietnamese text processing, excelling in OCR and understanding Vietnamese-specific documents.

Transformers Supports Multiple Languages

Qwq 32B Preview IdeaWhiz V1 GGUF

A 32B-parameter large language model based on llama.cpp, specializing in text generation tasks for chemistry, biology, climate, and medical fields

Large Language Model English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase