Q

Qwen3 235B A22B GGUF

Developed by ubergarm
Qwen3-235B-A22B is a 235-billion-parameter large language model that has undergone advanced non-linear quantization processing via the ik_llama.cpp branch, suitable for high-performance computing environments.
Downloads 889
Release Time : 4/30/2025

Model Overview

This model is a mixed-quantization large language model specifically designed for high-performance computing environments, supporting conversational text generation tasks.

Model Features

Advanced Non-linear Quantization
Utilizes the ik_llama.cpp branch for state-of-the-art (SotA) non-linear quantization, delivering optimal quality for a given memory footprint.
Mixture of Experts Architecture
Adopts a Mixture of Experts (MoE) architecture with 94 repeating layers/blocks, optimizing computational resource allocation.
High-performance Inference
Designed to run on high-end hardware configurations, supporting mixed GPU+CPU inference for high throughput.

Model Capabilities

Text Generation
Conversational Interaction
Long Context Processing (supports 32k context)

Use Cases

High-performance Computing
High-quality LLM on Gaming Consoles
Running high-quality language models on gaming consoles equipped with high-end GPUs and ample RAM
Achieved up to 140 tok/sec prefill speed and 10 tok/sec text generation speed in tests
Research & Development
Quantization Technology Research
Used for researching advanced model quantization techniques and methods
Featured Recommended AI Models
ยฉ 2025AIbase