Q

Qwen3 8b 192k Context 6X Josiefied Uncensored MLX AWQ 4bit

Developed by Goraint
The 4-bit AWQ quantized version of Qwen3-8B, optimized for the MLX library, supports 192k token long context processing, suitable for edge device deployment.
Downloads 204
Release Time : 5/15/2025

Model Overview

A 4-bit quantized model based on Qwen3-8B, enabling efficient inference on Apple chips via the MLX library while retaining the core capabilities of the original model and reducing resource consumption.

Model Features

Efficient inference
4-bit quantization reduces memory usage by approximately 75% compared to FP16.
Long context support
192k token processing capability (6x the standard version).
Apple chip optimization
Acceleration on M1/M3 chips via the MLX library.
Edge device deployment
Low resource consumption suitable for local device operation.

Model Capabilities

Long text generation
Conversational interaction
Document analysis
Code generation

Use Cases

Research
Long-context NLP experiments
Supports language modeling research with ultra-long text sequences.
Model compression research
Validation of 4-bit quantization techniques.
Development
Edge device chatbots
Deploy localized dialogue systems on Apple devices.
112.8 tokens/sec on M3 Ultra in real-world tests.
Long document processing
Analysis and summarization of long texts like books/papers.
Enterprise applications
Code generation
Generating complete code snippets based on long context.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase