Gemma-3-4B-It-Abliterated GGUF Open Source Model - Mixed Precision Quantization, Small Size but High Performance

Gemma 3 4b It Abliterated GGUF

Developed by ZeroWw

An innovative quantization solution that achieves smaller model size while maintaining high performance through mixed-precision quantization.

Large Language Model EnglishOpen Source License:MIT #Mixed-precision quantization #Efficient text generation #f16 embedding optimization

Downloads 247

Release Time : 3/22/2025

Model Overview

This model adopts a self-developed mixed-precision quantization scheme, using f16 precision for the output and embedding layers and q5_k or q6_k precision for the remaining parts, achieving a smaller size than standard q8_0 while maintaining performance comparable to pure f16 quantization.

Model Features

Mixed-precision quantization

Uses f16 precision for the output and embedding layers and q5_k or q6_k precision for the remaining parts, achieving efficient quantization.

Size optimization

Both f16.q6 and f16.q5 quantization schemes result in a smaller size than standard q8_0 quantization.

Performance retention

Quantized performance remains on par with pure f16 quantization.

Model Capabilities

Text generation

Use Cases

Natural language processing

Efficient text generation

Reduces model size while maintaining generation quality.

Smaller size than standard q8_0 with performance comparable to pure f16 quantization.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Gemma 3 4b It Abliterated GGUF

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 My Quantizations

📄 License

📚 Documentation

Quantization Details

Results

Update Information