Gemma-3-4B-pt-Q4_0-GGUF Open-Source Model - Free Support for Various Text Generation Tasks

Gemma 3 4b Pt Q4 0 GGUF

Developed by ngxson

This is a GGUF format model converted from Google's Gemma 3.4B parameter model, suitable for text generation tasks.

Large Language Model #Lightweight inference #Local deployment #Chinese text generation

Downloads 74

Release Time : 3/14/2025

Model Overview

This model is a GGUF format version converted from google/gemma-3-4b-pt via llama.cpp, primarily used for text generation tasks.

Model Features

GGUF format

Uses GGUF format for easy integration within the llama.cpp ecosystem.

Quantized version

Provides Q4_0 quantized version to reduce resource requirements.

Hugging Face integration

Accessible via Hugging Face platform, subject to Google's usage license agreement.

Model Capabilities

Text generation

Dialogue systems

Content creation

Use Cases

Content generation

Creative writing

Generate stories, poetry, and other creative content

Q&A systems

Answer various user questions

Education

Learning assistance

Help students understand complex concepts

🚀 ngxson/gemma-3-4b-pt-Q4_0-GGUF

This project offers a model converted to GGUF format, enabling seamless integration and utilization in various applications.

🚀 Quick Start

This model was converted to GGUF format from google/gemma-3-4b-pt using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

✨ Features

Format Conversion: Converted to GGUF format from the original model.
Compatibility: Can be used with llama.cpp.

📦 Installation

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

💻 Usage Examples

Basic Usage

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo ngxson/gemma-3-4b-pt-Q4_0-GGUF --hf-file gemma-3-4b-pt-q4_0.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo ngxson/gemma-3-4b-pt-Q4_0-GGUF --hf-file gemma-3-4b-pt-q4_0.gguf -c 2048

Advanced Usage

You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo ngxson/gemma-3-4b-pt-Q4_0-GGUF --hf-file gemma-3-4b-pt-q4_0.gguf -p "The meaning to life and the universe is"

./llama-server --hf-repo ngxson/gemma-3-4b-pt-Q4_0-GGUF --hf-file gemma-3-4b-pt-q4_0.gguf -c 2048

📚 Documentation

Model Information

Property	Details
Base Model	google/gemma-3-4b-pt
Library Name	transformers
License	gemma
Pipeline Tag	image-text-to-text
Tags	llama-cpp, gguf-my-repo

Access Gemma on Hugging Face

To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging Face and click below. Requests are processed immediately.

📄 License

The model is under the gemma license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご