gama-12b-i1-GGUF open-source model - Supports text generation in both English and Portuguese, with multiple quantization types to choose from

Gama 12b I1 GGUF

Developed by mradermacher

A quantized version of Gama-12B, providing files of various quantization types, suitable for text generation tasks and supporting English and Portuguese.

Large Language Model

Transformers

Supports Multiple Languages#Multilingual text generation #Efficient quantization model #Low-resource deployment

Downloads 559

Release Time : 6/22/2025

Model Overview

This project provides a quantized version of the rodrigomt/gama-12b model, including files of various quantization types, which can be used for tasks such as text generation. Users can choose models with different quantization degrees according to the requirements of different scenarios.

Model Features

Multiple quantization options

Provide multiple quantization types from IQ1 to IQ6 to meet the needs under different resource conditions.

Efficient inference

The quantized model significantly reduces resource consumption while maintaining high generation quality.

Multilingual support

Support text generation tasks in English and Portuguese.

Model Capabilities

Text generation

Dialogue system

Multilingual processing

Use Cases

Text generation

Content creation

Generate articles, stories or other creative text content.

Dialogue system

Build multilingual dialogue robots.

🚀 Gama-12b GGUF Quantized Model

This project provides weighted/imatrix quants of the rodrigomt/gama-12b model. It offers various GGUF quantized versions for different usage scenarios.

📚 Documentation

Model Information

Property	Details
Base Model	rodrigomt/gama-12b
Language	en, pt
Library Name	transformers
License	gemma
Quantized By	mradermacher
Tags	merge, gemma, text-generation, conversational, allura-org/Gemma-3-Glitter-12B, soob3123/amoral-gemma3-12B-v2-qat, soob3123/Veiled-Calla-12B

About

Weighted/imatrix quants of rodrigomt/gama-12b. Static quants are available at mradermacher/gama-12b-GGUF.

Usage

If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.

Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Link	Type	Size/GB	Notes
GGUF	i1-IQ1_S	3.0	for the desperate
GGUF	i1-IQ1_M	3.3	mostly desperate
GGUF	i1-IQ2_XXS	3.6
GGUF	i1-IQ2_XS	3.9
GGUF	i1-IQ2_S	4.1
GGUF	i1-IQ2_M	4.4
GGUF	i1-Q2_K_S	4.5	very low quality
GGUF	i1-Q2_K	4.9	IQ3_XXS probably better
GGUF	i1-IQ3_XXS	4.9	lower quality
GGUF	i1-IQ3_XS	5.3
GGUF	i1-IQ3_S	5.6	beats Q3_K*
GGUF	i1-Q3_K_S	5.6	IQ3_XS probably better
GGUF	i1-IQ3_M	5.8
GGUF	i1-Q3_K_M	6.1	IQ3_S probably better
GGUF	i1-Q3_K_L	6.6	IQ3_M probably better
GGUF	i1-IQ4_XS	6.7
GGUF	i1-IQ4_NL	7.0	prefer IQ4_XS
GGUF	i1-Q4_0	7.0	fast, low quality
GGUF	i1-Q4_K_S	7.0	optimal size/speed/quality
GGUF	i1-Q4_K_M	7.4	fast, recommended
GGUF	i1-Q4_1	7.7
GGUF	i1-Q5_K_S	8.3
GGUF	i1-Q5_K_M	8.5
GGUF	i1-Q6_K	9.8	practically like static Q6_K

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

FAQ / Model Request

See mradermacher/model_requests for some answers to questions you might have and/or if you want some other model quantized.

Thanks

I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time. Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご