Ichigo-llama3.1-s-instruct-v0.4-GGUF Open-source Model - Multi-quantization Versions for Different Hardware Requirements

Ichigo Llama3.1 S Instruct V0.4 GGUF

Developed by mradermacher

A statically quantized model based on Menlo/Ichigo-llama3.1-s-instruct-v0.4, offering multiple quantization versions to suit different hardware requirements.

Large Language Model EnglishOpen Source License:Apache-2.0 #Voice Command Understanding #Low-resource Quantization #Multi-precision Adaptation

Downloads 369

Release Time : 11/8/2024

Model Overview

This is a quantized language model based on the Llama architecture, primarily designed for instruction-following and text generation tasks. The model has undergone static quantization and provides multiple precision versions to adapt to various computing environments.

Model Features

Multiple Quantization Versions

Offers 13 different quantization versions from Q2_K to f16, catering to diverse hardware performance and precision needs

Efficient Inference

Quantized versions significantly reduce model size and improve inference speed, making them suitable for resource-constrained environments

Cross-platform Compatibility

GGUF format supports multiple platforms and devices, including ARM architecture

Model Capabilities

Text Generation

Instruction Following

English Language Processing

Use Cases

Natural Language Processing

Dialogue Systems

Building English chatbots

Text Generation

Generating coherent English text

Property	Details
Base Model	Menlo/Ichigo-llama3.1-s-instruct-v0.4
Datasets	homebrewltd/instruction-speech-whispervq-v2
Language	en
Library Name	transformers
License	apache-2.0
Quantized By	mradermacher
Tags	sound language model, audio-text-to-text, torchtune

Link	Type	Size/GB	Notes
GGUF	Q2_K	3.3
GGUF	Q3_K_S	3.8
GGUF	Q3_K_M	4.1	lower quality
GGUF	Q3_K_L	4.4
GGUF	IQ4_XS	4.6
GGUF	Q4_0_4_4	4.8	fast on arm, low quality
GGUF	Q4_K_S	4.8	fast, recommended
GGUF	Q4_K_M	5.0	fast, recommended
GGUF	Q5_K_S	5.7
GGUF	Q5_K_M	5.8
GGUF	Q6_K	6.7	very good quality
GGUF	Q8_0	8.6	fast, best quality
GGUF	f16	16.2	16 bpw, overkill