Llama-4-Scout-17B-16E-Instruct-GGUF Open-source Model - Multilingual Instruction Fine-tuning Supports Multilingual Communication

Llama 4 Scout 17B 16E Instruct GGUF

Developed by second-state

Llama-4-Scout-17B-16E-Instruct is a multilingual instruction fine-tuning model that supports multiple languages and can be run through LlamaEdge.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Multilingual instruction model #High-precision quantization #Tool call support

Downloads 2,959

Release Time : 4/8/2025

Model Overview

This model is a multilingual instruction fine-tuning model based on the Llama-4 architecture, suitable for text generation and tool call tasks.

Model Features

Multilingual support

Supports multiple languages including Arabic, German, English, Spanish, French, Hindi, Indonesian, Italian, Portuguese, Thai, Tagalog, and Vietnamese.

Instruction fine-tuning

After instruction fine-tuning, it is suitable for dialogue and tool call tasks.

Quantized version

Provides multiple quantized versions suitable for different hardware and performance requirements.

Model Capabilities

Text generation

Tool call

Multilingual support

Use Cases

Dialogue system

Multilingual dialogue

Used to build a dialogue system that supports multiple languages.

Tool call

Weather query

Implement the weather query function through tool call.

🚀 Llama-4-Scout-17B-16E-Instruct-GGUF

This project provides a quantized GGUF version of the Llama-4-Scout-17B-16E-Instruct model, enabling efficient inference and deployment.

📚 Documentation

Model Information

Property	Details
Model Name	Llama-4-Scout-17B-16E-Instruct
Base Model	unsloth/Llama-4-Scout-17B-16E-Instruct
Model Creator	Meta
Quantized By	Second State Inc.
Library Name	transformers
Languages Supported	Arabic (ar), German (de), English (en), Spanish (es), French (fr), Hindi (hi), Indonesian (id), Italian (it), Portuguese (pt), Thai (th), Tagalog (tl), Vietnamese (vi)
Tags	facebook, meta, llama, llama-4

Original Model

The original model can be found at unsloth/Llama-4-Scout-17B-16E-Instruct.

Important Note

⚠️ Important Note

ONLY text mode of Llama-4 supported for now!

Run with LlamaEdge

Prerequisites

LlamaEdge version: v0.16.16 and above

Prompt Template

Prompt type: llama-4-chat
Prompt string:

<|begin_of_text|><|header_start|>system<|header_end|>

{system_prompt}<|eot|><|header_start|>user<|header_end|>

{user_message_1}<|eot|><|header_start|>assistant<|header_end|>

{assistant_message_1}<|eot|><|header_start|>user<|header_end|>

{user_message_2}<|eot|>
<|header_start|>assistant<|header_end|>

Example: Tool Use

Prompt with user question and tool info:

Expand to see the example

You are a helpful assistant.<|eot|> <|header_start|>user<|header_end|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

[ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": [ "celsius", "fahrenheit" ] } }, "required": [ "location", "unit" ] } } }, { "type": "function", "function": { "name": "predict_weather", "description": "Predict the weather in 24 hours", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": [ "celsius", "fahrenheit" ] } }, "required": [ "location", "unit" ] } } }, { "type": "function", "function": { "name": "sum", "description": "Calculate the sum of two numbers", "parameters": { "type": "object", "properties": { "a": { "type": "integer", "description": "the left hand side number" }, "b": { "type": "integer", "description": "the right hand side number" } }, "required": [ "a", "b" ] } } } ]

Question: How is the weather of Beijing, China in celsius?<|eot|><|header_start|>assistant<|header_end|>

</details>

- Prompt with tool results:
<details> 
<summary> Expand to see the example </summary>
```text
<|begin_of_text|><|header_start|>system<|header_end|>

You are a helpful assistant.<|eot|>
<|header_start|>user<|header_end|>

How is the weather of Beijing, China in celsius?<|eot|>
<|header_start|>assistant<|header_end|>

{"name":"get_current_weather","arguments":"{\"location\":\"Beijing, China\",\"unit\":\"celsius\"}"}<|eot|>
<|header_start|>ipython<|header_end|>

{"temperature":"30","unit":"celsius"}<|eot|><|header_start|>assistant<|header_end|>

Context Size

Context size: 10M

Run as LlamaEdge Service

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-4-Scout-17B-16E-Instruct-Q5_K_M.gguf \
  llama-api-server.wasm \
  --prompt-template llama-4-chat \
  --ctx-size 1000000 \
  --model-name Llama-4-Scout

Run as LlamaEdge Command App

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-4-Scout-17B-16E-Instruct-Q5_K_M.gguf \
  llama-chat.wasm \
  --prompt-template llama-4-chat \
  --ctx-size 1000000

Quantized GGUF Models

Name	Quant method	Bits	Size	Use case
Llama-4-Scout-17B-16E-Instruct-Q2_K.gguf	Q2_K	2	39.6 GB	smallest, significant quality loss - not recommended for most purposes
Llama-4-Scout-17B-16E-Instruct-Q3_K_L-00001-of-00002.gguf	Q3_K_L	3	29.8 GB	small, substantial quality loss
Llama-4-Scout-17B-16E-Instruct-Q3_K_L-00002-of-00002.gguf	Q3_K_L	3	26.1 GB	small, substantial quality loss
Llama-4-Scout-17B-16E-Instruct-Q3_K_M-00001-of-00002.gguf	Q3_K_M	3	29.8 GB	very small, high quality loss
Llama-4-Scout-17B-16E-Instruct-Q3_K_M-00002-of-00002.gguf	Q3_K_M	3	21.9 GB	very small, high quality loss
Llama-4-Scout-17B-16E-Instruct-Q3_K_S-00001-of-00002.gguf	Q3_K_S	3	29.7 GB	very small, high quality loss
Llama-4-Scout-17B-16E-Instruct-Q3_K_S-00002-of-00002.gguf	Q3_K_S	3	17.0 GB	very small, high quality loss
Llama-4-Scout-17B-16E-Instruct-Q4_0-00001-of-00003.gguf	Q4_0	4	30.0 GB	legacy; small, very high quality loss - prefer using Q3_K_M
Llama-4-Scout-17B-16E-Instruct-Q4_0-00002-of-00003.gguf	Q4_0	4	29.7 GB	legacy; small, very high quality loss - prefer using Q3_K_M
Llama-4-Scout-17B-16E-Instruct-Q4_0-00003-of-00003.gguf	Q4_0	4	1.20 GB	legacy; small, very high quality loss - prefer using Q3_K_M
Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00003.gguf	Q4_K_M	4	29.9 GB	medium, balanced quality - recommended
Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00002-of-00003.gguf	Q4_K_M	4	29.8 GB	medium, balanced quality - recommended
Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00003-of-00003.gguf	Q4_K_M	4	5.66 GB	medium, balanced quality - recommended
Llama-4-Scout-17B-16E-Instruct-Q4_K_S-00001-of-00003.gguf	Q4_K_S	4	29.7 GB	small, greater quality loss
Llama-4-Scout-17B-16E-Instruct-Q4_K_S-00002-of-00003.gguf	Q4_K_S	4	29.7 GB	small, greater quality loss
Llama-4-Scout-17B-16E-Instruct-Q4_K_S-00003-of-00003.gguf	Q4_K_S	4	2.04 GB	small, greater quality loss
Llama-4-Scout-17B-16E-Instruct-Q5_0-00001-of-00003.gguf	Q5_0	5	29.9 GB	legacy; medium, balanced quality - prefer using Q4_K_M
Llama-4-Scout-17B-16E-Instruct-Q5_0-00002-of-00003.gguf	Q5_0	5	29.8 GB	legacy; medium, balanced quality - prefer using Q4_K_M
Llama-4-Scout-17B-16E-Instruct-Q5_0-00003-of-00003.gguf	Q5_0	5	14.6 GB	legacy; medium, balanced quality - prefer using Q4_K_M
Llama-4-Scout-17B-16E-Instruct-Q5_K_M-00001-of-00003.gguf	Q5_K_M	5	29.8 GB	large, very low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q5_K_M-00002-of-00003.gguf	Q5_K_M	5	29.8 GB	large, very low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q5_K_M-00003-of-00003.gguf	Q5_K_M	5	16.9 GB	large, very low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q5_K_S-00001-of-00003.gguf	Q5_K_S	5	29.9 GB	large, low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q5_K_S-00002-of-00003.gguf	Q5_K_S	5	29.8 GB	large, low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q5_K_S-00003-of-00003.gguf	Q5_K_S	5	14.6 GB	large, low quality loss - recommended
Llama-4-Scout-17B-16E-Instruct-Q6_K-00001-of-00003.gguf	Q6_K	6	30.0 GB	very large, extremely low quality loss
Llama-4-Scout-17B-16E-Instruct-Q6_K-00002-of-00003.gguf	Q6_K	6	29.6 GB	very large, extremely low quality loss
Llama-4-Scout-17B-16E-Instruct-Q6_K-00003-of-00003.gguf	Q6_K	6	28.9 GB	very large, extremely low quality loss
Llama-4-Scout-17B-16E-Instruct-Q8_0-00001-of-00004.gguf	Q8_0	8	29.5 GB	very large, extremely low quality loss - not recommended
Llama-4-Scout-17B-16E-Instruct-Q8_0-00002-of-00004.gguf	Q8_0	8	29.7 GB	very large, extremely low quality loss - not recommended
Llama-4-Scout-17B-16E-Instruct-Q8_0-00003-of-00004.gguf	Q8_0	8	29.7 GB	very large, extremely low quality loss - not recommended
Llama-4-Scout-17B-16E-Instruct-Q8_0-00004-of-00004.gguf	Q8_0	8	25.7 GB	very large, extremely low quality loss - not recommended
Llama-4-Scout-17B-16E-Instruct-f16-00001-of-00008.gguf	f16	16	30.0 GB
Llama-4-Scout-17B-16E-Instruct-f16-00002-of-00008.gguf	f16	16	29.5 GB
Llama-4-Scout-17B-16E-Instruct-f16-00003-of-00008.gguf	f16	16	29.1 GB
Llama-4-Scout-17B-16E-Instruct-f16-00004-of-00008.gguf	f16	16	29.5 GB
Llama-4-Scout-17B-16E-Instruct-f16-00005-of-00008.gguf	f16	16	29.5 GB
Llama-4-Scout-17B-16E-Instruct-f16-00006-of-00008.gguf	f16	16	29.1 GB
Llama-4-Scout-17B-16E-Instruct-f16-00007-of-00008.gguf	f16	16	29.5 GB
Llama-4-Scout-17B-16E-Instruct-f16-00008-of-00008.gguf	f16	16	9.41 GB