openchat-3.6-8b-20240522-IMat-GGUF開源模型 - 多量化類型按需下載使用

首頁

Openchat 3.6 8b 20240522 IMat GGUF

由legraphista開發

這是對 openchat/openchat-3.6-8b-20240522 模型進行 Llama.cpp imatrix 量化處理後的版本，提供了不同量化類型的文件，方便用戶根據需求下載和使用。

大型語言模型 #高效量化 #多輪對話 #輕量部署

下載量 4,416

發布時間 : 5/27/2024

模型概述

該模型是基於 openchat-3.6-8b 的量化版本，適用於文本生成任務，支持多種量化選項以優化性能和資源使用。

模型特點

多種量化選項

提供了從 Q8_0 到 IQ1_S 等多種量化類型，滿足不同硬件和性能需求。

IMatrix 優化

部分量化類型使用了 IMatrix 數據集進行優化，提升了低量化類型的性能。

輕量級部署

量化後的模型體積更小，適合在資源有限的設備上運行。

模型能力

文本生成

對話系統

數學問題求解

使用案例

日常對話

健康飲食建議

提供香蕉和火龍果的組合食用方法。

生成具體的食譜建議，如香蕉火龍果奶昔和沙拉。

教育輔助

數學問題求解

解答簡單的線性方程。

正確解答如 2x + 3 = 7 的方程。

🚀 openchat-3.6-8b-20240522-IMat-GGUF

本項目是對 openchat/openchat-3.6-8b-20240522 模型進行 Llama.cpp imatrix 量化處理後的版本。提供了不同量化類型的文件，方便用戶根據自身需求進行下載和使用。

屬性	詳情
基礎模型	openchat/openchat-3.6-8b-20240522
推理	否
庫名稱	gguf
許可證	llama3
任務類型	文本生成
量化者	legraphista
標籤	量化、GGUF、imatrix、量化處理、imat、imatrix、靜態

原始模型：openchat/openchat-3.6-8b-20240522
原始數據類型：BF16 (bfloat16)
量化工具：llama.cpp b3006
IMatrix 數據集：點擊查看

📦 文件信息

IMatrix

狀態：✅ 可用
鏈接：點擊下載

常用量化文件

文件名	量化類型	文件大小	狀態	是否使用 IMatrix	是否拆分
openchat-3.6-8b-20240522.Q8_0.gguf	Q8_0	8.54GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.Q6_K.gguf	Q6_K	6.60GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.Q4_K.gguf	Q4_K	4.92GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.Q3_K.gguf	Q3_K	4.02GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.Q2_K.gguf	Q2_K	3.18GB	✅ 可用	✅ IMatrix	❌ 否

所有量化文件

文件名	量化類型	文件大小	狀態	是否使用 IMatrix	是否拆分
openchat-3.6-8b-20240522.FP16.gguf	F16	16.07GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.BF16.gguf	BF16	16.07GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.Q5_K.gguf	Q5_K	5.73GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.Q5_K_S.gguf	Q5_K_S	5.60GB	✅ 可用	❌ 靜態	❌ 否
openchat-3.6-8b-20240522.Q4_K_S.gguf	Q4_K_S	4.69GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.Q3_K_L.gguf	Q3_K_L	4.32GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.Q3_K_S.gguf	Q3_K_S	3.66GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.Q2_K_S.gguf	Q2_K_S	2.99GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ4_NL.gguf	IQ4_NL	4.68GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ4_XS.gguf	IQ4_XS	4.45GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ3_M.gguf	IQ3_M	3.78GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ3_S.gguf	IQ3_S	3.68GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ3_XS.gguf	IQ3_XS	3.52GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ3_XXS.gguf	IQ3_XXS	3.27GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ2_M.gguf	IQ2_M	2.95GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ2_S.gguf	IQ2_S	2.76GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ2_XS.gguf	IQ2_XS	2.61GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ2_XXS.gguf	IQ2_XXS	2.40GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ1_M.gguf	IQ1_M	2.16GB	✅ 可用	✅ IMatrix	❌ 否
openchat-3.6-8b-20240522.IQ1_S.gguf	IQ1_S	2.02GB	✅ 可用	✅ IMatrix	❌ 否

🚀 快速開始

使用 huggingface-cli 下載

如果你還沒有安裝 huggingface-cli，可以使用以下命令進行安裝：

pip install -U "huggingface_hub[cli]"

下載你需要的特定文件：

huggingface-cli download legraphista/openchat-3.6-8b-20240522-IMat-GGUF --include "openchat-3.6-8b-20240522.Q8_0.gguf" --local-dir ./

如果模型文件較大，可能已被拆分為多個文件。要將它們全部下載到本地文件夾，請運行以下命令：

huggingface-cli download legraphista/openchat-3.6-8b-20240522-IMat-GGUF --include "openchat-3.6-8b-20240522.Q8_0/*" --local-dir openchat-3.6-8b-20240522.Q8_0
# 合併 GGUF 文件的方法請參考 FAQ

💻 使用示例

簡單聊天模板

<|begin_of_text|><|start_header_id|>GPT4 Correct User<|end_header_id|>

Can you provide ways to eat combinations of bananas and dragonfruits?<|eot_id|><|start_header_id|>GPT4 Correct Assistant<|end_header_id|>

Sure! Here are some ways to eat bananas and dragonfruits together:
 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey.
 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey.<|eot_id|><|start_header_id|>GPT4 Correct User<|end_header_id|>

What about solving an 2x + 3 = 7 equation?<|eot_id|>

帶系統提示的聊天模板

<|begin_of_text|><|start_header_id|>System<|end_header_id|>

You are a helpful AI.<|eot_id|><|start_header_id|>GPT4 Correct User<|end_header_id|>

Can you provide ways to eat combinations of bananas and dragonfruits?<|eot_id|><|start_header_id|>GPT4 Correct Assistant<|end_header_id|>

Sure! Here are some ways to eat bananas and dragonfruits together:
 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey.
 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey.<|eot_id|><|start_header_id|>GPT4 Correct User<|end_header_id|>

What about solving an 2x + 3 = 7 equation?<|eot_id|>

使用 Llama.cpp 進行推理

llama.cpp/main -m openchat-3.6-8b-20240522.Q8_0.gguf --color -i -p "prompt here (according to the chat template)"

❓ 常見問題解答

為什麼 IMatrix 沒有應用到所有地方？

根據這項調查，似乎只有較低的量化類型能從 imatrix 輸入中受益（根據 hellaswag 結果）。

如何合併拆分的 GGUF 文件？

確保你已經有 gguf-split 工具：
- 要獲取 gguf-split，請訪問 https://github.com/ggerganov/llama.cpp/releases。
- 從最新版本中下載適合你係統的壓縮包。
- 解壓壓縮包後，你應該能找到 gguf-split 工具。
找到你的 GGUF 拆分文件所在的文件夾（例如：openchat-3.6-8b-20240522.Q8_0）。
運行以下命令進行合併：

gguf-split --merge openchat-3.6-8b-20240522.Q8_0/openchat-3.6-8b-20240522.Q8_0-00001-of-XXXXX.gguf openchat-3.6-8b-20240522.Q8_0.gguf

請確保將 gguf-split 指向拆分文件中的第一個文件。

如果你有任何建議，歡迎在 @legraphista 聯繫我！

精選推薦AI模型

Llama 3 Typhoon V1.5x 8b Instruct

專為泰語設計的80億參數指令模型，性能媲美GPT-3.5-turbo，優化了應用場景、檢索增強生成、受限生成和推理任務

Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型，專為邊緣設備推理設計，體積僅為Cosmo-3B模型的2%左右。

Roberta Base Chinese Extractive Qa

基於RoBERTa架構的中文抽取式問答模型，適用於從給定文本中提取答案的任務。

智啟未來，您的人工智能解決方案智庫