Qwen3-4B-NEO-Imatrix-Max-GGUF開源模型 - 支持長文對話，提升推理輸出能力

首頁

Qwen3 4B NEO Imatrix Max GGUF

由DavidAU開發

這是基於Qwen3-4B模型的NEO Imatrix量化版本，採用BF16格式的MAX輸出張量以提升推理和輸出生成能力，支持32k上下文長度。

大型語言模型開源協議:Apache-2.0 #長上下文推理 #思維鏈可視化 #創意文本生成

下載量 1,152

發布時間 : 4/29/2025

模型概述

該模型是Qwen3-4B的量化版本，專注於提升推理和文本生成能力，特別適用於創意用例。支持32k上下文長度，並可擴展至128k。

模型特點

NEO Imatrix量化

採用BF16格式的MAX輸出張量量化，提升推理和輸出生成能力。

長上下文支持

支持32k上下文長度，並可擴展至128k，適用於長文本生成任務。

深度推理能力

模型默認開啟推理功能，可生成詳細的思考過程和內心獨白。

創意用例優化

在創意用例中表現突出，特別適合故事生成和對話寫作。

模型能力

文本生成

深度推理

長上下文處理

創意寫作

對話生成

使用案例

創意寫作

故事生成

生成具有複雜情節和角色發展的故事。

可生成包含50%對話、25%敘述、15%肢體語言和10%內心活動的故事。

對話寫作

生成具有潛臺詞和情感深度的對話。

通過展示而非講述的方式，生成生動的對話內容。

推理任務

複雜問題解決

通過系統性推理過程解決複雜問題。

生成詳細的思考過程和解決方案。

🚀 Qwen3-4B-NEO-Imatrix-Max-GGUF

Qwen3-4B-NEO-Imatrix-Max-GGUF 是基於“Qwen 3 - 4B”模型的 NEO Imatrix 量化版本，在 BF16 下具有最大“輸出張量”，可顯著提升推理和輸出生成能力。

✨ 主要特性

NEO Imatrix 量化：使用內部生成的 NEO Imatrix 數據集進行量化，不同量化方式對 Imatrix 效果和輸出質量有不同影響。
強大的上下文長度：支持 32K 上下文長度，輸出生成可達 8K，並且可以擴展到 128K。
多樣化的應用場景：不同量化方式適用於不同場景，如創意場景和強推理場景。
靈活的模板選擇：可使用 Jinja 模板或 CHATML 模板，LMSTUDIO 用戶可更新 Jinja 模板。
自定義系統角色：可根據需要設置系統角色，幫助模型更好地進行推理和思考。

📦 安裝指南

文檔未提供具體安裝步驟，可參考原模型卡片 https://huggingface.co/Qwen/Qwen3-4B 獲取相關信息。

💻 使用示例

基礎用法

文檔未提供具體代碼示例，可參考原模型卡片 https://huggingface.co/Qwen/Qwen3-4B 獲取使用示例。

📚 詳細文檔

量化說明

Imatrix 效果與量化關係：量化越低，Imatrix 效果越強。IQ4XS/IQ4NL 是質量和 Imatrix 效果平衡最佳的量化方式，適用於創意場景。對於強推理場景，建議使用較高的量化方式。Q8_0 量化僅為最大值，Imatrix 對該量化無影響，F16 為全精度。
上下文長度：支持 32K 上下文長度 + 8K 輸出生成，可擴展到 128K。對於 65K、128K 或 256K 上下文的 4B 模型，可參考 https://huggingface.co/DavidAU/Qwen3-4B-Q8_0-65k-128k-256k-context-GGUF。

模板使用

Jinja 模板問題：如果 Jinja “自動模板” 出現問題，可使用 CHATML 模板。LMSTUDIO 用戶可更新 Jinja 模板，參考 https://lmstudio.ai/neil/qwen3-thinking。
系統角色建議：大多數情況下，Qwen3 會自動生成推理/思考塊，系統角色設置可選。建議的系統角色如下：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

具體設置方法可參考文檔 “Maximizing-Model-Performance-All...”。

可選增強

可使用以下內容替代“系統提示”或“系統角色”，進一步增強模型性能：

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

此內容可用於場景生成和場景延續功能的增強。

另一個可使用的系統提示如下，可通過更改“名稱”調整其性能：

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.