Qwen3-8B-NEO-Imatrix-Max-GGUF開源模型 - 支持32K長上下文，增強推理能力

首頁

Qwen3 8B NEO Imatrix Max GGUF

由DavidAU開發

基於Qwen3-8B模型的NEO Imatrix量化版本，支持32K長上下文和增強推理能力

大型語言模型開源協議:Apache-2.0 #長上下文推理 #深度思維鏈 #創意文本生成

下載量 178

發布時間 : 4/30/2025

模型概述

這是一個經過NEO Imatrix量化的文本生成模型，具有出色的推理能力和長上下文處理能力，特別適合需要深度思考的創意場景。

模型特點

NEO Imatrix量化

採用特殊量化技術，在低量化等級下保持高質量輸出，特別適合創意場景

長上下文支持

支持32K上下文長度+8K輸出生成，可擴展至128K

自動推理功能

內置深度思考機制，可自動生成推理過程和思維模塊

模型能力

長文本生成

深度推理

創意寫作

多輪對話

邏輯分析

使用案例

創意寫作

故事創作

生成包含豐富細節和情感的長篇故事

可生成包含50%對話、25%敘述、15%肢體語言和10%思想的完整故事

專業分析

複雜問題解決

通過多步推理解決複雜問題

可展示完整的思考過程，包括多個AI角色的內部討論

🚀 Qwen3-8B-NEO-Imatrix-Max-GGUF

Qwen3-8B-NEO-Imatrix-Max-GGUF 是基於全新 “Qwen 3 - 8B” 模型的 NEO Imatrix 量化版本，在 BF16 下具備最大 “輸出張量”，可提升推理和輸出生成能力。

📦 模型信息

屬性	詳情
基礎模型	Qwen/Qwen3-8B
任務類型	文本生成
模型標籤	恐怖、32k 上下文、推理、思維、qwen3
許可證	Apache-2.0

✨ 主要特性

NEO Imatrix 量化：NEO Imatrix 數據集為內部生成。使用的量化等級越低，Imatrix 效果越強，其中 IQ4XS/IQ4NL 是質量和 Imatrix 效果平衡最佳的量化方式，這些量化方式在創意場景中表現也最佳。若需更強推理能力，可使用更高的量化等級。Q8_0 量化僅為最大值，因為 Imatrix 對此量化無影響。F16 為全精度。
長上下文支持：上下文長度為 32K + 8K 輸出生成（可擴展至 128K）。

🚀 快速開始

模板使用

若使用 Jinja “自動模板” 遇到問題，可使用 CHATML 模板。
（LMSTUDIO 用戶可選）更新 Jinja 模板，可訪問 https://lmstudio.ai/neil/qwen3-thinking，複製 “Jinja 模板” 後粘貼。

系統角色建議

大多數情況下，Qwen3 會自行生成推理/思維模塊，因此係統角色可能並非必需。建議的系統角色如下：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

具體如何在各種 LLM/AI 應用中 “設置” 系統角色，可參考文檔 “Maximizing-Model-Performance-All...”。

📚 詳細文檔

高質量設置/最佳操作指南/參數和採樣器

此為 “1 類” 模型。關於該模型的所有設置（包括其 “類別” 的具體設置）、示例生成以及高級設置指南（通常可解決任何模型問題），包括提高所有用例模型性能的方法，可參考 https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters。

可選增強設置

以下內容可替代 “系統提示” 或 “系統角色” 以進一步增強模型性能，也可在新聊天開始時使用，但需確保在聊天過程中保持該設置。不過，此增強設置在場景生成和場景延續功能方面的效果可能不如使用 “系統提示” 或 “系統角色”。

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

另外，還有一個系統提示可供使用，可通過更改 “名稱” 來調整其性能。該提示可創建一個類似 “推理” 的窗口/模塊，你的輸入提示將直接影響此係統提示的反應強度。

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.