Qwen3-32B-128k-NEO-Imatrix-Max-GGUF開源模型 - 超長上下文，大幅提升推理生成能力

首頁

Qwen3 32B 128k NEO Imatrix Max GGUF

由DavidAU開發

這是Qwen3-32B模型的NEO Imatrix量化版本，採用BF16格式最大化輸出張量以提升推理/生成能力，支持128k上下文長度。

大型語言模型開源協議:Apache-2.0 #128k超長上下文 #推理增強 #恐怖敘事優化

下載量 1,437

發布時間 : 5/2/2025

模型概述

基於Qwen3-32B的量化版本，優化了推理和文本生成能力，特別適合創意寫作和長文本生成任務。

模型特點

128k超長上下文

支持長達128k的上下文長度，適合處理長文檔和複雜敘事。

NEO Imatrix量化

採用BF16格式最大化輸出張量，提升推理和生成質量。

深度推理能力

內置思考模塊，可生成詳細的推理過程和內心獨白。

創意寫作優化

在恐怖、科幻等創意寫作場景中表現突出。

模型能力

文本生成

長文本處理

創意寫作

推理分析

對話生成

使用案例

創意寫作

恐怖故事生成

生成具有情感張力和氛圍感的恐怖故事。

如示例中的《最後的傳輸》故事，展現了深刻的情感衝擊和敘事技巧。

科幻敘事

創作複雜的科幻場景和角色對話。

能夠構建完整的宇宙飛船場景和角色心理活動。

推理分析

複雜問題推理

通過思維鏈分析複雜問題並給出系統化解答。

模型可生成詳細的思考過程，如示例中的[[[思考開始]]]模塊。

🚀 Qwen3-32B-NEO-Imatrix-Max-GGUF

Qwen3-32B-NEO-Imatrix-Max-GGUF是基於“Qwen 3 - 32B”模型的NEO Imatrix量化版本，在BF16下具有最大“輸出張量”，可顯著提升推理和輸出生成能力。該模型的NEO Imatrix數據集為內部生成，上下文長度調整至128k，以實現更強大的推理和輸出效果。

✨ 主要特性

增強推理能力：通過優化輸出張量，提升模型的推理和輸出生成能力。
128k上下文長度：支持更長的上下文，適用於複雜任務。
多樣化量化選擇：提供多種量化方式，滿足不同場景需求。
自定義模板支持：可使用Jinja模板或CHATML模板，靈活配置系統角色。

📦 安裝指南

文檔未提及安裝步驟，故跳過此章節。

💻 使用示例

基礎用法

以下是一個使用該模型生成科幻故事的示例：

Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. If the situation calls for it, have the character(s) curse and swear to further the reader's emotional connection to them. 800 - 1000 words.

QUANT: IQ3_S, Temp.6, rep pen 1.06, top k 100, topp.95 minp.05, rep pen range 64

高級用法

可根據不同需求調整量化參數、溫度、重複懲罰等，以獲得更好的生成效果。例如，使用更高的量化級別可能會帶來更優的結果。

📚 詳細文檔

模板使用說明

Jinja模板：若Jinja“自動模板”使用有問題，可使用CHATML模板。
LMSTUDIO用戶：可訪問https://lmstudio.ai/neil/qwen3-thinking複製並粘貼“Jinja模板”。

系統角色建議

系統角色並非必需，因為Qwen3通常會自動生成推理/思考模塊。建議系統角色如下：

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

具體設置方法可參考文檔“Maximizing-Model-Performance-All...”。

高質量設置與參數

該模型為“Class 1”模型，所有設置（包括特定設置）、示例生成及高級設置指南可參考https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters。

可選增強設置

可使用以下內容替代“系統提示”或“系統角色”，以進一步增強模型性能：

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

此設置有助於場景生成和場景延續，但並非必需。

其他系統提示

可使用以下系統提示，並通過更改“名稱”調整性能：

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in-depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.