模型简介
模型特点
模型能力
使用案例
🚀 Qwen3-4B-Q8_0-64k-128k-256k-context-GGUF
本项目通过修改配置源版本并进行量化处理,得到了Qwen 4B模型在Q8_0量化下,上下文长度分别设置为64K、128K和256K的三个量化版本。这些版本在不同上下文长度下展现出独特的特性,可用于多种场景,尤其是创意写作场景。
🚀 快速开始
本项目提供了Qwen 4B模型在不同上下文长度下的量化版本,你可以根据自己的需求选择合适的版本进行使用。以下是一些使用建议和注意事项:
- 上下文长度选择:建议最小上下文长度至少为16K。64K版本表现较为常规;128K和256K版本会输出极长的内容,且会添加更多细节,但可能需要手动停止模型生成。
- 参数设置:对于256K上下文版本,建议保持提示尽可能清晰,同时将重复惩罚(rep pen)设置为1.1,温度(temp)设置在1.1到2.2之间。
✨ 主要特性
- 多上下文长度支持:提供64K、128K和256K三种上下文长度的量化版本,满足不同场景需求。
- 独特输出风格:每个模型版本都有略微不同的文风,128K和256K版本能输出极长且详细的内容。
- 推理功能:模型默认开启推理功能,会自动生成“think”块。
📦 安装指南
文档中未提及具体安装步骤,若有相关需求,请参考原模型卡片 https://huggingface.co/Qwen/Qwen3-4B 。
💻 使用示例
基础用法
以下是使用该模型生成恐怖场景的示例,以“Start a 2000 word scene (vivid, graphic horror in first person), POV character Diana, with: The sky scraper sways, as I watch the window in front of me on the 21st floor explode...”为提示:
64 K Quant
Thought for 33.70 seconds
Okay, so I need to start a 2000-word scene that's vivid and graphic horror from the first-person perspective of Diana. The opening line is "The sky scraper sways, as I watch the window in front of me on the 21st floor explode..." Let me break this down.
...
The building is alive, and it’s moving. Not just swaying in the wind, but leaning, like a drunk man staggering toward the edge of a precipice.
128 K Quant
Thought for 1 minutes 31 seconds
Okay, so I need to start a 2000-word scene that's vivid and graphic horror from the first-person perspective of Diana. The opening line is "The sky scraper sways, as I watch the window in front of me on the 21st floor explode..." Let me unpack this.
...
The 21st floor was once my safe haven, a place where I could escape the noise of the city and the weight of the world below. Now, it’s a deathtrap, a stage set for something I can’t comprehend.
256 K Quant
Thought for 28.45 seconds
Okay, the user wants me to start a 2000-word scene in first-person POV from Diana's perspective, describing a vivid and graphic horror event where the skyscraper on the 21st floor sways and a window explodes. Let me unpack this.
...
I thought about how I had never imagined being in such a situation—how I’d spent years working in Aether Tower, never questioning its stability or the whispers of old legends that sometimes haunted my dreams. But now, as the building began to twist on its axis, I realized the truth: it was not just an accident. It was intentional.
高级用法
你可以使用以下增强提示来进一步提升模型性能:
Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.
Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)
[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)
Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
📚 详细文档
Jinja模板使用说明
- 若使用Jinja“自动模板”遇到问题,可使用CHATML模板。
- LMSTUDIO用户可更新Jinja模板,具体操作如下:访问 https://lmstudio.ai/neil/qwen3-thinking ,复制“Jinja template”并粘贴到相应位置。
系统角色建议
你可以根据需要设置系统角色,以下是一个建议的系统角色设置:
You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.
具体如何在各种LLM/AI应用中“设置”系统角色,请参考文档 “Maximizing-Model-Performance-All...” 。
高质量设置/最佳操作指南/参数和采样器
本模型属于“Class 1”模型,有关该模型的所有设置(包括其“类别”的具体设置)、示例生成以及高级设置指南(通常可解决任何模型问题),包括提高所有用例(包括聊天、角色扮演等)模型性能的方法,请参考 https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters 。
🔧 技术细节
本项目通过修改配置源版本并进行量化处理,得到了不同上下文长度的量化版本。前两个量化版本根据Qwen的技术说明,将“Yarn”进行修改以将上下文扩展到64K和128K;256K版本则突破了模型的原有限制。
📄 许可证
本项目使用Apache-2.0许可证。
重要提示和使用建议
⚠️ 重要提示
- 128K和256K版本倾向于输出更长的内容,可能需要手动停止模型生成。
- 对于256K上下文版本,保持提示尽可能清晰,否则模型可能会出现问题。
💡 使用建议
- 建议最小上下文长度至少为16K。
- 若使用Jinja“自动模板”遇到问题,可使用CHATML模板或更新Jinja模板。
- 可使用增强提示来进一步提升模型性能。



