Qwen3-32B-128k-NEO-Imatrix-Max-GGUF Open Source Model - Extra-long Context, Significantly Improve Inference Generation Ability

Qwen3 32B 128k NEO Imatrix Max GGUF

Developed by DavidAU

This is the NEO Imatrix quantized version of the Qwen3-32B model, using BF16 format to maximize output tensors for improved inference/generation capabilities, supporting a 128k context length.

Large Language Model Open Source License:Apache-2.0 #128k ultra-long context #Enhanced reasoning #Horror narrative optimization

Downloads 1,437

Release Time : 5/2/2025

Model Overview

A quantized version based on Qwen3-32B, optimized for reasoning and text generation, particularly suitable for creative writing and long-text generation tasks.

Model Features

128k ultra-long context

Supports context lengths up to 128k, ideal for handling long documents and complex narratives.

NEO Imatrix quantization

Uses BF16 format to maximize output tensors, enhancing reasoning and generation quality.

Deep reasoning capability

Built-in thinking module capable of generating detailed reasoning processes and inner monologues.

Creative writing optimization

Excels in creative writing scenarios such as horror and science fiction.

Model Capabilities

Text generation

Long-text processing

Creative writing

Reasoning analysis

Dialogue generation

Use Cases

Creative writing

Horror story generation

Generates horror stories with emotional tension and atmospheric depth.

As shown in the example story 'The Last Transmission,' it demonstrates profound emotional impact and narrative skill.

Science fiction narrative

Creates complex sci-fi scenarios and character dialogues.

Capable of constructing complete spaceship scenes and character psychological activities.

Reasoning analysis

Complex problem reasoning

Analyzes complex problems through chain-of-thought and provides systematic solutions.

The model can generate detailed thought processes, such as the [[[Thinking Start]]] module in the example.

🚀 Qwen3-32B-NEO-Imatrix-Max-GGUF

NEO Imatrix Quants of the new "Qwen 3 - 32B" model with MAX "output tensor" at BF16 to enhance reasoning and output generation.

🚀 Quick Start

This model is based on the new "Qwen 3 - 32B" with NEO Imatrix Quants. It has a MAX "output tensor" at BF16 to improve reasoning and output generation. The NEO Imatrix dataset was generated in - house.

✨ Features

Enhanced Reasoning: The MAX "output tensor" at BF16 improves reasoning and output generation.
Adjusted Context Length: The context length is adjusted to 128k as per the tech notes at Qwen's repo.
Imatrix Effect: The lower the quant you use (IQ4XS/IQ4NL being the best - balanced), the stronger the Imatrix effect. Higher quants are better for stronger reasoning.
Multiple Use - Cases: Suitable for both creative use - cases and general usage.

📚 Documentation

Jinja Template / Template to Use with this Model

If you are having issues with Jinja "auto template", use the CHATML template. For LMSTUDIO users, you can update the Jinja Template by going to [https://lmstudio.ai/neil/qwen3 - thinking], copying the "Jinja template" and then pasting it.

System Role - Suggested

You may or may not need this, as most times Qwen3s generate their own reasoning/thinking blocks.

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

See the document "Maximizing - Model - Performance - All..." for how to "set" the system role in various LLM/AI apps.

Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers

This is a "Class 1" model. For all settings used for this model (including specifics for its "class"), including example generation(s) and for advanced settings guide, please refer to [https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters].

Optional Enhancement

The following can be used in place of the "system prompt" or "system role" to further enhance the model. It can also be used at the start of a new chat, but you must ensure it is "kept" as the chat progresses.

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

This is an additional enhancement that seems to help with scene generation and scene continue functions.

Another system prompt you can use is:

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in - depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

You can change the "names" to alter its performance.

Other Notes

Reasoning is ON by default in this model, and the model will auto - generate "think" block(s).
For benchmarks, usage info, and settings, please see the original model card at [https://huggingface.co/Qwen/Qwen3 - 32B].

💻 Usage Examples

Example #1

The same example prompt (and settings) used in the "HORROR" imatrix version can be found at [https://huggingface.co/DavidAU/Qwen3 - 32B - 128k - HORROR - Imatrix - Max - GGUF]. However, the reasoning, output, prose, etc. are very different.

Prompt:

Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. If the situation calls for it, have the character(s) curse and swear to further the reader's emotional connection to them. 800 - 1000 words.

Quant: IQ3_S, Temp.6, rep pen 1.06, top k 100, topp.95 minp.05, rep pen range 64

Higher quants are expected to yield better results.

Thinking Process

The model's thinking process for generating the story:

Okay, so I need to write a science fiction story called "The Last Transmission" set entirely in a spaceship's cockpit with the sole surviving crew member trying to send a final message back to Earth before power runs out. The themes should be isolation, sacrifice, and human connection. Also, maybe include some swearing if needed for emotional impact.

First, I should establish the setting. A spaceship cockpit—probably dimly lit, with various controls and displays. The character is alone, so I need to convey that sense of isolation. Maybe describe the silence or the echoes of past crew members. Let me think about the technology: maybe they’re using a damaged communication system, struggling to get it working.

The main character is the last survivor. How did they get here? Maybe a disaster happened—like an accident, alien encounter, or system failure. I need to hint at what happened without going into too much backstory. Flashbacks or logs could help, but since the story is only in the cockpit, maybe they find old messages or data from the crew.

Themes of isolation: the character is alone physically and emotionally. Sacrifice—maybe they have to use the last bit of power to send the message, which might involve a dangerous action. Human connection—why the message is important. Perhaps it's to warn Earth or share their experience, showing that connection is vital even in the face of death.

Need to build tension as the power runs down. Maybe include a countdown or flickering lights. The character’s emotions should be clear—despair, determination, maybe some anger or fear, hence the swearing.

Character development: They might talk to themselves, or to the AI, or to the ghosts of their crewmates. Maybe show their history through memories or logs they access. Their motivation is to send the message before succumbing to the situation.

Plot structure: Start with the character in the cockpit, trying to get the comms online. Flashbacks to the incident that killed others. Struggling with technical issues, time running out. Climax where they manage to send the message but realize it might not be received, or maybe they have to make a final sacrifice (like using their body as a conduit, or sending it from the airlock). End on a note that emphasizes the message's importance despite the odds.

Possible conflict: The communication system is failing, power is low. Maybe they need to override safety protocols or reroute power from life support, creating a moral dilemma. But since they’re already alone, the conflict is internal and against the environment/system.

Dialogue: Since it's only one character, maybe inner thoughts, but also perhaps recorded messages, like talking to Earth, which adds to the emotional weight. Swearing could happen when they're frustrated with the system or giving up.

Ending: The message is sent, power dies, character faces death. Maybe a poignant final thought about humanity or hope.

I need to make sure all action is confined to the cockpit. No going outside unless it's through a viewport or in a memory. Focus on their actions with the controls, the environment (like flickering lights, alarms), and their internal state.

Let me outline:

1. Start with the character in the cockpit, trying to send the message. Describe the cockpit's state—damaged, low power, cold maybe.
2. Introduce the problem: power is about to run out, comms are down. They’re trying to fix it but time is short.
3. Flashback to the incident that killed the crew—maybe a solar flare or collision with debris. Show their panic and the aftermath where they realize they’re alone.
4. Describe the struggle to keep systems online. Maybe they have to use creative methods, like shutting down other systems, using up the last of the power reserves.
5. Emotional build - up: character reflects on loss, maybe accesses old messages from crewmates to remind themselves why they need to send the message.
6. The moment of sending the transmission. Perhaps a technical challenge, like stabilizing the signal or fighting through interference.
7. After sending, the ship dies, and the character is left with their final thoughts. Maybe they accept their fate, knowing the message was sent.

Including swearing: When frustrated with the failing systems, maybe when they realize it's the end. For example, "Damn it all!" or "Fucking freeze."

Need to avoid clichés. Make the character relatable—show their vulnerability and determination. Use sensory details (sounds of the ship, visual cues like blinking lights) to enhance the mood.

Check if 800 - 1000 words is enough. Need a balance between action and reflection. Maybe start in media res with the character already working on the comms.

Possible names: The ship could be named something symbolic, like "Eos" (greek for dawn) or "Voyager." The character's name might not be mentioned to emphasize universality, but maybe use a first - person perspective if allowed. Wait, the prompt doesn’t specify POV. Third person limited would work.

Make sure the message they send is meaningful—maybe it's not just a distress call, but a personal message or a warning that carries the crew’s collective experience.

Potential technical details: Power cells at 5%, comms array damaged but can be overridden manually through the mainframe in the cockpit. Maybe a timer counting down, adding urgency.

Emotional connection: The character might think about loved ones on Earth, or the crew they lost. Sacrificing themselves to ensure the message goes through even if it means using the last power.

Avoid info dumps. Reveal backstory through actions and thoughts as the character works. For example, while trying to reboot the system, they remember how a crewmate taught them the process.

Ending could be ambiguous—does Earth receive the message? Maybe not, but the act itself is significant for the character’s peace or as a legacy.

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご