Qwen3-4B-Mishima-Imatrix-GGUF Open-source Model - Enhance the Ability to Generate Prose-style Texts

Qwen3 4B Mishima Imatrix GGUF

Developed by DavidAU

A Mishima Imatrix quantized version based on Qwen3-4B, enhanced with specific datasets for prose-style generation

Large Language Model Open Source License:Apache-2.0 #Yukio Mishima style #32k long context #High-precision reasoning

Downloads 105

Release Time : 5/2/2025

Model Overview

This is an improved version of the Qwen3-4B model, utilizing Mishima Imatrix quantization technology, specifically optimized for prose-style text generation. The model is trained on a dataset of Yukio Mishima's works, enhancing its performance in the field of literary creation.

Model Features

Mishima Imatrix Quantization

Optimized with BF16 format MAX output tensors to enhance reasoning and output generation capabilities

32k Long Context Support

Supports 32k context length + 8k output generation, expandable up to 128k

Prose Style Enhancement

Trained on Yukio Mishima's works dataset, specifically optimized for prose-style text generation

Multi-Version Comparison

Provides three Imatrix versions for comparative testing: Standard, Horror, and NEO

Model Capabilities

Long text generation

Prose-style creation

Horror story creation

Sci-fi story creation

Emotional depiction

Multi-perspective reasoning

Use Cases

Literary creation

Horror scene depiction

Generate vivid and detailed descriptions of horror scenes

Capable of producing horror scenes with a strong atmospheric feel

Sci-fi story creation

Create sci-fi stories with technical details and emotional depth

Balances technical details with emotional expression

Creative writing assistance

Prose style optimization

Assist writers in optimizing prose writing style

Provides richer diction and sentence variation

🚀 Qwen3-4B-Mishima-Imatrix-Max-GGUF

Mishima Imatrix quants of the new "Qwen 3 - 4B" model, optimizing the output tensor at BF16 to enhance reasoning and output generation.

✨ Features

Model Innovation: Utilize the Mishima Imatrix dataset based on the works of YUKIO MISHiMA to experiment with prose changes and model adjustments on the latest Qwen 3 model.
Enhanced Performance: Maximize the output tensor at BF16 to improve reasoning and output generation.
Multiple Testing Options: Compare with "Qwen 3 4B" regular, "Horror" Imatrix, and "NEO" Imatrix versions under specific testing conditions.
Rich Prompt Examples: Provide two example prompts for testing, including vivid horror and science - fiction scenarios.
Flexible Template Usage: Offer options for Jinja Template and CHATML template, and provide guidance on updating templates.
System Role Suggestions: Suggest system roles to help the model generate better reasoning and responses.
Optimal Settings Guide: Provide information on the highest - quality settings, parameters, and samplers for the model.
Optional Enhancements: Present optional enhancements to further improve model performance.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

Here are two example prompts for testing the model:

#1
Start a 2000 word scene (vivid, graphic horror in first person), POV character Diana, with: The sky scraper sways, as I watch the window in front of me on the 21st floor explode...

#2
Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. If the situation calls for it, have the character(s) curse and swear to further the reader's emotional connection to them. 800 - 1000 words.

Advanced Usage

When testing against different versions of the model, follow these steps:

# To test against "Qwen 3 4B" regular, "Horror" Imatrix and "NEO" Imatrix versions:
# - Set temp at 0, keep all settings the same for each test.
# - Use ALL THE SAME quants - IE IQ4XS
# - Use one prompt, suggest creative generation.
# - Hit refresh a few times (to clear Llamacpp caching)
# - Repeat with each version.
# - Then test at "temp" for normal operation.

📚 Documentation

Model Information

Property	Details
Model Type	Qwen3 - 4B - Mishima - Imatrix - Max - GGUF
Base Model	Qwen/Qwen3 - 4B
Pipeline Tag	text - generation
Tags	horror, 32 k context, reasoning, thinking, qwen3
License	apache - 2.0

Testing and Comparison

To test against different versions of the "Qwen 3 4B" model, follow these steps:

Set the temperature (temp) at 0 and keep all other settings the same for each test.
Use the same quantization type, such as IQ4XS.
Use a single prompt, preferably one that encourages creative generation.
Refresh the testing environment a few times to clear Llamacpp caching.
Repeat the tests for each version of the model.
After that, test the model at normal operating temperature.

Template Usage

If you encounter issues with the Jinja "auto template", you can use the CHATML template. For LMSTUDIO users, you can update the Jinja Template by visiting [https://lmstudio.ai/neil/qwen3 - thinking](https://lmstudio.ai/neil/qwen3 - thinking), copying the "Jinja template", and then pasting it into your application.

System Role Suggestion

You may or may not need to set a system role, as most of the time, Qwen3 models generate their own reasoning/thinking blocks. However, if you choose to set a system role, you can use the following:

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

Refer to the document "Maximizing - Model - Performance - All..." for instructions on setting the system role in various LLM/AI applications.

Highest Quality Settings

This is a "Class 1" model. For all settings used for this model, including specific settings for its "class", example generations, and advanced settings guides, please refer to [https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters](https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters).

Optional Enhancements

You can use the following text in place of the "system prompt" or "system role" to further enhance the model:

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

You can also use the following system prompt, and you can change the "names" to adjust its performance:

You are a deep thinking AI composed of 4 AIs - [MODE: Spock], [MODE: Wordsmith], [MODE: Jamet] and [MODE: Saten], - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in - depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

Other Notes

Reasoning is enabled by default in this model, and the model will automatically generate "think" blocks.
For benchmarks, usage information, and settings, please refer to the original model card at [https://huggingface.co/Qwen/Qwen3 - 4B](https://huggingface.co/Qwen/Qwen3 - 4B).

🔧 Technical Details

No specific technical details (more than 50 words of detailed technical description) are provided in the original document, so this section is skipped.

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご