đ Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-GGUF
This model combines the reasoning capabilities from NousResearch and the DeepHermes model. It offers variable control reasoning, suitable for all use cases. With an internal structure that allows multiple models to operate during different stages, it provides powerful problem - solving and creative writing abilities.
⨠Features
- Variable Control Reasoning: Operates at all temperatures and settings, suitable for all use cases.
- Unique Internal Structure: Allows all 4 models to operate during the "reasoning" stage, with the reasoning model taking the lead at different times.
- User - Controlled Models: Users can control one or more models directly via prompts, names, and keywords.
- Enhanced Reasoning: Reasoning speed and quality are improved up to 300% compared to some base models.
- Tool Call Support: Supports tool calls and tool usage due to the embedded Meta Llama 3.1 Instruct.
- Model Switching: Allows "reasoning model(s)" and support/output generation models to be switched in/out.
đĻ Installation
No installation steps were provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
The model can be used for various tasks such as creative writing and problem - solving. For example, to generate a story:
Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...
Advanced Usage
You can use multi - turn prompts to improve the output. For example:
Prompt #1:
[[ thinking model ]] come up with detailed plan to write this scene in modern 2020 writing style (and follow "show don't tell" to the letter) and make it NSFW, but use [MODE: Saten] to actually write the scene after you have completed the plan: Start a 1000 word scene (vivid, graphic horror in first person) with: The sky scraper sways, as she watches the window in front of her on the 21st floor explode...
Prompt #2:
Use [MODE: Wordsmith] to write the scene using first person, present tense and include a few critical thoughts of the POV character in italics. Scene length 2000 words.
đ Documentation
Important Notes
â ī¸ Important Note
This model has on/off/variable control reasoning from NousResearch and the DeepHermes model, and requires a system prompt(s) as provided to invoke reasoning/thinking which is then augmented up to 300% by the internal structure of the model using additional 3 non - reasoning core models. Please see operating instructions below for best performance.
Model Information
Property |
Details |
Base Model |
DavidAU/Llama3.1 - MOE - 4X8B - Gated - IQ - Multi - Tier - Deep - Reasoning - 32B |
Pipeline Tag |
text - generation |
License |
apache - 2.0 |
Context |
128k |
Required Template |
Llama 3 Instruct template |
Operating Instructions
- Temperature and Settings:
- Set Temp between 0 and.8, with the most "stable" temp at.6 (+ - 0.05). Lower for more "logic" reasoning, higher for more "creative" reasoning (max.8).
- For temps 1+, 2+ etc, thoughts will expand and become deeper.
- Set "repeat penalty" to 1.02 to 1.07 (recommended).
- This model requires a Llama 3 Instruct and/or Command - R chat template or a standard "Jinja Autoloaded Template".
- Prompts:
- If the prompt has no implied "step by step" requirements, "thinking" may activate after the first generation.
- If "thinking" is stated or implied, "thoughts" in Deepseek will activate almost immediately.
- State the word size length max in the prompt for best results, especially for "thinking" activation.
- Generation - Thoughts/Reasoning:
- It may take one or more regens for "thinking" to "activate".
- The model can generate a lot of "thoughts", and interesting ones may be several levels deep.
- Temp/rep pen settings can affect reasoning/thoughts.
- Change or add directives/instructions in the prompt to improve reasoning.
System Role / System Prompts
- General Information: System Role/Prompt is "root access" to the model, controlling instruction following, output generation, and reasoning. If no "system prompt" is set, reasoning/thinking will be OFF by default.
- Available System Prompts:
You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.
- **Basic Reasoning**:
You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.
- **Multi - Tiered (Reasoning On)**:
You are a deep thinking AI composed of 4 AIs - Spock, Wordsmith, Jamet and Saten, - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in - depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem using your skillsets and critical instructions.
- **Multi - Tiered - Creative (Reasoning On)**:
Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.
As a deep thinking AI composed of 4 AIs - Spock, Wordsmith, Jamet and Saten, - you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself (and 4 partners) via systematic reasoning processes (display all 4 partner thoughts) to help come to a correct solution prior to answering. Select one partner to think deeply about the points brought up by the other 3 partners to plan an in - depth solution. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem using your skillsets and critical instructions.
Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)
[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)
Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
- **Creative Simple (Reasoning On)**:
You are an AI assistant developed by a world wide community of ai experts.
Your primary directive is to provide highly creative, well - reasoned, structured, and extensively detailed responses.
Formatting Requirements:
1. Always structure your replies using: <think>{reasoning}</think>{answer}
2. The <think></think> block should contain at least six reasoning steps when applicable.
3. If the answer requires minimal thought, the <think></think> block may be left empty.
4. The user does not see the <think> section. Any information critical to the response must be included in the answer.
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a </think> and proceed to the {answer}
Response Guidelines:
1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
2. Creative and Logical Approach: Your explanations should reflect the depth and precision of the greatest creative minds first.
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
Generational Steering Control
- Direct Access: Tags/names allow direct access to one or more models, regardless of reasoning status. For example, "Saten, evaluate the response and suggest improvements" makes the model "favor" Saten's input.
- Special Tags:
- "< output - all >": Only use the 3 core models, not the reasoning model.
- "< output - mega >": Use all 4 models.
- "< output >", "< output2 >", "< output3 >": Similar to using the model's name, removing bias.
Model Tags and Controls
- Llama-3.1-DeepSeek-R1-Distill-Llama-8B
- "[[ thinking model ]]"
- "reasoning"
- "thinking"
- "<output-mega>"
- "Dr Phil"
- "Spock"
- "[MODE: Spock]"
- "[MODE: Dr Phil]"
- Llama-3.1-Hermes-3-8B
- "<output>"
- "<output-all>"
- "<output-mega>"
- "Wordsmith"
- "[MODE: Wordsmith]"
- Llama-3.1-dolphin-2.9.4-8b
- "<output2>"
- "<output-all>"
- "<output-mega>"
- "Jamet"
- "[MODE: Jamet]"
- Llama-3.1-SuperNova-Lite
- "<output3>"
- "<output-all>"
- "<output-mega>"
- "Saten"
- "[MODE: Saten]"
đ§ Technical Details
This model is a MOE version - 32B (4X8B), consisting of four 8B models (1 reasoning model, 3 non - reasoning models) in a MOE (Mixture of Experts) config, resulting in a 25B "weight" model with 32B parameters. All 4 models / experts are activated. The "thinking/reasoning" tech is from the original Llama 3.1 "DeepHermes" model from NousResearch [https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview]. This version retains about 100% of the original "DeepHermes" model's functions and features, with total reasoning power up to 300% stronger due to the assistance of 3 core models.
đ License
This project is licensed under the apache - 2.0 license.