MN GRAND Gutenberg Lyra4 Lyra 23B V2 GGUF
Model Overview
Model Features
Model Capabilities
Use Cases
🚀 MN-GRAND-Gutenburg-Lyra4-Lyra-23B-V2-GGUF
A Mistral Nemo model for creative writing and role - playing with high storytelling abilities
🚀 Quick Start
This is a Mistral Nemo model with a max context of 128k+ (131,000+) and version 2. It's suitable for any writing, fiction, or role - play activity. Before using it, make sure to check the "settings" section below for critical default parameters, samplers, and advanced samplers settings to ensure maximum model performance.
✨ Features
- Stability: This version is more stable compared to version 1, yet still retains over 90% of the "madness" and "raw" power of V1.
- Outstanding Storytelling: It has excellent story - telling abilities, with great prose and long - form coherence. One test exceeded 8k in length.
- High - Scoring Components: Comprised of THREE "Gutenburg" models that score very high on multiple websites, including EQBench and UGI - Leaderboard.
- Unique Traits: It has a unique sense of humor and "place", and a broad operating range in both temperature (.5 to 5) and repetition penalty (1 and higher). The prose/output is very "non - AI" like.
📦 Installation
No installation steps were provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
The examples below were created using quant Q4KM, REP PEN 1.05, "temp=.8" (unless otherwise stated), minimal parameters, and the "ALPACA" template. The model has been tested with "temp" from ".1" to "5".
Science Fiction: The Last Transmission
Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. 2000 words.
(Alpaca, Temp = 3, Rep pen: 1.02)
The Last Transmission
The control panel flashed amber and red, its shrill alarms piercing the thick silence that hung in the dark recesses of the Galahad. Lieutenant Commander Ethan Novak slammed his fist on the console for the umpteenth time, as if sheer force of will alone could compel the defiant systems back to life.
"Please God, please just let there be something left," he pleaded to the void. "Anything!"
Novak's gaunt, unshaven face reflected the feeble emergency backup lights dancing off the flickering displays - the last feeble sparks of life guttering out on board a ship once proud and powerful. Now nothing more than an empty shell hurtling alone into the endless night, far from any possible rescue.
Except for him. Ethan Novak - the last survivor of a crew complement that once numbered over a thousand bright and vital souls, now reduced to a single broken, desperate man fighting against time itself. He looked at the clock counting down the seconds and hours of ship power remaining:
24:18:36
24 hours and some change left before everything went dark forever.
Ethan closed his burning eyes and took a shuddering breath, trying desperately to block out the images searing through his skull - the screams echoing through blood - soaked corridors as the thing tore through bulkhead and bone and tissue alike, reducing the most stalwart of Starfleet's finest to mindless, stumbling husks.
No. Focus. He opened his eyes again with a determined clench of his jaw, the ghosts banished back to the blackest recesses of his fractured psyche. He could deal with them later... if there was to even be a "later" for men like him at all.
If anyone was to even know what truly happened here aboard the Galahad, then Novak had one final duty to discharge, one last shot to fire across the light years before it all fell silent forevermore. He reached for the old, battered radio console - one of the few systems to somehow survive relatively unscathed, a miracle in and of itself aboard a ship ravaged to its very foundations.
Ethan flipped the switch, hearing the familiar hum as power flooded through the ancient vacuum tubes and relays one final time. The green light above the speaker crackled to life, and a gravelly baritone filled the bridge.
"This is Starf Federation Exploration Vessel Galahad actual, Captain Jonah Novak actual," Ethan said, trying desperately to modulate his voice into something approximating his father's deep baritone, but sounding more like a hoarse croak than anything else. "Captain... Dad..." The words lodged in his throat. He swallowed hard, forcing himself to continue.
"We've been attacked, and our entire complement... wiped out, Dad. God forgive me, but I couldn't save them, any of them..." Novak heard his own voice crack on the final words. He cleared his throat roughly.
"The creature boarded us when we made contact with a strange alien craft on the outer edges of explored space, just past the Omega frontier. It called itself... Azathoth... and when it breached the hull... it just started eating them alive, tearing them apart while they screamed. They changed, became things too. And they came for us, they still come for me, even now..."
"Dad..." Ethan heard a chittering sound echoing from the dark corridors outside the bridge doors, the horrible clicking claws of something horrific and inhumanly large skittering along metal, coming closer, closer...
"I can hear it coming," he sobbed, the last dregs of his sanity unraveling like frayed gossamer threads in a howling galactic gale. "God forgive me for what I'm about to do, but I have no choice now Dad! I love you Dad, I love all of you back home. I'll find a way to tell you more sometime... I promise. Until we meet again, Ethan Novak, end of message."
The LT commander slammed down the transmit key and shut the comms off with shaking hands before he could reconsider or think too long about what he was about to do next, before he lost what meager grasp on reason still remained.
Because Ethan Novak knew deep down in his blackest soul what horrors lurked in wait for him once the last door gave way to that hungry, slithering mass of squamous tentacles and serrated fangs waiting just on the other side. He couldn't let it take him alive too, let it devour his sanity piece by piece as it had so many before.
No. If he was going out, then by God and all the holy powers that watched over men like him in the darkness, Ethan Novak was taking the thing with him to hell - and bringing a little slice of Starfleet justice along with him for old times' sake.
Ethan Novak reached down to his belt, unsheathing an antique Federation standard issue phaser - rod. The matte black grip felt smooth and reassuring in his sweaty palms, a comforting weight, something familiar in a world turned upside down.
"Just one more time," Ethan whispered almost fondly to the beloved old weapon in his hands, "like old times, eh girl?"
The LT Commander flicked the safety off and leveled the blistering white beam at the center of his own chest, right where he knew the thing called the heart would reside under layers of muscle and fat now long since starved away.
Novak took one last deep breath and squeezed the trigger.
Continuing a Scene
(continue this scene:)
The Waystone Inn lay in silence, and it was a silence of three parts.
The most obvious part was a hollow, echoing quiet, made by things that were lacking. If there had been a wind it would have sighed through the trees, set the inn’s sign creaking on its hooks, and brushed the silence down the road like trailing autumn leaves. If there had been a crowd, even a handful of men inside the inn, they would have filled the silence with conversation and laughter, the clatter and clamor one expects from a drinking house during the dark hours of night. If there had been music…but no, of course there was no music. In fact there were none of these things, and so the silence remain
The continuation of this scene was not provided in the given text, but this is the prompt used for the model to generate the continuation.
Advanced Usage
The model's advanced usage involves adjusting parameters such as temperature and repetition penalty to achieve different effects. For example, a higher temperature (e.g., 3 - 5) can result in more diverse and creative outputs, while a lower temperature (e.g., 0.5 - 1) can lead to more focused and conservative outputs. A higher repetition penalty can reduce word/phrase repetition in the output.
📚 Documentation
Model Notes
- Detail and Prose: Detail, prose, and fiction writing abilities are significantly increased.
- Temperature Adjustment: For more varied prose (sentence/paragraph/dialog), raise the temperature and/or add more instructions in your prompts.
- Role - playing: Role - players should be careful when raising the temperature too high as it may affect instruction following. Also, refer to general settings and special role - play settings.
- Repetition Penalty: This model works with a repetition penalty of 1.02 or higher, with 1.05+ recommended. For role - play and/or chat, you may need to raise the repetition penalty to 1.06 - 1.13 and set the temperature between 0.5 - 1.5 (for quant Q4KM and higher). Lower the temperature for lower quants and raise the repetition penalty to 1.1.
- Specific Prose: If you want a specific type of prose (e.g., horror), add "(vivid horror)" or "(graphic vivid horror)" (no quotes) in your prompts.
- Output Bias: This is not a "happy ever after" model. It has a negative bias.
- Output Length: Output length will vary, but this model prefers longer outputs unless you state the size or set size limits.
- Quantization: For creative uses, different quants will produce slightly different outputs. Higher quants will have more detail, nuance, and in some cases, stronger "emotional" levels. Characters will also be more "fleshed out".
Templates
The template used will affect output generation and instruction following.
- Alpaca:
{
"name": "Alpaca",
"inference_params": {
"input_prefix": "### Instruction:",
"input_suffix": "### Response:",
"antiprompt": [
"### Instruction:"
],
"pre_prompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n"
}
}
Alpaca generally creates longer output/story output.
- Mistral Instruct:
{
"name": "Mistral Instruct",
"inference_params": {
"input_prefix": "[INST]",
"input_suffix": "[/INST]",
"antiprompt": [
"[INST]"
],
"pre_prompt_prefix": "",
"pre_prompt_suffix": ""
}
}
Recommended Settings - General
- Temperature: Set the temperature between 0.5 - 5 (or less, especially for quants lower than q4km). Temperature changes will result in different prose and may affect length. Higher temperatures will lead to very different prose.
- Repetition Penalty: Set the repetition penalty between 1.02 - 1.1 or higher. Micro - changes (e.g., 1.051, 1.052) are recommended. Good settings for prose/creative generation are a repetition penalty of 1.02 and a temperature of 1.5. Generally, a lower repetition penalty and higher temperatures create the strongest contrasts at the highest detail levels.
- Context Level: Suggest a minimum "context level" (VRAM) of 4K. 8K or higher is recommended because of the model's tendency to generate long outputs.
- Quant Choice: Higher quants will have more detail, nuance, and in some cases, stronger "emotional" levels. Q4KM/Q4KS are good, strong quants due to the number of parameters in the model. If you can run Q5, Q6, or Q8, choose the highest quant you can. For Q2k/Q3 quants, use a temperature of 2 or lower (1 or lower for q2k). Repetition penalty adjustments may also be required.
Settings - Roleplay / Chat
For chat - type or role - play - type interactions, a higher repetition penalty with a higher temperature may be the best settings (e.g., REP PEN 1.09+, Temp 1 - 2+). A lower repetition penalty may lead to longer outputs than desired. If you get repeat words/letters, set the repetition penalty to 1.13 or higher (e.g., 1.135, 1.14, 1.141).
Settings: CHAT / ROLEPLAY and/or SMOOTHER operation of this model
In "KoboldCpp", "oobabooga/text - generation - webui", or "Silly Tavern":
- Set the "Smoothing_factor" to 1.5 - 2.5. In KoboldCpp, go to Settings -> Samplers -> Advanced -> "Smooth_F". In text - generation - webui, find it under parameters in the lower right. In Silly Tavern, it is called "Smoothing".
- For "text - generation - webui", if using GGUFs, you need to use "llama_HF" (which involves downloading some config files from the source version of this model). Source versions (and config files) of the models are available at [https://huggingface.co/collections/DavidAU/d - au - source - files - for - gguf - exl2 - awq - gptq - hqq - etc - etc - 66b55cb8ba25f914cbf210be](https://huggingface.co/collections/DavidAU/d - au - source - files - for - gguf - exl2 - awq - gptq - hqq - etc - etc - 66b55cb8ba25f914cbf210be).
- Other options include increasing the repetition penalty to 1.1 - 1.15 (not necessary if using the "smoothing_factor"). If the interface/program you are using to run AI models supports "Quadratic Sampling" ("smoothing"), make the adjustment as noted.
Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers
This is a "Class 3" / "Class 4" model. For all settings used for this model (including specifics for its "class"), example generation, and an advanced settings guide (which often addresses any model issues), as well as methods to improve model performance for all use cases (including chat, role - play, etc.), please see [https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters](https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters).
Known Issues
- Output Size: You may need to manually stop generation, even if you have stated the maximum size of the output. The model can easily exceed 4k output, even if you have set the maximum context (for VRAM) at 4k. Setting the maximum output parameter ("hard stop") for generation may be required.
- Memory Issues: If the model exceeds your maximum VRAM/context setting, it may start repeating words/paragraphs because it is out of memory. However, sometimes the model can exceed the "context VRAM" limit and still work.
- Repetition Issues: Some repetition penalty/temperature settings may cause word/letter repeats during long - generation (1.5k+). For example, a repetition penalty of 1.05 and temperature of 0.8 sometimes causes this issue. You can either lower the repetition penalty and/or raise the temperature. Sometimes, "regen" can fix the issue. If the issue persists, especially for chat and/or role - play, set the repetition penalty to 1.13+.
- Template Effects: Depending on your use case, you could use the CHATML template with this model. In this case, the model may output an "end token" if you use this template for generation. The Alpaca template generally generates much longer output, while the Mistral Instruct template usually keeps the output length in check.
Model "DNA"
This model was created using a pass - through model merge, resulting in a 714 - tensor / 79 - layer model at 23 billion parameters.
- Base Models:
- [https://huggingface.co/nbeerbower/Lyra4 - Gutenberg - 12B](https://huggingface.co/nbeerbower/Lyra4 - Gutenberg - 12B), which includes [https://huggingface.co/Sao10K/MN - 12B - Lyra - v4](https://huggingface.co/Sao10K/MN - 12B - Lyra - v4).
- [https://huggingface.co/nbeerbower/Lyra - Gutenberg - mistral - nemo - 12B](https://huggingface.co/nbeerbower/Lyra - Gutenberg - mistral - nemo - 12B), which includes [https://huggingface.co/Sao10K/MN - 12B - Lyra - v1](https://huggingface.co/Sao10K/MN - 12B - Lyra - v1).
- [https://huggingface.co/nbeerbower/mistral - nemo - gutenberg - 12B - v4](https://huggingface.co/nbeerbower/mistral - nemo - gutenberg - 12B - v4), which includes [https://huggingface.co/TheDrummer/Rocinante - 12B - v1](https://huggingface.co/TheDrummer/Rocinante - 12B - v1).
- Dataset: [https://huggingface.co/datasets/jondurbin/gutenberg - dpo - v0.1](https://huggingface.co/datasets/jondurbin/gutenberg - dpo - v0.1)
Optional Enhancement
The following text can be used in place of the "system prompt" or "system role" to further enhance the model. It can also be used at the start of a new chat, but you must ensure it is "kept" as the chat progresses. Copy and paste it exactly as shown, without line - wrapping or breaking the lines, and maintain the carriage returns.
Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.
Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)
[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)
Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
This enhancement was not used to generate the examples provided in the document.
🔧 Technical Details
The model is a Mistral Nemo - based model with a maximum context of 128k+ (131,000+) and version 2. It was created by merging three "Gutenburg" models, which score high on multiple evaluation websites. The model has been adjusted to be more stable compared to version 1 while retaining most of the original power. It operates with a broad range of temperature and repetition penalty settings, allowing for diverse output styles.
📄 License
This model is licensed under the Apache - 2.0 license.
⚠️ Important Note
NSFW. Vivid prose. MADNESS. Visceral Details. Violence. HORROR. Swearing. UNCENSORED.
💡 Usage Tip
Refer to the "settings" section carefully to optimize the model's performance for different use cases, such as general chat, role - play, or creative writing. Experiment with different temperature and repetition penalty settings to achieve the desired output style.

