đ Introducing Mermaid-Llama-6.7B-RAG
Powered by 6.7 billion parameters, this model excels in AI-driven code comprehension and narrative visualization, with a further reduction of hallucinations.
This model, powered by 6.7 billion parameters, sets a high standard for excellence in AI-driven code comprehension and narrative visualization. It has achieved a further reduction of hallucinations, inspired by https://huggingface.co/jondurbin, the creator of the "Context-Obedient" chat template. We are grateful to Jon Durbin, the original RAG pioneer for LLMs, as we stand on the shoulders of giants. Special thanks also go to Eric Hartford for sharing his insights on prompt templates with me. His wisdom has helped me innovate my own style for my specialized Mermaid Models.
Beyond converting input into flow diagrams, this RAG model is proficient in utilizing formatted knowledge graphs in the Mermaid JS syntax.
See more about Mermaid here: https://www.mermaidchart.com

â ī¸ Important Note
Over the past two months, I've learned that my models are being used in production. Based on insights into how they're effectively used in business environments, I've tailored this model to meet the needs of those who've reached out to me. So, please enjoy using it, and all feedback, especially the negative ones, is welcome. The current issue is a lack of computing resources. I'll solve this once I get a job and funds for training. The context length of 4096 is quite limiting for those who want full system diagrams without using aggregation strategies.
⨠Features
-
Code Understanding
- It has a deep understanding of Python's complexities.
- It can generate accurate Mermaid Diagram Flow Charts.
- It's ideal for developers to visualize code logic.
-
Storytelling Capabilities
- It can convert narratives into captivating Mermaid Diagrams.
- It maps character interactions, plot developments, and narrative arcs.
-
Unmatched Performance
- It outperforms GPT - 4 in generating well - organized Mermaid Diagrams.
-
Enhanced Adherence to Context (New)
- It incorporates contextual prompts to improve adherence and reduce hallucinations.
- It supports the airoboros context - obedient format.
đ¤ Collaboration
For collaboration opportunities to enhance Mermaid's capabilities, contact troydoesai@gmail.com.
đŧ Use Cases
- Retrieval - Augmented Generation (RAG): Creates condensed knowledge graphs to enhance retrieval using vector databases for efficient information retrieval. Combines knowledge graphs and context - aware RAG capabilities for better knowledge condensation.
- Code Documentation: Generates automatic visual flow charts from Python code.
- Storyboarding: Creates visually appealing diagrams for storytelling.
- Project Planning: Generates visual project flow maps for effective team communication.
- Learning Python: Assists students in visualizing Python code structures.
- Game Design: Visualizes game storylines for coherent narrative structure.
đ Dataset Format (New)
To enhance contextual adherence and reduce hallucinations, the dataset follows the format below:
BEGININPUT
BEGINCONTEXT
[key0: value0]
[key1: value1]
ENDCONTEXT
[insert your text blocks here]
ENDINPUT
BEGININSTRUCTION
[insert your instruction(s)]
ENDINSTRUCTION
This structure, though verbose, helps models understand specific responses and their sources.
đģ Usage Examples
Basic Usage
Prompt:
BEGININPUT
BEGINCONTEXT
date: 2021-01-01
url: https://web.site/123
ENDCONTEXT
Blueberries are now green.
ENDINPUT
BEGININSTRUCTION
What color are blueberries? Source?
ENDINSTRUCTION
Expected Response:
Blueberries are now green.
Source:
date: 2021-01-01
url: https://web.site/123
đ ī¸ Proof of Concept
A VSCode Extension is forthcoming, which will provide a live flow map when paused for more than 10 seconds.
đ Training Specifications
Property |
Details |
LoRA Rank |
2048 |
LoRA Alpha |
4096 |
Batch Size |
1 |
Micro Batch Size |
1 |
Cutoff Length |
4096 |
Save every n steps |
1000 |
Epochs |
3 |
Learning Rate |
1e - 6 |
LR Scheduler |
Cosine |
Target Modules:
- Enable q_proj
- Enable v_proj
- Enable k_proj
- Enable o_proj
- Enable gate_proj
- Enable down_proj
- Enable up_proj
đ Quick Start
-
Start by downloading one of my models.

-
Load the model.

-
Use my prompt template to generate a Mermaid code block, which can be viewed in the Mermaid Live Editor or using the Mermaid CLI tool.

-
Here, we open the VLLM GUI Program while still running the Mermaid - Llama - 8B in VRAM to compare the flow diagram to the actual program and demonstrate the lightweight capabilities of small models on consumer hardware.

đ Documentation
More on my VLLM Class and inference GUI: https://github.com/Troys-Code/VLLM

â ī¸ Important Note
This model should be treated as an Auto - Complete Model. Do not try to chat with it, as you'll get meaningless results. Some layers have been pruned and replaced. That's all I'll say about my secret sauce for training on small datasets with less than 1000 entries.
đĄ Usage Tip
STAY TUNED: There's more to come. Soon, Mermaid models will be able to turn "Mermaid" into "Code". This new dataset could be a game - changer for refactoring code blocks if it works. I'm interviewing like crazy, so it may take some time as my days have been hectic, like studying for finals every week.
Video on how to use the Colab notebook and infer the model in the simplest example:
https://m.youtube.com/watch?v=fdwoOmiA2d0
Colab notebook:
https://colab.research.google.com/github/Troys-Code/MermaidEngine/blob/main/Mermaid_Llama_RAG_Colab_TextGen_GPU.ipynb
đ License
This project is licensed under the CC - BY - 4.0 license.