đ Ultra Quality High Remaster of the incredible: Psyonic-Cetacean-20b - Imatrix Plus
This project presents a Floating Point 32 upscale of the Psyonic-Cetacean-20b - Imatrix Plus model. All components and merges have been remastered to floating point 32, including all the merges recreated with master files and, where possible, substituting full FP32 models.
⨠Features
High Precision
The goal is to carry forward maximum precision right up to the point where it is "GUFFed". This includes an F32 master file for GGUF, which weighs in at a massive 78 GBs. The difference between F32 and BF16 is over 8 decimal places. As each merge or model is modified, there are "losses" along the way, and these losses accumulate, affecting the model's performance. By using FP32, these precision losses are minimized or eliminated.
Improved Performance
- Perplexity Reduction: At Q2K, there is an impressive drop of 533 points in perplexity; at Q4KM, a whopping drop of 976 points; and at Q6, an awesome drop of 234 points.
- Enhanced Abilities: According to the original model creator, instruction following has improved dramatically, new abilities have emerged, the instruction sets used have been reduced, prose, nuance, and depth have all improved, and known issues with the original model have disappeared.
Better Settings
For CHAT / ROLEPLAY and smoother operation of this model, in "KoboldCpp", "oobabooga/text-generation-webui", or "Silly Tavern", set the "Smoothing_factor" to 1.5 - 2.5. For "text-generation-webui", if using GGUFs, you need to use "llama_HF" and download some config files from the source version of this model.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
No code examples are provided in the original document.
đ Documentation
Model Settings
- Smoothing Factor: In "KoboldCpp", go to Settings -> Samplers -> Advanced -> "Smooth_F"; in "oobabooga/text-generation-webui", set it in the parameters section at the lower right; in "Silly Tavern", it is called "Smoothing".
- Other Options: You can increase the rep pen to 1.1 - 1.15 (not necessary if using "smoothing_factor"). If the interface or program supports "Quadratic Sampling" ("smoothing"), make the adjustment as noted.
Optimal Operation Guide
For all settings used for this model, including specifics for its "class", example generations, and advanced settings guide, please refer to [https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters].
đ§ Technical Details
The methods employed ensure that precision loss is minimized or eliminated at every step just before "ggufing" the model. It is a mathematically and theoretically sound approach. By using FP32 and carefully recreating merges, the model can achieve higher precision and better performance.
đ License
This model is licensed under the Apache-2.0 license.
Visual
Results from the Original Model Creator
As per Jeb Carter, the original creator of the model:
- Instruction following has improved dramatically.
- New abilities have emerged.
- He had to reduce the instruction sets used because the model no longer needed as specific instructions.
- Prose, nuance, and depth have all improved.
- Known issues with the original model have disappeared.
Future Remasters
This is the first group of remasters. A "reg quant plus" repo will follow, adding additional components into the GGUF at floating point 32 precision to further increase creativity and AI horsepower. A full float 32 precision Imatrix (including reg quants "imatrixed") will also be released. Test results will be posted when available.
Source Versions and Config Files
Source versions (and config files) of the models are available at [https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be].
Acknowledgment
Thanks again to Jeb Carter, the original creator of "Psyonic-Cetacean 20B" [https://huggingface.co/jebcarter/psyonic-cetacean-20B].