Psyonic-Cetacean-Ultra-Quality-20b-GGUF Open-Source Model - The Upgraded Version Significantly Boosts Performance

Psyonic Cetacean Ultra Quality 20b GGUF

Developed by DavidAU

A fully upgraded version based on Psyonic-Cetacean-20B with 32-bit floating-point precision, significantly improving model performance by minimizing precision loss.

Large Language Model EnglishOpen Source License:Apache-2.0 #FP32 high-precision generation #Quantization performance surpasses the original version #Enhanced creative writing

Downloads 560

Release Time : 5/28/2024

Model Overview

This model is a high-precision remastered version of the original Psyonic-Cetacean-20B. By using FP32 precision throughout the process, it significantly reduces the precision loss during quantization and achieves breakthrough improvements in instruction following, text generation quality, and detail presentation.

Model Features

Full FP32 precision processing

All components and merged models are remastered with 32-bit floating-point precision to minimize precision loss during quantization.

Significantly reduced perplexity

The perplexity of each quantized version is reduced by 234 - 976 points. The performance of the Q6 version even surpasses that of the original full-precision model.

Emergence of new capabilities

The instruction following ability is significantly improved, and the model demonstrates new capabilities that the original version does not have.

Support for multiple quantized versions

Multiple quantized versions such as Q2K/Q4KM/Q6/Q8 are provided to meet the performance requirements under different hardware conditions.

Model Capabilities

High-quality text generation

Creative writing

Story creation

Novel continuation

Role-playing dialogue generation

Complex instruction following

Use Cases

Creative writing

Science fiction novel creation

Generate science fiction stories with rich details and a coherent worldview.

The ability to deeply describe science fiction elements such as space whales is significantly improved.

Interactive applications

Role-playing chat

Achieve high-quality role-playing conversations on platforms such as Silly Tavern.

The fluency of conversations and the consistency of characters are significantly improved.

🚀 Ultra High Quality Remaster of the incredible: Psyonic - Cetacean - 20b

This project presents an ultra - high - quality remaster of the Psyonic - Cetacean - 20b model. It focuses on a Floating Point 32 upscale, enhancing the model's precision and performance.

✨ Features

Floating Point 32 Upscale: All components and merges in this model have been remastered to floating point 32. This includes recreating all the merges with master files and, where possible, substituting full FP32 models.
Maximum Precision: The goal is to carry forward maximum precision right up to the point where it is "GUFFed". It even includes an F32 master file for GGUF, weighing in at a massive 78 GBs (compared to an average of 38 GBs for 20B models).
Performance Improvement:
- Perplexity Drop: At Q2K, there is an impressive 533 - point drop in perplexity; at Q4KM, a whopping 976 - point drop; and at Q6, an awesome 234 - point drop.
- Quality Enhancement: "Q6" now operates above the original full - precision version of "Psyonic - Cetacean - 20b", and Q4KM operates at close to Q6 level quality.
Functionality Upgrade:
- Instruction Following: Instruction following has improved dramatically.
- New Abilities: New abilities have emerged.
- Reduced Instruction Sets: The need for specific instructions has been reduced.
- Prose and Nuance: Prose, nuance, and depth have all improved.
- Issue Resolution: Known issues with the original model have disappeared.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

No code examples are provided in the original document.

📚 Documentation

Settings for CHAT / ROLEPLAY and Smoother Operation:
- In "KoboldCpp", "oobabooga/text - generation - webui", or "Silly Tavern", set the "Smoothing_factor" to 1.5 to 2.5.
  - In KoboldCpp: Settings -> Samplers -> Advanced -> "Smooth_F".
  - In text - generation - webui: parameters -> lower right.
  - In Silly Tavern: It is called "Smoothing".
- Note for text - generation - webui: If using GGUFs, you need to use "llama_HF", which involves downloading some config files from the source version of this model. The source versions (and config files) of the models are available at [https://huggingface.co/collections/DavidAU/d - au - source - files - for - gguf - exl2 - awq - gptq - hqq - etc - etc - 66b55cb8ba25f914cbf210be](https://huggingface.co/collections/DavidAU/d - au - source - files - for - gguf - exl2 - awq - gptq - hqq - etc - etc - 66b55cb8ba25f914cbf210be).
- Other Options:
  - Increase rep pen to 1.1 to 1.15 (not necessary if using "smoothing_factor").
  - If the interface/program supports "Quadratic Sampling" ("smoothing"), make the adjustment as noted.
Highest Quality Settings / Optimal Operation Guide:
- This is a "Class 2" model. For all settings used for this model (including specifics for its "class"), example generation, advanced settings guide, and methods to improve model performance for all use cases (chat, role - play, etc.), please refer to [https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters](https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters).

🔧 Technical Details

Precision Difference: The difference between F32 and BF16 is over 8 decimal places. As each merge/model is modified, there are "losses" along the way, which are carried forward and lead to more losses. Decimal points are critical to model performance.
Future Repositories:
- A "reg quant plus" repo will be released, adding additional components into the GGUF (all levels) at floating point 32 precision to further increase creativity and AI horsepower, shaving an extra 50 - 100 points off perplexity.
- A full float 32 precision Imatrix (including reg quants "imatrixed") will follow.
- An Imatrix Plus repo (with the same floating 32 enhancement as "reg quant plus") will push the limit even more. The Imatrix Depo is at [https://huggingface.co/DavidAU/Psyonic - Cetacean - Ultra - Quality - 20b - GGUF - imatrix](https://huggingface.co/DavidAU/Psyonic - Cetacean - Ultra - Quality - 20b - GGUF - imatrix).

📄 License

This project is licensed under the Apache - 2.0 license.

📋 Information Table

Property	Details
Model Type	Ultra High Quality Remaster of Psyonic - Cetacean - 20b
Training Data	Not provided

🖼️ Visual

![Space Whale Thinking](space - whale - thinking.jpg)

Acknowledgments

Thanks again to Jeb Carter, the original creator of "Psyonic - Cetacean 20B". [https://huggingface.co/jebcarter/psyonic - cetacean - 20B](https://huggingface.co/jebcarter/psyonic - cetacean - 20B)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご