Psyonic - Cetacean - Ultra - Quality - 20b - GGUF - imatrix Free and Open - Source Model - Enhance Text Generation and Instruction Execution Capabilities

Psyonic Cetacean Ultra Quality 20b GGUF Imatrix

Developed by DavidAU

A large language model comprehensively upgraded with 32-bit floating-point precision, significantly improving text generation quality and instruction-following capabilities through FP32 precision reconstruction

Large Language Model EnglishOpen Source License:Apache-2.0 #FP32 High-Precision Remaster #Quantization Performance Breakthrough #Creative Writing Enhancement

Downloads 550

Release Time : 5/29/2024

Model Overview

An FP32 precision upgrade based on Psyonic-Cetacean-20b, achieving a substantial reduction in perplexity through full-component 32-bit floating-point remastering, with special optimizations for creative writing and role-playing scenarios

Model Features

FP32 Full-Precision Remaster

All components and merged models are remastered in 32-bit floating-point precision to minimize cumulative precision loss

Quantization Performance Breakthrough

Significant reduction in perplexity across all quantization levels (Q2K reduced by 533 points, Q4KM reduced by 976 points), with Q6 version performance surpassing the original full-precision model

Emergent New Capabilities

Notable improvement in instruction-following ability, comprehensive enhancement in text expressiveness and detail depth, with all known issues from the original version resolved

Imatrix Quantization Enhancement

Supports Imatrix quantization technology, often doubling the reduction in perplexity

Model Capabilities

Creative Writing

Story Generation

Novel Creation

Role-Playing Dialogue

High-Precision Text Generation

Use Cases

Creative Content Generation

Science Fiction Writing

Generates sci-fi narrative texts with rich details

Enhanced ability for in-depth descriptions of sci-fi elements like space whales

Role-Playing Dialogue

Supports natural dialogue generation under complex character settings

Significant improvement in character consistency maintenance and emotional expression

Quantized Deployment

High-Performance Quantized Inference

Utilizes higher compression rates while maintaining high-quality output

Q4KM quantized version approaches the original Q6 level performance

🚀 Ultra Quality High Remaster of the incredible: Psyonic-Cetacean-20b - Imatrix Plus

This project presents a Floating Point 32 upscale of the Psyonic-Cetacean-20b - Imatrix Plus model. All components and merges have been remastered to floating point 32, including all the merges recreated with master files and, where possible, substituting full FP32 models.

✨ Features

High Precision

The goal is to carry forward maximum precision right up to the point where it is "GUFFed". This includes an F32 master file for GGUF, which weighs in at a massive 78 GBs. The difference between F32 and BF16 is over 8 decimal places. As each merge or model is modified, there are "losses" along the way, and these losses accumulate, affecting the model's performance. By using FP32, these precision losses are minimized or eliminated.

Improved Performance

Perplexity Reduction: At Q2K, there is an impressive drop of 533 points in perplexity; at Q4KM, a whopping drop of 976 points; and at Q6, an awesome drop of 234 points.
Enhanced Abilities: According to the original model creator, instruction following has improved dramatically, new abilities have emerged, the instruction sets used have been reduced, prose, nuance, and depth have all improved, and known issues with the original model have disappeared.

Better Settings

For CHAT / ROLEPLAY and smoother operation of this model, in "KoboldCpp", "oobabooga/text-generation-webui", or "Silly Tavern", set the "Smoothing_factor" to 1.5 - 2.5. For "text-generation-webui", if using GGUFs, you need to use "llama_HF" and download some config files from the source version of this model.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

No code examples are provided in the original document.

📚 Documentation

Model Settings

Smoothing Factor: In "KoboldCpp", go to Settings -> Samplers -> Advanced -> "Smooth_F"; in "oobabooga/text-generation-webui", set it in the parameters section at the lower right; in "Silly Tavern", it is called "Smoothing".
Other Options: You can increase the rep pen to 1.1 - 1.15 (not necessary if using "smoothing_factor"). If the interface or program supports "Quadratic Sampling" ("smoothing"), make the adjustment as noted.

Optimal Operation Guide

For all settings used for this model, including specifics for its "class", example generations, and advanced settings guide, please refer to [https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters].

🔧 Technical Details

The methods employed ensure that precision loss is minimized or eliminated at every step just before "ggufing" the model. It is a mathematically and theoretically sound approach. By using FP32 and carefully recreating merges, the model can achieve higher precision and better performance.

📄 License

This model is licensed under the Apache-2.0 license.

Visual

Results from the Original Model Creator

As per Jeb Carter, the original creator of the model:

Instruction following has improved dramatically.
New abilities have emerged.
He had to reduce the instruction sets used because the model no longer needed as specific instructions.
Prose, nuance, and depth have all improved.
Known issues with the original model have disappeared.

Future Remasters

This is the first group of remasters. A "reg quant plus" repo will follow, adding additional components into the GGUF at floating point 32 precision to further increase creativity and AI horsepower. A full float 32 precision Imatrix (including reg quants "imatrixed") will also be released. Test results will be posted when available.

Source Versions and Config Files

Source versions (and config files) of the models are available at [https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be].

Acknowledgment

Thanks again to Jeb Carter, the original creator of "Psyonic-Cetacean 20B" [https://huggingface.co/jebcarter/psyonic-cetacean-20B].

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご