đ Hexagon Purple V2
Hexagon Purple V2 is a merged pre - trained language model. It retains the base of Smartracks and makes several improvements compared to V1, aiming for better performance and lower censorship.
đ Quick Start
The document doesn't provide specific start - up steps, but it's a merged model created with mergekit. You can refer to the mergekit documentation for more details on how to use and deploy this model.
⨠Features
Model Evolution
- The base of Hexagon Purple V2, Smartracks, remains unchanged. It is a "3 levels" stock merge that includes Deepseek Distill R1 (3 flavors), Nemotron, and Tulu capabilities.
- Compared to V1, it makes the following improvements:
- Replaced Black - Ink - Guild's Perniscious Prophecy with Steelskull's Electra R1 because it performs better.
- Replaced the Hostess stock merge with a Priestess one, bringing in 70Blivision and removing the Lumitron merge on top of Tess R1 and Llama Creative Writer.
- Replaced standalone models Dobby, Wayfarer, and Drummer's Fallen Llama R1 with a stock - merge of these 3, DoppelGanger R1.
- Added Nbeerbower's Doppel Gutemberg as a 3.1 instruct (and novel writing) stabilizer working in tandem with the following model.
- Added Miguel Tissera's Tess 3.0 70B 3.1 as a perplexity dropper.
Low Censorship
When available, abliterated and lorablated techniques (thanks to Huihui - ai, Maxime Labonne, and ofc Failspy) are used systematically. Otherwise, the model focuses on very low censorship.
Benchmark Performance
- PPL Wikitext Eng 512: 3.43 (good)
- ARC - C: 60.55 (good)
- ARC - E: 81.05 (good also)
đĻ Installation
The document doesn't provide specific installation steps. You can follow the general installation process of mergekit - based models and refer to the mergekit repository for more information.
đģ Usage Examples
The document doesn't provide specific code examples. You can use the model according to the general usage of pre - trained language models, such as using the transformers
library in Python:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "your_model_path"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "Your input text here"
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
output = model.generate(input_ids)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
đ Documentation
Merge Details
Merge Method
This model was merged using the Model Stock merge method, with Nexesenex/Llama_3.x_70b_SmarTracks_V1.01 as the base.
Models Merged
The following models were included in the merge:
- [Steelskull/L3.3 - Electra - R1 - 70b](https://huggingface.co/Steelskull/L3.3 - Electra - R1 - 70b)
- NexesMess/Llama_3.3_70b_DoppelGanger_R1
- [nbeerbower/Llama3.1 - Gutenberg - Doppel - 70B](https://huggingface.co/nbeerbower/Llama3.1 - Gutenberg - Doppel - 70B)
- NexesMess/Llama_3.1_70b_Priestess_V1
- [migtissera/Tess - 3 - Llama - 3.1 - 70B](https://huggingface.co/migtissera/Tess - 3 - Llama - 3.1 - 70B)
Configuration
The following YAML configuration was used to produce this model:
merge_method: model_stock
models:
- model: migtissera/Tess-3-Llama-3.1-70B
parameters:
weight: 1.0
- model: nbeerbower/Llama3.1-Gutenberg-Doppel-70B
parameters:
weight: 1.0
- model: NexesMess/Llama_3.1_70b_Priestess_V1
parameters:
weight: 1.0
- model: Steelskull/L3.3-Electra-R1-70b
parameters:
weight: 1.0
- model: NexesMess/Llama_3.3_70b_DoppelGanger_R1
parameters:
weight: 1.0
base_model: Nexesenex/Llama_3.x_70b_SmarTracks_V1.01
dtype: bfloat16
out_dtype: bfloat16
parameters:
int8_mask: true
normalize: true
rescale: false
chat_template: auto
tokenizer:
source: union
đ§ Technical Details
The merge of this model uses the Model Stock method. By combining multiple pre - trained models, it aims to achieve better performance in various benchmarks and creative tasks. The specific weights of each model in the merge are set in the configuration file, which helps balance the contributions of different models.
đ License
The document doesn't provide license information. You should check the licenses of the individual models used in the merge and the mergekit library for more details.