🚀 UNA-ThePitbull 21.4B v2
Introducing the industry's top LLM. Despite being a 21.4B model based on saltlux/luxia-21.4b-alignment-v1.0, it performs nearly as well as a 70B model.

This model hasn't been manipulated to achieve high scores while being ineffective. We're releasing it because it combines high EQ and IQ in a powerful, intelligent, and conversational model.
Quant Versions are available at bartowski/UNA-ThePitbull-21.4B-v2-GGUF
✨ Features
Difference V1 vs V2
On V2, we implemented a different UNA strategy and partially covered the MLP's and Attention Layers.
We also conducted further SFT and DPO on V1, and some of those results will be released soon.
Changes
- SFT over V1 with
Replete-AI/code_bagel_hermes-2.5
at 1.0e-4 till 5.0e-5 for 1 epoch
- DPO with: 1.0e-4 to min_lr 5.0e-5 for 1 epoch
mlabonne/orpo-dpo-mix-40k
jondurbin/py-dpo-v0.1
📚 Documentation
Evaluations
Detailed results can be found here
Metric |
Value |
Avg. |
77.82 |
AI2 Reasoning Challenge (25-Shot) |
77.73 |
HellaSwag (10-Shot) |
91.79 |
MMLU (5-Shot) |
68.25 |
TruthfulQA (0-shot) |
78.24 |
Winogrande (5-shot) |
87.37 |
GSM8k (5-shot) |
63.53 |
It can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1.
UNA v2 (VLLM) Evaluations:
vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
| Tasks | Version | Filter | n-shot | Metric | Value | | Stderr |
|--------------|--------:|----------------|-------:|-----------|------:|---|--------:|
| gsm8k | 3 | strict-match | 5 | exact_match | 0.7695 | ± | 0.0116 | +
| | | flexible-extract | 5 | exact_match | 0.7695 | ± | 0.0116 | +
| hellaswag | 1 | none | 10 | acc | 0.8110 | ± | 0.0039 |
| | | none | 10 | acc_norm | 0.9169 | ± | 0.0028 | +
| winogrande | 1 | none | 5 | acc | 0.8777 | ± | 0.0092 | +
| mmlu | N/A | none | 0 | acc | 0.6427 | ± | 0.0038 | -
| arc_challenge | 1 | none | 25 | acc | 0.7713 | ± | 0.0123 |
| | | none | 25 | acc_norm | 0.7875 | ± | 0.0120 | +
| truthfulqa_mc2 | 2 | none | 0 | acc | 0.7824 | ± | 0.0135 | -
| mathqa | 1 | none | 0 | acc | 0.4037 | ± | 0.009 |
| | | none | 0 | acc_norm | 0.4034 | ± | 0.009 | +
| pubmedqa | 1 | none | 0 | acc | 0.7260 | ± | 0.020 | +
| boolq | 2 | none | 0 | acc | 0.8602 | ± | 0.0061 | +
UNA v1 (VLLM) Evaluations
| Tasks | Version | Filter | n-shot | Metric | Value | | Stderr |
|--------------|--------:|----------------|-------:|-----------|------:|---|--------:|
| gsm8k | 3 | strict-match | 5 | exact_match | 0.7566 | ± | 0.0118 |
| | | flexible-extract | 5 | exact_match | 0.7582 | ± | 0.0118 |
| hellaswag | 1 | none | 10 | acc | 0.8168 | ± | 0.0039 |
| | | none | 10 | acc_norm | 0.9188 | ± | 0.0027 |
| winogrande | 1 | none | 5 | acc | 0.8635 | ± | 0.0097 |
| mmlu | N/A | none | 0 | acc | 0.6444 | ± | 0.0038 |
| arc_challenge | 1 | none | 25 | acc | 0.7747 | ± | 0.0122 |
| | | none | 25 | acc_norm | 0.7850 | ± | 0.0120 |
| truthfulqa_mc2 | 2 | none | 0 | acc | 0.7902 | ± | 0.0134 |
| mathqa | 1 | none | 0 | acc | 0.4030 | ± | 0.009 |
| | | none | 0 | acc_norm | 0.4034 | ± | 0.009 |
| pubmedqa | 1 | none | 0 | acc | 0.6860 | ± | 0.0208 |
| boolq | 2 | none | 0 | acc | 0.8401 | ± | 0.0064 |
Original (VLLM) Evaluations
| Tasks | Version | Filter | n-shot | Metric | Value | | Stderr |
|--------------|--------:|----------------|-------:|-----------|------:|---|--------:|
| gsm8k | 3 | strict-match | 5 | exact_match | 0.7528 | ± | 0.0119 |
| | | flexible-extract | 5 | exact_match | 0.7521 | ± | 0.0119 |
| hellaswag | 1 | none | 10 | acc | 0.8117 | ± | 0.0039 |
| | | none | 10 | acc_norm | 0.9167 | ± | 0.0028 |
| winogrande | 1 | none | 5 | acc | 0.8682 | ± | 0.0095 |
| mmlu | N/A | none | 0 | acc | 0.6448 | ± | 0.0038 |
| arc_challenge | 1 | none | 25 | acc | 0.7688 | ± | 0.0123 |
| | | none | 25 | acc_norm | 0.7730 | ± | 0.0122 |
| truthfulqa_mc2 | 2 | none | 0 | acc | 0.7895 | ± | 0.0133 |
| mathqa | 1 | none | 0 | acc | 0.4000 | ± | 0.009 |
| | | none | 0 | acc_norm | 0.4003 | ± | 0.009 |
| pubmedqa | 1 | none | 0 | acc | 0.6680 | ± | 0.0211 |
| boolq | 2 | none | 0 | acc | 0.8346 | ± | 0.0065 |
Detailed results can be found here
Metric |
Value |
Avg. |
22.60 |
IFEval (0-Shot) |
37.90 |
BBH (3-Shot) |
46.79 |
MATH Lvl 5 (4-Shot) |
9.59 |
GPQA (0-shot) |
6.94 |
MuSR (0-shot) |
6.42 |
MMLU-PRO (5-shot) |
27.95 |
Citations
- mlabonne
- jondurbin & Replete-AI
- bartowski
- saltlux
If you use UNA models, don't forget to cite:
@misc{unathepitbull21b,
title={ThePitbull: Uniform Neural Alignment},
author={Xavier Murias},
year={2024},
publisher = {Juanako.AI},
journal = {HuggingFace repository},
howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}},
}
📄 License
This project is licensed under the afl-3.0 license.