đ Tito-7B-slerp
Tito-7B-slerp is a merged model that combines the strengths of multiple models. It addresses the need for high - performance text generation by leveraging the features of its base models. This model is suitable for various NLP tasks, offering improved accuracy and performance in different evaluation benchmarks.
đ Quick Start
Tito-7B-slerp can be used for text generation tasks. You can start by integrating it with your existing NLP pipelines. For more details on how to use it, refer to the official documentation of the base models and the mergekit
tool.
⨠Features
- Model Merging: Utilizes
mergekit
to combine gordicaleksa/YugoGPT and mlabonne/AlphaMonarch-7B.
- Slerp Merge Method: Employs the slerp merge method for better model integration.
- Parameter Tuning: Allows for fine - tuning of parameters such as
t
for different components like self_attn
and mlp
.
đĻ Installation
There is no specific installation information provided in the original README. If you want to use this merged model, you may need to follow the installation steps of mergekit
and the base models.
đ Documentation
đ§Š Configuration
slices:
- sources:
- model: gordicaleksa/YugoGPT
layer_range: [0, 32]
- model: mlabonne/AlphaMonarch-7B
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.6
dtype: bfloat16
Results
Evaluations on Serbian LLM eval suite
Evaluations on Serbian LLM eval suite (or rather, performance and knowledge of Serbian):
|
ARC-E |
ARC-C |
Hellaswag |
BoolQ |
Winogrande |
OpenbookQA |
PiQA |
NQ Open |
TriviaQA |
Avg. |
Zamfir-7B |
51.85 |
32.25 |
46.03 |
75.59 |
62.59 |
26.00 |
66.81 |
16.09 |
36.11 |
45.92 |
Mustra-7B |
52.95 |
33.70 |
45.89 |
77.55 |
64.17 |
30.60 |
67.25 |
15.40 |
34.84 |
46.93 |
Tito-7B |
55.43 |
34.73 |
48.19 |
77.37 |
65.27 |
30.00 |
67.30 |
16.7 |
35.38 |
47.82 |
YugoGPT |
57.79 |
34.73 |
49.89 |
69.45 |
64.56 |
28.20 |
72.03 |
15.82 |
36.14 |
47.62 |
Here, all benchmarks were done 0 - shot, on the exception of NQ Open and TriviaQA which were done in 5 - shot manner, in order to be comparable to Mistral paper.
Replicating OpenLLM Leaderboard results on Serbian datasets
If we try to replicate OpenLLM Leaderboard results on available Serbian datasets (running an appropriate amount of shots instead of 0), we get:
|
ARC |
Hellaswag |
Winogrande |
TruthfulQA |
Avg. |
Tito-7B |
47.27 |
- |
69.93 |
57.48 |
58.23 |
Perucac-7B |
49.74 |
- |
71.98 |
56.03 |
59.25 |
YugoGPT |
44.03 |
- |
70.64 |
48.06 |
54.24 |
Llama3-8B |
42.24 |
- |
61.25 |
51.08 |
51.52 |
SambaLingo |
37.88 |
- |
61.48 |
47.23 |
48.86 |
Note that YugoGPT, Llama3 and SambaLingo are all base models, unlike Tito and Perucac.
Detailed results can be found here
Metric |
Tito |
YugoGPT |
Avg. |
70.13 |
57.34 |
AI2 Reasoning Challenge (25 - Shot) |
68.09 |
58.10 |
HellaSwag (10 - Shot) |
86.38 |
81.44 |
MMLU (5 - Shot) |
64.01 |
60.68 |
TruthfulQA (0 - shot) |
57.01 |
36.60 |
Winogrande (5 - shot) |
81.69 |
76.56 |
GSM8k (5 - shot) |
63.61 |
30.70 |
đ License
This project is licensed under the Apache - 2.0 license.