Tito-7B-slerp Open Source Large Language Model - Free Deployment Supports Serbian and English Tasks

Tito 7B Slerp

Developed by Stopwolf

Tito-7B-slerp is a large language model created by merging the YugoGPT and AlphaMonarch-7B models using the mergekit tool, excelling in Serbian and English tasks.

Large Language Model

Transformers

Open Source License:Apache-2.0 #Serbian language optimization #Multitask text generation #Model fusion enhancement

Downloads 22

Release Time : 2/28/2024

Model Overview

This model employs the slerp fusion method to combine the strengths of YugoGPT and AlphaMonarch-7B, demonstrating robust performance in text generation tasks, particularly excelling in Serbian language tasks.

Model Features

Bilingual capability

Performs excellently in both English and Serbian tasks, with special optimization for Serbian language performance.

Model fusion technology

Uses the slerp fusion method to combine the strengths of YugoGPT and AlphaMonarch-7B.

High performance

Achieves an average score of 70.13 on the Open LLM Leaderboard, surpassing the base model YugoGPT.

Model Capabilities

Text generation

Question answering systems

Reasoning tasks

Multilingual processing

Use Cases

Education

Language learning assistant

Assists in learning Serbian and English

Achieved an average score of 47.82 in Serbian language evaluations.

Research

Benchmark testing

Used to evaluate multilingual model performance

Scored 68.09 in the AI2 Reasoning Challenge.

🚀 Tito-7B-slerp

Tito-7B-slerp is a merged model that combines the strengths of multiple models. It addresses the need for high - performance text generation by leveraging the features of its base models. This model is suitable for various NLP tasks, offering improved accuracy and performance in different evaluation benchmarks.

🚀 Quick Start

Tito-7B-slerp can be used for text generation tasks. You can start by integrating it with your existing NLP pipelines. For more details on how to use it, refer to the official documentation of the base models and the mergekit tool.

✨ Features

Model Merging: Utilizes mergekit to combine gordicaleksa/YugoGPT and mlabonne/AlphaMonarch-7B.
Slerp Merge Method: Employs the slerp merge method for better model integration.
Parameter Tuning: Allows for fine - tuning of parameters such as t for different components like self_attn and mlp.

📦 Installation

There is no specific installation information provided in the original README. If you want to use this merged model, you may need to follow the installation steps of mergekit and the base models.

📚 Documentation

🧩 Configuration

slices:
  - sources:
      - model: gordicaleksa/YugoGPT
        layer_range: [0, 32]
      - model: mlabonne/AlphaMonarch-7B
        layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.6
dtype: bfloat16

Results

Evaluations on Serbian LLM eval suite

Evaluations on Serbian LLM eval suite (or rather, performance and knowledge of Serbian):

	ARC-E	ARC-C	Hellaswag	BoolQ	Winogrande	OpenbookQA	PiQA	NQ Open	TriviaQA	Avg.
Zamfir-7B	51.85	32.25	46.03	75.59	62.59	26.00	66.81	16.09	36.11	45.92
Mustra-7B	52.95	33.70	45.89	77.55	64.17	30.60	67.25	15.40	34.84	46.93
Tito-7B	55.43	34.73	48.19	77.37	65.27	30.00	67.30	16.7	35.38	47.82
YugoGPT	57.79	34.73	49.89	69.45	64.56	28.20	72.03	15.82	36.14	47.62

Here, all benchmarks were done 0 - shot, on the exception of NQ Open and TriviaQA which were done in 5 - shot manner, in order to be comparable to Mistral paper.

Replicating OpenLLM Leaderboard results on Serbian datasets

If we try to replicate OpenLLM Leaderboard results on available Serbian datasets (running an appropriate amount of shots instead of 0), we get:

	ARC	Hellaswag	Winogrande	TruthfulQA	Avg.
Tito-7B	47.27	-	69.93	57.48	58.23
Perucac-7B	49.74	-	71.98	56.03	59.25
YugoGPT	44.03	-	70.64	48.06	54.24
Llama3-8B	42.24	-	61.25	51.08	51.52
SambaLingo	37.88	-	61.48	47.23	48.86

Note that YugoGPT, Llama3 and SambaLingo are all base models, unlike Tito and Perucac.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Tito	YugoGPT
Avg.	70.13	57.34
AI2 Reasoning Challenge (25 - Shot)	68.09	58.10
HellaSwag (10 - Shot)	86.38	81.44
MMLU (5 - Shot)	64.01	60.68
TruthfulQA (0 - shot)	57.01	36.60
Winogrande (5 - shot)	81.69	76.56
GSM8k (5 - shot)	63.61	30.70

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご