🚀 Bielik-11B-v2.3-Instruct
Bielik-11B-v2.3-Instruct is a generative text model with 11 billion parameters. It addresses the need for high - performance Polish language processing by merging multiple fine - tuned models. This model is a result of a unique collaboration, leveraging Polish computing infrastructure and large - scale text corpora, enabling accurate language understanding and task execution in Polish.
🚀 Quick Start
The model uses ChatML as the prompt format. Here is a basic example of how to use it:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_name = "speakleash/Bielik-11B-v2.3-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
messages = [
{"role": "system", "content": "Odpowiadaj krótko, precyzyjnie i wyczerpująco w języku polskim."},
{"role": "user", "content": "Jakie mamy pory roku w Polsce?"},
{"role": "assistant", "content": "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima."},
{"role": "user", "content": "Która jest najcieplejsza?"}
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = input_ids.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
✨ Features
- Multi - model Merge: It is a linear merge of Bielik-11B-v2.0-Instruct, Bielik-11B-v2.1-Instruct, and Bielik-11B-v2.2-Instruct, which are instruct fine - tuned versions of Bielik-11B-v2.
- Polish Language Focus: Developed and trained on Polish text corpora, enabling excellent performance in Polish language tasks.
- Advanced Training Techniques: Utilized techniques such as weighted tokens level loss, adaptive learning rate, and masked prompt tokens to improve performance.
- Multiple Quantized Versions: Available in various quantized versions, including GGUF, GPTQ, and FP8, to suit different resource requirements.
📦 Installation
No specific installation steps are provided in the original README. If you want to use the model, you can install the necessary libraries as shown in the quick - start code example:
pip install transformers torch
💻 Usage Examples
Basic Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_name = "speakleash/Bielik-11B-v2.3-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
messages = [
{"role": "user", "content": "Jakie mamy pory roku?"}
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = input_ids.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=100, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
Advanced Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_name = "speakleash/Bielik-11B-v2.3-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
messages = [
{"role": "system", "content": "Odpowiadaj krótko, precyzyjnie i wyczerpująco w języku polskim."},
{"role": "user", "content": "Jakie mamy pory roku w Polsce?"},
{"role": "assistant", "content": "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima."},
{"role": "user", "content": "Która jest najcieplejsza?"}
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = input_ids.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=500, temperature=0.7, top_k=50, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
📚 Documentation
Model
The SpeakLeash team developed custom Polish instructions. Due to the scarcity of high - quality Polish instructions, synthetic instructions were generated with Mixtral 8x22B and used in training. The training dataset had over 20 million instructions with more than 10 billion tokens. To improve performance, several techniques were introduced:
The DPO - Positive method was used to align the model with user preferences. The model was merged using mergekit by Remigiusz Kinas.
Model description:
Quantized models:
- GGUF - Q4_K_M, Q5_K_M, Q6_K, Q8_0
- GPTQ - 4bit
- FP8 (vLLM, SGLang - Ada Lovelace, Hopper optimized)
- GGUF - experimental - IQ imatrix IQ1_M, IQ2_XXS, IQ3_XXS, IQ4_XS and calibrated Q4_K_M, Q5_K_M, Q6_K, Q8_0
⚠️ Important Note
Quantized models may offer lower quality of generated answers compared to full - sized variants.
Chat template
Bielik-11B-v2.3-Instruct uses ChatML as the prompt format. For example:
prompt = "<s><|im_start|> user\nJakie mamy pory roku?<|im_end|> \n<|im_start|> assistant\n"
completion = "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima.<|im_end|> \n"
Evaluation
Bielik-11B-v2.3-Instruct has been evaluated on several benchmarks:
- Open PL LLM Leaderboard
- Open LLM Leaderboard
- Polish MT - Bench
- Polish EQ - Bench (Emotional Intelligence Benchmark)
- MixEval
Open PL LLM Leaderboard
Models were evaluated on Open PL LLM Leaderboard 5 - shot. The benchmark assesses NLP tasks like sentiment analysis, categorization, and text classification.
Model |
Parameters (B) |
Average |
Meta - Llama - 3.1 - 405B - Instruct - FP8,API |
405 |
69.44 |
Mistral - Large - Instruct - 2407 |
123 |
69.11 |
Qwen2 - 72B - Instruct |
72 |
65.87 |
Bielik-11B-v2.3-Instruct |
11 |
65.71 |
Bielik-11B-v2.2-Instruct |
11 |
65.57 |
Meta - Llama - 3.1 - 70B - Instruct |
70 |
65.49 |
Bielik-11B-v2.1-Instruct |
11 |
65.45 |
Mixtral - 8x22B - Instruct - v0.1 |
141 |
65.23 |
Bielik-11B-v2.0-Instruct |
11 |
64.98 |
Meta - Llama - 3 - 70B - Instruct |
70 |
64.45 |
Athene - 70B |
70 |
63.65 |
WizardLM - 2 - 8x22B |
141 |
62.35 |
Qwen1.5 - 72B - Chat |
72 |
58.67 |
Qwen2 - 57B - A14B - Instruct |
57 |
56.89 |
glm - 4 - 9b - chat |
9 |
56.61 |
aya - 23 - 35B |
35 |
56.37 |
Phi - 3.5 - MoE - instruct |
41.9 |
56.34 |
openchat - 3.5 - 0106 - gemma |
7 |
55.69 |
Mistral - Nemo - Instruct - 2407 |
12 |
55.27 |
SOLAR - 10.7B - Instruct - v1.0 |
10.7 |
55.24 |
Mixtral - 8x7B - Instruct - v0.1 |
46.7 |
55.07 |
Bielik-11B-v2.0-Instruct |
11 |
64.98 |
Bielik-7B-Instruct-v0.1 |
7 |
44.70 |
trurl - 2 - 13b - academic |
13 |
36.28 |
trurl - 2 - 7b |
7 |
26.93 |
The results show that Bielik-11B-v2.3-Instruct:
- Outperforms all other models with less than 70B parameters.
- Performs on par with models in the 70B parameter range.
- Shows a marked improvement over its predecessor, Bielik-7B-Instruct-v0.1.
- Stands out as a leader among Polish language models.
Open PL LLM Leaderboard - Generative Tasks Performance
Model |
Parameters (B) |
Average g |
Bielik-11B-v2.3-Instruct |
11 |
67.47 |
Bielik-11B-v2.1-Instruct |
11 |
66.58 |
Bielik-11B-v2.2-Instruct |
11 |
66.11 |
Bielik-11B-v2.0-Instruct |
11 |
65.58 |
gpt-3.5-turbo-instruct |
Unavailable |
N/A |
🔧 Technical Details
- Training Dataset: Comprised of over 20 million instructions with more than 10 billion tokens.
- Training Techniques: Weighted tokens level loss, adaptive learning rate, masked prompt tokens, and DPO - Positive for alignment.
- Merge Method: Linear merge of multiple models using mergekit.
📄 License
The model is licensed under Apache 2.0 and Terms of Use.
⚠️ Important Note
If you want to learn more about how you can use the model, please refer to our Terms of Use.