Bielik-4.5B-v3.0-Instruct Open-source Polish Text Generation Model - Powerful Language Comprehension and Processing Capabilities

Bielik 4.5B V3.0 Instruct

Developed by speakleash

Bielik-4.5B-v3-Instruct is a 4.6 billion parameter Polish generative text model, fine-tuned based on Bielik-4.5B-v3, demonstrating exceptional Polish language comprehension and processing capabilities.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Polish language generation #Instruction fine-tuning #Multi-turn dialogue

Downloads 1,121

Release Time : 4/18/2025

Model Overview

This model is the result of a collaborative open-source research project between SpeakLeash and the high-performance computing center ACK Cyfronet AGH, trained on a curated Polish corpus and suitable for Polish text generation tasks.

Model Features

High-quality Polish language processing

Trained on a curated Polish corpus, demonstrating exceptional Polish language comprehension and generation capabilities

Instruction fine-tuning optimization

Fine-tuned with 19 million Polish instructions using DPO-Positive method to align with user preferences

Multi-turn dialogue support

Innovatively incorporates multi-turn dialogue training for more natural interaction experiences

High-performance computing support

Trained using Polish PLGrid computing infrastructure, including Athena and Helios supercomputers

Model Capabilities

Polish text generation

Instruction understanding and execution

Multi-turn dialogue processing

Open-ended text creation

Use Cases

Education

Polish cultural knowledge Q&A

Answering questions about Polish history and culture, such as explanations of Polish coat of arms patterns

Can accurately answer questions related to Polish culture

Customer service

Polish language customer service

Handling inquiries and issues from Polish-speaking customers

Can provide fluent Polish customer service dialogues

🚀 Bielik-4.5B-v3-Instruct

Bielik-4.5B-v3-Instruct is a generative text model with 4.6 billion parameters. It's an instruct fine - tuned version of Bielik-4.5B-v3. This model results from the collaboration between the open - science/open - source project SpeakLeash and the High Performance Computing (HPC) center: ACK Cyfronet AGH. Developed and trained on Polish text corpora processed by the SpeakLeash team, it leverages Polish large - scale computing infrastructure in the PLGrid environment, specifically at the HPC center ACK Cyfronet AGH. Supported by computational grants PLG/2024/017214 and PLG/2025/018338 on the Athena and Helios supercomputers, the model can understand and process the Polish language well, offering accurate responses and performing various linguistic tasks precisely.

📚 Technical report: https://arxiv.org/abs/2505.02550

✨ Features

High - Quality Training: Trained on over 19 million instructions with more than 12 billion tokens, including manually verified and synthetic instructions.
Advanced Alignment Techniques: Aligned with user preferences using the DPO - Positive method, which introduces multi - turn conversations.
Polish Language Proficiency: Exceptionally capable of understanding and processing the Polish language.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model_name = "speakleash/Bielik-4.5B-v3-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)

messages = [
    {"role": "system", "content": "Odpowiadaj krótko, precyzyjnie i wyłącznie w języku polskim."},
    {"role": "user", "content": "Jakie mamy pory roku w Polsce?"},
    {"role": "assistant", "content": "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima."},
    {"role": "user", "content": "Która jest najcieplejsza?"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = input_ids.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Advanced Usage

The basic usage example also demonstrates how to handle multi - turn conversations, which is an advanced feature introduced in the DPO - Positive method.

📚 Documentation

Model

The SpeakLeash team is continuously expanding and refining a set of Polish instructions. A manually verified portion of these instructions was used for training, along with synthetic instructions generated by Bielik 11B v2.3. The training dataset had over 19 million instructions with more than 12 billion tokens.

To align the model with user preferences, multiple techniques were tested, and the DPO - Positive method was chosen. It uses both generated and manually corrected examples scored by a metamodel. A dataset of over 111,000 examples of different lengths was filtered and evaluated by the reward model.

Bielik instruct models are trained using the open - source ALLaMo framework implemented by Krzysztof Ociepa.

Model description:

Property	Details
Developed by	SpeakLeash & ACK Cyfronet AGH
Language	Polish
Model Type	causal decoder - only
Finetuned from	Bielik-4.5B-v3
License	Apache 2.0 and Terms of Use

Chat template

Bielik-4.5B-v3-Instruct uses ChatML as the prompt format.

E.g.

prompt = "<s><|im_start|> user\nJakie mamy pory roku?<|im_end|> \n<|im_start|> assistant\n"
completion = "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima.<|im_end|> \n"

This format is available as a chat template via the apply_chat_template() method.

🔧 Technical Details

The model is developed through a unique collaboration between SpeakLeash and ACK Cyfronet AGH. It is trained on Polish text corpora processed by the SpeakLeash team, using the Polish large - scale computing infrastructure in the PLGrid environment. The training is supported by computational grants PLG/2024/017214 and PLG/2025/018338 on the Athena and Helios supercomputers.

For alignment, the DPO - Positive method is used. A dataset of over 111,000 examples is filtered and evaluated by the reward model to select instructions with the right level of difference between chosen and rejected responses. The novelty in DPO - P is the introduction of multi - turn conversations.

📄 License

The model is licensed under Apache 2.0 and Terms of Use.

Limitations and Biases

Bielik-4.5B-v3-Instruct is a quick demonstration that can be fine - tuned easily. It lacks moderation mechanisms. The model may produce factually incorrect, lewd, false, biased, or offensive outputs, as it was trained on various public datasets.

Citation

Please cite this model using the following format:

@misc{ociepa2025bielikv3smalltechnical,
      title={Bielik v3 Small: Technical Report}, 
      author={Krzysztof Ociepa and Łukasz Flis and Remigiusz Kinas and Krzysztof Wróbel and Adrian Gwoździej},
      year={2025},
      eprint={2505.02550},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.02550}, 
}

@misc{Bielik45Bv3i,
    title     = {Bielik-4.5B-v3-Instruct model card},
    author    = {Ociepa, Krzysztof and Flis, Łukasz and Kinas, Remigiusz and Gwoździej, Adrian and Wróbel, Krzysztof and {SpeakLeash Team} and {Cyfronet Team}},
    year      = {2025},
    url       = {https://huggingface.co/speakleash/Bielik-4.5B-v3-Instruct},
    note      = {Accessed: 2025-05-06}, % change this date
    urldate   = {2025-05-06} % change this date
}

Responsible for training the model

Krzysztof Ociepa^SpeakLeash: Team leadership, conceptualizing, data preparation, process optimization, and oversight of training.
Łukasz Flis^{Cyfronet AGH}: Coordinating and supervising the training.
Remigiusz Kinas^SpeakLeash: Conceptualizing, coordinating RL trainings, data preparation, benchmarking, and quantizations.
Adrian Gwoździej^SpeakLeash: Data preparation and ensuring data quality.
Krzysztof Wróbel^SpeakLeash: Benchmarks.

Many individuals from the SpeakLeash team and ACK Cyfronet AGH team also contributed to the model creation.

We thank the Polish high - performance computing infrastructure PLGrid (HPC Center: ACK Cyfronet AGH) for providing computer facilities and support through computational grants PLG/2024/017214 and PLG/2025/018338.

Contact Us

If you have any questions or suggestions, use the discussion tab. To contact us directly, join our Discord SpeakLeash.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご