Bielik-1.5B-v3.0-Instruct: An Open-Source Polish Text Generation Model with Practical Instruction Tuning Features

Bielik 1.5B V3.0 Instruct

Developed by speakleash

Bielik-1.5B-v3-Instruct is a 1.6 billion parameter Polish generative text model, fine-tuned for instructions based on Bielik-1.5B-v3, developed by SpeakLeash in collaboration with ACK Cyfronet AGH.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Polish instruction fine-tuning #Multi-turn dialogue optimization #DPO reinforcement learning

Downloads 780

Release Time : 4/18/2025

Model Overview

This model is an instruction-tuned version of Bielik-1.5B-v3, specializing in Polish language understanding and processing, capable of accurately completing various linguistic tasks. Training utilized a curated Polish corpus selected by the SpeakLeash team and was completed on Poland's national computing infrastructure.

Model Features

Polish language optimization

Trained on a carefully selected Polish corpus, demonstrating exceptional Polish language comprehension and processing capabilities

Instruction fine-tuning

Fine-tuned with over 19 million Polish instructions, optimizing response style and task completion abilities

High-performance computing support

Training process supported by the HPC center ACK Cyfronet AGH within the PLGrid environment, utilizing Athena and Helios supercomputers

Advanced training methods

Incorporates DPO-Positive method combined with multi-turn dialogue training mechanisms to optimize model alignment

Model Capabilities

Polish text generation

Instruction understanding and execution

Multi-turn dialogue

Linguistic task processing

Use Cases

Education

Polish cultural knowledge Q&A

Answering questions about Polish history, culture, geography, etc.

As shown in examples, can accurately answer questions such as the symbols on Poland's national emblem

Customer service

Polish customer service dialogue

Handling inquiries and issues from Polish-speaking users

🚀 Bielik-1.5B-v3-Instruct

Bielik-1.5B-v3-Instruct is a generative text model with 1.6 billion parameters. It's an instruct fine-tuned version of Bielik-1.5B-v3. This model results from the collaboration between the open-science/open-source project SpeakLeash and the High Performance Computing (HPC) center ACK Cyfronet AGH. Trained on Polish text corpora processed by the SpeakLeash team, it uses Polish large-scale computing infrastructure in the PLGrid environment, specifically at the HPC center ACK Cyfronet AGH. The creation and training were supported by computational grants PLG/2024/017214 and PLG/2025/018338 on the Athena and Helios supercomputers, enabling it to handle large-scale machine learning tasks. As a result, the model excels at understanding and processing the Polish language, offering accurate responses and high-precision linguistic task performance.

📚 Technical report: https://arxiv.org/abs/2505.02550

🚀 Quick Start

The model is ready for use. You can refer to the following sections for more details on its usage and features.

✨ Features

Polish Language Proficiency: Developed and trained on Polish text corpora, it has excellent understanding and processing capabilities for the Polish language.
Instruct Fine-tuning: It is an instruct fine-tuned version, enabling it to perform various linguistic tasks with high precision.
Powered by Advanced Infrastructure: Leveraging Polish large-scale computing infrastructure and computational grants, it uses cutting-edge technology for training.

📚 Documentation

Model

The SpeakLeash team is developing its own set of Polish instructions, which are continuously refined by annotators. A manually verified and corrected portion of these instructions was used for training. Due to the limited availability of high-quality Polish instructions, synthetic instructions generated by Bielik 11B v2.3 were also used. The training dataset consisted of over 19 million instructions with more than 12 billion tokens.

To align the model with user preferences, various techniques such as DPO, PPO, KTO, and SiMPO were tested. Finally, the DPO-Positive method was adopted, using both generated and manually corrected examples scored by a metamodel. A dataset of over 111,000 examples of different lengths was filtered and evaluated by the reward model to select instructions with the appropriate difference between chosen and rejected responses. The novelty of DPO-P is the introduction of multi-turn conversations.

The Bielik instruct models were trained using an original open-source framework called ALLaMo implemented by Krzysztof Ociepa. This framework allows for fast and efficient training of language models with architectures similar to LLaMA and Mistral.

Model description:

Property	Details
Developed by	SpeakLeash & ACK Cyfronet AGH
Language	Polish
Model Type	causal decoder-only
Finetuned from	Bielik-1.5B-v3
License	Apache 2.0 and Terms of Use

Chat template

Bielik-1.5B-v3-Instruct uses ChatML as the prompt format.

E.g.

prompt = "<s><|im_start|> user\nJakie mamy pory roku?<|im_end|> \n<|im_start|> assistant\n"
completion = "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima.<|im_end|> \n"

This format is available as a chat template via the apply_chat_template() method:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model_name = "speakleash/Bielik-1.5B-v3-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)

messages = [
    {"role": "system", "content": "Odpowiadaj krótko, precyzyjnie i wyłącznie w języku polskim."},
    {"role": "user", "content": "Jakie mamy pory roku w Polsce?"},
    {"role": "assistant", "content": "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima."},
    {"role": "user", "content": "Która jest najcieplejsza?"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = input_ids.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Fully formated input conversation by apply_chat_template from previous example:

<s><|im_start|> system
Odpowiadaj krótko, precyzyjnie i wyłącznie w języku polskim.<|im_end|> 
<|im_start|> user
Jakie mamy pory roku w Polsce?<|im_end|> 
<|im_start|> assistant
W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima.<|im_end|> 
<|im_start|> user
Która jest najcieplejsza?<|im_end|>

Limitations and Biases

Bielik-1.5B-v3-Instruct is a quick demonstration that the base model can be easily fine-tuned to achieve compelling and promising performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community in ways to make the model respect guardrails, allowing for deployment in environments requiring moderated outputs.

Bielik-1.5B-v3-Instruct can produce factually incorrect output, and should not be relied on to produce factually accurate data. Bielik-1.5B-v3-Instruct was trained on various public datasets. While great efforts have been taken to clear the training data, it is possible that this model can generate lewd, false, biased or otherwise offensive outputs.

Citation

Please cite this model using the following format:

@misc{ociepa2025bielikv3smalltechnical,
      title={Bielik v3 Small: Technical Report}, 
      author={Krzysztof Ociepa and Łukasz Flis and Remigiusz Kinas and Krzysztof Wróbel and Adrian Gwoździej},
      year={2025},
      eprint={2505.02550},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.02550}, 
}

@misc{Bielik15Bv3i,
    title     = {Bielik-1.5B-v3-Instruct model card},
    author    = {Ociepa, Krzysztof and Flis, Łukasz and Kinas, Remigiusz and Gwoździej, Adrian and Wróbel, Krzysztof and {SpeakLeash Team} and {Cyfronet Team}},
    year      = {2025},
    url       = {https://huggingface.co/speakleash/Bielik-1.5B-v3-Instruct},
    note      = {Accessed: 2025-05-06}, % change this date
    urldate   = {2025-05-06} % change this date
}
@unpublished{Bielik15Bv33a,
  author = {Ociepa, Krzysztof and Flis, Łukasz and Kinas, Remigiusz and Gwoździej, Adrian and Wróbel, Krzysztof},
  title  = {Bielik: A Family of Large Language Models for the Polish Language - Development, Insights, and Evaluation},
  year   = {2024},
}

Responsible for training the model

Krzysztof Ociepa^SpeakLeash - team leadership, conceptualizing, data preparation, process optimization and oversight of training
Łukasz Flis^{Cyfronet AGH} - coordinating and supervising the training
Remigiusz Kinas^SpeakLeash - conceptualizing, coordinating RL trainings, data preparation, benchmarking and quantizations
Adrian Gwoździej^SpeakLeash - data preparation and ensuring data quality
Krzysztof Wróbel^SpeakLeash - benchmarks

The model could not have been created without the commitment and work of the entire SpeakLeash team, whose contribution is invaluable. Thanks to the hard work of many individuals, it was possible to gather a large amount of content in Polish and establish collaboration between the open-science SpeakLeash project and the HPC center: ACK Cyfronet AGH. Individuals who contributed to the creation of the model: Sebastian Kondracki, Igor Ciuciura, Szymon Baczyński, Jacek Chwiła, Dominika Basaj, Kuba Sołtys, Karol Jezierski, Anna Przybył, Agnieszka Ratajska, Witold Wydmański, Izabela Babis, Nina Babis.

Members of the ACK Cyfronet AGH team providing valuable support and expertise: Szymon Mazurek, Marek Magryś, Mieszko Cholewa .

We gratefully acknowledge Polish high-performance computing infrastructure PLGrid (HPC Center: ACK Cyfronet AGH) for providing computer facilities and support within computational grant no. PLG/2024/017214 and PLG/2025/018338.

Contact Us

If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our Discord SpeakLeash.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご