B

Bielik 11B V2

Developed by speakleash
Bielik-11B-v2 is a generative text model with 11 billion parameters, specifically developed and trained for Polish language text. It is initialized based on Mistral-7B-v0.2 and trained on 400 billion tokens.
Downloads 690
Release Time : 8/26/2024

Model Overview

This model is the result of a collaboration between the open-source scientific project SpeakLeash and the high-performance computing center ACK Cyfronet AGH. It demonstrates exceptional Polish language understanding and processing capabilities, accurately responding to and efficiently completing various language tasks.

Model Features

Large-scale training
Initialized based on the predecessor Mistral-7B-v0.2 and trained on 400 billion tokens, the training data includes Polish language texts collected by the SpeakLeash project and subsets of CommonCrawl.
High-quality data
Polish language text quality was evaluated using the XGBoost classification model, selecting texts with a HIGH quality index and a probability exceeding 90%, ensuring refined and high-quality training data.
High-performance computing
Training was completed on the Helios supercomputer at ACK Cyfronet AGH, using 256 NVidia GH200 GPUs, leveraging the large-scale computing infrastructure of the Polish PLGrid environment.

Model Capabilities

Polish text generation
Polish language understanding and processing
Language task response

Use Cases

Language processing
Text generation
Generate Polish language texts, such as articles, stories, etc.
Can accurately respond to and efficiently complete various language tasks.
Sentiment analysis
Analyze the emotional tendency of Polish language texts.
Performs excellently on the Open PL LLM Leaderboard.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase