Bielik-11B-v2 - AI Language Models Tool
Overview
Bielik-11B-v2 is an 11-billion-parameter causal (decoder-only) generative language model optimized for Polish-language tasks and competitive on English benchmarks. It was developed by the SpeakLeash project in collaboration with ACK Cyfronet AGH and released under an Apache-2.0 license. The Hugging Face model card describes the model as initialized from Mistral-7B-v0.2 and trained using Megatron-LM with large-scale parallelization on the Helios supercomputer (256 NVIDIA GH200 GPUs), producing a model intended as a high-quality base model for downstream fine-tuning. ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2)) Bielik-11B-v2 was trained on curated Polish corpora assembled by SpeakLeash plus subsets of CommonCrawl and related corpora, and the project reports large token counts used during pretraining (model card text references both 200B tokens used for two epochs and also states 400B tokens in other summary lines). The model card and related instruct variants highlight strong performance on the Open PL LLM Leaderboard (Polish-focused) and the Open LLM Leaderboard (English benchmarks), and the team provides multiple instruction-tuned releases (v2.0–v2.6 series) and quantized builds for deployment. ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2))
Model Statistics
- Downloads: 820
- Likes: 46
- Pipeline: text-generation
- Parameters: 11.2B
License: apache-2.0
Model Details
Architecture and initialization: Bielik-11B-v2 is a transformer-based causal decoder-only model scaled to ~11B parameters and explicitly initialized from Mistral-7B-v0.2 weights. The Hugging Face model card and supporting pages report training with Megatron-LM and advanced parallelization/ depth-scaling techniques. ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2)) Training data and process: The model was trained on a Polish-centric corpus curated by SpeakLeash, supplemented with web text (CommonCrawl / SlimPajama subsets in some reports). The model card documents an XGBoost-based quality filter used to select high-quality Polish documents and describes using hundreds of billions of tokens (the page references both ~200B tokens over two epochs and an alternate 400B-token figure in summary text). Training utilized large HPC resources (Helios/Athena supercomputers; 256 NVIDIA GH200 GPUs were used in reported runs). ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2)) Instruction tuning and variants: SpeakLeash publishes several instruction-tuned versions (v2.0, v2.1, v2.2, later merged v2.6) that apply supervised and preference-style tuning (including DPO-inspired techniques, and a DPO-Positive variant for multi-turn conversations). Instruct variants are reported to use large instruction corpora (>20M synthetic/manual instructions) and curated reward-model filtering (DPO-P dataset ~66k examples). These instruct variants improve generative and instruction-following performance and are available as separate model artifacts. ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2.2-Instruct?utm_source=openai)) Runtime and deployment: The model supports standard Hugging Face Transformers loading (AutoTokenizer / AutoModelForCausalLM) and common precision modes (bfloat16). Community and third-party builds provide quantized artifacts (GGUF/GPTQ/FP8) and NVIDIA documentation lists a native context window of 32,768 tokens for some instruct builds and notes support for vLLM and inference engines. License: Apache-2.0 with SpeakLeash Terms of Use. ([huggingface.co](https://huggingface.co/speakleash/Bielik-11B-v2))
Key Features
- Polish-first 11B parameter model optimized for Polish NLP tasks and cross-lingual English capability.
- Initialized from Mistral-7B-v0.2 and scaled with Megatron-LM parallelization techniques.
- Trained on SpeakLeash-curated Polish corpora plus web text; high-quality filtering via XGBoost selection.
- Instruction-tuned variants (v2.0→v2.6) using DPO-like methods and large instruction corpora.
- Deployment-ready quantized builds (GGUF/GPTQ/FP8) and long context (32,768 tokens for some builds).
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "speakleash/Bielik-11B-v2"
# Use bfloat16 to reduce memory footprint if supported
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
prompt = "Napisz krótki akapit po polsku o zaletach czytania książek."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, do_sample=True, top_k=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Benchmarks
Open PL LLM Leaderboard — average (Bielik-11B-v2): 58.14 (5-shot average across Polish NLP tasks) (Source: https://huggingface.co/speakleash/Bielik-11B-v2)
Open LLM Leaderboard — average (Bielik-11B-v2): 65.87 (aggregated English-task average; breakdown: hellaswag 79.84, gsm8k 67.78) (Source: https://huggingface.co/speakleash/Bielik-11B-v2)
Open LLM Leaderboard — average (Bielik-11B-v2.2-Instruct): 69.86 (instruct-tuned release, improved English-task average) (Source: https://huggingface.co/speakleash/Bielik-11B-v2.2-Instruct)
Polish generative tasks (Open PL LLM) — Bielik-11B-v2.2-Instruct: ~66.11 (reported generative average for v2.2-Instruct) (Source: https://huggingface.co/speakleash/Bielik-11B-v2.2-Instruct)
Technical report (model scaling and methods): Technical report and methodology published (Bielik 11B v2 technical report, May 5, 2025) (Source: https://arxiv.org/abs/2505.02410)
Key Information
- Category: Language Models
- Type: AI Language Models Tool