BLOOM - AI Language Models Tool
Overview
BLOOM (BigScience Large Open-science Open-access Multilingual) is a 176-billion-parameter autoregressive Transformer released by the BigScience research workshop to provide an open, research-grade multilingual LLM. Trained in public on the Jean Zay supercomputer, BLOOM was built from a balanced ROOTS corpus spanning 46 natural languages and 13 programming languages; the project released full weights, intermediate checkpoints, and optimizer states to support transparent research and replication. ([huggingface.co](https://huggingface.co/bigscience/bloom?utm_source=openai)) The model uses a decoder-only architecture with 70 layers, 112 attention heads, ALiBi positional encodings, a 2048-token context window, and a byte-level BPE tokenizer with a vocabulary around 250k tokens. BLOOM is distributed under a Responsible AI License (RAIL) that permits research and many non-commercial uses while restricting certain harmful applications. The project also produced instruction‑tuned variants (BLOOMZ) and ecosystem integrations — e.g., Hugging Face model cards, community tools such as Petals for distributed inference, and published training artifacts including optimizer checkpoints — to make experimentation and evaluation easier for researchers. ([huggingface.co](https://huggingface.co/bigscience/bloom?utm_source=openai))
Key Features
- 176B-parameter decoder-only Transformer for large-scale multilingual generation.
- Trained on ROOTS: ~350B tokens across 46 natural and 13 programming languages.
- Open release with full weights, intermediate checkpoints, and optimizer states for reproducibility.
- Responsible AI License (RAIL) that documents permitted and restricted uses.
- ALiBi positional encoding and StableEmbedding for training stability and length extrapolation.
- Ecosystem integrations: Hugging Face model card, BLOOMZ instruction-tuned variants, and Petals support.
Example Usage
Example (python):
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# WARNING: bigscience/bloom (176B) requires very large GPUs or inference sharding.
# This example shows the standard HF loading pattern; adjust device_map/quantization for your setup.
model_id = "bigscience/bloom"
tokenizer = AutoTokenizer.from_pretrained(model_id)
# device_map='auto' attempts to place layers across available devices (requires accelerate/transformers recent versions)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
)
prompt = "Write a concise summary of BLOOM's design and intended uses."
inputs = tokenizer(prompt, return_tensors="pt")
# Generate (set max_new_tokens small for quick test)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Benchmarks
Parameters: 176,247,271,424 parameters (176B) (Source: https://huggingface.co/bigscience/bloom)
Training data (tokens / size): ≈350 billion tokens; ≈1.6 TB of preprocessed text (ROOTS corpus) (Source: https://huggingface.co/bigscience/bloom)
Tokenizer vocabulary: Byte-level BPE tokenizer; vocabulary size 250,680 (Source: https://huggingface.co/bigscience/bloom-optimizer-states)
Context window: 2048 tokens (Source: https://huggingface.co/bigscience/bloom)
Validation perplexity: Perplexity ≈ 7.045 (validation, reported on model card) (Source: https://huggingface.co/bigscience/bloom)
HumanEval (code gen): HumanEval pass@1 = 0.155 (self-reported on model card) (Source: https://huggingface.co/bigscience/bloom)
Key Information
- Category: Language Models
- Type: AI Language Models Tool