Open-r1 - AI Language Models Tool

Overview

Open-r1 is a fully open reproduction of DeepSeek-R1 that implements the end-to-end pipeline for building reasoning-optimized language models. The project provides scripts and recipes for supervised fine-tuning (SFT), Group Relative Policy Optimization (GRPO) reinforcement learning, and synthetic-data generation via Distilabel — enabling researchers to reproduce DeepSeek-R1’s multi-stage training approach and extend it. ([github.com](https://github.com/huggingface/open-r1)) The codebase is built for scale: it supports multi-node training (DDP and DeepSpeed ZeRO-2/3), recipes tuned for clusters (examples target 8×H100 nodes), and uses vLLM for high-performance inference with very long contexts (examples use max sequence length 32,768). The repo also publishes reasoning-focused datasets (Mixture-of-Thoughts — 350k verified traces, CodeForces-CoTs, OpenR1-Math-220k) and provides evaluation tooling (lighteval) and reproducible benchmark recipes. Open-r1 is released under an Apache-2.0 license and is intended as a community-first, extensible reproduction rather than a closed commercial product. ([github.com](https://github.com/huggingface/open-r1))

GitHub Statistics

  • Stars: 25,798
  • Forks: 2,406
  • Contributors: 42
  • License: Apache-2.0
  • Primary Language: Python
  • Last Updated: 2025-07-17T20:20:00Z

Activity and community signals are strong: the GitHub repository shows ~25.8k stars and ~2.4k forks, with an active issues queue (~290 open issues) and open pull requests (~44). The repo contains training, generation and evaluation scripts, a Makefile, and recipes for cluster setups; documentation calls out CUDA 12.4 and PyTorch v2.6.0 as required components for the vLLM binaries. These indicators show high interest, active contribution, and an emphasis on reproducibility and cluster-scale training. ([github.com](https://github.com/huggingface/open-r1))

Installation

Install via pip:

uv venv openr1 --python 3.11 && source openr1/bin/activate && uv pip install --upgrade pip
uv pip install vllm==0.8.5.post1
uv pip install setuptools && uv pip install flash-attn --no-build-isolation
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e ".[dev]"
huggingface-cli login
wandb login
git-lfs --version (if missing: sudo apt-get install git-lfs)

Key Features

  • End-to-end R1 pipeline: GRPO reinforcement learning, SFT fine-tuning, and Distilabel synthetic-data generation.
  • Multi-node training support with DDP and DeepSpeed (ZeRO-2/3) and cluster-optimised slurm recipes for 8×H100 nodes.
  • vLLM-backed inference and evaluation supporting long contexts (examples use max_seq_length 32,768 tokens).
  • Published reasoning datasets: Mixture-of-Thoughts (350k traces), CodeForces-CoTs, and OpenR1-Math-220k.
  • Reproducible evaluation with lighteval and benchmark recipes (AIME24, MATH-500, LiveCodeBench).

Community

Open-r1 has a large, active community and visible adoption: according to the GitHub repository it has ~25.8k stars and ~2.4k forks, an active issues tracker, and dozens of contributors; the Hugging Face organization hosts models, datasets, and multiple project updates. The project received wider attention as part of the conversation around DeepSeek-R1’s impact on the LLM ecosystem. Contributors publish dataset and model cards on Hugging Face and the repo maintains frequent update posts and reproducible training/eval recipes. ([github.com](https://github.com/huggingface/open-r1))

Last Refreshed: 2026-01-09

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool