Open-r1 - AI Language Models Tool

Overview

Open-R1 is a community-driven, fully open reproduction of DeepSeek's R1 reasoning pipeline, maintained by Hugging Face. The repository implements end-to-end pieces of the R1 workflow—data distillation, supervised fine-tuning (SFT), and reinforcement-style policy optimization (GRPO)—with ready-to-run recipes and Slurm examples for cluster-scale runs. According to the project README, Open-R1 follows a three-step plan: replicate distilled R1 models, reproduce the RL pipeline used to obtain R1-Zero, and demonstrate multi-stage training from base to RL-tuned models. ([github.com](https://github.com/huggingface/open-r1)) The project is explicitly engineered for large-scale training and inference: it integrates TRL’s vLLM backend to scale GRPO across nodes, supports DDP and DeepSpeed (ZeRO-2/ZeRO-3) training topologies, and provides tools to generate high-quality reasoning traces using Distilabel. The repo also publishes several curated datasets (e.g., Mixture-of-Thoughts, OpenR1-Math-220k, CodeForces-CoTs) and model recipes (including an OpenR1-Distill-7B recipe), enabling reproducible experiments in math, coding, and multi-step reasoning. ([github.com](https://github.com/huggingface/open-r1))

GitHub Statistics

  • Stars: 25,914
  • Forks: 2,416
  • Contributors: 43
  • License: Apache-2.0
  • Primary Language: Python
  • Last Updated: 2025-07-17T20:20:00Z

Open-R1 is highly active and visible: the GitHub repository lists ~25.9k stars and ~2.4k forks, an Apache-2.0 license, and an active issue/PR backlog (hundreds of issues and multiple open PRs). The README and examples show frequent updates (news entries and dataset/model releases), example Slurm scripts for multi-node workflows, and 198+ commits on the main branch, indicating ongoing development and community contributions. These signals point to a healthy, fast-growing project with active maintainers and broad community interest. ([github.com](https://github.com/huggingface/open-r1))

Installation

Install via pip:

uv venv openr1 --python 3.11 && source openr1/bin/activate && uv pip install --upgrade pip
uv pip install vllm==0.8.5.post1
uv pip install setuptools && uv pip install flash-attn --no-build-isolation
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e ".[dev]"
huggingface-cli login
wandb login
git-lfs --version  # if missing: sudo apt-get install git-lfs

Key Features

  • GRPO reinforcement-style training using TRL’s vLLM backend to scale across nodes and GPUs. ([github.com](https://github.com/huggingface/open-r1))
  • Supervised fine-tuning (SFT) recipes and scripts for distilled reasoning models. ([github.com](https://github.com/huggingface/open-r1))
  • Synthetic data generation via Distilabel pipelines to produce reasoning traces. ([github.com](https://github.com/huggingface/open-r1))
  • Published reasoning datasets: Mixture-of-Thoughts (350k traces) and OpenR1-Math-220k. ([github.com](https://github.com/huggingface/open-r1))
  • Code-execution reward functions for competitive programming (IOI, CodeForces) with E2B/Morph sandbox support. ([github.com](https://github.com/huggingface/open-r1))
  • Evaluation tooling with lighteval + vLLM and support for very long contexts (examples use 32,768 tokens). ([github.com](https://github.com/huggingface/open-r1))
  • Multi-node training examples (Slurm recipes) and support for DDP or DeepSpeed (ZeRO-2/ZeRO-3). ([github.com](https://github.com/huggingface/open-r1))

Community

Open-R1 attracted rapid attention across media and developers: the GitHub repo shows ~25.9k stars and ~2.4k forks and an active issues/PRs backlog, while Hugging Face published multiple project updates and datasets on the Hub. Coverage in outlets such as TechCrunch and Hugging Face’s own blog highlights both the project’s goals and community momentum. Community engagement includes ongoing contributions, dataset/model releases, and public discussions in issues and PRs. ([github.com](https://github.com/huggingface/open-r1))

Last Refreshed: 2026-03-03

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool