Kimi-Dev - AI Language Models Tool

Overview

Kimi-Dev (Kimi-Dev-72B) is an open-source coding large language model from Moonshot AI focused on automated software engineering tasks — especially automated bug repair and unit-test generation. The model is released with model weights and code (MIT license) and was trained and fine-tuned from a Qwen-2.5-72B base through a two-stage pipeline: extensive mid-training on real GitHub issues/PR data and a large-scale reinforcement-learning phase that applies candidate patches inside Docker and rewards only when the full test suite passes. ([huggingface.co](https://huggingface.co/moonshotai/Kimi-Dev-72B?utm_source=openai)) Kimi-Dev uses a duo design (BugFixer + TestWriter) that first localizes files to edit and then produces edits or unit tests; at test time it can self-play between those roles and scale by generating many patch/test candidates. The project provides runnable tooling (examples, vLLM serving instructions) for agentless rollouts and encourages community contributions via GitHub and Hugging Face. Reported headline metrics include a 60.4% resolve rate on SWE-bench Verified; community feedback praises its debugging accuracy while noting inference speed, quantization trade-offs, and real-world variability. ([github.com](https://github.com/MoonshotAI/Kimi-Dev?utm_source=openai))

GitHub Statistics

  • Stars: 1,148
  • Forks: 147
  • Contributors: 3
  • License: NOASSERTION
  • Primary Language: Python
  • Last Updated: 2025-09-30T02:15:54Z

Key Features

  • Autonomous repository patching in Docker; rewards only when full test suites pass.
  • Dual BugFixer + TestWriter workflow for paired fixes and unit-test generation.
  • Large-scale RL fine-tuning with outcome-only rewards to prioritize correctness.
  • Very long context support (131,072 tokens) for multi-file or large-repo reasoning.
  • Open-source weights, vLLM-ready serving instructions, and example rollout scripts.

Example Usage

Example (python):

from transformers import AutoModelForCausalLM, AutoTokenizer

# Example quick-start (adapted from the project's model card). See HF model page for details.
model_name = "moonshotai/Kimi-Dev-72B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "# Fix the failing test: add the missing edge-case handling in function foo\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

# Note: for production or multi-request serving, the repo provides vLLM serving examples and rollout scripts. ([huggingface.co](https://huggingface.co/moonshotai/Kimi-Dev-72B?utm_source=openai))

Benchmarks

SWE-bench Verified resolve rate: 60.4% (Source: https://huggingface.co/moonshotai/Kimi-Dev-72B)

Model family / nominal size: Kimi-Dev-72B (72B parameters, HF lists ~73B) (Source: https://huggingface.co/moonshotai/Kimi-Dev-72B)

Context window (max tokens): 131,072 tokens (131K) (Source: https://github.com/MoonshotAI/Kimi-Dev)

Mid-training data volume: ~150B training tokens (mid-training on GitHub issues/PRs) (Source: https://moonshotai.github.io/Kimi-Dev/)

Last Refreshed: 2026-02-24

Key Information

  • Category: Language Models
  • Type: AI Language Models Tool