Kimi-Dev - AI Language Models Tool
Overview
Kimi-Dev (Kimi-Dev-72B) is an open-source coding large language model from Moonshot AI focused on automated software engineering tasks — especially automated bug repair and unit-test generation. The model is released with model weights and code (MIT license) and was trained and fine-tuned from a Qwen-2.5-72B base through a two-stage pipeline: extensive mid-training on real GitHub issues/PR data and a large-scale reinforcement-learning phase that applies candidate patches inside Docker and rewards only when the full test suite passes. ([huggingface.co](https://huggingface.co/moonshotai/Kimi-Dev-72B?utm_source=openai)) Kimi-Dev uses a duo design (BugFixer + TestWriter) that first localizes files to edit and then produces edits or unit tests; at test time it can self-play between those roles and scale by generating many patch/test candidates. The project provides runnable tooling (examples, vLLM serving instructions) for agentless rollouts and encourages community contributions via GitHub and Hugging Face. Reported headline metrics include a 60.4% resolve rate on SWE-bench Verified; community feedback praises its debugging accuracy while noting inference speed, quantization trade-offs, and real-world variability. ([github.com](https://github.com/MoonshotAI/Kimi-Dev?utm_source=openai))
GitHub Statistics
- Stars: 1,148
- Forks: 147
- Contributors: 3
- License: NOASSERTION
- Primary Language: Python
- Last Updated: 2025-09-30T02:15:54Z
Key Features
- Autonomous repository patching in Docker; rewards only when full test suites pass.
- Dual BugFixer + TestWriter workflow for paired fixes and unit-test generation.
- Large-scale RL fine-tuning with outcome-only rewards to prioritize correctness.
- Very long context support (131,072 tokens) for multi-file or large-repo reasoning.
- Open-source weights, vLLM-ready serving instructions, and example rollout scripts.
Example Usage
Example (python):
from transformers import AutoModelForCausalLM, AutoTokenizer
# Example quick-start (adapted from the project's model card). See HF model page for details.
model_name = "moonshotai/Kimi-Dev-72B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
prompt = "# Fix the failing test: add the missing edge-case handling in function foo\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Note: for production or multi-request serving, the repo provides vLLM serving examples and rollout scripts. ([huggingface.co](https://huggingface.co/moonshotai/Kimi-Dev-72B?utm_source=openai)) Benchmarks
SWE-bench Verified resolve rate: 60.4% (Source: https://huggingface.co/moonshotai/Kimi-Dev-72B)
Model family / nominal size: Kimi-Dev-72B (72B parameters, HF lists ~73B) (Source: https://huggingface.co/moonshotai/Kimi-Dev-72B)
Context window (max tokens): 131,072 tokens (131K) (Source: https://github.com/MoonshotAI/Kimi-Dev)
Mid-training data volume: ~150B training tokens (mid-training on GitHub issues/PRs) (Source: https://moonshotai.github.io/Kimi-Dev/)
Key Information
- Category: Language Models
- Type: AI Language Models Tool