DeepSeek-MoE

DeepSeek-MoE 16B is a Mixture-of-Experts (MoE) language model featuring 16.4B parameters. It employs fine-grained expert segmentation and shared experts isolation to achieve comparable performance to larger models with only around 40% of the typical computations. The repository includes both base and chat variants along with evaluation benchmarks and integration instructions via Hugging Face Transformers.

Key Information

  • Category: Language Models
  • Source: Github
  • Tags: Python
  • Last updated: March 03, 2026

Structured Metrics

No structured metrics captured yet.

Links

Canonical source: https://github.com/deepseek-ai/DeepSeek-MoE