DeepSeek-MoE
DeepSeek-MoE 16B is a Mixture-of-Experts (MoE) language model featuring 16.4B parameters. It employs fine-grained expert segmentation and shared experts isolation to achieve comparable performance to larger models with only around 40% of the typical computations. The repository includes both base and chat variants along with evaluation benchmarks and integration instructions via Hugging Face Transformers.
Key Information
- Category: Language Models
- Source: Github
- Tags: Python
- Last updated: March 03, 2026
Structured Metrics
No structured metrics captured yet.
Links
Canonical source: https://github.com/deepseek-ai/DeepSeek-MoE