Home › Language Models › DeepSeek-V3.2-Speciale

DeepSeek-V3.2-Speciale - AI Language Models Tool

Overview

DeepSeek-V3.2-Speciale is a high-compute, research-oriented variant of the open-weight DeepSeek-V3.2 family that focuses on deep multi-step reasoning and competitive problem solving. The model card and technical report describe three core advances: DeepSeek Sparse Attention (DSA) for efficient long-context processing, a scaled reinforcement-learning post-training pipeline designed to strengthen instruction-following and reasoning, and a large agentic task-synthesis pipeline for generating complex training scenarios. The Speciale variant is presented as a “max‑compute” reasoning engine intended to push state-of-the-art performance on math and competitive programming benchmarks; the model card explicitly highlights gold‑medal–level results on several 2025 olympiad competitions. ([huggingface.co](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale)) Weights and inference assets for DeepSeek‑V3.2‑Speciale are published on Hugging Face as safetensors shards (the repository lists many shard files totaling ≈690 GB) and the authors provide a runnable inference/demo workflow in the V3.2‑Exp repo for local deployment and research use. The release is licensed permissively under the MIT license, enabling research and commercial use under that license. Practical notes in the model card call out that Speciale intentionally does not support native tool‑calling (it is built to run in a standalone “thinking” mode). ([huggingface.co](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale/tree/main))

Model Statistics

Downloads: 172,335
Likes: 677
Pipeline: text-generation

License: mit

Model Details

Architecture and core capabilities: DeepSeek‑V3.2 introduces DeepSeek Sparse Attention (DSA), a sparse-attention mechanism intended to reduce the O(n²) cost of dense attention for very long contexts while retaining output quality; DSA is the core efficiency innovation carried into V3.2‑Speciale. The project also documents a scaled RL post‑training pipeline and a large synthetic agentic dataset used to improve multi‑step planning and compliance in interactive scenarios. According to the model card and repository, V3.2 and the Speciale variant share the same model structure as the V3.2‑Exp baseline; implementation and inference recipes (conversion, model-parallel launch, and vLLM/SGLang support) are published in the V3.2‑Exp repository. ([huggingface.co](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale)) Weights & formats: the Hugging Face model page and the repo show the model distributed as many safetensors shards (the model tree lists ~163 shards and an approximate total weight of 690 GB). The model page/tagging indicates FP8 quantization artifacts and safetensors distribution; users should consult the repository and files on Hugging Face for the exact shard/quantization variants they intend to run. The authors provide a local conversion and launch example (convert.py plus torchrun with a provided config) in the V3.2‑Exp inference folder. Note: explicit parameter counts and some quantization variants are not enumerated in the model card; consult the repository and files for exact layout and available precisions. ([huggingface.co](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale/tree/main))

Key Features

DeepSeek Sparse Attention (DSA) — efficient long‑context attention for very large context windows.
Thinking mode — explicit chain‑of‑thought / "thinking with tools" encoding for multi‑step reasoning.
Speciale high‑compute variant — tuned for hardest reasoning and competitive programming tasks.
Weights published as safetensors shards (≈690 GB total); MIT license for commercial/research use.
Local run recipes — convert.py + torchrun + vLLM/SGLang recipes in the V3.2‑Exp repo.

Example Usage

Example (python):

import transformers
# example from the model card: use encoding helper to build 'thinking' prompts
from encoding_dsv32 import encode_messages, parse_message_from_completion_text

tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.2")

messages = [
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hello! I am DeepSeek.", "reasoning_content": "thinking..."},
    {"role": "user", "content": "Prove that sqrt(2) is irrational."}
]

encode_config = dict(thinking_mode="thinking", drop_thinking=True, add_default_bos_token=True)
prompt = encode_messages(messages, **encode_config)

# Tokenize and inspect token ids
tokens = tokenizer.encode(prompt)
print(len(tokens), tokens[:64])

# NOTE: for local inference follow the repository's convert.py and generate.py launch examples
# (see DeepSeek-V3.2-Exp inference folder in the official repo).

Benchmarks

MMLU-Pro (V3.2-Exp reported): 85.0 (Source: https://github.com/deepseek-ai/DeepSeek-V3.2-Exp)

AIME 2025 (V3.2-Exp reported): 89.3 (Source: https://github.com/deepseek-ai/DeepSeek-V3.2-Exp)

Codeforces rating (V3.2-Exp reported): 2121 (Source: https://github.com/deepseek-ai/DeepSeek-V3.2-Exp)

LiveCodeBench (V3.2-Exp reported): 74.1 (Source: https://github.com/deepseek-ai/DeepSeek-V3.2-Exp)

Reported competition achievements (IMO / IOI 2025): Gold‑medal level on selected problems (per model card) (Source: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale)

Last Refreshed: 2026-02-24

HuggingFace

Key Information

Category: Language Models
Type: AI Language Models Tool

Visit Official Website

DeepSeek-V3.2-Speciale - AI Language Models Tool

Overview

Model Statistics

Model Details

Key Features

Example Usage

Benchmarks

Key Information

Related Tools

Qwen2.5-7B

DeepSeek-V3

Llama 3

UNfilteredAI-1B

Shuttle-3

WizardLM