vLLM
A high-throughput, memory-efficient library for large language model inference and serving that supports tensor and pipeline parallelism.
Key Information
- Category: Developer Tools
- Source: Github
- Last updated: January 09, 2026
Structured Metrics
No structured metrics captured yet.
Links
Canonical source: https://github.com/vllm-project/vllm