Resource exhaustion in vLLM - CVE-2025-29770
Published: March 19, 2025 / Updated: May 1, 2026
vLLM
vLLM
Description
The vulnerability allows a remote user to cause a denial of service.
The vulnerability exists due to uncontrolled resource consumption in outlines grammar cache in vllm/model_executor/guided_decoding/outlines_logits_processors.py when handling decoding requests with unique schemas. A remote user can send a stream of very short decoding requests with unique schemas to cause a denial of service.
The issue applies to the V0 engine only. The outlines backend can also be selected on a per-request basis using the guided_decoding_backend key in the extra_body field.