Incorrect Calculation of Buffer Size in vLLM - #VU128747
Published: May 1, 2026
vLLM
Detailed vulnerability description
The vulnerability allows a remote user to cause a denial of service.
The vulnerability exists due to incorrect calculation of buffer size in the extract_hidden_states speculative decoding proposer when processing requests with sampling penalty parameters. A remote user can send a specially crafted request with a penalty parameter to cause a denial of service.
Only deployments configured with extract_hidden_states speculative decoding are vulnerable.