SB2026070221 - Multiple vulnerabilities in vLLM
Published: July 2, 2026 Updated: July 2, 2026
Breakdown by Severity
- Low
- Medium
- High
- Critical
Description
This security bulletin contains information about 5 vulnerabilities.
1) Reachable assertion (CVE-ID: CVE-2026-55514)
CWE-ID: CWE-617 - Reachable Assertion
CVSSv4: CVSS:4.0/AV:N/AC:L/AT:N/PR:L/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N/E:U/U:Green
The vulnerability allows a remote attacker to perform a denial of service (DoS) attack.
The vulnerability exists due to a reachable assertion. A remote user can send a pure prompt embeds payload in a "/v1/completions" request with a model using M-RoPE and cause a denial of service condition on the target system.
2) Allocation of Resources Without Limits or Throttling (CVE-ID: CVE-2026-55646)
CWE-ID: CWE-770 - Allocation of Resources Without Limits or Throttling
CVSSv4: CVSS:4.0/AV:N/AC:L/AT:N/PR:L/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N/E:U/U:Clear
The vulnerability allows a remote user to cause a denial of service.
The vulnerability exists due to uncontrolled resource consumption in the speech-to-text upload handling routes when processing oversized audio file uploads. A remote user can send a specially crafted oversized multipart upload to cause a denial of service.
Exploitation requires that the deployment exposes the speech-to-text endpoints and has a speech-to-text capable model or task configured.
3) Input validation error (CVE-ID: CVE-2026-54234)
CWE-ID: CWE-20 - Improper input validation
CVSSv4: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N/E:U/U:Green
The vulnerability allows a remote attacker to cause a denial of service.
The vulnerability exists due to improper input validation in the speculative decoding path when handling overlapping public gRPC Generate and Abort requests. A remote attacker can send a specific overlapping request sequence to cause a denial of service.
In shared deployments, exploitation can crash the engine worker, abort concurrent requests, and prevent later requests from completing until the worker is restarted. The issue was reproduced on Qwen3 GPTQ.
4) Inefficient regular expression complexity (CVE-ID: CVE-2026-55574)
CWE-ID: CWE-1333 - Inefficient Regular Expression Complexity
CVSSv4: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N/E:U/U:Green
The vulnerability allows a remote attacker to cause a denial of service.
The vulnerability exists due to inefficient regular expression complexity in the structured_outputs.regex API in the xgrammar and outlines backends when compiling a user-supplied regex string. A remote attacker can send a specially crafted regex pattern to cause a denial of service.
A single crafted request can hang an inference worker indefinitely.
5) Improper handling of highly compressed data (CVE-ID: CVE-2026-54233)
CWE-ID: CWE-409 - Improper Handling of Highly Compressed Data (Data Amplification)
CVSSv4: CVSS:4.0/AV:N/AC:L/AT:N/PR:L/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N/E:U/U:Clear
The vulnerability allows a remote user to cause a denial of service.
The vulnerability exists due to improper handling of highly compressed data in the /v1/audio/transcriptions endpoint when processing compressed audio uploads. A remote user can send a specially crafted audio file to cause a denial of service.
The issue arises because the endpoint limits compressed upload size but not decoded PCM output, allowing excessive memory consumption during audio decoding and concatenation.
Remediation
Install update from vendor's website.
References
- https://github.com/vllm-project/vllm/security/advisories/GHSA-33cg-gxv8-3p8g
- https://github.com/vllm-project/vllm/security/advisories/GHSA-v82g-2437-67m2
- https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc
- https://github.com/vllm-project/vllm/security/advisories/GHSA-8wr5-jm2h-8r4f
- https://github.com/vllm-project/vllm/pull/44744
- https://github.com/vllm-project/vllm/security/advisories/GHSA-rwxx-mrjm-wc2m
- https://github.com/vllm-project/vllm/pull/44970