Vercel AI SDK CVE-2025-48985: Input Validation Bypass Hits LLM App Builders

Vercel’s AI SDK, the TypeScript library that handles prompt construction and model I/O for a broad slice of production LLM applications, shipped a quietly damaging input validation bug. CVE-2025-48985 allows an attacker to substitute arbitrary downloaded bytes for different URLs within the same prompt, injecting content while bypassing the SDK’s URL-based trust checks. The GitHub Advisory rates it low severity. Whether that label applies to your app depends entirely on whether you pass user-controlled URLs as file or image inputs through generateText() or streamText().

The root cause: index misalignment in the prompt pipeline

The vulnerability lives in a single file: convert-to-language-model-prompt.ts, which handles the conversion of structured prompt inputs into the format consumed by language model calls. When a prompt includes file or image attachments specified by URL, the SDK plans a set of downloads, fetches them, and then maps the downloaded bytes back to the original URL keys.

The bug is an array-index mismatch. The download pipeline creates a plannedDownloads array and a corresponding downloadedFiles array. When some URLs point to unsupported file types, the SDK filters out the null results. But the filtering operation removes entries from the downloaded array without adjusting the index mapping back to the original planned downloads. The remaining downloaded data gets associated with the wrong URL keys, according to Vercel’s changelog.

In concrete terms: if a prompt contains three URLs where the first points to a supported image type and the second points to an unsupported type, the SDK downloads what it can, discards the null result for the unsupported URL, and then assigns the bytes from URL 1 to the key that was supposed to correspond to URL 3. An attacker who controls the mix of supported and unsupported URLs can inject arbitrary bytes into positions the application trusts to contain content from a specific, validated source.

This is a classic off-by-one-adjacent bug in a data-processing pipeline, the kind that shows up when filtering and indexing are handled in separate passes without a reconciliation step.

Affected versions and attack surface

The version ranges matter because the AI SDK is a direct dependency in many application stacks rather than a transitive one. Teams know they use it, which is good; teams may not know which minor version they pinned, which is the problem.

According to the GitHub Advisory, vulnerable versions are:

Range	Status
AI SDK < 5.0.52	Vulnerable
AI SDK >= 5.1.0-beta.0, < 5.1.0-beta.9	Vulnerable
AI SDK >= 5.0.52 (stable)	Patched
AI SDK >= 5.1.0-beta.9	Patched
AI SDK >= 6.0.0-beta	Patched (per NVD)

The attack surface is specific: the generateText() and streamText() functions, covering most methods that accept images or files as inputs per Vercel’s advisory. Applications that pass only text prompts through these functions are not affected. Applications that do pass user-supplied URLs as file or image inputs inherit the full exposure until they bump past the vulnerable version.

The fix and the workaround

The patch is straightforward. Rather than filtering empty entries out of the downloaded array and then mapping by position, the fix maps files to their original indices before filtering. The array positions stay aligned. The commit is 930399bb, attributed to the responsible disclosure by researcher @aphantom via HackerOne.

For teams that cannot upgrade immediately, Vercel’s changelog recommends implementing custom filetype validation logic outside the SDK rather than relying on its built-in whitelist. This is sound advice but worth being blunt about: if you were trusting the SDK’s whitelist before, your validation gap existed from the first deployment, not from the CVE disclosure. The disclosure just made the gap visible.

As of 2026-06-01, the NVD entry for CVE-2025-48985 classifies the vulnerability under CWE-20 (Improper Input Validation) but has not yet published a CVSS score. This is not unusual for recently disclosed CVEs, but the operational consequence is specific: most automated dependency scanners and policy engines weight their alerts on NVD CVSS scores. A missing score means the CVE may not surface in standard vulnerability reports, and may not trigger the version-bump policy that teams rely on to keep their supply chain clean.

As of late May 2026, CVE-2025-48985 does not appear in CISA’s Known Exploited Vulnerabilities catalog, meaning it is not known to be exploited in the wild. That distinction matters for triage, but it is not the same as “low risk.” An unexploited vulnerability in a widely-used SDK is a target-rich surface for anyone who bothers to look, and the missing NVD score means the people most likely to notice it first are security researchers and attackers, not the teams running vulnerable code.

The pattern here is familiar to anyone who has tracked supply-chain vulnerabilities in foundational libraries. A bug in a component that most teams treat as inert glue code, an SDK that handles prompt formatting and model calls, shifts the patching burden to every downstream maintainer. The fix is a one-line version bump. The failure mode is the month between disclosure and the bump actually happening, especially when the scanner does not ring the bell.

Frequently Asked Questions

How does CVE-2025-48985 differ from prompt-injection attacks on LLM apps?

Prompt injection exploits model-level interpretation of input text. This CVE operates one layer below, subverting the SDK’s file-handling pipeline before content reaches any model. Prompt-injection mitigations like output filtering, guardrails, and system-prompt hardening do nothing here, because the attack swaps byte content at the download-mapping step. Defense requires fixing the pipeline itself, not hardening the model’s behavior.

What does external filetype validation actually look like as a workaround?

A practical implementation fetches each user-supplied URL in application code, checks the response Content-Type header and optionally inspects file magic bytes against an allowlist, then passes only the validated bytes to the SDK instead of the raw URL. This shifts the trust boundary from the SDK’s internal whitelist to application-controlled logic. The tradeoff is an extra network round-trip per input, unless the SDK accepts pre-fetched content directly.

Why might the NVD CVSS score remain missing months after disclosure?

NVD analysts prioritize higher-severity entries, and a vendor-rated low-severity CVE with no known exploitation falls below the effective triage threshold. Teams relying exclusively on CVSS-gated dependency policies should cross-reference the GitHub Advisory (GHSA-rwvc-j5jr-mgvh), which rates and describes the vulnerability independently of NVD. The GHSA database is often faster to publish assessments for lower-severity entries.

Could the same index-misalignment bug appear in other AI SDKs?

Any framework that accepts heterogeneous URL inputs, downloads a subset based on filetype support, and maps results back by array position without a stable key is susceptible to the same class of bug. CWE-20 (Improper Input Validation) is among the most frequently recorded weaknesses in NVD, and download-then-map pipelines are a recurring structural source. Tool-calling frameworks that resolve user-supplied URLs to attachments deserve the same audit.

Which patched version is authoritative when GitHub Advisory and NVD list different ranges?

The GitHub Advisory lists 5.0.52 and 5.1.0-beta.9 as fixed. NVD adds 6.0.0-beta to that set. The discrepancy reflects the SDK’s independent release branches: the 5.0.x stable line, the 5.1.0-beta line, and the 6.0.0-beta line each received separate patches. Teams on any beta track should verify against the NVD entry, which aggregates all three fixed ranges, rather than relying on a single source.