CVE-2026-39987's 9-Hour Exploitation Window Exposes the Credential Gap at the Heart of AI Dev Infrastructure

A single unauthenticated WebSocket handshake to /terminal/ws granted attackers a full interactive shell on any internet-reachable Marimo instance running versions prior to 0.23.0. The first attacker found the endpoint 9 hours and 41 minutes after the April 8, 2026 advisory dropped — with no public proof-of-concept in circulation — and spent under 3 minutes harvesting .env files, AWS access keys, and LLM provider API keys before disconnecting.

The 9-Hour Clock: What the Sysdig Honeypot Recorded

The CVE-2026-39987 advisory was published on April 8, 2026 at 21

UTC. According to Sysdig’s honeypot telemetry, first exploitation followed at 07

UTC on April 9 — an interval of 9 hours and 41 minutes.¹

No public proof-of-concept was circulating at that point. What was available was the advisory itself, which named the vulnerable endpoint (/terminal/ws) and described the missing authentication call. That combination was sufficient: the advisory text functioned as an exploit specification, not merely a warning.

The initial attacker ran four sessions totaling 90 minutes of access but completed the core credential-theft operation in under 3 minutes.¹ A sustained scanning wave followed. Between April 11 and 14, Sysdig’s honeypot recorded 662 exploit events originating from 11 unique source IPs across 10 countries.¹

Of the 125 IPs that conducted port scanning or HTTP probing during that period, only one progressed to actual WebSocket exploitation.¹ That asymmetry points to a small number of actors holding working exploits while a much larger scanning population remained unable to convert reconnaissance into access.

Root in One Request: How a Missing `validate_auth()` Call Became a Free Shell

CVE-2026-39987 carries a CVSS score of 9.3. The vulnerability is a one-endpoint authentication omission: Marimo’s /terminal/ws WebSocket endpoint allocates a full PTY shell session without invoking validate_auth(), while every other WebSocket endpoint in the server correctly enforces the authentication check.²

The attack surface is minimal by design. A single WebSocket upgrade request to /terminal/ws — no credentials, no session token, no prior interaction — yields an interactive terminal running as the Marimo process user. In default Docker deployments, that user is root.³

The fix, once identified, was straightforward: add the missing validate_auth() call to the /terminal/ws handler. The structural problem is that the omission survived long enough to reach production installs with internet-facing configurations.

The Credential Payload: LLM Keys, Cloud Tokens, and What Attackers Were Actually After

The first attacker’s session log is direct evidence of targeting intent. Within 3 minutes they searched for .env files, attempted to read AWS_ACCESS_KEY_ID, and probed SSH key locations.¹ The credential classes in scope extended further: according to CSA Labs analysis, Marimo environments commonly hold OpenAI, Anthropic, and Google Gemini API keys alongside OAuth tokens and cloud provider credentials.³

This is what makes AI development tooling a disproportionate target relative to its user base. An ML engineer’s notebook environment may hold credentials for a dozen external services simultaneously: the LLM provider, the cloud platform, the experiment tracking service, the vector database, and the OAuth identity for each. The blast radius of a single shell connection scales with that credential density, not with the tool’s deployment footprint.

Post-exploitation behavior went beyond credential harvesting in at least two documented cases. According to The Hacker News, one attacker extracted database credentials from environment variables and connected to a PostgreSQL server to enumerate schemas and tables; a separate Hong Kong-based actor scanned all 16 Redis databases on a targeted instance and dumped session tokens and application cache entries.⁵ These are medium-confidence claims from secondary reporting; primary telemetry has not been independently published as of April 22, 2026.

Beginning April 12, a separate campaign used the same vulnerability to deploy NKAbuse — a remote access tool that uses the NKN blockchain network for command-and-control — downloaded from a typosquatted Hugging Face Space named vsccode-modetx.⁶ The payload binary was named kagent to mimic a Kubernetes tooling component and established persistence via systemd or cron. These details originate from secondary sources and have not been confirmed by a primary sensor report as of April 22, 2026.

A Pattern, Not an Anomaly: Langflow, Flowise, and the AI Tooling RCE Cycle

CVE-2026-39987 fits a pattern that CSA Labs documented across the AI development toolchain. CVE-2026-33017 (affecting Langflow) saw working exploits within 20 hours of advisory publication, with no public PoC in circulation.³ Flowise CVE-2025-59528 presents a contrasting but complementary pattern: active exploitation persisted months after the fix shipped, indicating the threat is continuous rather than opportunistic.³

The implication is that some threat actors are monitoring AI tooling vulnerability advisories specifically — and synthesizing functional exploits from advisory text alone rather than waiting for released PoC code. Marimo had approximately 20,000 GitHub stars at the time of disclosure versus Langflow’s 145,000+.³ The gap in tool popularity did not translate into a gap in exploitation speed.

The conventional assumption — that niche or less widely adopted tools are lower-priority targets — does not hold when an advisory discloses enough structural detail to reconstruct the exploit. For this category of tooling, the timeline to first exploitation appears to be measured in hours regardless of the tool’s adoption scale.

The Dev-Grade Security / Prod-Grade Secrets Gap

The deeper issue CVE-2026-39987 surfaces is architectural, not specific to Marimo. ML organizations that have hardened LLM API access at the inference layer — rate limiting, key rotation, access logging — have often left the development environment layer where those same keys are stored in plaintext .env files entirely outside their threat model.

Notebook infrastructure is routinely treated as internal tooling: lightly firewalled, infrequently patched, and assumed to be low-value because it produces artifacts rather than serving production traffic. But the credential payload a notebook environment holds — cloud provider access keys, LLM provider API keys, OAuth tokens, database credentials — is functionally indistinguishable from production-grade access. The shell that CVE-2026-39987 grants is a shell inside that credential store.

Marimo processes running as root in default Docker images compound this: a successful connection is not limited to the notebook user’s environment variables.³ It is full host access, including any secrets mounted into the container or accessible via the instance metadata service on cloud platforms. The perimeter was the credential store all along; the notebook UI was incidental.

What to Do Now: Patch, Rotate, Segment

Three actions, in priority order:

Upgrade to Marimo 0.23.0. The fix is in PR #9098.⁴ The advisory identified ≤0.20.4 as the affected range, but the patch landed in 0.23.0, making the effective safe baseline that release. Do not assume versions between 0.20.4 and 0.23.0 are clean without verifying the changelog.

Block /terminal/ws at the reverse proxy or firewall layer if an immediate upgrade is operationally impossible.⁴ This eliminates the attack surface without requiring a version change. It also disables the terminal feature for legitimate users — that is the correct tradeoff until the upgrade completes.

Rotate all secrets accessible by the Marimo process, regardless of whether exploitation is confirmed.⁴ This includes AWS access keys, LLM provider API keys (OpenAI, Anthropic, Google Gemini), OAuth tokens, SSH keys, and any database credentials present in environment variables. The first attacker in Sysdig’s observation completed their objective in under 3 minutes and left minimal log trace; absence of evidence of exploitation is not evidence of absence.

The medium-term question is segmentation: notebook environments that hold production-grade credentials should be treated as production-grade infrastructure, with commensurate network segmentation, access controls, patch cadence, and monitoring. The current default — a dev tool running on the same cloud account, with the same credential scope, as production workloads — is the structural condition that makes these vulnerabilities high-severity incidents rather than low-impact misconfigurations.

FAQ

Why was exploitation so fast with no public proof-of-concept?

The April 8 advisory named both the vulnerable endpoint (/terminal/ws) and the missing authentication function (validate_auth()). That combination is sufficient to reconstruct the exploit independently: connect to the endpoint, observe that no authentication challenge is issued, interact with the resulting shell. No additional reverse engineering was required. The advisory text was effectively a complete exploit specification, not merely a description of a flaw.

Does running Marimo behind an authenticated reverse proxy or VPN eliminate the risk?

For new exploitation, yes — if access to /terminal/ws is blocked or requires authentication at the network layer before reaching the Marimo process, the vulnerability cannot be reached. However, if an instance was previously internet-reachable while running an affected version, credential rotation remains necessary regardless of current network posture. Network controls prevent future exposure; rotation addresses any credentials that may have already left.

Is this risk specific to Marimo, or does it extend to other notebook and pipeline tools?

The CVE is Marimo-specific, but the underlying credential-exposure risk is structural across notebook-style development environments. Any internet-reachable tool that loads .env files, SDK credentials, or cloud provider configuration into a process environment presents an analogous attack surface if authentication or network controls fail. The Langflow and Flowise precedents cited above suggest that the threat actor pool actively monitors this category of tooling and moves faster than patch deployment cycles.³

Frequently Asked Questions

If we upgrade to 0.23.0, are we safe from the NKAbuse campaign too?

Upgrading closes the initial-access vector but does not remove NKAbuse if it was already delivered. The malware establishes persistence independently via systemd services or cron entries, and the binary (named ‘kagent’ to mimic Kubernetes tooling) survives a Marimo version change. Any instance that was internet-reachable between April 8 and April 12 should be inspected for unexpected systemd units, cron jobs, or binaries matching that name before assuming the threat is fully remediated.

What’s the fastest path to credential segmentation without re-architecting the notebook environment?

Replace static .env files with short-lived credentials from a secrets manager (AWS Secrets Manager, HashiCorp Vault) or cloud IAM role-chaining. Per-session tokens with a TTL of hours — not permanent keys — mean that even if a shell is obtained, exfiltrated credentials expire before they can be reused. This is the operational fix the article’s ‘segmentation’ recommendation translates into: the notebook process never holds long-lived secrets, so the blast radius of a single RCE collapses to the session window.

How does the 9-hour-41-minute exploit timeline compare to typical CVE exploitation?

CISA’s Known Exploited Vulnerabilities catalog shows median first-exploitation for critical CVEs in the days-to-weeks range. Sub-10-hour exploitation is an outlier even among critical-infrastructure vulnerabilities. The Langflow precedent (under 20 hours, also no public PoC) suggests AI dev-tooling advisories attract a faster attacker response than the general CVE population — likely because the credential payload (LLM keys, cloud tokens) makes these targets disproportionately valuable relative to the tool’s install base.

Does the typosquatted Hugging Face Space used to deliver NKAbuse indicate a broader supply-chain risk?

The ‘vsccode-modetx’ Space exploited a trust assumption that extends well beyond Marimo: ML pipelines routinely fetch models, datasets, and Spaces from Hugging Face without verifying provenance or pinning to a specific commit hash. Any toolchain that auto-resolves or allows references to public model-registry resources is susceptible to the same typosquatting technique. Pinning all registry references to verified commit hashes and restricting pipeline fetches to an allow-list of namespaces are the minimum viable mitigations.

Sysdig, “Marimo OSS Python Notebook RCE: From Disclosure to Exploitation in Under 10 Hours,” accessed April 22, 2026. ↩ ↩² ↩³ ↩⁴ ↩⁵
Endor Labs, “Root in One Request: Marimo’s Critical Pre-Auth RCE (CVE-2026-39987),” accessed April 22, 2026. ↩
CSA Labs, “CSA Research Note: Marimo Pre-Auth RCE and AI Toolchain Security (CVE-2026-39987),” accessed April 22, 2026. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
BleepingComputer, “Critical Marimo pre-auth RCE flaw now under active exploitation,” accessed April 22, 2026. ↩ ↩² ↩³ ↩⁴
The Hacker News, “Marimo RCE Flaw CVE-2026-39987 Exploited Within 10 Hours of Disclosure,” accessed April 22, 2026. ↩
we-fix-pc.com, “Hackers exploit Marimo flaw to deploy NKAbuse malware from Hugging Face,” accessed April 22, 2026. ↩

CVE-2026-39987's 9-Hour Exploitation Window Exposes the Credential Gap at the Heart of AI Dev Infrastructure

The 9-Hour Clock: What the Sysdig Honeypot Recorded

Root in One Request: How a Missing `validate_auth()` Call Became a Free Shell

The Credential Payload: LLM Keys, Cloud Tokens, and What Attackers Were Actually After

A Pattern, Not an Anomaly: Langflow, Flowise, and the AI Tooling RCE Cycle

The Dev-Grade Security / Prod-Grade Secrets Gap

What to Do Now: Patch, Rotate, Segment

FAQ

Frequently Asked Questions

If we upgrade to 0.23.0, are we safe from the NKAbuse campaign too?

What’s the fastest path to credential segmentation without re-architecting the notebook environment?

How does the 9-hour-41-minute exploit timeline compare to typical CVE exploitation?

Does the typosquatted Hugging Face Space used to deliver NKAbuse indicate a broader supply-chain risk?

Sources

Enjoyed this article?

The 9-Hour Clock: What the Sysdig Honeypot Recorded

Root in One Request: How a Missing validate_auth() Call Became a Free Shell

The Credential Payload: LLM Keys, Cloud Tokens, and What Attackers Were Actually After

A Pattern, Not an Anomaly: Langflow, Flowise, and the AI Tooling RCE Cycle

The Dev-Grade Security / Prod-Grade Secrets Gap

What to Do Now: Patch, Rotate, Segment

FAQ

Frequently Asked Questions

If we upgrade to 0.23.0, are we safe from the NKAbuse campaign too?

What’s the fastest path to credential segmentation without re-architecting the notebook environment?

How does the 9-hour-41-minute exploit timeline compare to typical CVE exploitation?

Does the typosquatted Hugging Face Space used to deliver NKAbuse indicate a broader supply-chain risk?

Footnotes

Sources

Related Articles

Marimo CVE-2026-39987 Exposed Unauthenticated Root Shells Within Hours of Disclosure

Marimo CVE-2026-39987: Pre-Auth RCE via /terminal/ws in Under 10 Hours

Enjoyed this article?

Root in One Request: How a Missing `validate_auth()` Call Became a Free Shell