Security

Marimo CVE-2026-39987 Exposed Unauthenticated Root Shells Within Hours of Disclosure

Marimo's /terminal/ws endpoint granted unauthenticated attackers a full PTY shell. CVE-2026-39987 was actively exploited within 9 hours and 41 minutes of disclosure.

Security

Marimo CVE-2026-39987: Pre-Auth RCE via /terminal/ws in Under 10 Hours

Marimo's /terminal/ws skipped validate_auth() on ≤0.20.4. Sysdig recorded exploitation 9h 41m after disclosure; .env credential theft completed in under three minutes.

Security

MCP STDIO Executes Even When the Server Fails: One Design Decision, 14 CVEs, 30+ RCEs

[OX Security's April 2026 advisory](/articles/vercels-april-2026-database-leak-pivoted-from-lumma-stealer-at-context-ai-via/) traces 14 CVEs and 30+ RCEs across LiteLLM, Flowise, and Cursor to one MCP STDIO behavior: the command field executes before handshake.

Open Source

Off Grid v0.0.88 Ships Hexagon HTP Acceleration: Auditability Is the Real Edge Over Apple Intelligence

Off Grid v0.0.88 ships Hexagon HTP/NPU text acceleration with a self-reported 3× speed gain. Auditability of the MIT source is its genuine advantage over Apple Intelligence.

Agents & Frameworks

PROBE-SWE Finds Chain-of-Thought and Self-Debiasing Don't Reduce Prompt-Induced Bias in Coding Agents

PROBE-SWE (arXiv 2604.16756) finds chain-of-thought and self-debiasing fail to reduce prompt-induced cognitive bias in SE agents; axiomatic reasoning cues cut it 51%.

Models & Research

STaD Exposes What HumanEval Hides: Compositional Skill Gaps in LLMs That Aggregate Benchmarks Miss

IBM Research's STaD shows models with identical benchmark scores can fail on different subskills, making leaderboard rank a poor proxy for compositional code generation.

Models & Research

STaD's Scaffolded Tasks Isolate the Compositional Skill Gaps That Aggregate LLM Benchmarks Hide

IBM Research's STaD framework exposes compositional skill gaps aggregate benchmarks miss: two models at 32% on ToT Arithmetic needed fundamentally different fixes.

Infrastructure & Runtime

UCCL-Zip Adds Lossless Compression to NCCL Collectives: 47.5% Faster RL Weight Sync, No API Changes

UCCL-Zip fuses lossless compression into NCCL collectives at the kernel level, cutting cross-node wire bytes without accuracy tradeoffs or application changes. Peak gains:.

Infrastructure & Runtime

UCCL-Zip: Lossless Compression for NCCL, 47.5% Faster RL Sync, 10% Lower vLLM Latency

UCCL-Zip fuses lossless compression into NCCL and GPU P2P transfers, cutting RL weight sync by 47.5% and vLLM latency by 10% with no API changes and bit-identical outputs.

Agents & Frameworks

ACL 2026: Multi-Agent LLM Topologies Accelerate Premature Convergence; Adding Agents Makes It Worse

An ACL 2026 Findings paper shows dense communication topologies in [multi-agent LLM systems](/articles/neural-computers-symbolic-stability-failure-contradicts-the-case-for-pure/) accelerate premature convergence, meaning topology matters more than model strength.

Agents & Frameworks

'Beyond the Diff' Quantifies Agentic Entropy: Why AI Coding Agents Drift Across Iterations

A CHI 2026 paper formalizes agentic entropy as structural drift between agent actions and intent, showing why per-step benchmarks miss cumulative misalignment in long agent.

Industry & Business

CATL's 10-to-98%-in-Seven-Minute LFP Cell Pushes the EV Fast-Charge Bottleneck From Battery to Charger Grid

CATL's Shenxing LFP claims 10-to-98% in 6:27, implying ~700–900 kW sustained draw that exceeds CCS1 and Tesla V4 limits and shifts the fast-charging bottleneck from cell.

Infrastructure & Runtime

CoCoDiff Exposes the All-to-All Bottleneck That Caps Distributed Diffusion Transformer Inference Well Below Theoretical GPU Count

Ulysses parallelism caps distributed DiT inference scaling on heterogeneous interconnects. CoCoDiff delivers 3.6x average speedups on Aurora via topology-aware scheduling.

Agents & Frameworks

Diversity Collapse in Multi-Agent LLM Systems: Structural Coupling Breaks Open-Ended Idea Generation Even When Topologies Are Sparse

An ACL 2026 Findings paper finds multi-agent LLM brainstorming collapses because agents share models, prompts, and context, not because topologies are too dense.

Models & Research

DuQuant++ Makes FP4 Quantization Practical for LLM Inference: What Fine-Grained Rotation Means for Blackwell Deployments

DuQuant++ aligns rotation block size with MXFP4 microscaling groups, halving preprocessing cost and pushing W4A4 accuracy close to FP8 as Blackwell FP4 Tensor Cores ship.