groundy

Groundy — independent coverage of developer tools, infrastructure, and platforms

models

Audio LLMs Break When the Codec Changes: A Robustness Vector Voice-AI Teams Haven't Tested

CodecAttack achieves 85.5% attack success on audio LLMs by optimizing in codec latent space, with 100% zero-shot transfer to MP3, proving lossy compression fails as a defense.

models

Do LLMs Know What Not to Say? Causal Evidence for Statistical Preemption

New causal evidence shows LLMs suppress wrong continuations during pretraining via statistical preemption, suggesting output-layer safety fixes may target the wrong layer.

oss

Microsoft Open-Sources the Earliest Known DOS Source Code: What 1980 Tim Paterson 86-DOS Reveals

Microsoft released 86-DOS 1.00 on GitHub, the earliest known DOS source, giving researchers a primary document to trace the QDOS to MS-DOS chain and compare it with CP/M.

infra

Railway's GCP Suspension Is a Reseller PaaS Problem, Not a Google One

Railway's eight-hour outage shows why every reseller PaaS with a single upstream account is one billing flag away from total blackout, and what teams should audit now.

agents

Routing LLM Agents: Why TwinRouterBench Splits Static and Live Evaluation

TwinRouterBench pairs 970-prefix static scoring with live SWE-bench runs to expose why per-step router accuracy fails to predict end-to-end agent success.

policy

arXiv 2602.13372 MoralityGym Tests Whether Agents Hold Moral Priorities Across Sequential Decisions

MoralityGym's benchmark shows Safe RL agents degrade on sequential moral tradeoffs, revealing a gap in the single-turn alignment evals that vendors publish as safety proof.



  1. may 25 devtools Anthropic Acquires Stainless: SDK Generator for OpenAI, Google, and Cloudflare
  2. may 25 devtools Anthropic Buys Stainless: OpenAI and Google Now Depend on a Rival for SDK Tooling
  3. may 25 models Audio LLMs Break When the Codec Changes: A Robustness Vector Voice-AI Teams Haven't Tested
  4. may 25 models Do LLMs Know What Not to Say? Causal Evidence for Statistical Preemption
  5. may 25 oss Microsoft Open-Sources the Earliest Known DOS Source Code: What 1980 Tim Paterson 86-DOS Reveals
  6. may 25 infra Railway's GCP Suspension Is a Reseller PaaS Problem, Not a Google One
  7. may 25 agents Routing LLM Agents: Why TwinRouterBench Splits Static and Live Evaluation
  8. may 24 policy arXiv 2602.13372 MoralityGym Tests Whether Agents Hold Moral Priorities Across Sequential Decisions
  9. may 24 infra CISA Admin Leaked AWS GovCloud Keys on GitHub: What Federal Secret Scanning Missed
  10. may 24 security CISA's Internal Data Leak Tests the Disclosure Standards It Sets for Others
  11. may 24 infra Cloudflare's Agent Accounts: When Bots Become Paying Cloud Customers
  12. may 24 oss Colorado SB051 Carves Out Open Source From Age Verification After Maintainer Backlash
  13. may 24 oss Colorado SB26-051 Shields Non-Commercial Open Source by Omission, Not by Design
  14. may 24 models Embedding Compression at Training Time: DIVE's Gradient Trick vs Post-Hoc Quantization for Vector DBs
  15. may 24 security Inside the TanStack npm compromise: OIDC trust as a supply-chain weapon
  16. may 24 oss Nesbitt's Open Source Death Taxonomy Exposes a Health Score Blind Spot
  17. may 24 security Nx s1ngularity Attackers Used Local Claude Code and Gemini CLI to Steal Developer Tokens
  18. may 24 culture OpenAI's Own Economic Analysis Quietly Concedes the Labor Displacement Case
  19. may 24 industry OpenAI's S-1 Triggers a Repricing Cascade for Every Private AI Lab Valuation
  20. may 24 models μP Hyperparameter Transfer Has an Embedding Layer Hole, New arXiv Paper Says
  21. may 24 devtools Rmux Brings a Playwright SDK to tmux Sessions for Agent Automation Workflows
  22. may 24 devtools Shai-Hulud Returns: 314 npm Packages Compromised in a Self-Propagating Supply-Chain Worm
  23. may 24 industry SoftBank's $40B Bridge Loan Means Bank Covenants Will Shape OpenAI's Post-IPO Pricing
  24. may 24 security TanStack npm Attack: When OIDC Trusted Publishing Becomes the Attack Vector
  25. may 24 industry Vercel Acquires Splitbee to Fold First-Party Analytics Into the Hosting Bundle
  26. may 24 infra Vercel CDN Request Collapsing: One Origin Fetch Per ISR Cache Miss
  27. may 24 infra Vercel Fluid Pools Database Connections Across Invocations, Bypassing External Poolers
  28. may 23 policy AI Agent Alignment Tests Are One-Shot. A New Benchmark Catches Multi-Step Failures
  29. may 23 security FBI Director Patel's Based Apparel Site Was Caught Serving ClickFix Malware
  30. may 23 oss Files.md Bets on Plain Markdown Folders as the Obsidian Exit Ramp
  31. may 23 industry Green Card Rule Change Forces Tech Workers to Leave the US to Apply
  32. may 23 culture Microsoft's Own Numbers: AI Agents Cost More Per Task Than the Human Employees They Replace
  33. may 23 policy Microsoft's Own Numbers Now Show AI Agents Cost More Than the Humans They Replaced
  34. may 23 industry OpenAI Hires Slack's Denise Dresser as CRO, Conceding Enterprise Growth Needs a Sales Org
  35. may 23 security OpenAI Ships Lockdown Mode and Elevated Risk Labels for ChatGPT Sessions
  36. may 23 models Project Glasswing One Month In: AI Bug Discovery Has Outpaced the Patch Pipeline
  37. may 23 culture Trump Ends Domestic Green Card Filing: Applicants Must Now Leave the US to Apply
  38. may 23 culture US Researchers Hit With New Federal Limits on Publishing With Foreign Collaborators
  39. may 23 infra Vercel CDN Now Caches External Origin Responses by Default, Ending Years of Uncached Proxies
  40. may 23 infra Vercel Stops Billing for Firewall-Mitigated Traffic: Edge WAF Pricing Catches Up to Cloudflare
  41. may 23 infra What Cloudflare's Q1 2026 Outage Data Says About Designing for State-Level Shutdowns
  42. may 22 models A Theory of Time-Sensitive Language Generation Says Sparse Hallucination Beats Mode Collapse
  43. may 22 models arXiv 2605.16428 Measures AI Search's Drag on Publisher Traffic Using Paired Google and Reddit Data
  44. may 22 agents Beyond Text-to-SQL: New Agentic Architecture Routes Enterprise Analytics Through Governed APIs
  45. may 22 policy CISA's Own Data Leak Has Lawmakers Demanding Answers About the Voluntary Threat-Sharing Pact
  46. may 22 devtools Cursor's In-House Model Changes the Vendor Calculus for AI Coding Teams
  47. may 22 devtools Deno 2.8 Lands as Bun Gets Deprecated by yt-dlp: The JavaScript Runtime Field Is Reshuffling
  48. may 22 culture Employer-Side Law Firms Create a Structural Asymmetry in US Organizing Drives
  49. may 22 devtools Google Sunsets Gemini CLI on June 18: Forced Migration to Antigravity CLI Breaks Existing Automation
  50. may 22 agents GraphFlow Lifts LLM-Agent Workflows Into Schedulable Graphs to Optimize Serving
load older →