Groundy — independent coverage of developer tools, infrastructure, and platforms
Frontier AI Has Broken Open CTFs: Why Claude Code Now One-Shots Medium Pwn Challenges
Frontier AI agents solve most medium CTF challenges for under $100 in API costs. BSidesSF 2026 saw 16 full-solve teams, up from one. The open CTF format has lost calibration.
policySelective Geometry Attacks Bypass LLM Safety Alignment, New arXiv Paper Reports
Two papers show LLM safety alignment can be bypassed by embedding perturbations, a surface neither standard evaluations nor regulatory certifications inspect.
OpenAI's Indeed Customer Story Pushes ChatGPT Into the Job-Description Stack Ahead of LinkedIn
OpenAI's enterprise HR-tech push commoditizes job-description AI ahead of its IPO, shifting the recruiting-tool advantage to data moats held by LinkedIn and Workday.
industryHiBob Runs 2,500 Internal GPTs: OpenAI's New Enterprise Adoption Metric
OpenAI is pushing custom GPT count as its enterprise adoption metric, using HiBob's 2,500 deployments as proof. The number measures configuration volume, not usage or value.
industryOpenAI's Trusted-Access Programs Force a Compliance Tier onto Pharma AI Buyers
OpenAI's trusted-access gating for GPT-Rosalind and GPT-5.5-Cyber forces pharma procurement teams to absorb a vendor-defined compliance layer before any inference can run.
agentsSkillOpt Treats Agent Skill Libraries as an Executive Scheduling Problem, Not a Memory Bank
SkillOpt treats agent skills as trainable state with deletion and budgeted edits, sweeping 52 of 52 benchmarks. Append-only registries in agent frameworks are a design error.
ossAudiomass Adds Multitrack to the Browser-Only Open-Source Audio Editor
AudioMass added multitrack editing to its 65 KB browser audio editor with no install step. The update targets locked-down devices, but browser memory limits cap project size.
agentsClaude Code Dynamic Workflows: Spawning 100 Parallel Subagents on Opus 4.8
Dynamic workflows lets Claude Code run hundreds of parallel subagents in one session. Here is how map-reduce and fan-out patterns work on Opus 4.8.
- agents Claude Code, Cursor, Copilot: How Agentic Coding Assistants Get Weaponized as Attacker Shells
- devtools Anthropic Buys Stainless: OpenAI and Google Now Depend on a Rival for SDK Tooling
- agents A New Trust Schema Exposes Why Agent Skill Registries Fail Enterprise Audit Requirements
- policy FTC's TAKE IT DOWN Act Lands May 19: 48-Hour Deepfake NCII Takedowns and No Safe Harbor
- agents CrewAI vs AutoGen vs LangGraph 2026: The Real Trade-Off After Maintenance Mode
- devtools Claude Code Plugins: Anthropic's Official Plugin Ecosystem Explained
- devtools GitHub Copilot vs Cursor vs Claude Code: The 2026 AI Coding Showdown
- infra MLX vs llama.cpp on Apple Silicon: Which Runtime to Use for Local LLM Inference
- models Chinese AI Models Compared: DeepSeek, Qwen, Kimi, Doubao, and Ernie
- models AI Code Generation Benchmarks 2026: Which Model Actually Writes Better Code?
- devtools GitHub Copilot's Opus 4.7 Multiplier: 7.5x to 15x to 27x in 60 Days
- industry Cursor's Meteoric Rise: Inside the AI Editor Hitting $300M ARR
- infra Prefill-Decode Disaggregation: The Architecture Shift Redefining LLM Serving at Scale
- devtools Claude Code in GitHub Actions: A Complete Guide to Automated PR Fixes
- culture EU's 2027 Replaceable Battery Mandate: What It Means for Phone Buyers and Repairers Right Now
- may 27 oss Frontier AI Has Broken Open CTFs: Why Claude Code Now One-Shots Medium Pwn Challenges
- may 27 policy Selective Geometry Attacks Bypass LLM Safety Alignment, New arXiv Paper Reports
- may 27 industry OpenAI's Indeed Customer Story Pushes ChatGPT Into the Job-Description Stack Ahead of LinkedIn
- may 27 industry HiBob Runs 2,500 Internal GPTs: OpenAI's New Enterprise Adoption Metric
- may 27 industry OpenAI's Trusted-Access Programs Force a Compliance Tier onto Pharma AI Buyers
- may 27 agents SkillOpt Treats Agent Skill Libraries as an Executive Scheduling Problem, Not a Memory Bank
- may 27 oss Audiomass Adds Multitrack to the Browser-Only Open-Source Audio Editor
- may 27 agents Claude Code Dynamic Workflows: Spawning 100 Parallel Subagents on Opus 4.8
- may 27 agents How Opus 4.8 Honesty Prevents Cascade Failures in Agentic Loops
- may 27 models Opus 4.8 Batch API: 1M Context, 300k Output, and Team Cost Controls
- may 27 models Opus 4.8 vs Opus 4.7: What Changed and What Did Not
- may 27 devtools Should Your Coding Team Upgrade to Opus 4.8? The Honest Tradeoff Math
- may 26 agents Penetration Testing Multi-Agent LLM Systems: A Failure Catalog Vendors Don't Document
- may 26 security OpenAI's New Safety Bug Bounty Pays Researchers for Jailbreaks and Policy Bypasses
- may 26 models One Learning Rate Doesn't Fit All: Heavy-Tail Layerwise LR Schedules for LLM Pretraining
- may 26 industry OpenAI Buys Statsig and Makes Vijaye Raji CTO of Applications: Product Analytics Becomes Core Infra
- may 26 security Axios npm Compromise Forces Vercel Into Platform-Level Remediation
- may 26 industry HuggingFace's $100M Series C Bets Open-Source AI Can Outlast Per-Token Pricing Wars
- may 26 security Next.js Dev Server CVE-2025-48068: Any Web Page Could Read Your Source Files
- may 26 industry Vercel's Series F Repackages Frontend Hosting as an AI Cloud Bundle
- may 26 infra Gemma 4 31B on Cloud TPU vs GPU: The Serving Cost Crossover Point
- may 26 agents Claude Code, Cursor, Copilot: How Agentic Coding Assistants Get Weaponized as Attacker Shells
- may 26 agents Claude Code Configs in the Wild: New Study Maps How Developers Actually Use It
- may 26 infra Cloudflare Flagship Is a Feature Flag Service That Deepens Platform Gravity
- may 26 security MCP Tool Description Poisoning: New Benchmark Shows Agents Trust Manuals That Lie
- may 26 security OpenAI Adds a GPT-5 System Card Addendum on Sensitive Conversations
- may 26 industry OpenAI's Biology Risk Post Reads as S-1 Disclosure Prep, Not Safety Theater
- may 26 models Scale Vectors: Tiny Parameter Subsets That Disproportionately Steer LLM Behavior
- may 26 security Vercel Could Block React2Shell at the Edge. Its Next 13 CVEs Had No Shortcut.
- may 26 devtools Vercel Sandbox Gets CLI Access and Env Vars: A Push at the Agent Runtime Slot
- may 26 infra Why LLMs Still Botch Kubernetes Manifests: The Training-Data Gap
- may 25 agents Microsoft Bolts Governance Onto Agent Framework as Stack Sprawl Persists
- may 25 policy arXiv Paper Tracks FTC Affiliate Disclosure Gaps in YouTube's Influencer Economy
- may 25 devtools Bun Rewrites Its Core From Zig to Rust, Putting Downstream Zig Bindings at Risk
- may 25 infra ObjectCache Moves KV Reuse to S3-Class Storage: Why Layerwise Retrieval Beats Full-Prefix Cache Hits
- may 25 policy AI Safety Benchmark Rankings Flip Based on Eval Config, SafetyRepro Paper Reports
- may 25 infra Vercel's CDN Origin Timeout Jumps to 2 Minutes: A Concession to LLM Streaming Workloads
- may 25 agents GovernSpec Contractual Skills Make Agent Governance Auditable Before Runtime
- may 25 devtools Vercel Bets on Bun While Post-Acquisition Priority Drift Makes the Runtime a Vendor Decision
- may 25 industry OpenAI Replaces Indeed's Job-Matching Engine: What It Means for ATS Vendors
- may 25 oss One Coding Agent Per Kanban Card: Kanbots Stress-Tests Parallel AI Workflow
- may 25 infra Fluid Compute vs PgBouncer: Vercel's Undocumented Bet on Connection Reuse
- may 25 devtools PromptArmor Shows Microsoft Copilot Cowork Can Be Tricked Into Exfiltrating Files
- may 25 agents Indirect Prompt Injection Benchmarks Were Too Easy: LivePI Adds Realism
- may 25 security Apple Names Claude in CVE Credit Line, Setting Vendor Attribution Precedent
- may 24 industry Vercel Acquires Splitbee to Fold First-Party Analytics Into the Hosting Bundle
- may 24 models Embedding Compression at Training Time: DIVE's Gradient Trick vs Post-Hoc Quantization for Vector DBs
- may 25 devtools Anthropic Buys Stainless: OpenAI and Google Now Depend on a Rival for SDK Tooling
- may 24 models μP Hyperparameter Transfer Has an Embedding Layer Hole, New arXiv Paper Says
- may 25 models Audio LLMs Break When the Codec Changes: A Robustness Vector Voice-AI Teams Haven't Tested