groundy

Groundy — independent coverage of developer tools, infrastructure, and platforms





  1. jun 22 policy Do LLM Personality Tests Measure Anything? A New Paper Says No
  2. jun 22 security Reported React Server Components Leak Is Unconfirmed: Audit the Payload
  3. jun 22 devtools Generating Vercel Firewall Rules From Natural Language: What to Audit
  4. jun 22 devtools GLM-5.2 Coding Plan vs Claude Opus 4.8: Picking a Model for Coding Agents
  5. jun 22 security Vercel's Secure AI Agent Guidance Pushes Defense Into the Sandbox
  6. jun 22 security Nx Supply-Chain Attack Used Developers' Own AI CLIs to Hunt Secrets
  7. jun 22 industry Vercel Folds Backends, Agent Tooling, and Operations Into Its Deploy Platform
  8. jun 22 infra Cloudflare Now Routes Public Traffic to Private Apps via DNS, No VPN
  9. jun 22 oss OpenAI's Patch the Planet Is Security Capacity for Nine Projects, Not Sustainability Funding
  10. jun 22 oss MiniMax M3 Claims GPT-5.5-Beating Code With 1M Context and Open Weights
  11. jun 22 industry George Hotz Says Only AGI Doom Justifies Today's AI Valuations
  12. jun 22 infra GitHub's AI Capacity Crunch Pushes Microsoft to Rent AWS Compute
  13. jun 22 policy Community LoRA Mining Raises a Consent Gap for Style Generation
  14. jun 21 culture Why Audio Deepfake Detectors Keep Losing the Voice-Cloning Arms Race
  15. jun 20 security Mixed Compliance Data Makes Safety Fine-Tuning a Curation Problem
  16. jun 20 policy When an LLM Narrates a Solver, the Explanation Drifts From the Math
  17. jun 20 infra Cloudflare's Temporary Accounts Give AI Agents Disposable Credentials
  18. jun 20 policy Grading DiffusionGemma: How an Open-Weight Diffusion Model Scores on Transparency
  19. jun 20 policy Who Owns Editorial Authority When LLMs Mediate Knowledge?
  20. jun 20 oss Lithuania's Open-Source Drone-Detection Network Signals an Air-Defense Shift
  21. jun 20 culture Why AI Misreads Nigerian English: A Register Gap in Public Discourse
  22. jun 20 agents Deep-Research Benchmarks Hide How Agents Fail at Open-Web Source Grounding
  23. jun 20 policy Vector Database Access Control Is Missing, and RAG Pipelines Pay for It
  24. jun 20 agents DSPy Ships Autonomous Prompt Optimization, but Judge Drift Is the Failure Mode
  25. jun 20 culture What YouTube's Coding Tutorials Teach About Who Belongs in Software
  26. jun 20 industry Finance Agent Benchmarks Expose Where Lending Automation Breaks
  27. jun 20 oss NLnet's Grant Model Diverges From VC-Backed Open Source
  28. jun 20 oss Adam's Open-Source AI CAD Claim Lacks a Confirmed Repo or Accuracy Benchmark
  29. jun 20 agents Do AI Agents Reach for Over-Privileged Tools When Simpler Ones Suffice?
  30. jun 20 agents When Should Multi-Agent Systems Use an Event Bus Instead of an Orchestrator?
  31. jun 20 oss Epic Open-Sources Lore, a VCS Pitched at Git's Scaling Ceiling
  32. jun 20 infra Running Long-Context Agents on a 4-Bit KV Cache: Where Accuracy Breaks
  33. jun 20 security Defending Agentic AI With Deception: Misdirecting Model-Guided Attacks
  34. jun 20 security The Autonomy Tax: Why RL Rewards the Wrong Behavior in Agents
  35. jun 20 security Anthropic's Procurement Risk Is Policy Refusal, Not Jailbreaks
  36. jun 19 industry Can You Predict a Fine-Tune's Payoff Before Training Finishes?
  37. jun 19 culture When an Algorithm Sequences Gig Hiring, Whose Objective Does It Optimize?
  38. jun 19 infra When LLM-Generated CUDA Kernels Pass Tests but Get the Math Wrong
  39. jun 19 models Can RoboSSM's State-Space Backbone Replace Transformer Imitation Policies?
  40. jun 19 models Pruning Experts to Shrink MoE Models: Does Attribution-Guided Compression Beat Magnitude?
  41. jun 19 agents Can Deontic Policy Rules Govern an AI Agent at Runtime?
  42. jun 19 models GLM-5.2 vs Kimi K2.7 Code: Two Open-Weight Bets on Agentic Coding
  43. jun 19 models How Linear Is a Transformer Feed-Forward Block? A New Test Says It's Learned, Not Built In
  44. jun 19 devtools Cursor Goes to SpaceX, Windsurf to Cognition: What Changes for Dev Teams
  45. jun 18 culture AI Essay Grading: What a Probe of LLM Internals Reveals About Scoring
  46. jun 18 models GLM-5.2 Benchmarks: What 62.1% SWE-bench Pro and 99.2% AIME Actually Mean
  47. jun 18 policy GLM-5.2 MIT Weights vs Llama License: Self-Hosting Compliance for Regulated Industries
  48. jun 18 models GLM-5.2 on Terminal-Bench 2.1: Strengths, Gaps, and How to Route Real Coding Tasks
  49. jun 18 models GLM-5.2 vs Claude Opus 4.8: Open-Weight Coding at Frontier Pricing
  50. jun 18 models GLM-5.2's 753B MoE Costs More to Self-Host Than the MIT License Suggests
load older →