groundy

Groundy — independent coverage of developer tools, infrastructure, and platforms





  1. jun 21 culture Why Audio Deepfake Detectors Keep Losing the Voice-Cloning Arms Race
  2. jun 20 security Mixed Compliance Data Makes Safety Fine-Tuning a Curation Problem
  3. jun 20 policy When an LLM Narrates a Solver, the Explanation Drifts From the Math
  4. jun 20 infra Cloudflare's Temporary Accounts Give AI Agents Disposable Credentials
  5. jun 20 policy Grading DiffusionGemma: How an Open-Weight Diffusion Model Scores on Transparency
  6. jun 20 policy Who Owns Editorial Authority When LLMs Mediate Knowledge?
  7. jun 20 oss Lithuania's Open-Source Drone-Detection Network Signals an Air-Defense Shift
  8. jun 20 culture Why AI Misreads Nigerian English: A Register Gap in Public Discourse
  9. jun 20 agents Deep-Research Benchmarks Hide How Agents Fail at Open-Web Source Grounding
  10. jun 20 policy Vector Database Access Control Is Missing, and RAG Pipelines Pay for It
  11. jun 20 agents DSPy Ships Autonomous Prompt Optimization, but Judge Drift Is the Failure Mode
  12. jun 20 culture What YouTube's Coding Tutorials Teach About Who Belongs in Software
  13. jun 20 industry Finance Agent Benchmarks Expose Where Lending Automation Breaks
  14. jun 20 oss NLnet's Grant Model Diverges From VC-Backed Open Source
  15. jun 20 oss Adam's Open-Source AI CAD Claim Lacks a Confirmed Repo or Accuracy Benchmark
  16. jun 20 agents Do AI Agents Reach for Over-Privileged Tools When Simpler Ones Suffice?
  17. jun 20 agents When Should Multi-Agent Systems Use an Event Bus Instead of an Orchestrator?
  18. jun 20 oss Epic Open-Sources Lore, a VCS Pitched at Git's Scaling Ceiling
  19. jun 20 infra Running Long-Context Agents on a 4-Bit KV Cache: Where Accuracy Breaks
  20. jun 20 security Defending Agentic AI With Deception: Misdirecting Model-Guided Attacks
  21. jun 20 security The Autonomy Tax: Why RL Rewards the Wrong Behavior in Agents
  22. jun 20 security Anthropic's Procurement Risk Is Policy Refusal, Not Jailbreaks
  23. jun 19 industry Can You Predict a Fine-Tune's Payoff Before Training Finishes?
  24. jun 19 culture When an Algorithm Sequences Gig Hiring, Whose Objective Does It Optimize?
  25. jun 19 infra When LLM-Generated CUDA Kernels Pass Tests but Get the Math Wrong
  26. jun 19 models Can RoboSSM's State-Space Backbone Replace Transformer Imitation Policies?
  27. jun 19 models Pruning Experts to Shrink MoE Models: Does Attribution-Guided Compression Beat Magnitude?
  28. jun 19 agents Can Deontic Policy Rules Govern an AI Agent at Runtime?
  29. jun 19 models GLM-5.2 vs Kimi K2.7 Code: Two Open-Weight Bets on Agentic Coding
  30. jun 19 models How Linear Is a Transformer Feed-Forward Block? A New Test Says It's Learned, Not Built In
  31. jun 19 devtools Cursor Goes to SpaceX, Windsurf to Cognition: What Changes for Dev Teams
  32. jun 18 culture AI Essay Grading: What a Probe of LLM Internals Reveals About Scoring
  33. jun 18 models GLM-5.2 Benchmarks: What 62.1% SWE-bench Pro and 99.2% AIME Actually Mean
  34. jun 18 policy GLM-5.2 MIT Weights vs Llama License: Self-Hosting Compliance for Regulated Industries
  35. jun 18 models GLM-5.2 on Terminal-Bench 2.1: Strengths, Gaps, and How to Route Real Coding Tasks
  36. jun 18 models GLM-5.2 vs Claude Opus 4.8: Open-Weight Coding at Frontier Pricing
  37. jun 18 models GLM-5.2's 753B MoE Costs More to Self-Host Than the MIT License Suggests
  38. jun 18 infra Running GLM-5.2 at Home: SGLang, vLLM, Transformers, and KTransformers Setup Guide
  39. jun 18 devtools Running GLM-5.2 in Cursor, Cline, and Roo Code: Migration Checklist and Gotchas
  40. jun 17 models STAR Replaces Scalar Reward in Text-to-Image RL with Attention-Derived Spatial Maps
  41. jun 15 oss Zhipu Open-Sources GLM-5.2 Under MIT While Anthropic Tightens Model Access
  42. jun 15 models Can Editing One Neuron Fix LLM Repetition Loops?
  43. jun 15 industry Zhipu Ships GLM-5.2 With 1M Context and MIT Weights, but Zero Benchmarks at Launch
  44. jun 15 infra AWS Bedrock Now Requires Data Sharing for Mythos: The Self-Hosting Calculus
  45. jun 15 devtools Vercel's Remend Turns Streaming-Markdown Repair Into a Dependency
  46. jun 15 industry Moonshot's Kimi K2.7 Code Loses 11 of 12 Benchmark Cells, Leads on Efficiency Instead
  47. jun 14 policy Can Reinforcement Learning Be Provably Safe Without Sacrificing Scale?
  48. jun 14 infra vLLM Cold Start Latency: Why Scale-to-Zero LLM Serving Stalls
  49. jun 14 infra The Vercel-AWS Deal Reveals Where AI Inference Runs
  50. jun 14 agents Do Programming Languages Still Matter to Your AI Coding Agent?
load older →