Groundy — independent coverage of developer tools, infrastructure, and platforms
Do Multi-Agent RAG Systems Write Better READMEs Than One Agent?
An ICSME 2026 study finds single-agent RAG matches multi-agent README quality using 86% fewer tokens and half the latency, while developer-guided planning beats both.
securityJailbreaks Hidden in Image Pixels Slip Past Editors' Text Guardrails via an Empty Prompt
VJA embeds jailbreak instructions in image pixels with an empty text prompt, leaving text-only guardrails nothing to scan and forcing moderation into the pixel pipeline.
Doubao 2.1 Pro: What 180 Trillion Daily Tokens Means for Inference Infrastructure
Doubao 2.1 Pro ships at ¥6/¥30 per million tokens. The family's 180 trillion daily tokens reset what Western inference stacks must assume about price and capacity.
devtoolsVercel Firewall in the CLI: What's Still Missing
Vercel Firewall now has a CLI, but the dashboard is still required. We map the controls that stay manual, the vercel.json action subset, and per-region rate-limit trap.
infraEvery CUDA Kernel Pays a Launch Tax: The Host-to-Device Walkthrough
Every CUDA kernel pays a fixed driver-queue tax before its first FLOP runs. The fusion, graphs, and batching sold as bandwidth wins mostly hide the launch overhead.
modelsHow LLMs Fuse Conflicting Facts: Single-Source vs Multi-Source Truth
LLMs beat classic truth-discovery on conflicting facts, but the same reasoning treats repeated low-credibility claims as corroboration. More sources do not mean more truth.
securityLinux Foundation Akrites Centralizes Open-Source Vulnerability Disclosure
Akrites pools 19 vendors behind one shared vulnerability disclosure SIRT to absorb a flood of duplicate LLM reports, but risks becoming the new bottleneck itself.
modelsLinear Transformers Get a Learnable Kernel: Does Flexformer Change the Efficiency Tradeoff?
Flexformer makes linear attention's kernel learnable by training spectral frequencies, but the abstract offers no perplexity or accuracy numbers to back its gains.
- agentsMCP vs A2A: Two Agent Protocols, One Integration Layer Decision
- modelsGLM-5.2 vs Kimi K2.7 Code: Two Open-Weight Bets on Agentic Coding
- devtoolsCursor Goes to SpaceX, Windsurf to Cognition: What Changes for Dev Teams
- policyUS Export Order Forces Anthropic to Disable Fable 5 and Mythos 5 Worldwide
- infraMiniMax M3 Ships 1M Context and Desktop Control as Open Weights
- devtoolsGitHub Copilot vs Cursor vs Claude Code: The 2026 AI Coding Showdown
- industryCursor's Meteoric Rise: Inside the AI Editor Hitting $300M ARR
- modelsGLM-5.2 Benchmarks: What 62.1% SWE-bench Pro and 99.2% AIME Actually Mean
- modelsAI Code Generation Benchmarks 2026: Which Model Actually Writes Better Code?
- modelsChinese AI Models Compared: DeepSeek, Qwen, Kimi, Doubao, and Ernie
- devtoolsClaude Code Plugins: Anthropic's Official Plugin Ecosystem Explained
- infraMLX vs llama.cpp on Apple Silicon: Which Runtime to Use for Local LLM Inference
- infraRunning GLM-5.2 at Home: SGLang, vLLM, Transformers, and KTransformers Setup Guide
- devtoolsRunning GLM-5.2 in Cursor, Cline, and Roo Code: Migration Checklist and Gotchas
- modelsGLM-5.2 on Terminal-Bench 2.1: Strengths, Gaps, and How to Route Real Coding Tasks
- jun 29agentsDo Multi-Agent RAG Systems Write Better READMEs Than One Agent?
- jun 29securityJailbreaks Hidden in Image Pixels Slip Past Editors' Text Guardrails via an Empty Prompt
- jun 29infraDoubao 2.1 Pro: What 180 Trillion Daily Tokens Means for Inference Infrastructure
- jun 29devtoolsVercel Firewall in the CLI: What's Still Missing
- jun 29infraEvery CUDA Kernel Pays a Launch Tax: The Host-to-Device Walkthrough
- jun 29modelsHow LLMs Fuse Conflicting Facts: Single-Source vs Multi-Source Truth
- jun 28securityLinux Foundation Akrites Centralizes Open-Source Vulnerability Disclosure
- jun 28modelsLinear Transformers Get a Learnable Kernel: Does Flexformer Change the Efficiency Tradeoff?
- jun 28securityWhy LLM Prompt Injection Persists: Instructions and Data Share Embeddings
- jun 28cultureGenerative AI Moves the Freelance Bottleneck From Tasks to Skill Repricing
- jun 28policyUncertainty-Aware Reward Discounting Cuts Reward Hacking 93.6% in a Preprint
- jun 28securityWhen Bots and Agents Post CVEs in PRs, Reporters Inherit the Triage Burden
- jun 28securityRuntime vs Build-Time SBOMs: Why Your Container Runs Uncatalogued Code
- jun 28industryElkjøp's Next.js Move Shows Vercel Wants Retail Operations, Not Just Websites
- jun 28agentsAgentic AI Turns Location Trails Into a Re-Identification Tool
- jun 28modelsHuawei Ships CUDA-Free AI Compute On-Device, but Ascend Quantization Accuracy Is Unverified
- jun 28infraVercel Montreal Region: Audit Residency Before You Migrate
- jun 28agentsHow a Human-Agent Team Lifts One Video Into 4D Interactions
- jun 28ossSafetensors vs Pickle: Why Hugging Face Chose It After the Security Audit
- jun 28modelsDo Multimodal RAG Models Ignore Late Evidence? A Primacy Bias Test
- jun 28securityOpenAI's Agent Link Safety Isolates the Fetch, Not Prompt Injection
- jun 28agentsCan LLM Agents Learn Cooperation Laws From Embodied Play?
- jun 28securityNo Verified 'React2Shell' Bulletin Exists: What Next.js Teams Should Check
- jun 28modelsCan Deep Learning Design RF Power Amplifiers Without Full EM Simulation?
- jun 28securityVercel on the Axios npm Compromise: Platform Scanning Has a Blind Spot
- jun 28agentsGovern the Repo, Not the Agent: A New Risk Metric for AI-Native Code
- jun 28cultureLLM-Generated VeriFast Specs Shift the Trust Bottleneck from Proofs to Review
- jun 28infraGLM-5.2 on vLLM and Ascend: Open Weights Beyond NVIDIA
- jun 28ossHugging Face Is Absorbing Computer Vision Into Vision-Language Models
- jun 27agentsCan an AI Agent Catch Cryptographic Misuse Before It Ships? Chai Tests the Claim
- jun 27devtoolsVercel's CLI Is a Deployment Path, Not a Control Plane
- jun 27infraHow Vercel Runs Its Own CDN in Front of Discourse: A Self-Dogfooding Case Study
- jun 27industryByteDance's Doubao Seed 2.1 Pro: Production-Grade Claims, Vendor-Graded Evidence
- jun 27devtoolsGLM-5.2 Goes Open Weights: What the Long-Horizon Coding Pitch Leaves Out
- jun 27policyMedical AI Liability Needs a Clinical Harness
- jun 27modelsSynthetic Clinical Notes from LLMs: Believable Prose Is Not Clinical Validity
- jun 27modelsDoubao vs Qwen 3.7 vs GLM-5.2: Route by Axis, Not Leaderboard
- jun 27infraVercel Runtime Logs Surface CDN Cache Hits, Not the Eviction Cause
- jun 27modelsCan Dynamic Experts Fix Catastrophic Forgetting in Robot Manipulation?
- jun 27devtoolsHuggingFace Personal Copilot: The Bottleneck Is Your Codebase, Not Compute
- jun 27devtoolsLlama 4 on Vercel's AI Model Gateway: Hosted Inference vs Self-Hosted vLLM
- jun 27devtoolsVercel's Pre-Generate SSL Flow Stages Certs Before DNS Cutover
- jun 27modelsError-Conditioned Neural Solvers vs Iterative Refinement: When Does Learned Correction Win?
- jun 27modelsVision-Language Models Move Past Object Detection: The MLLM Perception Shift
- jun 27modelsCan Autoregressive Boltzmann Generators Replace MCMC in Simulation?
- jun 27infraMultimodal Knowledge Graph RAG vs Vector RAG: What MKG-RAG-Bench Shows
- jun 27devtoolsVercel Sandbox CLI: Reproducible Agent Runs Belong in CI, Not the Dashboard
- jun 27infraVercel Observability Now Tracks Redirects and Rewrites Beside Function Errors
- jun 27ossAkrites Defends Open Source Code, Not in Court: What It Can and Can't Do
- jun 27infraCloudflare Workflows Saga Rollbacks: Compensating Actions in Serverless Orchestration