groundy

Groundy — independent coverage of developer tools, infrastructure, and platforms





  1. jun 05 security Stronger Safety Alignment Made LLMs Easier to Jailbreak, Not Harder
  2. jun 05 security SAML Signature Bypass Is Back: Inside the SAMLStorm Vulnerability Class
  3. jun 05 policy When LLM Safety Lives at Inference, Not Training: A Certification Gap
  4. jun 05 culture Do LLMs Understand Idioms in Low-Resource Languages?
  5. jun 05 infra Does CUDA Tile Match Hand-Tuned Kernels on Hopper and Blackwell?
  6. jun 05 security SAMLStorm: The SAML Signature Bug That Forges Valid SSO Logins
  7. jun 05 models MiniMax M3 Bets on Sparse Attention for 1M Context. Does the Math Hold?
  8. jun 05 models Can One Model Handle Every CAD Task? UniCAD Tests It
  9. jun 05 models Do Foundation Models Actually Learn Relational Structure In-Context?
  10. jun 05 models Can LLMs Write Better Research Paper Titles Than Authors?
  11. jun 05 models Does Information-Theoretic Example Selection Beat kNN for In-Context Learning?
  12. jun 05 infra Pod-Level Remote Attestation in Kubernetes: Confidential Workloads on dstack
  13. jun 05 models Do Concept Bottleneck Model Benchmarks Measure Interpretability or Dataset Bias?
  14. jun 05 agents Cascading Hallucination in Agentic RAG: When One Bad Retrieval Poisons the Chain
  15. jun 05 security Vercel's Flags SDK Exposed Feature-Flag Definitions via CVE-2025-46332
  16. jun 05 models Continuous Bit-Width Quantization vs Fixed INT4: Does LiftQuant Beat Discrete?
  17. jun 04 models Federated Learning for Industrial IoT Anomaly Detection: The Data-Locality Tradeoff
  18. jun 04 infra Generating GPU Kernels for Moore Threads Silicon: Can LLMs Break CUDA Lock-In?
  19. jun 04 devtools Alibaba's Open Code Review Moves AI Review Into the CLI, Not the PR
  20. jun 04 infra Microsoft's Azure Linux Goes General-Purpose: The Container Base-Image Play
  21. jun 04 models Reading Failed LLM Reasoning Traces Won't Tell You Which Ones RL Can Fix
  22. jun 04 agents Can AI Agents Build Other Agents? The Meta-Agent Challenge Says Mostly Not Yet
  23. jun 04 models Can You Stitch Two Foundation Models Together Without Retraining?
  24. jun 04 infra Cloudflare Acquires VoidZero, the Company Behind Vite's Rust Toolchain
  25. jun 04 security Jailbreak Suffixes Hit Harder at Specific Token Positions, New GCG Variant Shows
  26. jun 04 policy When Should an LLM Forget You? A Benchmark for Deciding What Memory to Drop
  27. jun 04 security OpenAI Adds Lockdown Mode to ChatGPT, Shifting Prompt-Injection Risk to Users
  28. jun 04 policy When RL Training Rewards Capability-Seeking: A New Alignment Risk
  29. jun 04 models Do Reasoning LLMs Waste Tokens? OckBench Tries to Measure It
  30. jun 04 security Activation Steering Was Sold as LLM Control. New Work Makes It an Attack Surface
  31. jun 04 culture Can Teaching Logical Fallacies Inoculate People Against AI Misinformation?
  32. jun 04 devtools Vercel Ships Experimental Native CLI Binaries to Cut the Node Startup Tax
  33. jun 04 security Catching LLM Agents Leaking Credentials From Their Own Activations
  34. jun 04 policy Refusal Steering Targets Individual Experts in MoE LLMs
  35. jun 04 infra Putting a Datacenter V100 in a Gaming PC: The Local LLM Math
  36. jun 04 devtools Vercel Rebuilds Its Marketplace CLI for Agents Instead of Humans
  37. jun 04 security The 2026 npm Attacks Proved AI Coding Assistants Are a Supply-Chain Target
  38. jun 03 security ChatGPT's New Lockdown Mode Borrows Apple's Name for a Prompt-Injection Kill Switch
  39. jun 03 agents When MCP Tool Descriptions Don't Match the Code, Agents Trust the Lie
  40. jun 03 security Students Are Prompt-Injecting AI Graders to Score Full Marks
  41. jun 03 devtools Malicious npm Packages Hit Red Hat's Published JavaScript Clients
  42. jun 03 policy Stacked Org Policies in LLM Chatbots Break Where Rules Collide
  43. jun 03 security Removing an LLM Backdoor Post-Training Without the Poisoned Data
  44. jun 03 models Which Layer Detects LLM Hallucinations Best? The Case Against Fixed-Layer Probes
  45. jun 03 policy Why Fine-Tuning Strips Safety Alignment From Open-Weight LLMs
  46. jun 03 security Stored Prompt Injection Now Persists Across AI Agent Sessions
  47. jun 03 industry MiniMax M3 Bundles 1M Context and Native Multimodal Into One Open-Weight Model
  48. jun 03 security LLM Data Poisoning Survives the Data-Cleaning Defenses Built to Stop It
  49. jun 03 devtools OpenAI Upgrades Codex Right as Teams Weigh Leaving Claude Code
  50. jun 03 policy Game Theory vs RLHF: Modeling LLM Safety Alignment as a Non-Cooperative Game
load older →