groundy

all articles

  1. jun 04 security Activation Steering Was Sold as LLM Control. New Work Makes It an Attack Surface
  2. jun 04 culture Can Teaching Logical Fallacies Inoculate People Against AI Misinformation?
  3. jun 04 devtools Vercel Ships Experimental Native CLI Binaries to Cut the Node Startup Tax
  4. jun 04 security Catching LLM Agents Leaking Credentials From Their Own Activations
  5. jun 04 policy Refusal Steering Targets Individual Experts in MoE LLMs
  6. jun 04 infra Putting a Datacenter V100 in a Gaming PC: The Local LLM Math
  7. jun 04 devtools Vercel Rebuilds Its Marketplace CLI for Agents Instead of Humans
  8. jun 04 security The 2026 npm Attacks Proved AI Coding Assistants Are a Supply-Chain Target
  9. jun 03 security ChatGPT's New Lockdown Mode Borrows Apple's Name for a Prompt-Injection Kill Switch
  10. jun 03 agents When MCP Tool Descriptions Don't Match the Code, Agents Trust the Lie
  11. jun 03 security Students Are Prompt-Injecting AI Graders to Score Full Marks
  12. jun 03 devtools Malicious npm Packages Hit Red Hat's Published JavaScript Clients
  13. jun 03 policy Stacked Org Policies in LLM Chatbots Break Where Rules Collide
  14. jun 03 security Removing an LLM Backdoor Post-Training Without the Poisoned Data
  15. jun 03 models Which Layer Detects LLM Hallucinations Best? The Case Against Fixed-Layer Probes
  16. jun 03 policy Why Fine-Tuning Strips Safety Alignment From Open-Weight LLMs
  17. jun 03 security Stored Prompt Injection Now Persists Across AI Agent Sessions
  18. jun 03 industry MiniMax M3 Bundles 1M Context and Native Multimodal Into One Open-Weight Model
  19. jun 03 security LLM Data Poisoning Survives the Data-Cleaning Defenses Built to Stop It
  20. jun 03 devtools OpenAI Upgrades Codex Right as Teams Weigh Leaving Claude Code
  21. jun 03 policy Game Theory vs RLHF: Modeling LLM Safety Alignment as a Non-Cooperative Game
  22. jun 03 infra Cost-Aware RAG Routing: When Deeper Retrieval Stops Paying Off
  23. jun 02 devtools GitHub Copilot Moves to a Platform App, Decoupling From the Editor
  24. jun 02 infra Using Your Nvidia GPU's VRAM as Linux Swap: Where the NBD Hack Breaks Down
  25. jun 02 security Why OpenAI Bets on Instruction Hierarchy to Stop Prompt Injection
  26. jun 02 policy Explainability Mandates Leak Graph Models to Their Attackers
  27. jun 02 security Stopping Multi-Turn LLM Jailbreaks Without Retraining the Model
  28. jun 02 security African Languages Are a Jailbreak Blind Spot for English-Tuned LLM Safety
  29. jun 02 devtools How a VSCode Bug Let One Click Steal Your GitHub Token
  30. jun 02 agents When an AI Agent Causes a Loss, Who Files the Insurance Claim?
  31. jun 02 models Cross-Domain RL Training Degrades Capabilities. CARE-RL Reweights to Fix It
  32. jun 02 agents When Agent Skill Libraries Scale, Dependency-Aware Retrieval Beats Flat Search
  33. jun 02 policy Evolutionary Search Finds LLM Jailbreak Classes That Static Red-Teaming Misses
  34. jun 02 security Poisoning Open-Source LLM Merges: One Bad Checkpoint Hijacks the Result
  35. jun 02 agents Can Instruction-Tuned Retrievers Fix Agentic Search's Retrieval Gap?
  36. jun 02 models LLM Watermarking Without Quality Loss: The Non-Distortionary Approach
  37. jun 02 security An Autonomous Research Agent Now Discovers SOTA LLM Jailbreak Attacks
  38. jun 02 devtools GitHub Copilot and Productivity: What an Observational Dose-Response Study Measures
  39. jun 02 policy Why AI Red-Teaming Rediscovers the Same Jailbreaks and Misses the Rest
  40. jun 02 industry Morningstar's $780B SpaceX Mark Undercuts the IPO Target by Half
  41. jun 02 security Malware Can Prompt-Inject the AI Agent Reverse-Engineering It
  42. jun 02 agents Bandit-Based Prompt Optimization Targets Multi-Agent Systems Like CrewAI and AutoGen
  43. jun 02 security CVE-Factory Turns Published CVEs Into Security Agent Training Data. A 32B Model Beats Claude 4.5 Sonnet.
  44. jun 01 oss Open-Source Workspace Suite tinycld Takes On Google and Nextcloud
  45. jun 01 oss DARPA's AIxCC Postmortem: What Autonomous Cyber Reasoning Systems Got Right and Wrong
  46. jun 01 oss An Open-Source Home Camera That Encrypts End-to-End Instead of Trusting Ring
  47. jun 01 policy LLMs Treat the Assistant Persona as Privileged. That's a Safety Gap
  48. jun 01 industry Vercel's Grep Buy Signals Code Search Is Now AI Agent Infrastructure
  49. jun 01 security LLM Reasoning Traces Leak the Private Data They're Told to Hide
  50. jun 01 models Treating LLM Agent Memory as a Database: The VikingMem Approach
  51. jun 01 oss Your Open-Source License Won't Stop Someone Phishing With Your Code
  52. jun 01 models Can a Language Model Work Without a Neural Network? A New arXiv Paper Says Yes
  53. jun 01 models Can Code-Generating LLMs Do Engineering Math? FEM-Bench Tests Them
  54. jun 01 policy Newer LLMs Aren't Always Safer: Adversarial Attacks Transfer Across Model Generations
  55. jun 01 models Unlearning Isn't Deletion: arXiv 2505.16831 Shows Machine Unlearning in LLMs Is Reversible
  56. jun 01 security Video Jailbreaks Hit Multimodal LLMs by Splitting Payloads Across Clips
  57. jun 01 industry OMB's Power to Cancel Any Grant at Any Time Shifts Risk Onto University AI Labs
  58. jun 01 devtools JetBrains Ships Codex Natively, Making Its IDE the Multi-Vendor AI Surface
  59. jun 01 industry Anthropic's $965B Private Mark Now Faces a Confidential S-1
  60. may 31 models Why LLMs Fail at Spatial Reasoning When Planning Navigation
  61. may 31 culture Ranking LLMs Side by Side Makes Their Dialect Bias Worse
  62. may 31 security Vercel AI SDK CVE-2025-48985: Input Validation Bypass Hits LLM App Builders
  63. may 31 policy Can Synthetic Preference Data Keep RLHF Private Without Wrecking Alignment?
  64. may 31 agents What Breaks When Claude Code Writes Production Code: A New Failure Catalog
  65. may 31 security Hijacking AI Agent Memory: One Conversation Can Plant a Persistent Trojan
  66. may 31 security Why Attack Success Rate Misleads LLM Jailbreak Benchmarks
  67. may 31 agents More Agents, Worse Results: Why Multi-Agent LLM Teams Hold Experts Back
  68. may 31 devtools Transformers.js v4 Moves Transformer Inference Into the Browser
  69. may 31 industry OpenRouter's $113M Series B Bets Routing Beats Picking a Single LLM
  70. may 31 models Does Giving AI Agents More Skills Help? A Controlled SkillsBench Study
  71. may 31 policy FTC's May 11 Take It Down Act Letters Set May 19 Deadline: 48-Hour Removal, $53,088 Per Violation
  72. may 30 culture Replacing Workers With AI Erodes the Skills You'll Need Later
  73. may 30 culture Does AI Have 6.5 Years Before It Breaches a Planetary Boundary?
  74. may 30 policy Can a Mental Health Support Chatbot Be Safe If It Learns From Forums?
  75. may 30 policy Dataset Watermarks Fail to Trace Fine-Tuned AI Image Models, New Benchmark Finds
  76. may 30 culture Can LLM Agents Realistically Fake Reactions to Online News?
  77. may 30 security Job Seekers Are Prompt-Injecting AI Resume Screeners. New Study Measures the Hit Rate
  78. may 30 security Why Audio Jailbreaks Slip Past the Safety Training Built for Text LLMs
  79. may 30 models Can an LLM Peer-Review Your Paper? A New Behavior Benchmark
  80. may 30 security LoRA Adapter Backdoors Generalize Beyond Their Trigger Tokens
  81. may 30 infra Cloudflare Turnstile Now Fingerprints WebGL: The Privacy CAPTCHA Tradeoff
  82. may 30 models Anthropic Scaled Sparse Autoencoders to Claude 3 Sonnet. Interpretability Now Costs Compute
  83. may 29 oss An Open-Source 80386 Rebuilt Around Intel's Original Microcode
  84. may 29 industry Valve's $200 Steam Deck Price Hike Concedes the Handheld PC Margin Squeeze
  85. may 28 policy Can LLM Personas Replace Human Survey Respondents? New arXiv Paper Tests Decision Alignment
  86. may 28 culture Wikipedia's Foundation Is Running Big Tech's Anti-Labor Playbook, an Editor Argues
  87. may 28 security Three Labs Concede Browser Agents Cannot Stop Prompt Injection
  88. may 28 agents Multi-Agent LLM Coordination: Why Attention Steering Beats Full Broadcast
  89. may 28 models Tracing Why LLM Agent Memory Fails: A Method for Attributing Errors
  90. may 28 security Vercel Firewall Now Blocks SAMLStorm. Can an Edge WAF Fix a SAML Signature Flaw?
  91. may 28 models Persona Prompts Change Who an LLM Recommends as an Expert
  92. may 28 policy Distributed Training Breaks the Compute Thresholds Behind AI Regulation
  93. may 28 agents DataClawBench: AI Agents Fail at Exploratory Financial Analysis Across 492 Tasks
  94. may 28 infra The Viral AWS Support Post Is a Warning About Cloud Escalation Paths
  95. may 28 policy A Single RLHF Pass Can't Align an LLM to Every Online Community
  96. may 28 oss Models.dev Turns Scattered AI Model Pricing Into One Open Database
  97. may 28 policy RLHF Can Be Exploited to Optimize the Biases It Was Built to Suppress
  98. may 28 agents Agentic RAG Has a Credit-Assignment Problem That Subgoaling Tries to Fix
  99. may 27 oss Frontier AI Has Broken Open CTFs: Why Claude Code Now One-Shots Medium Pwn Challenges
  100. may 27 policy Selective Geometry Attacks Bypass LLM Safety Alignment, New arXiv Paper Reports