groundy

security

79 articles · rss

Top in security


  1. jun 05 security Stronger Safety Alignment Made LLMs Easier to Jailbreak, Not Harder
  2. jun 05 security SAML Signature Bypass Is Back: Inside the SAMLStorm Vulnerability Class
  3. jun 05 security SAMLStorm: The SAML Signature Bug That Forges Valid SSO Logins
  4. jun 05 security Vercel's Flags SDK Exposed Feature-Flag Definitions via CVE-2025-46332
  5. jun 04 security Jailbreak Suffixes Hit Harder at Specific Token Positions, New GCG Variant Shows
  6. jun 04 security OpenAI Adds Lockdown Mode to ChatGPT, Shifting Prompt-Injection Risk to Users
  7. jun 04 security Activation Steering Was Sold as LLM Control. New Work Makes It an Attack Surface
  8. jun 04 security Catching LLM Agents Leaking Credentials From Their Own Activations
  9. jun 04 security The 2026 npm Attacks Proved AI Coding Assistants Are a Supply-Chain Target
  10. jun 03 security ChatGPT's New Lockdown Mode Borrows Apple's Name for a Prompt-Injection Kill Switch
  11. jun 03 security Students Are Prompt-Injecting AI Graders to Score Full Marks
  12. jun 03 security Removing an LLM Backdoor Post-Training Without the Poisoned Data
  13. jun 03 security Stored Prompt Injection Now Persists Across AI Agent Sessions
  14. jun 03 security LLM Data Poisoning Survives the Data-Cleaning Defenses Built to Stop It
  15. jun 02 security Why OpenAI Bets on Instruction Hierarchy to Stop Prompt Injection
  16. jun 02 security Stopping Multi-Turn LLM Jailbreaks Without Retraining the Model
  17. jun 02 security African Languages Are a Jailbreak Blind Spot for English-Tuned LLM Safety
  18. jun 02 security Poisoning Open-Source LLM Merges: One Bad Checkpoint Hijacks the Result
  19. jun 02 security An Autonomous Research Agent Now Discovers SOTA LLM Jailbreak Attacks
  20. jun 02 security Malware Can Prompt-Inject the AI Agent Reverse-Engineering It
  21. jun 02 security CVE-Factory Turns Published CVEs Into Security Agent Training Data. A 32B Model Beats Claude 4.5 Sonnet.
  22. jun 01 security LLM Reasoning Traces Leak the Private Data They're Told to Hide
  23. jun 01 security Video Jailbreaks Hit Multimodal LLMs by Splitting Payloads Across Clips
  24. may 31 security Vercel AI SDK CVE-2025-48985: Input Validation Bypass Hits LLM App Builders
  25. may 31 security Hijacking AI Agent Memory: One Conversation Can Plant a Persistent Trojan
  26. may 31 security Why Attack Success Rate Misleads LLM Jailbreak Benchmarks
  27. may 30 security Job Seekers Are Prompt-Injecting AI Resume Screeners. New Study Measures the Hit Rate
  28. may 30 security Why Audio Jailbreaks Slip Past the Safety Training Built for Text LLMs
  29. may 30 security LoRA Adapter Backdoors Generalize Beyond Their Trigger Tokens
  30. may 28 security Three Labs Concede Browser Agents Cannot Stop Prompt Injection
  31. may 28 security Vercel Firewall Now Blocks SAMLStorm. Can an Edge WAF Fix a SAML Signature Flaw?
  32. may 26 security Vercel Could Block React2Shell at the Edge. Its Next 13 CVEs Had No Shortcut.
  33. may 26 security OpenAI Adds a GPT-5 System Card Addendum on Sensitive Conversations
  34. may 26 security MCP Tool Description Poisoning: New Benchmark Shows Agents Trust Manuals That Lie
  35. may 26 security OpenAI's New Safety Bug Bounty Pays Researchers for Jailbreaks and Policy Bypasses
  36. may 26 security Axios npm Compromise Forces Vercel Into Platform-Level Remediation
  37. may 26 security Next.js Dev Server CVE-2025-48068: Any Web Page Could Read Your Source Files
  38. may 25 security Apple Names Claude in CVE Credit Line, Setting Vendor Attribution Precedent
  39. may 24 security CISA's Internal Data Leak Tests the Disclosure Standards It Sets for Others
  40. may 24 security TanStack npm Attack: When OIDC Trusted Publishing Becomes the Attack Vector
  41. may 24 security Nx s1ngularity Attackers Used Local Claude Code and Gemini CLI to Steal Developer Tokens
  42. may 23 security OpenAI Ships Lockdown Mode and Elevated Risk Labels for ChatGPT Sessions
  43. may 22 security AI Jailbreaks Are Now a Reasoning Problem, Not a Prompt Problem
  44. may 22 security Jailbreak Defense Now Lives in Model Weights, Not in Prompt Filters
  45. may 22 security Vercel Blocks Deploys With Vulnerable next-mdx-remote by Default: Platform Mitigation Outpaces the CVE Cycle
  46. may 22 security Vercel's Next.js Middleware Bypass Postmortem: What the Fix Reveals About Edge Runtime Auth
  47. may 22 security OpenAI's New Agent Defense Post Concedes Prompt Injection Is Architectural, Not Patchable
  48. may 22 security When Stronger Backdoor Triggers Backfire: An arXiv Theory Paper Inverts a Core Defense Assumption
  49. may 17 security DPrivBench: LLMs Score 99.5% on Textbook DP but Collapse on Advanced Reasoning
  50. may 17 security Catching Graph Neural Net Backdoors by Influence, Not Pattern

Security coverage here starts from a premise other beats elide: the AI stack is not a new attack surface so much as an old one wearing fresh abstractions. Inference servers, agent frameworks, and notebook runtimes ship with the same deserialization, SSRF, and path-traversal classes that web infrastructure spent two decades learning to harden, only now wired directly to credential stores, tool execution, and untrusted model output. The interesting question is rarely whether a given framework is exploitable; it is which inherited assumption finally broke under agentic load.

We track three structural tensions. First, the collapse of the local-host trust model as agent protocols carry developer-grade defaults into multi-tenant deployments. Second, supply-chain compromise that bypasses scanner coverage by hiding in places package auditors do not look, from model repositories to preinstall hooks to registry metadata. Third, the shrinking window between coordinated disclosure and in-the-wild exploitation, which is increasingly measured in hours and which exposes how much of the ecosystem still treats patch cadence as a quarterly concern.

The frame is comparative and skeptical rather than alarmist. Vendor lockdown modes, model-level safety training, and detector benchmarks all get evaluated against the same standard: does this address a structural property of the system, or relocate the failure mode somewhere harder to audit? Jailbreak research, disclosure-policy enforcement, and institutional credential hygiene belong on the same beat because they fail for related reasons. The work is to name those reasons in a way that still reads true after the specific advisories have rolled off the front page.