Topic

#ai-safety

5 articles exploring ai-safety. Expert insights and analysis from our editorial team.

Showing 1–5 of 5 articles

Articles

Newest first

Jailbreak Scaling Laws: Why Reasoning Models Are Now the Cheapest Attack Vector Against Other LLMs

Two converging studies show LRMs achieve 97% autonomous jailbreak success and exponential scaling — here's what that means for production deployments.

April 19, 2026 · 6 min read

Ethics, Policy & Safety

Detecting AI Content in 2026: The Arms Race Nobody Is Winning

AI content detectors claim 99% accuracy but consistently fail in real-world conditions—flagging innocent students while missing actual AI use. Here's why the arms race has no winner, and what educators and publishers should do instead.

March 13, 2026 · 9 min read

Ethics, Policy & Safety

Don't Trust the Salt: How Non-English Prompts Break LLM Guardrails

AI safety guardrails are built primarily in English. Research shows they can be trivially bypassed using other languages, exposing critical vulnerabilities in global AI deployment.

February 19, 2026 · 10 min read

Ethics, Policy & Safety

Google's AI Overviews Can Scam You: Here's How to Stay Safe

Google's AI-generated search summaries are being exploited by scammers to surface malicious content directly in search results. Learn how these scams work and the protective measures you can take.

February 19, 2026 · 8 min read

Ethics, Policy & Safety

How Much Autonomy Should AI Agents Have? A Framework for Trust

As AI agents gain real-world capabilities—browsing, coding, purchasing—the question of how much autonomy to grant these systems becomes critical. This article proposes the VERIFIED framework for determining appropriate trust levels.

February 19, 2026 · 12 min read

Browse All Topics