Topic

#ai-safety

5 articles exploring ai-safety. Expert insights and analysis from our editorial team.

Showing 1–5 of 5 articles

Articles

Newest first
Security

Jailbreak Scaling Laws: Why Reasoning Models Are Now the Cheapest Attack Vector Against Other LLMs

Two converging studies show LRMs achieve 97% autonomous jailbreak success and exponential scaling — here's what that means for production deployments.

· 6 min read
Ethics, Policy & Safety

Detecting AI Content in 2026: The Arms Race Nobody Is Winning

AI content detectors claim 99% accuracy but consistently fail in real-world conditions—flagging innocent students while missing actual AI use. Here's why the arms race has no winner, and what educators and publishers should do instead.

· 9 min read
Ethics, Policy & Safety

Don't Trust the Salt: How Non-English Prompts Break LLM Guardrails

AI safety guardrails are built primarily in English. Research shows they can be trivially bypassed using other languages, exposing critical vulnerabilities in global AI deployment.

· 10 min read
Ethics, Policy & Safety

Google's AI Overviews Can Scam You: Here's How to Stay Safe

Google's AI-generated search summaries are being exploited by scammers to surface malicious content directly in search results. Learn how these scams work and the protective measures you can take.

· 8 min read
Ethics, Policy & Safety

How Much Autonomy Should AI Agents Have? A Framework for Trust

As AI agents gain real-world capabilities—browsing, coding, purchasing—the question of how much autonomy to grant these systems becomes critical. This article proposes the VERIFIED framework for determining appropriate trust levels.

· 12 min read