Topic

#safety

2 articles exploring safety. Expert insights and analysis from our editorial team.

Showing 1–2 of 2 articles

Articles

Newest first

Symbolic Guardrails for AI Agents: Hard Safety Guarantees Without Crippling Capability

A new paper shows symbolic guardrails can push agent safety to 100% in regulated domains without capability loss — but only for 74% of real-world policies.

April 20, 2026 · 6 min read

Ethics, Policy & Safety

Constitutional AI: Teaching Models to Self-Correct Before They Act

Anthropic's Constitutional AI trains language models to critique and revise their own outputs using principles rather than human labels, but questions remain about whether this represents genuine safety gains or sophisticated filtering mechanisms.

February 14, 2026 · 9 min read

Browse All Topics