Can Deontic Policy Rules Govern an AI Agent at Runtime?

Yes, in principle. A paper submitted to arXiv in June 2026 (arXiv:2606.19464) shows that obligations, permissions, and prohibitions can be encoded as deontic policies and enforced by a logic engine running outside the LLM while an agent acts. The harder question is whether teams shipping agents will pay the cost of moving safety out of the prompt and into an external layer they have to build, run, and audit.

What does “deontic” mean for an agent?

Deontic governance treats an agent’s actions as carrying duties, not just gates. Where an access-control rule says “you may read this file,” a deontic policy adds “and because you did, you owe a follow-up obligation” (arXiv:2606.19464).

The name comes from deontic logic, the formal study of obligation, permission, and prohibition, founded in von Wright’s 1951 system (Wikipedia). Where classical deontic logic formalizes those three operators, the agentic-governance variant adds lifecycle and conflict machinery on top: when an obligation fires, when it is fulfilled or expires, when it can be waived, and which rule wins when two disagree (arXiv:2606.19464; community write-up).

The paper’s worked example makes the distinction concrete: an agent that takes a sensitive action may be obliged to notify the CISO afterward (arXiv:2606.19464). Most policy engines have no slot for a follow-up duty like that.

What can Rego, Cedar, and XACML not express?

As of mid-2026, the policy engines most teams already run (Rego, Cedar, and XACML) answer one question: is this action permitted or prohibited. According to the paper, that permit/prohibit subset leaves out four things enterprises actually need: obligation lifecycle management, meta-policy conflict resolution, dispensations that waive obligations in specific circumstances, and ontological reasoning over domain class hierarchies (for example, recognizing a pediatric-oncology record as a subclass of PHI).

That last gap is quietly important. A rule keyed on “PHI” silently fails to fire when the resource is labeled “pediatric-oncology record,” unless the engine understands the subclass relationship. Rego and Cedar treat data as flat structures; the deontic approach leans on an OWL ontology to reason over types.

Capability	Rego / Cedar / XACML	Deontic (AgenticRei)
Permission, prohibition	✓	✓
Obligation lifecycle	✗	✓
Dispensation	✗	✓
Meta-policy conflict resolution	✗	✓
Ontological class reasoning	✗	✓

Where should agent policy be enforced?

At the action boundary, not in the model’s context window. The central claim of the paper is that putting rules in a system prompt and hoping the model complies is structurally weak once an agent chains actions across organizational boundaries.

This is not a fringe position. As of mid-2026, several vendor systems converge on the same boundary: Wallarm’s A2AS, Microsoft’s Agent Governance Toolkit, and Cisco’s MCP policy-enforcement gateway in Secure Access all enforce policy at the point of action execution, rather than in the LLM’s context window (arXiv:2606.19464). The paper frames this convergence as validation of the “where” of governance: the right place to enforce policy is at the action boundary.

The structural argument holds regardless. A rule sitting in a system prompt depends on the model choosing to follow it; a rule sitting in an external policy layer does not. Moving evaluation outside the model makes the model’s compliance irrelevant: the action either passes the external check or it does not.

How does AgenticRei enforce policy outside the LLM?

AgenticRei is the paper’s concrete realization of that boundary, authored by Tim Finin and to appear at the 2026 IEEE Symposium on Agentic Services. The policy itself is written in a deontic language built on the Rei framework and expressed as OWL, evaluated by a high-performance logic engine that runs entirely outside the LLM. The same pipeline governs both tool invocations and agent-to-agent messages, not just tool calls.

The verdict that comes back is richer than allow/deny. It carries the obligations the action triggered, dispensations that modify standing duties, and the priority resolution that decided between conflicting policies.

What does an auditor actually get?

A deterministic, replayable record of every decision and the obligations it triggered. That is the property that matters for compliance, and it is what a joint advisory from CISA, NSA, and allied national cybersecurity agencies independently asks for: per-invocation policy evaluation rather than a once-at-startup check, plus a durable record of what the agent did and under whose authority.

The advisory identifies non-reproducible decision chains as the technical root of diffuse accountability in production financial-services deployments. An agent that “decided” something inside an opaque LLM, with no external record, is exactly the kind of chain that cannot be reproduced. External evaluation produces a log the auditor can read.

The commercial pressure is explicit. A 2026 KPMG survey found that 75% of large-enterprise leaders, per the Kaptein companion paper, cite security, compliance, and auditability as the most critical requirements for agent deployment, and the EU AI Act’s high-risk AI provisions take effect in August 2026. An enforceable, inspectable policy log is the artifact those requirements point at.

What are the limits?

The paper demonstrates the approach through worked examples, not benchmarks. AgenticRei is a prototype: the community write-up concedes it reads more as a theoretical model than a production implementation, and the paper itself does not quantify what it costs to run ontology reasoning at every action boundary.

That runtime-cost question is the obvious one left open. Logic-engine evaluation per action is fine for a demo with a handful of policies; whether it holds when an agent fires many tool calls per second across a large ontology is not addressed.

Deontic logic also carries genuine paradoxes. The literature records Ross’s paradox, the Good Samaritan paradox, Chisholm’s paradox, and the gentle-murder paradox, and the field is described as one of the most controversial and least agreed-upon areas of logic. Any engineering system built on it has to navigate these without emitting self-contradictory verdicts.

Finally, AgenticRei should not be conflated with the Kaptein framework, arXiv:2603.16586, which formalizes compliance policies as deterministic functions over agent execution paths and argues that prompt instructions and static access control are merely weaker special cases. Kaptein’s work is conceptual: it outlines a reference implementation but reports no empirical experiments and ships no runtime. The two papers agree on the diagnosis; AgenticRei is the one that realizes it as a runnable prototype.

Is it worth building?

The case for moving governance out of the prompt and into an external layer is the stronger one, but not because the prompt is unreliable. It is stronger because an external policy log is the only artifact an auditor can check, and the regulatory clock is running: the EU AI Act’s high-risk provisions land in August 2026, and the joint CISA/NSA advisory already treats reproducible decision chains as table stakes.

The honest caveat is that AgenticRei proves the category is buildable, not that it is built. The runtime-cost question, the deontic paradoxes, and the gap between a prototype and a product all remain open. Teams that adopt the pattern now are betting that a policy engine they can audit is worth more than a prompt they can only hope the model follows. That bet is defensible. It is not yet proven.

Frequently Asked Questions

What does AgenticRei actually intercept when an agent acts?

It runs a three-step extract-evaluate-apply contract: intercept the agent action, extract it to a subject-action-resource triple, evaluate that triple against the RDFox policy engine, and return a verdict. The extraction step is what makes the rest deterministic, because before any rule fires the action is already reduced to a three-part structure the ontology can reason over.

How does AgenticRei differ from Wallarm’s A2AS?

Both enforce at the action boundary, but A2AS’s own authors concede a ‘security reasoning drift’ limit: in-context, model-interpreted policies still produce partial compliance. AgenticRei sidesteps that by evaluating each action in an external logic engine rather than asking the model to interpret a codified rule. The trade is that A2AS ships as product today while AgenticRei is a demonstrated prototype.

Can a deontic obligation carry a hard time limit?

Yes. The paper’s worked obligations include a compliance officer who reads a regulated dataset being required to log that access within 60 seconds, plus a host-software install that must notify the CISO. The lifecycle machinery, tracking each duty from trigger through activation to fulfillment or timeout, is what lets the engine detect whether that 60-second window closed, not merely whether access was granted.

What’s a concrete deontic-logic failure an engine must avoid?

Ross’s paradox: classical deontic logic lets ‘you ought to mail the letter’ entail ‘you ought to mail the letter or burn it,’ because obligation is preserved under disjunction. A policy engine built naively on those axioms could emit duties that contradict the original intent. The meta-policy priority resolution layer is the mechanism meant to stop conflicting duties from producing such verdicts.

What does a dispensation do that deleting a rule does not?

It waives a standing obligation under specific conditions while preserving the audit trail that the obligation existed and was lifted. If a production-host install is normally obliged to notify the CISO, a dispensation could suspend that duty during a declared incident-response window without erasing the record. Rego and Cedar have no native construct for this; rewriting or deleting the rule would lose the trail of why it stopped applying.