Data Safety Policies for AI Agents: Controlling What an Agent Can Leak

Most agent-safety work asks whether the model will refuse a harmful request. A paper submitted to arXiv on June 4, 2026 asks a different question: even if the model generates a perfectly valid SQL query, who checks whether that query is allowed to combine the data it touches? Charlie Summers’ “Data Flow Control: Data Safety Policies for AI Agents” introduces a framework for enforcing data-safety policies at the query layer, outside the model entirely.

Correctness is not safety

The paper’s central observation is concise enough to quote directly: a query may be semantically valid yet violate regulatory, privacy, or business constraints governing how data may be combined and released. An agent-generated JOIN that merges patient records with billing data and returns the result to a chat endpoint is “correct” by the database’s own definition. It is also a compliance incident.

The authors formalize this gap as the difference between query correctness (the DBMS optimizer’s concern) and data safety (a policy concern that the DBMS currently ignores). Current agent-safety stacks, which rely on prompt-level guardrails, output filtering, or model alignment, treat data leakage as a probabilistic problem: the model probably won’t exfiltrate sensitive data, and if it does, you hope the output filter catches it. The arXiv:2606.05679 paper argues this is the wrong layer of abstraction.

How Data Flow Control works

DFC introduces two mechanisms. First, a declarative policy language that specifies where data is allowed to flow: which tables can be joined, which columns can appear in which query outputs, and which aggregate predicates must hold over the result set. Second, a formal enforcement layer built on provenance monomials, algebraic expressions that track how individual data tuples contribute to a query result.

The provenance-monomial approach is what distinguishes DFC from simpler access-control schemes. Row-level security and column masking can restrict what a user sees, but they cannot express constraints over combinations of data. A policy like “patient diagnosis fields must never appear in the same result set as billing addresses” requires reasoning about the query’s data flow, not just its access pattern. DFC formalizes these as aggregate predicates over the provenance polynomial, which the authors argue makes policy violations machine-checkable rather than dependent on model behavior.

The paper describes this enforcement as “optimizer invariant”: the policy check operates on the query’s logical plan, so it remains correct regardless of which execution plan the DBMS chooses. This is an important property for portability across database engines.

Passant: zero-overhead query rewriting

The paper’s accompanying implementation is Passant, an open-source portable query-rewriting layer that enforces DFC policies without materializing provenance. That distinction matters. Naive provenance tracking requires computing and storing the full lineage of every tuple, which is expensive enough to be impractical for production workloads. Passant rewrites the incoming query to embed the policy constraints directly into the SQL, so the database executes a policy-compliant query without any intermediate provenance step.

The authors evaluated Passant across five DBMS engines: DuckDB, Umbra, PostgreSQL, DataFusion, and SQLServer. According to the abstract, the overhead is approximately 0% across all five, and Passant outperforms alternatives by “orders of magnitude.” The paper runs 15 pages with 12 figures, suggesting substantial benchmark detail, though the brief available for this article does not include the specific workloads or baselines used in those comparisons.

Why the enforcement layer matters

The practical significance for teams deploying tool-using agents is straightforward. If an agent has access to a database and can generate arbitrary SQL, the current safety model is: trust the model to not ask for data it shouldn’t see, and trust the output filter to catch it if it does. Both of those are probabilistic guarantees. DFC replaces them with a deterministic one: the rewritten query is provably policy-compliant before it reaches the database.

The authors frame this as “the first step towards moving data safety from prompts and post-hoc checks into the data infrastructure,” as stated in arXiv:2606.05679. The second-order consequence for practitioners is that data-leak prevention stops being an alignment problem and becomes a policy-enforcement problem. This raises the upfront cost of agent deployment, because someone has to write and maintain the data-safety policies. But it makes the failure mode auditable: you can inspect the policy, inspect the rewritten query, and determine whether a violation occurred, rather than reconstructing what the model was “thinking” when it decided to exfiltrate data.

This is a real trade-off. Policy authoring requires domain expertise about which data combinations are restricted, and policies must be maintained as schemas and regulations evolve. The paper does not appear to address the policy-authoring UX problem; its contribution is the enforcement mechanism, not the specification interface.

Scope and limitations

DFC’s current scope is DBMS queries. It does not govern what an agent does with the data after the query returns results: free-form text generation, API calls to non-SQL services, or any output channel that bypasses the database layer. An agent could execute a policy-compliant query and then describe the contents of the result set in a chat response. DFC prevents the query from violating policy; it does not prevent the agent from leaking data through other channels.

The paper also does not address integration with existing agent frameworks. Passant operates at the SQL layer, which means it needs to sit between the agent’s query generator and the database. How that insertion point works with frameworks like LangChain, CrewAI, or custom tool-use pipelines is an engineering question the paper leaves open.

The policy language itself raises questions about expressiveness and performance. The brief describes it as using “aggregate predicates over provenance monomials,” which is formally rigorous but may not cover every real-world data-governance rule. Regulations like GDPR or HIPAA impose constraints on data retention, purpose limitation, and cross-border transfer that go beyond what a query-level provenance model can capture. The authors do not claim otherwise, but the gap between “data flow control” and “regulatory compliance” is worth stating plainly.

AIPapers.ai categorized the paper under “AI Safety” with tags including Data Flow Control, DFC, AI agents, data safety, and policy enforcement. That categorization is accurate but incomplete. DFC is a database-systems contribution with safety implications, not a safety framework that happens to use databases. The distinction matters for practitioners evaluating where it fits in their architecture: this is infrastructure you add to your data layer, not a model you fine-tune or a prompt you append.

The paper is new enough that deep independent analysis is scarce. The formalization is sound within its stated scope. The open questions, specifically policy authoring, non-SQL output channels, and integration with agent frameworks, are the ones that will determine whether DFC becomes a standard layer in agent deployments or remains an interesting research result with narrow adoption. The code is open source, which lowers the barrier to finding out.

Frequently Asked Questions

How does DFC differ from row-level security or column masking?

Row-level security filters which rows a given role can see, and column masking hides specific fields, but neither can prevent two permitted columns from appearing in the same result set. DFC enforces combination-level constraints on the query’s logical plan, such as requiring that diagnosis codes and billing addresses never co-occur in output. Standard RLS implementations in PostgreSQL and SQL Server have no mechanism for this kind of cross-column flow rule.

Can Passant prevent an agent from describing sensitive query results in a chat response?

No. Passant rewrites SQL before it reaches the database, so the returned result set satisfies data-flow policies. Once results are in the agent’s memory, nothing stops the model from summarizing or relaying them through chat, API calls, or file writes. The post-query output channel is outside DFC’s scope and requires a separate redaction or content-filtering layer.

What does a team need to change in an existing agent pipeline to add DFC?

Insert Passant between the agent’s SQL-generation step and the database connection, then author declarative policies specifying which data combinations are restricted for the target schema. Because the rewriting targets the logical plan, the same policy definition can apply across DuckDB, Umbra, PostgreSQL, DataFusion, and SQL Server without engine-specific rewrites. The ongoing cost is policy maintenance as schemas and regulations evolve.

What regulatory requirements does query-level data flow control not cover?

GDPR imposes obligations on retention periods, purpose limitation, and cross-border transfer that are lifecycle and context constraints, not query-time flow rules. A provenance-monomial model can block a specific join from combining restricted tables, but demonstrating full regulatory compliance still requires audit trails, data-retention automation, and governance processes that operate outside the query layer.