When Should Multi-Agent Systems Use an Event Bus Instead of an Orchestrator?

A multi-agent system should move from a request-response orchestrator to an event bus when coordination overhead and duplicated work start to scale faster than the agents themselves. Three June 2026 arXiv preprints suggest the tradeoff is sharper than framework marketing implies: the cost center shifts from central control logic to event ordering, state convergence, and containment of bad messages.

When does request-response orchestration become the bottleneck?

The orchestrator pattern keeps every agent on a leash. One node decides who speaks, who waits, and who sees what context. That works cleanly for small, well-defined chains, but it serializes decision-making around a single point. Add more agents, longer tool chains, or concurrent workstreams and the orchestrator stops being a convenience and starts being a throttle.

The deeper cost is not just latency. A central controller must maintain a coherent picture of partial results, retry failures, and resolve conflicts between agents that are working in parallel. The more state it holds, the more it resembles a distributed database that happens to also run your business logic. Eventually the question is no longer “how do I route this call?” but “how do I keep every agent’s view of the world from diverging?” That is the inflection point where an event substrate starts to look less like architectural decoration and more like a load-bearing wall.

What does an append-only event log actually buy you?

The grite substrate answers this with an unusual design choice: it stores coordination events as an append-only, signed log inside git itself. Before the Pull Request: Mining Multi-Agent Coordination reports that coding agents using grite saw the share of work that merely re-did a teammate’s task fall from 78% to 0%, while useful throughput more than tripled. Those numbers are from an unrefereed preprint and should be treated as such, but the mechanism behind them is what matters for builders.

Because the log is append-only and signed, every agent’s copy converges to the same state without a central server quietly dropping concurrent writes. The authors contrast this with a file-based tracker that lost concurrent writes in their tests. The signed log also becomes a mineable artifact: conflicting edits, lock starvation, redundant rediscovery, and race-to-close conditions are recoverable with provenance. Several of those failure modes are invisible in ordinary pull-request history, which only records the final merge, not the sequence of agent actions that produced it.

Can shared memory replace explicit coordination?

Multi-Agent Transactive Memory, or MATM, takes a different angle on the same problem. Instead of asking how agents should talk to one another, it asks whether they need to talk at all. Producer agents contribute trajectories to a shared repository; consumer agents retrieve them when they encounter a similar task. The MATM preprint reports improved downstream task performance and fewer interaction steps on ALFWorld and WebArena without requiring joint training or explicit coordination protocols.

The no-coordination result is the interesting part. It suggests that for some problem classes, shared state can substitute for hand-designed orchestration. An agent does not need to ask another agent what to do if it can read what a previous agent already figured out. The repository becomes a population-level memory rather than a message broker. That is a different decentralization from grite’s event log, grite preserves the sequence of decisions, while MATM preserves the outcomes, but both move the system away from a central coordinator that knows everything in real time.

How do you stop bad messages from spreading?

Decentralized coordination has an obvious failure mode: once you remove the orchestrator that validates every message, junk can propagate before anyone notices. The Argent Signaling Protocol, or ASP, attacks this with a sidecar pattern. Each AI-generated response carries a compact machine-readable header encoding certainty (@C), grounding (@G), stochasticity (@S), and an assumption index. In multi-agent mode, an ASP sidecar sits between a retrieval agent and a downstream decision agent.

According to the ASP preprint, this sidecar blocked 100% of ungrounded upstream outputs in the tested configuration: 24 of 27 were blocked, with zero ungrounded propagations. The numbers are small and unrefereed, but the architectural point is more durable. Treating message quality as routing metadata lets downstream agents ignore inputs that do not meet a threshold, rather than trying to sanitize everything at a central gate. It turns containment into a local decision, which is the only kind that scales cleanly with agent count.

Where does decentralized coordination break down?

Moving coordination onto an event bus or shared log does not remove the hard problems. It relocates them. The first is ordering. In a request-response orchestrator, the order of operations is explicit by construction. In an event-driven system, two agents can observe events in different orders unless the substrate guarantees a total or causal order. grite relies on git’s own ordering semantics, which is a clever reuse but not a general solution for arbitrary event streams.

The second is convergence. Append-only logs make divergence visible, but visibility is not the same as resolution. When two agents append conflicting events, the system still needs a policy for which event wins, how to merge them, or whether to fork. MATM’s trajectory repository sidesteps some of this by treating contributions as read-only experience, but the moment consumers start writing back or updating shared state, the same consistency questions reappear.

The third is containment. ASP shows that local signal headers can block ungrounded messages between two agents, but that assumes the sidecar is deployed, the thresholds are calibrated, and every producer cooperates. A single agent that emits unsigned or mislabeled events becomes a confusion multiplier. Decentralized systems distribute trust as well as load, and distributing trust only works when every participant is accountable.

What should LangGraph and CrewAI adopters do with this?

This section is editorial extrapolation, not sourced fact. The preprints do not evaluate LangGraph or CrewAI directly, and any claim that they “lack” event-bus support should be treated as an argument for verification rather than a settled finding.

That said, the architectural pattern is worth taking seriously for anyone already running multi-agent pipelines in these frameworks. If your workflow is a linear chain with a small, fixed set of agents, an orchestrator is probably still the right default. The complexity of an event log or shared repository is not free, and the failure modes it introduces only pay for themselves once concurrency, redundancy, or agent count crosses a threshold. For grite’s coding-agent workload, that threshold appears to be the point where 78% of effort is wasted duplication. Your threshold will differ.

If you are building toward larger populations of agents, the practical move is not to rip out the orchestrator overnight. It is to start treating coordination as a first-class design surface. Log agent decisions somewhere durable and ordered. Separate message content from metadata about its reliability. Make it possible for an agent to benefit from another agent’s work without requiring a direct conversation. Those changes prepare a system for an event-driven substrate even if the orchestrator remains in the loop for now.

The June 2026 papers do not settle the debate. They do make one thing concrete: the next bottleneck for multi-agent systems is less likely to be the quality of any single agent and more likely to be the substrate that lets many agents work without undoing each other’s progress.

Frequently Asked Questions

Which multi-agent workloads are poor fits for an event-bus design?

Small linear chains with a fixed set of agents usually pay more in ordering and convergence complexity than they gain, especially if the team lacks visibility into retry storms or duplicate tool calls. The crossover point is workload-specific: watch for retry queues that outgrow task throughput, merge-conflict rates that spike with concurrency, or duplicate calls dominating observability. grite’s event log became worthwhile when 78% of effort was wasted duplication, but most teams will hit their own threshold at a different redundancy or concurrency level.

How does grite’s event log differ from MATM’s trajectory repository?

grite is an open-source, server-less substrate that uses git’s own append-only commit graph to keep every agent’s copy convergent without a central server. MATM behaves more like a population-level experience cache: producer agents upload trajectories and consumers retrieve them for ALFWorld and WebArena tasks, improving performance without coordination protocols or joint training.

What is the minimum viable change before adopting decentralized coordination?

Start by creating a durable ordered log of agent actions and tagging messages with reliability metadata. Then baseline three indicators: the share of work that merely redoes a teammate’s task, the number of retry storms hitting the central controller, and the rate at which concurrent writes disappear from a file-based tracker. Those numbers tell you whether convergence and containment are worth the migration cost.

What can break an ASP-style signaling sidecar in production?

ASP headers encode certainty (@C), grounding (@G), stochasticity (@S), and an assumption index. In the tested configuration the sidecar blocked 24 of 27 outputs and stopped every ungrounded propagation, but production also requires versioning the sidecar with each agent, revisiting thresholds when tasks change, and logging headers for audit. One bypassed or mislabeled agent can still become a confusion multiplier.

How should teams treat the headline metrics from the June 2026 papers?

Read them as unrefereed directional signals, not settled engineering guidance. arXiv has relied on an endorsement system rather than formal peer review since 2004, and the specific paper searched for this angle (2606.20058) was never retrieved. The argument is therefore anchored to adjacent preprints like grite, ASP, and MATM, with LangGraph/CrewAI guidance explicitly marked as editorial extrapolation.