groundy
agents & frameworks

Claude Code Dynamic Workflows: Spawning 100 Parallel Subagents on Opus 4.8

Dynamic workflows lets Claude Code run hundreds of parallel subagents in one session. Here is how map-reduce and fan-out patterns work on Opus 4.8.

7 min · · · 4 sources ↓

Anthropic shipped dynamic workflows alongside Opus 4.8 on May 28, 2026.1 The feature is a Claude Code research preview that lets a single session spawn hundreds of parallel subagents. The practical implication is that tasks which previously required sequential agent chains can now be split into independent work units and executed concurrently, with an orchestrating agent collecting the results.

This piece covers what the architecture looks like, which patterns fit the model well, and what running parallel subagents on Opus 4.8 costs in practice.

What dynamic workflows actually are

Dynamic workflows is a Claude Code research preview.1 It is not a general API primitive available to all SDK callers; it is a capability exposed within Claude Code sessions that allows the model to spawn many subagents within a single top-level session. Anthropic describes it as enabling “hundreds of parallel subagents.”1

The term “dynamic” distinguishes this from static multi-step pipelines where the number and sequencing of agents is fixed at design time. In a dynamic workflow, the orchestrating agent decides at runtime how many subagents to spawn and what work to assign each one, based on the structure of the problem it is solving. The orchestrator can fan out to many workers, wait for completion, then synthesize the results.

Opus 4.8 is the practical tier for this feature.1 The model’s 1-million-token context window2 means the orchestrator can hold substantial state about all in-flight subagent tasks without losing track. Anthropic also notes that Opus 4.8 “works independently for longer” and shows “sharper judgment.”1 Both properties reduce the rate at which a long-running orchestration loop requires human intervention or produces contradictory subagent instructions.

How map-reduce fits the parallel subagent model

Map-reduce is the most natural pattern for dynamic workflows. The orchestrator receives a problem that decomposes into N independent units of work, assigns one subagent per unit (the map phase), waits for all subagents to complete, then synthesizes the collected outputs into a final result (the reduce phase).

Concrete examples where this decomposition is clean:

  • Repository-wide code audits. The orchestrator identifies N modules or packages, assigns one subagent per module to audit it against a checklist, then aggregates findings into a ranked issue list. Because modules are largely independent, the map phase can run fully in parallel.
  • Multi-source research synthesis. The orchestrator receives a list of sources, assigns one subagent per source to extract claims and evidence, then reduces to a reconciled summary. The reduce phase is where Opus 4.8’s reduced rate of unsupported claims matters1: the synthesizing agent is less likely to introduce invented deltas when merging subagent outputs.
  • Test suite generation. Given a set of functions or API endpoints, the orchestrator fans out to one subagent per target, each writing tests independently, then merges the test files and resolves any import or fixture conflicts.

Map-reduce works best when the work units in the map phase are genuinely independent: no shared mutable state, no ordering constraints, no subagent needing the output of another subagent before it can proceed.

Fan-out-fan-in for dependent subtasks

Fan-out-fan-in extends map-reduce to handle cases where subagents have partial dependencies. In a pure fan-out, all subagents start simultaneously. In fan-out-fan-in, the orchestrator may start a first wave of subagents, wait for a subset to complete, then start a second wave that consumes the first wave’s outputs before the full reduce phase.

A practical example: a refactoring task that requires (1) understanding the existing API contract, (2) proposing a new interface in parallel across multiple modules, then (3) updating call sites once the new interface is agreed. Phases 1 and 3 are sequential relative to phase 2, but within phase 2 the module proposals are parallel. The orchestrator fans out for phase 2, collects the interface proposals, reconciles them, then fans out again for phase 3.

Opus 4.8’s stronger benchmark performance on agentic tasks is relevant here. On SWE-Bench Pro, it reaches 69.2 percent, compared to 64.3 percent for Opus 4.7.1 On Terminal-Bench 2.1, it scores 74.6 percent, though GPT-5.5 leads that benchmark at 78.2 percent.1 The SWE-Bench Pro gains reflect improved performance on multi-file, cross-module tasks, which are exactly the coordination problems that fan-out-fan-in workflows surface.

When parallel beats sequential

Parallel subagents are faster for work that is decomposable and where the bottleneck is latency per unit of work, not dependencies between units. They are not better for every task.

Parallel wins when:

  • Work units are independent (no data hazards between subagents).
  • The task has enough volume that fan-out setup overhead is small relative to total work. Spawning a hundred subagents for a task that would complete in two sequential steps is not efficient.
  • The reduce phase is straightforward. If synthesizing outputs requires complex reconciliation logic that takes as long as the map phase, the parallelism gain shrinks.

Sequential wins when:

  • Each step depends on the prior step’s exact output.
  • The task requires accumulated context that would be expensive or lossy to reconstruct from independent subagent outputs.
  • The total work is small enough that coordination overhead dominates.

The practical threshold for when parallelism pays off depends on how much time each subagent task takes versus how much overhead the orchestrator adds. For tasks where each work unit takes several seconds or more, fan-out is almost always faster wall-clock. For trivial tasks measured in milliseconds, the overhead of spawning subagents erases the gain.

Coordination overhead and the orchestrator’s role

Every parallel workflow has coordination overhead: the orchestrator must dispatch tasks, track which subagents have completed, handle failures or stalls, and aggregate results. In dynamic workflows on Claude Code, this coordination runs within the same session.1

Opus 4.8’s behavioral properties matter for orchestration quality. Anthropic describes the model as more likely to flag uncertainties and less likely to make unsupported claims.1 In an orchestration context, this means the orchestrator is more likely to detect when a subagent has returned an incomplete or ambiguous result rather than silently incorporating a wrong output into the final synthesis. That fail-fast behavior reduces the cost of errors that propagate through a long reduce chain.

Opus 4.8 also “works independently for longer.”1 For a session managing dozens or hundreds of subagents, reduced interruption frequency is a practical efficiency gain: the orchestrator can proceed further into the fan-in phase before needing to surface a decision to the user.

Cost per parallel task on Opus 4.8

Opus 4.8 prices at $5 per million input tokens and $25 per million output tokens, the same as Opus 4.7.1 In a parallel workflow, each subagent consumes tokens independently. The total cost scales with the number of subagents and the token volume per subagent, not with the wall-clock time saved.

A rough example: if each subagent task consumes 2,000 input tokens and 500 output tokens, a 100-subagent fan-out costs 200,000 input tokens ($1.00) and 50,000 output tokens ($1.25), for $2.25 total at standard pricing.1 Add the orchestrator’s own token usage, which spans the full session and includes the synthesize phase, and a 100-subagent map-reduce on a moderate task might run $5 to $15 depending on orchestrator context size and reduce complexity.

For throughput-sensitive workflows, the fast mode on Opus 4.8 runs at approximately 2.5 times the speed at $10 per million input tokens and $50 per million output tokens.1 Anthropic notes this is three times cheaper than the previous fast mode pricing.1 For workflows where wall-clock latency is the binding constraint (competitive deadlines, interactive use cases), fast mode roughly halves the total session time at roughly double the per-token cost, a trade-off that favors fast mode when the latency saving is worth more than the cost delta.

The extended output ceiling of 300,000 tokens via the Batch API beta header2 is relevant for reduce phases that generate large synthesis documents, consolidated test files, or full refactored codebases.

What the research preview status means in practice

Dynamic workflows is shipping as a research preview.1 That means the interface and behavior are subject to change, and Anthropic is collecting data on real-world usage patterns before stabilizing the feature. Users who build workflows around this capability should plan for the possibility that the API surface or session model changes between the preview and a general release.

The research preview framing also means the current implementation may have edge cases (subagent failure handling, session state management under high fan-out counts, or orchestrator context limits) that Anthropic has not fully documented or resolved. Teams building production workflows on top of dynamic workflows should validate behavior at their target fan-out scale before committing to it as a dependency.

The capability is real and demonstrated, but treating a research preview as a stable production primitive carries the same risks it always has.

sources · 4 cited

  1. Claude Opus 4.8 Announcement primary accessed 2026-05-28
  2. Claude Model Overview (API Docs) primary accessed 2026-05-28
  3. Claude Opus 4.7 Announcement primary accessed 2026-05-28
  4. Claude Opus Product Page vendor accessed 2026-05-28