Agent isolation architectures for high-stakes assessment

You run fourteen agents sequentially. Agent 1 (GP Geopolitical) publishes its assessment of a geopolitical scenario. Agent 2 (CT Cyber Threat) receives the scenario brief along with GP's output. It reads GP's framing, absorbs its assumptions, and produces an assessment that is subtly anchored to Agent 1's conclusions. By Agent 14, every assessment is a variation on Agent 1's analytical frame. You have paid for fourteen agents. You have received approximately 1.2 independent perspectives.

This failure mode is well-documented. The multi-agent debate literature shows that agents exposed to majority opinion suppress independent correction, with weak agents rarely overturning initial majorities regardless of argument quality. But the problem runs deeper than majority pressure. LLMs are specifically trained to be agreeable and contextually consistent. RLHF (reinforcement learning from human feedback) optimises for responses that humans rate as helpful and satisfying, and humans tend to rate agreement more favourably than dissent. Research measuring sycophantic behaviour across frontier models found compliance rates as high as 100% on prompts designed to elicit agreement with incorrect statements. Recent work submitted to ICLR 2026 decomposed sycophancy into distinct, independently steerable behaviours: sycophantic agreement, sycophantic praise, and genuine agreement, each encoded along separate linear directions in the model's latent space.

An LLM that sees a prior assessment will incorporate it. This happens not because the agent was instructed to defer, but because contextual consistency is what its training incentivises. In a sequential multi-agent system, this means the first agent to produce output effectively sets the analytical frame for every subsequent agent.

What isolation means, precisely

Isolation is a design principle with a specific definition: agents share the same input (scenario description, entity profile, analytical question) but have zero access to each other's outputs, intermediate reasoning, or metadata. They see the same problem. They never see each other's answers.

Three levels of isolation exist, and each addresses a different information leakage pathway:

Output isolation. Agents do not see each other's final assessments. This is the minimum viable boundary. It prevents the most obvious form of anchoring: reading another agent's conclusion and adjusting toward it. Necessary, but insufficient on its own.

Process isolation. Agents do not share intermediate reasoning, scratchpads, or chain-of-thought traces. In systems where agents log their reasoning to a shared store (for debugging, observability, or audit), process isolation prevents information leakage through shared memory. An agent that reads another agent's chain-of-thought has effectively been exposed to that agent's analytical framework, even if it never sees the final score.

Infrastructure isolation. Agents run in separate execution contexts with no shared state. This addresses a category of leakage that most system designers overlook: accidental coupling through caching, session state, or model inference batching. If two agents share an inference server with KV-cache optimisation, the second agent's generation may be influenced by cached key-value pairs from the first agent's context. Infrastructure isolation eliminates this pathway entirely by ensuring each agent operates in its own process, with its own context window, and its own system prompt.

Roach implements all three levels. Each of the fourteen specialised agents runs as an independent process. No shared memory, no shared logs during assessment, no shared inference infrastructure. The only shared element is the input: the scenario brief and entity profile that each agent receives identically.

Isolation without diversity is expensive noise

Fourteen isolated agents running the same base model with the same system prompt will produce fourteen independent samples from the same distribution. The variance you observe between them is sampling noise. It carries no structural information about the problem. You have spent 14x the compute to learn nothing that a single agent with temperature sampling could not tell you.

Isolation preserves analytical independence. But analytical independence only produces valuable disagreement when the agents are analytically diverse. Each agent needs a distinct analytical framework: a defined domain of expertise, a specific set of causal models it applies, and a named set of blind spots it acknowledges.

In Roach, fourteen agents cover fourteen distinct analytical domains:

GP Geopolitical and Conflict: strategic scenario planning (Wack), coercive bargaining (Schelling)
ME Macroeconomic and Monetary Policy: central bank reaction functions, yield curve dynamics
SC Supply Chain and Outsourcing: network theory, geographic concentration risk
CT Cyber Threat and InfoWar: kill chain (Lockheed), MITRE ATT&CK
RC Regulatory and Compliance: regulatory cycle theory, supervisory expectations
FC Financial Contagion: network contagion, liquidity spirals (Brunnermeier)
AI Developments Risk: technology concentration risk, adversarial ML
LS Liquidity Stress: deposit dynamics (Diamond-Dybvig), collateral cascades
OR Operational Resilience: business continuity (ISO 22301), third-party risk (BCBS 271)
EM Emerging Markets and FX: carry trade mechanics, EM contagion pathways (Rey 2013)
CL Climate and ESG Risk: physical risk taxonomy (TCFD), transition pathways (Net Zero 2050)
SR Systemic Risk: interconnectedness (Allen-Gale), procyclical leverage (Adrian-Shin)
RT Adversarial Red Team: ACH (Heuer), devil's advocacy, pre-mortem
ST Strategic Scenario Planner: scenario planning (Schwartz), long-horizon foresight

The diversity lives in the system prompt. The explicit "do NOT assess" instructions serve a dual purpose. They prevent agents from wandering into domains where their framework has nothing useful to contribute. And they force each agent to declare its blind spots, which becomes critical data for the Peripheral Scanner later in the pipeline.

The output schema: making disagreement measurable

Diversity without a common output schema produces incomparable outputs. If GP returns a narrative essay and FC returns a numerical scorecard, you cannot compute the spread between their assessments. The disagreement exists but is invisible to the system.

All fourteen agents produce outputs in the same structured schema. Every dimension carries a score with explicit uncertainty bounds, a named mechanism explaining why the score is what it is, evidence citations supporting the assessment, and a confidence classification.

The cannot_assess field is the most important structural element. When an agent marks a dimension as outside its framework, it is doing two things: preventing itself from contributing noise to a domain it has no expertise in, and flagging a coverage gap that the system can aggregate across all agents. If ten out of fourteen agents mark "climate transition risk" as outside their framework, the system knows that dimension is sparsely covered and may require additional analytical capability.

Post-isolation synthesis

After all fourteen agents complete their assessments in isolation, the outputs are collected and passed to a synthesis stage. The Strategic Scenario Planner (ST) receives all fourteen structured assessments simultaneously. Its role is to organise disagreement, not resolve it.

ST computes three categories per dimension:

Consensus (spread below 10 points). Most agents who assessed this dimension reached similar conclusions. ST reports the consensus range and notes the number of contributing agents. A consensus across six agents carries more weight than a consensus between two.

Contested (spread above 25 points). Agents disagree significantly. ST extracts the competing reasoning chains: which mechanism does the high-scoring agent cite? Which precedent does the low-scoring agent rely on? The output surfaces both positions with their evidence, preserving the structure of the disagreement.

Sparse (fewer than three agents assessed). Most agents marked this dimension as outside their framework. ST flags these as coverage gaps with an explicit warning: the system's assessment of this dimension is based on limited analytical perspectives and should be treated with corresponding caution.

After synthesis produces its structured report, two additional analysis passes operate on the output:

The Adversarial Red Team (RT), which has already produced its own isolated assessment, now enters a second mode: challenging the consensus positions. Where agents agree, RT probes for shared assumptions that went unexamined. It asks: are these agents agreeing because they independently reached the same conclusion, or because they share a common training bias or a common analytical blind spot?

The Peripheral Scanner reviews all fourteen outputs simultaneously and searches for four types of intelligence that individual agents structurally cannot detect: uncited evidence that no agent referenced despite clear relevance, cross-domain convergence where individually low-probability risks compound, framework blind spots where no agent's toolkit covers a plausible scenario, and temporal clusters where independent risks converge on the same window.

Practical constraints and tradeoffs

Isolation carries real costs. Fourteen independent LLM calls at full context length cost 14x a single call. For Roach, this tradeoff is acceptable: the decisions being informed are institutional resilience assessments where the cost of analytical failure far exceeds the cost of compute. For a customer service chatbot, isolation would be absurd overkill.

Latency. Parallel dispatch transforms the cost equation. If all fourteen agents run concurrently, wall-clock time equals the slowest agent rather than the sum of all fourteen. Roach dispatches all agents simultaneously, with a timeout that triggers assessment with available outputs if any agent exceeds the deadline.

Token usage. Each agent receives the full scenario context independently. For long entity profiles (a Dutch bank's outsourcing relationships, technology stack, regulatory obligations), this means substantial token consumption per agent. One mitigation: structure the input so each agent receives a common core (scenario description, entity overview) plus a domain-specific supplement (the sections of the entity profile most relevant to its analytical framework). GP receives the geopolitical exposure appendix. FC receives the balance sheet and funding profile. This reduces per-agent context length without sacrificing analytical relevance.

When to use isolation. The decision criterion is straightforward: if the cost of correlated analytical failure exceeds the cost of running independent assessments, use isolation. Financial resilience planning, regulatory stress testing, national security assessment, and critical infrastructure risk analysis all clear this bar. Routine classification tasks, content moderation, and single-domain question answering typically do not.

The map, not the territory

The output of an isolated multi-agent system is a map of uncertainty. It shows where the analytical ensemble agrees, where it disagrees, and where it lacks coverage. That map does not tell the decision-maker what to do. It tells them what they know, what they are uncertain about, and what they cannot see.

Without isolation, you get consensus. Consensus is comfortable, dashboardable, and wrong in exactly the ways that matter most. With isolation, you get structured disagreement. Structured disagreement is harder to consume and harder to report upward. It is also the only honest representation of analytical uncertainty that a multi-agent system can produce.

The architecture that produces this map (isolated agents, diverse frameworks, common schema, structured synthesis) is more expensive and more complex than a single-agent pipeline. The question is whether the decision being informed justifies the cost. For institutional resilience at the scale Roach operates, the answer is unambiguous.

References

Wu, H. et al. (2025). Can LLM Agents Really Debate? A Controlled Study of Multi-Agent Debate in Logical Reasoning. arxiv.org/abs/2511.07784. Demonstrates that majority pressure in multi-agent debate suppresses independent correction, with minority agents rarely overturning incorrect majorities.
Chen, S. et al. (2025). When Helpfulness Backfires: LLMs and the Risk of False Medical Information Due to Sycophantic Behavior. npj Digital Medicine, 8, 605. nature.com. Measured compliance rates as high as 100% across frontier LLMs on prompts designed to elicit sycophantic agreement with incorrect statements.
Submitted to ICLR 2026. Sycophancy Is Not One Thing: Causal Separation of Sycophantic Behaviors in LLMs. openreview.net. Decomposes sycophancy into sycophantic agreement, sycophantic praise, and genuine agreement, showing each is encoded along distinct, independently steerable directions in latent space.