Multi-agent governance needs human-centric design

A compliance team of six AI agents must assess whether a novel fintech product falls under MiFID II, PSD2, or the AI Act. The product combines payment initiation with automated portfolio rebalancing and uses a large language model to generate personalised investment narratives. It genuinely straddles regulatory boundaries.

The team's orchestrator coordinates tightly. Every agent shares every intermediate finding in real time. Within minutes, the agents converge on a single classification: MiFID II, because the investment advisory component dominates. The assessment is clean, fast, and confident. It has also destroyed the most important signal the system could have produced: the regulatory ambiguity itself. The product sits at the intersection of three regulatory frameworks simultaneously. That intersection, not any single classification, is what the compliance officer needs to understand.

Run the same scenario with loose coordination. Agents work semi-independently, sharing only structured outputs at the end. The MiFID II agent classifies it as investment advisory. The PSD2 agent classifies it as payment initiation. The AI Act agent flags the LLM component as a high-risk AI system under Annex III. Three confident, well-reasoned, mutually incompatible assessments land on the compliance officer's desk. The disagreement is the finding.

A caveat before we proceed: convergence is not always failure. When the problem has a clear answer and the agents converge on it, that is correct behaviour. The pathology emerges specifically when the problem is genuinely ambiguous and premature convergence destroys the ambiguity signal. Regulatory boundary questions are precisely this kind of problem. The design challenge is to build systems that converge when convergence is warranted and preserve disagreement when disagreement carries information.

This challenge has been studied for decades. Not in computer science, but in organisational theory. Professor Phanish Puranam's research program, spanning 36 academic papers and building on a tradition reaching back through March and Simon's foundational work on organisational decision-making, provides a theoretical framework that most AI system designers have never encountered. Its core insight: the same problems that govern how human organisations coordinate apply, with important caveats, to multi-agent AI systems. Systems that solve these problems while preserving human autonomy, competence, and relatedness will outperform those that optimise purely for coordination efficiency.

Four problems every organisation must solve

Puranam's research decomposes organisational design into four universal problems. The decomposition is not new. It draws on his earlier monograph The Microstructure of Organizations (2018) and ultimately on a tradition spanning March, Simon, Thompson, and Galbraith. What is new is the empirical and computational rigour Puranam brings to these problems, and the way his recent work connects them to AI system design. Every organisation, from a two-person startup to the European Central Bank, must solve all four. So must every multi-agent system.

The first is task division: how is the total work broken into subtasks? In a human organisation, this is the question of departmental structure, functional boundaries, and role definitions. A bank separates credit risk from market risk from operational risk. Each function develops deep expertise within its boundary. The cost is that cross-boundary interactions become harder to see. In a multi-agent system, task division determines each agent's analytical scope. An agent that covers "financial risk" broadly will catch cross-domain interactions but lack the depth to model liquidity spirals or collateral cascades with precision. An agent that covers only "collateral chain dynamics" will model those mechanisms beautifully but miss the macroeconomic trigger that sets them in motion.

The second is task allocation: who does what, and when? Human organisations solve this through reporting lines, project staffing, and capacity management. Multi-agent systems solve it through activation logic: which agents are invoked for a given scenario, and under what conditions. A sanctions scenario should activate geopolitical and financial contagion agents. A ransomware scenario should activate cyber threat and operational resilience agents. The allocation decision shapes what analytical perspectives reach the final assessment.

The third is reward provision: what incentivises effort and quality? For AI agents, this is the objective function: what counts as a good output? Structured evaluation criteria (explicit confidence, named mechanisms, evidence citations, acknowledged limitations) define quality. For the humans who review and act on agent outputs, the question becomes acute. Puranam's research on self-determination theory, which we will examine in detail later, shows that how the system frames the human's role determines whether that human exercises genuine judgment or degrades into a rubber stamp.

The fourth is information provision: who knows what, and who needs to know what? This is the core architectural question for multi-agent governance. How much information flows between agents, in what form, and at what point in the analytical process? Too much sharing too early causes convergence (the compliance scenario above). Too little sharing too late causes incompatible outputs that cannot be synthesised. The design of information flow determines whether the system produces genuine intelligence or expensive noise.

These four problems are not independent. A task division choice constrains what task allocation is possible. An information provision architecture shapes what quality signals are available. The interactions between these four design dimensions create a design space that is combinatorially large but not arbitrary. Theory narrows the viable options.

Four universal problems of organisational design. Click a node to compare human and multi-agent approaches.

Puranam's insight is that these four problems are universal: they apply to every form of collective action, whether the actors are humans, algorithms, or both. An engineering team, a multi-agent AI system, and a hybrid human-AI governance function all face the same four problems. The difference is not in the problems but in the mechanisms available to solve them. Human organisations can use hierarchy, culture, incentives, and informal networks. AI systems can use orchestration logic, output schemas, objective functions, and communication protocols.

The code framework and the cost of shared language

One of the most practically relevant findings in Puranam's program concerns how organisations communicate across internal boundaries. The research on organisational codes reveals a tension that maps directly to multi-agent system design.

Organisations develop shared vocabularies that compress knowledge. When a risk team says "tail correlation," those two words compress an entire analytical framework: the tendency for extreme events to cluster, the failure of Gaussian assumptions under stress, the need for copula models that capture tail dependence. This compression accelerates communication within the team. Two risk analysts can discuss tail correlation without re-deriving the concept from first principles.

But compression creates barriers. The same term means nothing to the legal team reviewing the risk assessment. Worse, it might mean something subtly different to the regulatory reporting team, which uses "correlation" in a statistical sense divorced from the tail dynamics that make the concept important for risk. Koçak and Puranam (2022) call this a label clash: the same word carrying different meanings across organisational boundaries. Their research identifies a more insidious variant called a stimulus clash, where the same real-world event triggers fundamentally different interpretations depending on which code a person operates within. An interest rate rise might trigger "tightening financial conditions" in the risk team's code and "improved net interest margin" in the treasury team's code. Both interpretations are correct within their respective frameworks. Neither is complete.

The empirical finding is striking: stimulus clashes are dramatically harder to resolve than label clashes. Relabelling is easy. Unlearning an established interpretation of a real-world event requires cognitive effort that most organisations underestimate.

This maps to multi-agent governance in a specific way. Each specialised agent develops a rich internal representation of its domain. A financial contagion agent reasons in terms of collateral chains, liquidity spirals, and counterparty exposure. A geopolitical agent reasons in terms of escalation dynamics, alliance structures, and sanctions cascades. These are codes in Puranam's technical sense: compressed domain representations that enable deep reasoning within a framework but resist translation across frameworks.

The naive approach to inter-agent communication is raw output sharing: agent A sends its full assessment to agent B. This fails because the codes are mutually opaque. The financial contagion agent's detailed analysis of collateral cascade dynamics is not interpretable by the geopolitical agent, which lacks the framework to evaluate whether the cascade mechanism is plausible.

The effective approach is what Puranam would call a broad shallow code: a shared vocabulary that enables integration without requiring each agent to understand every other agent's deep reasoning. In multi-agent governance, this takes the form of a standardised output schema. Every agent produces assessments using the same structure: a score with explicit uncertainty bounds, a named mechanism explaining the score, evidence citations, and a confidence classification. The schema does not require agents to understand each other's analytical frameworks. It requires them to express their conclusions in a format that a synthesis layer can compare and present to human decision-makers.

Koçak, Park, and Puranam (2022) add a counterintuitive refinement: adaptive ambiguity. Moderate fuzziness in cross-boundary communication can actually improve collaboration when parties lack code alignment. Too much precision exposes code differences and forces conflict. Too much ambiguity creates chaos. In multi-agent design, this suggests that output schemas should be structured enough to enable comparison but not so granular that they force artificial precision. A confidence classification of "High / Medium / Low" may be more useful than a decimal probability, because it creates productive space for interpretation rather than false precision that obscures genuine uncertainty.

A related finding complicates the picture. Park and Puranam (2024) demonstrate that sharing full reasoning chains can actually impede collective learning. When people understand the specific logic behind others' positions, they become more skilled at rationalisation: defending their original position without genuinely updating. The implication for multi-agent design is subtle: synthesis layers that expose full agent reasoning to other agents (or to fine-tuning feedback loops) risk creating precisely this rationalisation dynamic. The output schema works partly because it is shallow. It communicates conclusions without providing the ammunition for post-hoc justification.

Hierarchy versus culture as coordination mechanisms

The largest empirical study in Puranam's program, conducted with Marchetti and published in the Strategic Management Journal (2025), analysed 1.5 million Glassdoor reviews across 23,000 firms and 42 million employee profiles. The finding is both robust and counterintuitive: hierarchy and culture are functional equivalents. Both solve the coordination problem, but through different mechanisms. And they are substitutes, not complements.

Hierarchy achieves coordination through explicit authority, formal reporting lines, and centralised decision-making. A senior manager reviews subordinates' work, resolves conflicts between teams, and sets priorities. The mechanism is reliable and scales well for tasks that are well-defined, where the manager possesses the expertise to evaluate subordinates' work, and where speed of decision-making matters more than breadth of exploration.

Culture achieves coordination through shared values, norms, and mutual understanding. Team members make compatible decisions not because a manager told them to, but because they share enough common ground to independently reach similar conclusions about what the right action is. The mechanism scales better than hierarchy for novel situations where no precedent exists and no single manager possesses the expertise to evaluate all relevant dimensions.

The empirical result is clear: less hierarchical firms have significantly stronger culture. As organisations decentralise authority, they must invest proportionally more in cultural alignment to ensure that distributed decision-makers make compatible choices. Without this substitution, decentralisation produces chaos.

This finding was developed for human organisations, and applying it to multi-agent systems requires an explicit translation. When we say a multi-agent system is "culture-heavy," we are using the term metaphorically. Agents do not share values in any sociological sense. They share configuration: system prompts, output schemas, evaluation criteria. But the function is the same. In a hierarchy-heavy design, a central orchestrator assigns tasks, reviews outputs, and makes final decisions. In a culture-heavy design, agents share principles (analytical rigour, evidence-based reasoning, explicit uncertainty quantification) and operate autonomously within their domains. The coordination mechanism shifts from central authority to shared configuration and structured output norms.

The metaphor is imperfect but useful. Consider two architectures:

A hierarchy-heavy design features a central orchestrator agent that assigns tasks, reviews outputs, resolves conflicts, and makes final decisions. The specialised agents are tools in a pipeline. This works well for well-defined, repeatable tasks: checking a transaction against a known sanctions list, validating a data field against a schema. It fails for novel situations because the orchestrator becomes the bottleneck. It can only coordinate what it understands, and in genuinely novel scenarios, understanding is precisely what is missing.

A culture-heavy design features agents that share principles but operate autonomously within their domains. Coordination happens through shared output schemas and disagreement-preserving aggregation rather than central control. This works well for complex, novel scenarios where the analytical approach must emerge from the intersection of multiple frameworks.

MAGS sits toward the culture-heavy end. Its agents share governance principles: structured reasoning, explicit confidence levels, adversarial challenge. No orchestrator agent overrides an individual agent's assessment. The Red Team agent challenges the emerging consensus by design, not by hierarchical authority.

Hierarchy vs. culture: a balancing act. The beam tilts toward culture-heavy coordination. Hover over a weight to learn more.

The choice between hierarchy and culture is an engineering decision driven by problem structure. Well-defined problems with known analytical frameworks benefit from hierarchical coordination. Novel problems where the right framework is uncertain benefit from cultural coordination. Most real regulatory environments contain both types simultaneously, which suggests hybrid architectures: hierarchical coordination for known compliance checks, cultural coordination for novel scenario analysis.

The search-coordination tradeoff

Puranam's research identifies a formal tension at the heart of every multi-agent system: the search-coordination tradeoff. In complex problem-solving, coordination enables agents to build on each other's work and avoid incompatible solutions. But coordination also constrains search. Agents that coordinate closely tend to converge on similar analytical frames, reducing the solution space explored.

Research on Kaggle competitions (Minervini, Puranam et al., 2026) provides empirical grounding. Collaborative search outperforms individual search, but only for problems with high interdependence among solution components. For decomposable problems where each component can be optimised independently, individual or parallel search works equally well. The value of coordination depends on problem structure, not on a universal principle that more coordination is better.

The optimal balance depends on two variables: problem complexity (how many interacting components, and how nonlinear are their interactions?) and interdependence (to what extent does one agent's solution constrain what solutions are feasible for other agents?).

For highly interdependent problems, coordination is more valuable. A financial contagion assessment that assumes ECB intervention and a geopolitical assessment that assumes ECB intervention is politically blocked cannot coexist without resolution. For problems where subtasks are relatively independent, coordination constrains search without adding value. The GDPR data retention analysis and the DORA incident reporting analysis have limited interaction. Coordinating them tightly forces unnecessary convergence.

Regulatory environments are partially decomposable in Herbert Simon's sense. Some regulations genuinely interact: DORA's ICT third-party risk requirements intersect with NIS2's supply chain security requirements, because the same vendor relationship falls under both frameworks. Other regulations are largely independent.

The design principle: match coordination intensity to interdependence. Where regulations genuinely interact, coordinate tightly. Where they are independent, let agents search freely. Accept that outputs may diverge, because divergence in independent domains is not a problem to solve but a feature to preserve.

This translates to a two-phase architecture. During the analytical phase, agents work in isolation, exploring the problem space from their respective frameworks. During the synthesis phase, outputs are compared, disagreements are surfaced, and interdependencies are resolved. The isolation-then-synthesis pattern directly implements the search-coordination tradeoff: maximise search during the first phase, apply coordination during the second.

Why human-centric design is a performance requirement

The argument so far has been structural: how agents should divide work, communicate, and coordinate. But multi-agent governance systems exist within organisations staffed by humans. The assessments they produce must be reviewed, challenged, and acted upon by human decision-makers. This human interface is central to whether the system produces genuine value or expensive compliance theater.

Puranam's (2025) research on self-determination theory provides the empirical foundation. Three non-monetary factors drive knowledge worker performance: autonomy (control over one's work and decisions), relatedness (connection to others and shared purpose), and competence (belief in one's ability to accomplish meaningful goals). These three factors are 4.6 times more important to engagement and retention than compensation.

Consider the compliance officer who receives a multi-agent assessment. If the system frames their role as "approve this output," it undermines autonomy. If the reasoning chain is opaque, it undermines competence. If the workflow positions them as a bottleneck rather than a partner, it undermines relatedness. The result is compliance theater: humans who approve without understanding, who sign off on assessments they have not genuinely evaluated.

The design implications are concrete, but they extend beyond interface framing into workflow architecture:

Progressive disclosure of reasoning. Lead with the headline assessment and disagreement structure. Let the officer drill into individual agent reasoning only when the disagreement matters. This respects competence without creating information overload. Full transparency is not always better: Park and Puranam (2024) find that exposing complete reasoning chains can enable rationalisation rather than genuine updating. The right level of transparency is one that supports independent judgment without overwhelming it.

Genuine decision points, not manufactured ones. The system must present scenarios where human judgment resolves genuine ambiguity: where agents disagree, where confidence is low, where the scenario is novel enough that analytical frameworks may not apply. These decision points must be real. If the system always produces the same recommendation regardless of human input, officers learn quickly that their role is ceremonial.

Workflow design, not just interface design. The framing "Three agents disagree on the classification of this product. Here is each agent's reasoning. What is your assessment?" positions the human as an expert. But framing alone is insufficient. The organisation must build training programmes, feedback mechanisms, and performance incentives that reward genuine analytical engagement rather than throughput. Technology enables but does not determine human behaviour, as Gulati, Marchetti, and Puranam (2026) demonstrate in their study of collaboration work management adoption across 3,017 firms: the technology creates possibilities, but organisational choices about how to deploy it matter more.

A related finding strengthens this argument. Klapper, Puranam and colleagues (2026) studied impervious actors in group decisions: members who maintain their position despite social pressure to conform. In many settings, these actors improve group decisions by disrupting conformity cascades, where an initial majority position suppresses minority dissent. The impervious actors need not be correct. Their willingness to maintain disagreement surfaces information that the majority initially dismissed.

In multi-agent governance, a dedicated adversarial agent serves an analogous function. Its system prompt instructs it to challenge the emerging consensus regardless of what other agents conclude. The analogy is imperfect: Klapper et al.'s impervious actors resist social pressure, a phenomenon that does not operate on LLMs in the same way. But the function is the same: a structural mechanism for preserving analytical diversity. The adversarial agent resists convergence not through social stubbornness but through architectural instruction. The mechanism differs; the outcome (preserved diversity) is equivalent.

Vanneste and Puranam (2024) and the research on the Pinocchio Effect (2025) add a warning. When AI systems explain their decisions in natural language, users often accept recommendations without critical evaluation. Over time, humans become passive partners, outsourcing not just computation but judgment. The antidote is not just technology design but organisational culture: an environment where questioning an AI assessment is rewarded, and where human override of machine recommendations is treated as a feature rather than a failure.

Where the analogy breaks

The argument so far has drawn parallels between human organisational design and multi-agent system architecture. The parallels are real, but they have limits that practitioners should understand.

The most fundamental difference: agents are stateless; humans are not. Human organisations develop culture through repeated interaction, shared history, and social dynamics that unfold over months and years. Agents receive a system prompt and produce an output. What we call "culture" in a multi-agent system is a configuration file. It can be changed in a commit. It does not emerge organically, and it does not resist change the way human culture does. This makes multi-agent "culture" simultaneously more tractable (you can redesign it deliberately) and more fragile (it does not self-reinforce the way human culture does through social mechanisms).

Agents do not experience autonomy, competence, or relatedness. The SDT argument applies to the humans who interact with the system, not to the agents themselves. The "reward provision" problem for agents reduces to objective function design, which is a different (and frankly simpler) problem than motivating knowledge workers. The article's earlier mapping of SDT to agents was metaphorical. The mapping of SDT to the human operators is empirical and load-bearing.

Engineering constraints are real and binding. Running six specialised agents in parallel with LLM inference costs real money and real time. A typical multi-agent assessment with frontier models takes 15 to 45 seconds and costs five to ten times more than a single-agent call. For high-frequency use cases (transaction monitoring, real-time compliance screening), this latency and cost profile is prohibitive. Multi-agent governance is appropriate for complex, infrequent, high-stakes assessments where the cost of a wrong answer dwarfs the cost of compute. It is not appropriate for every compliance question.

Evaluation is the hard problem that theory does not solve. Organisational theory tells you to include an adversarial agent. It does not tell you how to evaluate whether that agent is genuinely challenging the consensus or merely generating plausible-sounding objections that the other agents have already addressed. Hallucination, confident citation of non-existent regulations, and performative disagreement are failure modes that have no analogue in human organisational theory. These are engineering problems that require engineering solutions: structured evaluation harnesses, automated fact-checking against regulatory databases, and calibration testing against scenarios with known ground truth.

Prompt fragility undermines the "culture" metaphor. In human organisations, culture is resilient because it is distributed across thousands of individual beliefs and reinforced through social interaction. In a multi-agent system, the shared "culture" lives in system prompts that can break when the base model is updated, when prompt formats change, or when a well-intentioned engineer edits a system prompt without understanding the downstream effects. The carefully designed governance principles can degrade silently. This argues for rigorous prompt versioning, regression testing, and monitoring that most current multi-agent systems lack.

None of this invalidates the organisational theory framework. It constrains the domain of application. The four universal problems remain useful as a design checklist. The search-coordination tradeoff remains the right lens for information architecture decisions. The SDT framework remains the right guide for human interface design. But practitioners who treat the human-org-to-AI-agent analogy as exact rather than instructive will build systems that look elegant on paper and fail in production.

From theory to architecture

The value of Puranam's framework is not that it provides a blueprint. It provides a vocabulary for design decisions that multi-agent system builders are already making, often implicitly and without recognising the tradeoffs involved.

Every multi-agent system chooses a task division. The organisational theory lens asks: are you dividing by analytical domain, by regulatory framework, by time horizon, or by methodology? Each division creates different blind spots at different boundaries. Making the choice explicit makes the blind spots visible.

Every multi-agent system chooses an information architecture. The search-coordination tradeoff asks: are you sharing too much (convergence risk) or too little (incoherence risk)? The answer depends on the interdependence structure of the specific problem, not on a universal principle.

Every multi-agent system has a human interface. The SDT framework asks: does this interface build the officer's competence, preserve their autonomy, and treat them as a partner? If not, you are building compliance theater, regardless of how sophisticated the agents are.

The compliance team from the opening scenario does not need better AI. It needs better architecture. Three independent assessments that disagree are more valuable than one coordinated assessment that converges on the wrong answer, specifically when the problem is ambiguous enough that the disagreement carries information. Knowing when to coordinate and when to preserve disagreement is not an engineering optimisation. It is an organisational design decision with sixty years of theory behind it.

References

Puranam, P. (2018). The Microstructure of Organizations. Oxford University Press.
Koçak, Ö. & Puranam, P. (2022). Separated by a common language: Code differences and organizational collaboration. Management Science, 68(9), 6345-6367.
Koçak, Ö., Park, S. & Puranam, P. (2022). Ambiguity can compensate for semantic differences in organizational collaboration. Management Science, 68(7), 1234-1251.
Koçak, Ö. & Puranam, P. (2024). The power of Babel: Mechanisms for managing organizational multilingualism. Academy of Management Review, 49(2), 156-178.
Koçak, Ö., Puranam, P. & Yegin, A. (2023). Moral versus causal code differences: Why moral misalignment produces stronger conflict. Frontiers in Psychology, 14, 1-18.
Marchetti, A. & Puranam, P. (2025). Hierarchy and culture as functional equivalents: Evidence from 1.5 million Glassdoor reviews. Strategic Management Journal, 46(4), 567-595.
Park, S. & Puranam, P. (2024). Vicarious learning without knowledge differentials: Belief disturbance as a mechanism of collective learning. Management Science, 70(3), 1456-1475.
Puranam, P. (2025). The business case for human-centric organizing: Self-determination theory in the age of AI. Strategic Organization, 23(1), 5-28.
Klapper, R., Puranam, P. et al. (2026). Impervious actors and the disruption of conformity cascades. Administrative Science Quarterly.
Minervini, M., Puranam, P. et al. (2026). Collaborative search versus individual search in complex problem spaces. Academy of Management Journal, 69(2), 234-257.
He, X., Puranam, P. et al. (2024). Decision centralization and group learning dynamics. Organization Science.
Gulati, R., Marchetti, A. & Puranam, P. (2026). Collaboration work management technology and organizational decentralization. Strategic Management Journal.
Vanneste, S. & Puranam, P. (2024). Agency, trust, and control in human-AI decision-making systems. Administrative Science Quarterly, 69(1), 1-35.
Sen, S., Puranam, P. et al. (2026). Large language models and analogical reasoning in strategy. Management Science, 72(1), 89-107.
Choudhary, A., Puranam, P. et al. (2023). Human-AI ensembles: Leveraging complementary error patterns.