Contagion Networks: Evaluator Bias Propagation in Multi-Agent LLM Systems
arXiv:2606.20493v1 Announce Type: cross Abstract: When large language models serve as evaluators in multi-agent systems, their systematic evaluation biases propagate through the agent network. We introduce Contagion Networks, a formal framework for measuring how evaluator biases spread across...
The Silent Spread: How Bias Cascades Through Multi-Agent LLM Systems
A new preprint from arXiv (2606.20493) introduces "Contagion Networks," a formal framework that systematically measures how evaluation biases propagate when LLMs serve as judges in multi-agent systems. The research reveals a critical vulnerability: when one agent’s biased evaluation influences another agent’s output, that bias doesn’t remain isolated—it spreads through the network like a contagion, potentially amplifying and distorting results across the entire system.
The core insight is deceptively simple. In current multi-agent architectures, agents often evaluate each other’s work—checking code, reviewing summaries, or assessing reasoning steps. If Agent A has a systematic preference for verbose responses, it will reward verbose outputs from Agent B. Agent B, receiving positive feedback, becomes more verbose in its next interaction with Agent C. The bias compounds. The framework provides mathematical tools to model this cascade, identifying which network topologies (star, chain, fully connected) are most susceptible to bias amplification and which are more resilient.
Why This Matters
This research strikes at the heart of a growing industry practice. As organizations deploy multi-agent systems for complex tasks—from automated software development to financial analysis—they increasingly rely on LLM-as-judge evaluation loops. The assumption has been that multiple agents provide a form of democratic oversight. The Contagion Networks framework challenges this: without careful design, multi-agent evaluation can actually increase systematic error rather than reduce it.
The implications are particularly acute for:
- Automated code review systems: Where one agent’s preference for certain coding styles can propagate across the development pipeline
- Content moderation pipelines: Where initial bias in content flagging can cascade through appeal systems
- Scientific literature analysis: Where evaluation preferences can systematically skew research synthesis
Implications for AI Practitioners
First, network topology is not neutral. A star topology (one central evaluator) concentrates bias risk, while a fully connected network may actually accelerate bias propagation. Practitioners need to model their specific network structure for vulnerability to contagion.
Second, diverse evaluator models are insufficient. Using different base LLMs as evaluators doesn’t automatically prevent bias propagation if those models share training data or alignment techniques. The framework suggests that true bias mitigation requires structural interventions—like decoupling evaluation from generation or introducing random audit nodes.
Third, measurement must precede mitigation. The paper provides formal metrics for quantifying bias propagation rates. Teams should implement these measurements as continuous monitoring tools, not one-time assessments.
Key Takeaways
- Multi-agent LLM systems can amplify evaluation biases through network effects, turning minor preferences into systemic distortions
- Network topology significantly influences bias propagation speed and magnitude—star and chain topologies are particularly vulnerable
- Current best practices (using multiple evaluators or different models) are insufficient without structural interventions against contagion
- Practitioners should implement formal bias propagation metrics as part of their multi-agent system monitoring, not just output accuracy metrics