Diverse Evidence, Better Forecasts: Multi-Agent Deliberation Under Information Asymmetry
arXiv:2607.01661v1 Announce Type: new Abstract: Multi-agent systems are increasingly used for forecasting future events, as deliberation among multiple LLMs is believed to improve reasoning and calibration. Yet existing approaches overlook a critical design choice: what information each agent...
What Happened
A new pre-print on arXiv (2607.01661v1) systematically examines a previously overlooked variable in multi-agent LLM forecasting: information asymmetry. While prior work has focused on how multiple LLMs can deliberate to improve reasoning and calibration, this research investigates what happens when each agent in the ensemble receives different evidence or data points before deliberating. The core finding is that deliberately introducing diverse, asymmetric information across agents leads to more accurate forecasts than giving all agents identical information—even when that shared information is high-quality.
The paper formalizes a multi-agent deliberation framework where each LLM agent sees a unique subset of available evidence, then engages in structured discussion before producing a final forecast. This mirrors how human expert panels often work: each member brings specialized knowledge, and the collective judgment emerges from pooling these distinct perspectives rather than from redundant consensus.
Why It Matters
This research addresses a fundamental weakness in current multi-agent LLM systems: the tendency toward echo chambers and confirmation bias. When all agents share the same prompt and data, their deliberations often converge on a narrow range of answers, amplifying shared errors rather than correcting them. By introducing controlled information asymmetry, the system forces agents to challenge each other with facts their peers haven't seen, producing more robust and calibrated forecasts.
The implications are significant for any domain where forecasting accuracy matters—from financial markets and supply chain planning to epidemiological modeling and geopolitical risk assessment. Current best practices often involve running the same LLM multiple times with identical inputs and averaging outputs. This work suggests that a more deliberate approach to evidence distribution could yield better results without requiring more powerful models or additional compute.
For AI practitioners, the study also highlights a practical insight: the design of the deliberation protocol matters as much as the quality of the individual models. How agents share information, when they share it, and what they are allowed to reveal about their own evidence sources all affect the final outcome. The paper provides a framework for optimizing these parameters.
Implications for AI Practitioners
First, practitioners building multi-agent forecasting systems should move beyond simple majority voting or averaging. Instead, they should implement structured deliberation protocols where each agent receives a carefully curated subset of available evidence. This requires upfront work in partitioning data sources, but the payoff in accuracy appears substantial.
Second, the research suggests that transparency about evidence provenance is critical. Agents should be able to cite which specific data points support their claims during deliberation, enabling other agents to assess credibility and relevance. This is analogous to how human experts cite sources in collaborative forecasting.
Third, the work implies that the optimal number of agents and the degree of information overlap are tunable hyperparameters. Too much overlap leads to redundancy; too little leads to fragmentation. Practitioners will need to experiment with these settings for their specific use cases.
Finally, this approach naturally supports adversarial or devil's advocate configurations, where one agent is deliberately given contradictory evidence to test the robustness of emerging consensus.
Key Takeaways
- Information asymmetry improves forecasting: Giving different LLM agents different evidence subsets before deliberation produces more accurate and calibrated forecasts than identical information sharing.
- Deliberation protocol design is a critical optimization lever: How agents share and challenge evidence matters as much as model quality or ensemble size.
- Practitioners should partition evidence deliberately: Pre-processing data into distinct, non-overlapping subsets for each agent can yield better results than feeding all data to all agents.
- Transparency and evidence citation enhance deliberation quality: Allowing agents to reference specific data points during discussion improves the robustness of the final forecast.