Research2026-06-19

LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data

arXiv:2606.19509v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly applied to structured clinical data, yet whether they can recognize the limits of their own knowledge on such tasks remains unexplored. We study this question through the lens of cross-model attribution...

What Happened

Researchers have identified a critical blind spot in large language models applied to clinical tabular data: LLMs cannot reliably detect what they do not know. The paper introduces a method called Cross-Model Attribution Divergence (CMAD) to expose these epistemic gaps. By comparing attribution patterns across different models on the same clinical dataset, the authors demonstrate that LLMs often produce confident but incorrect predictions on structured medical data—without any internal signal of uncertainty.

The study focuses on tabular clinical data (e.g., patient records, lab results), a domain where accuracy is non-negotiable. When an LLM misclassifies a patient risk score or misinterprets a lab value, it rarely flags its own error. CMAD works by measuring divergence in how different models attribute importance to input features; high divergence correlates with higher likelihood of error, effectively serving as a proxy for the model's epistemic uncertainty.

Why It Matters

This research addresses a fundamental limitation of LLMs that is often glossed over in clinical AI discussions. Most existing work on LLM calibration focuses on natural language tasks—question answering, summarization, or dialogue—where the cost of error is lower. Clinical tabular data is different: a false negative on a sepsis predictor or a misclassified comorbidity can have life-or-death consequences.

The key insight is that LLMs do not possess an internal "confidence meter" that correlates with actual correctness on structured data. They can be confidently wrong, and worse, they cannot tell you when they are operating outside their knowledge boundaries. The CMAD approach offers a practical, model-agnostic way to detect these blind spots without requiring access to training data or model internals.

For AI safety in healthcare, this is a wake-up call. Regulatory frameworks like the EU AI Act and FDA guidelines increasingly require uncertainty quantification for high-risk AI systems. Current LLM deployment strategies in clinical settings often assume that a model's output probabilities reflect true uncertainty—this paper shows that assumption is dangerously flawed for tabular data.

Implications for AI Practitioners

For clinical AI developers: Do not rely on LLM confidence scores or logit-based uncertainty for structured medical data. Implement CMAD or similar cross-model divergence checks as a mandatory pre-deployment filter. Any prediction where attribution patterns diverge significantly across models should be flagged for human review. For MLOps teams: This work highlights the need for multi-model ensembles not just for accuracy, but for uncertainty detection. Deploying a single LLM on clinical tabular data is risky; running two or more models in parallel and measuring attribution divergence can serve as a cheap, effective guardrail. For researchers: The CMAD methodology is promising but needs validation on real-world clinical datasets with ground truth labels. The current study appears to be proof-of-concept; practitioners should wait for replication studies before relying on it in production. Additionally, the computational overhead of computing attributions across multiple models may be non-trivial for real-time clinical decision support.

Key Takeaways

LLMs applied to clinical tabular data cannot reliably detect their own knowledge gaps, producing confident errors without internal uncertainty signals.
Cross-Model Attribution Divergence (CMAD) offers a practical method to expose epistemic blind spots by comparing feature attribution patterns across different models.
Clinical AI practitioners should not trust LLM confidence scores on structured data; implement cross-model divergence checks as a safety filter.
This research underscores the urgent need for better uncertainty quantification methods before deploying LLMs in high-stakes medical settings.

Read Original Article on Arxiv CS.AI

arxivpapers