Research2026-04-22

Right for the Wrong Reasons: Epistemic Regret Minimization for LLM Causal Reasoning

arXiv:2602.11675v3 Announce Type: replace Abstract: Large language models may answer causal questions correctly for the wrong reasons, substituting associational shortcuts P(Y|X) for the interventional query P(Y|do(X)). Current RL methods reward what the model answers but not why, reinforcing these...

Read Original Article on Arxiv CS.AI

arxivpapersreasoning