Research2026-05-12

Expert Evaluation and the Limits of Human Feedback in Mental Health AI Safety Testing

arXiv:2601.18061v3 Announce Type: replace Abstract: Learning from human feedback~(LHF) assumes that expert judgments, appropriately aggregated, yield valid ground truth for training and evaluating AI systems. We tested this assumption in mental health, where high safety stakes make expert consensus...

Read Original Article on Arxiv CS.AI

arxivpaperssafety