BeClaude
Research2026-05-14

RTLC -- Research, Teach-to-Learn, Critique: A three-stage prompting paradigm inspired by the Feynman Learning Technique that lifts LLM-as-judge accuracy on JudgeBench with no fine-tuning

Source: Arxiv CS.AI

arXiv:2605.13695v1 Announce Type: cross Abstract: LLM-as-a-judge is now the default measurement instrument for open-ended generation, but on the public JudgeBench benchmark even strong instruction-tuned judges barely scrape past random on objective-correctness pairwise items. We introduce RTLC, a...

arxivpapersfine-tuningprompting