Research2026-05-14

Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback

arXiv:2605.05739v2 Announce Type: replace-cross Abstract: Forecast evaluation in finance has relied on aggregate accuracy metrics and predictive-accuracy tests built on point-forecast errors. These instruments evaluate forecast outputs but cannot evaluate the process of forecast generation, which...

Read Original Article on Arxiv CS.AI

arxivpapersagentsrl