Research2026-05-06
How Reasoning Evolves from Post-Training Data: An Empirical Study Using Chess
Source: Arxiv CS.AI
arXiv:2604.05134v2 Announce Type: replace-cross Abstract: We study how reasoning evolves in a language model -- from supervised fine-tuning (SFT) to reinforcement learning (RL) -- by analyzing how a set of theoretically-inspired datasets influences language model performance in chess. We find that...
arxivpapersreasoning