Research2026-05-14

Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning

arXiv:2605.12906v1 Announce Type: cross Abstract: Data selection during supervised fine-tuning (SFT) can critically change the behavior of large language models (LLMs). Although existing work has studied the effect of selecting data based on heuristics such as perplexity, difficulty, or length, the...

Read Original Article on Arxiv CS.AI

arxivpapersfine-tuning