BeClaude
Research2026-04-27

CRAFT: Clustered Regression for Adaptive Filtering of Training data

Source: Arxiv CS.AI

arXiv:2604.22693v1 Announce Type: cross Abstract: Selecting a small, high-quality subset from a large corpus for fine-tuning is increasingly important as corpora grow to tens of millions of datapoints, making full fine-tuning expensive and often unnecessary. We propose CRAFT (Clustered Regression...

arxivpapers