BeClaude
Research2026-05-11

When Losses Align: Gradient-Based Composite Loss Weighting for Efficient Pretraining

Source: Arxiv CS.AI

arXiv:2605.07756v1 Announce Type: cross Abstract: Modern deep models are often pretrained on large-scale data with missing labels using composite objectives, where the relative weights of multiple loss terms act as hyperparameters. Tuning these weights with random search or Bayesian optimization is...

arxivpapers