Policy2026-04-23
Hybrid Policy Distillation for LLMs
Source: Arxiv CS.AI
arXiv:2604.20244v1 Announce Type: cross Abstract: Knowledge distillation (KD) is a powerful paradigm for compressing large language models (LLMs), whose effectiveness depends on intertwined choices of divergence direction, optimization strategy, and data regime. We break down the design of existing...
arxivpapers