Research2026-04-28
K-Score: Kalman Filter as a Principled Alternative to Reward Normalization in Reinforcement Learning
Source: Arxiv CS.AI
arXiv:2604.23056v1 Announce Type: cross Abstract: We propose a simple yet effective alternative to reward normalization in policy gradient reinforcement learning by integrating a 1D Kalman filter for online reward estimation. Instead of relying on fixed heuristics, our method recursively estimates...
arxivpapersrl