TERC: A Transfer Entropy Redundancy Criterion for State Variable Selection in Reinforcement Learning
arXiv:2401.11512v2 Announce Type: replace-cross Abstract: Identifying the most suitable variables to represent the state is a fundamental challenge in Reinforcement Learning (RL). These variables must efficiently capture the information necessary for making optimal decisions. In order to address...
What Happened
Researchers have introduced TERC (Transfer Entropy Redundancy Criterion), a novel information-theoretic method for selecting state variables in reinforcement learning. The approach leverages transfer entropy—a measure of directed information flow between time series—to identify which variables carry the most predictive power for decision-making. Rather than relying on domain expertise or exhaustive search, TERC systematically evaluates candidate state variables by quantifying how much unique predictive information each contributes beyond what others already provide, while also penalizing redundancy.
The work, posted on arXiv, addresses a core bottleneck in RL: the state representation problem. In complex environments, the number of potential state variables can be enormous, and including irrelevant or redundant ones degrades learning efficiency and policy quality. TERC offers a principled, data-driven way to prune this space.
Why It Matters
State variable selection has long been treated as an art in RL. Practitioners often rely on intuition, manual feature engineering, or brute-force hyperparameter tuning to decide what goes into the state vector. This is brittle and doesn't scale to high-dimensional or partially observable domains.
TERC’s contribution is significant for three reasons:
- Theoretical grounding: Transfer entropy is rooted in information theory and captures causal, directional dependencies—unlike correlation-based methods that can miss nonlinear relationships. This makes TERC suitable for environments where state variables interact in complex, non-obvious ways.
- Redundancy awareness: Many feature selection methods rank variables independently, but in RL, two variables may be individually informative yet redundant together. TERC explicitly penalizes this, leading to more compact and efficient state representations.
- Potential for automation: If validated across diverse environments, TERC could reduce the human labor in RL pipeline design, enabling more autonomous discovery of minimal sufficient state spaces.
Implications for AI Practitioners
For RL engineers and researchers, TERC offers a concrete tool to diagnose and improve state representations. Practitioners working on robotics, game AI, or process control—where sensor streams are plentiful but relevance is unclear—could use TERC to identify which signals actually matter for the task.
However, adoption will depend on computational cost. Transfer entropy estimation requires time-series data and can be expensive for high-dimensional spaces. The authors will need to demonstrate scalability to real-world problems with thousands of candidate variables.
Additionally, TERC assumes access to sufficient data for reliable entropy estimation—a nontrivial requirement in sparse-reward or online settings. Practitioners should view it as a preprocessing or offline analysis tool rather than a runtime component.
If integrated into RL frameworks like Stable-Baselines3 or RLlib, TERC could become a standard diagnostic, much like feature importance in supervised learning. For now, it represents a promising step toward more principled state representation learning.
Key Takeaways
- TERC uses transfer entropy to select state variables that are both predictive of future rewards and non-redundant with each other.
- It addresses a longstanding practical challenge in RL: how to systematically choose which variables to include in the state representation.
- The method could reduce manual feature engineering but currently requires offline data and may be computationally intensive for large state spaces.
- Practitioners should watch for scalability benchmarks and potential integration into mainstream RL libraries.