BeClaude
Research2026-06-26

Zero-Shot Size Transfer for Neural ODEs on Sparse Random Graphs: Graphon Limits and Adjoint Convergence

Source: Arxiv CS.AI

arXiv:2606.26662v1 Announce Type: cross Abstract: Graph Neural Differential Equations (GNDEs) model continuous-time graph dynamics by parameterizing Neural ODE velocity fields with Graph Neural Networks. Their local, size-independent filters suggest a zero-shot size-transfer principle: train on a...

A Mathematical Foundation for Scaling Graph Neural ODEs

The preprint introduces a rigorous theoretical framework for what the authors term "zero-shot size transfer" in Graph Neural Differential Equations (GNDEs). At its core, the work proves that GNDEs trained on small random graphs can generalize to much larger graphs without retraining, provided the underlying graph structure converges to a well-defined limit object called a graphon. The paper establishes convergence guarantees for both the forward dynamics and the adjoint sensitivity method used for training, bridging a critical gap between empirical practice and mathematical theory.

Why This Matters

Graph Neural ODEs represent a powerful paradigm for modeling continuous-time processes on networks—from epidemic spread to opinion dynamics to molecular systems. However, a persistent practical bottleneck has been the inability to train on small graphs and deploy on large ones without performance degradation. This paper provides the first formal proof that such transfer is not merely possible but mathematically guaranteed under specific conditions.

The key insight lies in the "graphon limit" framework. Just as a sequence of increasingly fine discretizations can converge to a continuous function, sequences of random graphs drawn from the same probabilistic model converge to a graphon—a bounded symmetric function on [0,1]². The authors show that GNDEs respect this convergence: as graph size grows, the learned dynamics approach a well-defined continuum limit, and the gradient computations (adjoints) remain stable.

Implications for AI Practitioners

1. Training efficiency gains. Practitioners can now train GNDEs on graphs with hundreds of nodes and deploy on graphs with millions, dramatically reducing computational costs. This is particularly valuable for applications like large-scale social network analysis or molecular dynamics where full-size training is prohibitive. 2. Model validation shifts. The work implies that validation strategies should focus on whether the training graph is a representative sample from the target graphon distribution, rather than matching exact graph size. Practitioners need to think about graphon estimation and sampling quality. 3. Architectural constraints. The zero-shot transfer guarantee depends on the GNDE using local, size-independent filters—specifically message-passing architectures. Global attention mechanisms or graph transformers may not enjoy the same guarantees, limiting architectural choices for transferable models. 4. New evaluation benchmarks. The paper suggests evaluating GNDEs on graphon-based metrics (e.g., how well the learned dynamics approximate the continuum limit) rather than just per-node accuracy on fixed-size graphs.

Key Takeaways

  • Graph Neural ODEs trained on small random graphs can provably generalize to arbitrarily large graphs from the same generative model, eliminating the need for size-matched training data.
  • The theoretical framework relies on graphon convergence and adjoint stability, providing rigorous guarantees that were previously missing from the GNDE literature.
  • Practitioners should prioritize local message-passing architectures and ensure training graphs are representative samples from the target graphon distribution.
  • The work opens the door to training GNDEs on small-scale simulations and deploying on real-world networks, with potential applications in epidemiology, social science, and computational chemistry.
arxivpapers