January 2024
·
5 Reads
IEEE/ACM Transactions on Networking
We study the optimal scheduling problem where n source nodes attempt to transmit updates over L shared wireless on/off fading channels to optimize their age performance under energy and age-violation tolerance constraints. Specifically, we provide a generic formulation of age-optimization in the form of a constrained Markov Decision Process (CMDP), and obtain the optimal scheduler as the solution of an associated Linear Programming problem. We investigate the characteristics of the optimal single-user multi-channel scheduler under different age-related objectives where a usual threshold-based policy does not apply. We then investigate the stability region of the optimal scheduler for the multi-user case under age-violation tolerance constraints. Furthermore, we develop two online schedulers that do not require statistics and are amenable to scalable operation: Drift-plus-penalty-based design, and a novel variation of the well-known Q-learning-based reinforcement learning method that combines Q-learning with drift-minimization-methods successfully for the first time, to the best of our knowledge. Our numerical studies compare the performance of our online schedulers to the optimal scheduler to reveal that both algorithms capture the essential behavior of the optimal design under different scenarios with good scalability, with the Q-learning-based design providing even closer performance to the optimal one by utilizing the history of the drift in a novel way.