Michael J. Neely's research while affiliated with University of Southern California and other places

Publications (171)

Preprint
Full-text available
In this paper, we provide a sub-gradient based algorithm to solve general constrained convex optimization without taking projections onto the domain set. The well studied Frank-Wolfe type algorithms also avoid projections. However, they are only designed to handle smooth objective functions. The proposed algorithm treats both smooth and non-smooth...
Article
Full-text available
This paper proves an impossibility result for stochastic network utility maximization for multi-user wireless systems, including multiple access and broadcast systems. Every time slot an access point observes the current channel states for each user and opportunistically selects a vector of transmission rates. Channel state vectors are assumed to b...
Preprint
Full-text available
This paper proves a representation theorem regarding sequences of random elements that take values in a Borel space and are measurable with respect to the sigma algebra generated by an arbitrary union of sigma algebras. This, together with a related representation theorem of Kallenberg, is used to characterize the set of multidimensional decision v...
Article
This paper considers online convex optimization (OCO) problems where decisions are constrained by available energy resources. A key scenario is optimal power control for an energy harvesting device with a finite capacity battery. The goal is to minimize a time-average loss function while keeping the used energy less than what is available. In this...
Article
Future generation wireless technologies are expected to serve an increasingly dense and dynamic population of users that generate short bundles of information to be transferred over the shared spectrum. This calls for new distributed and low-overhead Multiple-Access-Control (MAC) strategies to serve such dynamic demands with spectral efficiency cha...
Preprint
This paper revisits a classical problem of slotted multiple access with success, idle, and collision events on each slot. First, results of a 2-user multiple access game are reported. The game was conducted at the University of Southern California over multiple semesters and involved competitions between student-designed algorithms. An algorithm ca...
Article
This paper considers online convex optimization (OCO) problems where decisions are constrained by available energy resources. A key scenario is optimal power control for an energy harvesting device with a finite capacity battery. The goal is to minimize a time-average loss function while keeping the used energy less than what is available. In this...
Preprint
This paper considers online optimization of a renewal-reward system. A controller performs a sequence of tasks back-to-back. Each task has a random vector of parameters, called the task type vector, that affects the task processing options and also affects the resulting reward and time duration of the task. The probability distribution for the task...
Article
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint functions are revealed after the decision is made at each time slot. The best known expected regr...
Article
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint functions are revealed after the decision is made at each time slot. The best known expected regr...
Preprint
This paper proves an impossibility result for stochastic network utility maximization for multi-user wireless systems, including multiple access and broadcast systems. Every time slot an access point observes the current channel states for each user and opportunistically selects a vector of transmission rates. Channel state vectors are assumed to b...
Preprint
Full-text available
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint functions are revealed after the decision is made at each time slot. The best known expected regr...
Article
This paper considers utility optimal power control for energy-harvesting wireless devices with a finite capacity battery. The distribution information of the underlying wireless environment and harvestable energy is unknown, and only outdated system state information is known at the device controller. This scenario shares similarity with Lyapunov o...
Article
We consider multiple parallel Markov decision processes (MDPs) coupled by global constraints, where the time varying objective and constraint functions can only be observed after the decision is made. Special attention is given to how well the decision maker can perform in T slots, starting from any state, compared to the best feasible randomized s...
Article
Full-text available
This paper considers optimization over multiple renewal systems coupled by time-average constraints. These systems act asynchronously over variable length frames. When a particular system starts a new renewal frame, it chooses an action from a set of options for that frame. The action determines the duration of the frame, the penalty incurred durin...
Preprint
Full-text available
We propose a new primal-dual homotopy smoothing algorithm for a linearly constrained convex program, where neither the primal nor the dual function has to be smooth or strongly convex. The best known iteration complexity solving such a non-smooth problem is $\mathcal{O}(\varepsilon^{-1})$. In this paper, we show that by leveraging a local error bou...
Conference Paper
We consider multiple parallel Markov decision processes (MDPs) coupled by global constraints, where the time varying objective and constraint functions can only be observed after the decision is made. Special attention is given to how well the decision maker can perform in T slots, starting from any state, compared to the best feasible randomized s...
Article
We consider multiple parallel Markov decision processes (MDPs) coupled by global constraints, where the time varying objective and constraint functions can only be observed after the decision is made. Special attention is given to how well the decision maker can perform in T slots, starting from any state, compared to the best feasible randomized s...
Preprint
Full-text available
We study constrained stochastic programs where the decision vector at each time slot cannot be chosen freely but is tied to the realization of an underlying random state vector. The goal is to minimize a general objective function subject to linear constraints. A typical scenario where such programs appear is opportunistic scheduling over a network...
Article
This paper considers utility optimal power control for energy harvesting wireless devices with a finite capacity battery. The distribution information of the underlying wireless environment and harvestable energy is unknown and only outdated system state information is known at the device controller. This scenario shares similarity with Lyapunov op...
Article
This paper considers the fundamental convergence time for opportunistic scheduling over time-varying channels. The channel state probabilities are unknown and algorithms must perform some type of estimation and learning while they make decisions to optimize network utility. Existing schemes can achieve a utility within $\epsilon$ of optimality, for...
Article
Full-text available
We consider multiple parallel Markov decision processes (MDPs) coupled by global constraints, where the time varying objective and constraint functions can only be observed after the decision is made. Special attention is given to how well the decision maker can perform in $T$ slots, starting from any state, compared to the best feasible randomized...
Article
Full-text available
This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich's OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i.i.d. generated at each round and are disclosed to the decision maker only after the decision is made. This formulation arises naturall...
Article
This paper studies the convergence time of dual gradient methods for general (possibly nondifferentiable) strongly convex programs. For general convex programs, the convergence time of dual subgradient/gradient methods with simple running averages (running averages started from iteration 0) is known to be O(1/ε <sup xmlns:mml="http://www.w3.org/199...
Article
This paper considers large scale constrained convex (possibly composite and non-separable) programs, which are usually difficult to solve by interior point methods or other Newton-type methods due to the non-smoothness or the prohibitive computation and storage complexity for Hessians and matrix inversions. Instead, they are often solved by first o...
Article
This paper considers dynamic transmit covariance design in point-to-point MIMO fading systems with unknown channel state distributions and inaccurate channel state information subject to both long term and short term power constraints. First, the case of instantaneous but possibly inaccurate channel state information at the transmitter (CSIT) is tr...
Article
This paper considers online convex optimization with time-varying constraint functions. Specifically, we have a sequence of convex objective functions $\{f_t(x)\}_{t=0}^{\infty}$ and convex constraint functions $\{g_{t,i}(x)\}_{t=0}^{\infty}$ for $i \in \{1, ..., k\}$. The functions are gradually revealed over time. For a given $\epsilon>0$, the go...
Article
The backpressure algorithm has been widely used as a distributed solution to the problem of joint rate control and routing in multi-hop data networks. By controlling a parameter $V$ in the algorithm, the backpressure algorithm can achieve an arbitrarily small utility optimality gap. However, this in turn brings in a large queue length at each node...
Article
Traffic load-balancing in datacenters alleviates hot spots and improves network utilization. In this paper, a stable in-network load-balancing algorithm is developed in the setting of software-defined networking. A control plane configures a data plane over successive intervals of time. While the MaxWeight algorithm can be applied in this setting a...
Article
This paper considers time-average optimization, where a decision vector is chosen every time step within a (possibly non-convex) set, and the goal is to minimize a convex function of the time averages subject to convex constraints on these averages. Such problems have applications in networking, multi-agent systems, and operations research, where d...
Article
Full-text available
This paper considers optimization over multiple renewal systems coupled by time average constraints. These systems act asynchronously over variable length frames. For each system, at the beginning of each renewal frame, it chooses an action which affects the duration of its own frame, the penalty, and the resource expenditure throughout the frame....
Article
Stochastic non-smooth convex optimization constitutes a class of problems in machine learning and operations research. This paper considers minimization of a non-smooth function based on stochastic subgradients. When the function has a locally polyhedral structure, a staggered time average algorithm is proven to have O(1/T) convergence rate. A more...
Article
Full-text available
This paper considers constrained optimization over a renewal system. A controller observes a random event at the beginning of each renewal frame and then chooses an action that affects the duration of the frame, the amount of resources used, and a penalty metric. The goal is to make frame-wise decisions so as to minimize the time average penalty su...
Article
This paper considers online convex optimization over a complicated constraint set, which typically consists of multiple functional constraints and a set constraint. The conventional Zinkevich's projection based online algorithm (Zinkevich 2013) can be difficult to implement due to the potentially high computation complexity of the projection operat...
Article
This paper considers large scale constrained convex programs, which are usually not solvable by interior point methods or other Newton-type methods due to the prohibitive computation and storage complexity for Hessians and matrix inversions. Instead, large scale constrained convex programs are often solved by gradient based methods or decomposition...
Article
This paper considers convex programs with a general (possibly non-differentiable) convex object function and Lipschitz continuous convex inequality constraint functions and proposes a simple parallel algorithm with $O(1/t)$ convergence rate. Similar to the classical dual subgradient algorithm or the ADMM algorithm, the new algorithm has a distribut...
Article
This paper considers dynamic power allocation in MIMO fading systems with unknown channel state distributions. First, the ideal case of perfect instantaneous channel state information at the transmitter (CSIT) is treated. Using the drift-plus-penalty method, a dynamic power allocation policy is developed and shown to approach optimality, regardless...
Article
Full-text available
This paper considers a cost minimization problem for data centers with $N$ servers and randomly arriving service requests. A central router decides which server to use for each new request. Each server has three types of states (active, idle, setup) with different costs and time durations. The servers operate asynchronously over their own states an...
Article
Full-text available
This paper considers the problem of minimizing the time average of a controlled stochastic process subject to multiple time average constraints on other related processes. The probability distribution of the random events in the system is unknown to the controller. A typical application is time average power minimization subject to network throughp...
Article
Full-text available
This paper considers optimization of power and delay in a time-varying wireless link using rateless codes. The link serves a sequence of variable-length packets. Each packet is coded and transmitted over multiple slots. Channel conditions can change from slot to slot and are unknown to the transmitter. The amount of mutual information accumulated o...
Article
Full-text available
We consider the problem of simultaneous on-demand streaming of stored video to multiple users in a multi-cell wireless network where multiple unicast streaming sessions are run in parallel and share the same frequency band. Each streaming session is formed by the sequential transmission of video "chunks," such that each chunk arrives into the corre...
Article
Full-text available
This paper treats power-aware throughput maxi-mization in a multi-user file downloading system. Each user can receive a new file only after its previous file is finished. The file state processes for each user act as coupled Markov chains that form a generalized restless bandit system. First, an optimal algorithm is derived for the case of one user...
Article
This paper studies the convergence time of the drift-plus-penalty algorithm for strongly convex programs. The drift-plus-penalty algorithm was originally developed to solve more general stochastic optimization and is closely related to the dual subgradient algorithm when applied to deterministic convex programs. For general convex programs, the con...
Article
One practical open problem is the development of a distributed algorithm that achieves near-optimal utility using only a finite (and small) buffer size for queues in a stochastic network. This paper studies utility maximization (or cost minimization) in a finite-buffer regime and considers the corresponding delay and reliability (or rate of packet...
Article
We consider the design of a scheduling policy for video streaming in a wireless network formed by several users and helpers (e.g., base stations). In such networks, any user is typically in the range of multiple helpers. Hence, an efficient policy should allow the users to dynamically select the helper nodes to download from and determine adaptivel...
Article
This paper considers information sharing in a multi-player repeated game. Every round, each player observes a subset of components of a random vector and then takes a control action. The utility earned by each player depends on the full random vector and on the actions of others. An example is a game where different rewards are placed over multiple...
Article
This paper considers time-average stochastic optimization, where a time average decision vector, an average of decision vectors chosen in every time step from a time-varying (possibly nonconvex) set, minimizes a convex objective function and satisfies convex constraints. A class of this formulation with a random, discrete decision set has applicati...
Article
This paper considers the problem of minimizing the time average of a stochastic process subject to time average constraints on other processes. A canonical example is minimizing average power in a data network subject to multi-user throughput constraints. Another example is a (static) convex program. Under a Slater condition, the drift-plus-penalty...
Article
This paper considers a wireless link with randomly arriving data that is queued and served over a time-varying channel. It is known that any algorithm that comes within $\epsilon$ of the minimum average power required for queue stability must incur average queue size at least $\Omega(\log(1/\epsilon))$. However, the optimal convergence time is unkn...
Article
This paper considers peer-to-peer scheduling for a network with multiple wireless devices. A subset of the devices are mobile users that desire specific files. Each user may already have certain popular files in its cache. The remaining devices are access points that typically have a larger set of files. Users can download packets of their requeste...
Article
Full-text available
We consider extensions and improvements on our previous work on dynamic adaptive video streaming in a multi-cell multiuser ``small cell'' wireless network. Previously, we treated the case of single-antenna base stations and, starting from a network utility maximization (NUM) formulation, we devised a ``push'' scheduling policy, where users place re...
Article
Full-text available
This paper treats power-aware throughput maximization in a multi-user file downloading system. Each user can receive a new file only after its previous file is finished. The file state processes for each user act as coupled Markov chains that form a generalized restless bandit system. First, an optimal algorithm is derived for the case of one user....
Article
This paper considers a stochastic optimization approach for job scheduling and server management in large-scale, geographically distributed data centers. Randomly arriving jobs are routed to a choice of servers. The number of active servers depends on server activation decisions that are updated at a slow time scale, and the service rates of the se...
Article
Full-text available
We consider a wireless broadcast station that transmits packets to multiple users. The packet requests for each user may overlap, and some users may already have certain packets. This presents a problem of broadcasting in the presence of side information, and is a generalization of the well-known (and unsolved) index coding problem of information t...
Article
This paper considers a time-varying game with N players. Every time slot, players observe their own random events and then take a control action. The events and control actions affect the individual utilities earned by each player. The goal is to maximize a concave function of time average utilities subject to equilibrium constraints. Specifically,...
Article
This paper investigates Quality of Information (QoI) aware adaptive sampling in a system where two sensor devices report information to an end user. The system carries out a sequence of tasks, where each task relates to a random event that must be observed. The accumulated information obtained from the sensor devices is reported once per task to a...
Conference Paper
Full-text available
This demo abstract describes an initial design of a new adaptive video streaming protocol for device-to-device WiFi-based mobile platforms and its software implementation. For the demonstration, two mobile servers and two mobile users will be deployed verifying that our device-to-device adaptive video streaming implementation works with desirable u...
Article
We consider a one-hop wireless system with a small number of delay constrained users and a larger number of users without delay constraints. We develop a scheduling algorithm that reacts to time varying channels and maximizes throughput utility (to within a desired proximity), stabilizes all queues, and satisfies the delay constraints. The problem...
Article
This paper considers a base station that delivers packets to multiple receivers through a sequence of coded transmissions. All receivers overhear the same transmissions. Each receiver may already have some of the packets as side information, and requests another subset of the packets. This problem is known as the index coding problem and can be rep...
Article
Full-text available
We consider the jointly optimal design of a transmission scheduling and admission control policy for adaptive video streaming over small cell networks. We formulate the problem as a dynamic network utility maximization and observe that it naturally decomposes into two subproblems: admission control and transmission scheduling. The resulting algorit...
Article
Full-text available
We consider the optimal design of a scheduling policy for adaptive video streaming in wireless 'Small-Cells' networks. We formulate the problem as a network utility maximization, and we observe that it naturally decomposes into two subproblems: admission control and transmission scheduling. The resulting algorithms are simple and suitable for distr...
Article
This paper considers a problem where multiple users make repeated decisions based on their own observed events. The events and decisions at each time step determine the values of a utility function and a collection of penalty functions. The goal is to make distributed decisions over time to maximize time average utility subject to time average cons...
Article
It is well known that max-weight policies based on a queue backlog index can be used to stabilize stochastic networks, and that similar stability results hold if a delay index is used. Using Lyapunov optimization, we extend this analysis to design a utility maximizing algorithm that uses explicit delay information from the head-of-line packet at ea...
Article
This paper considers optimization of time averages in systems with variable length renewal frames. Applications include power-aware and profit-aware scheduling in wireless networks, peer-to-peer networks, and transportation systems. Every frame, a new policy is implemented that affects the frame size and that creates a vector of attributes. The pol...
Article
An information collection problem in a wireless network with random events is considered. Wireless devices report on each event using one of multiple reporting formats. Each format has a different quality and uses different data lengths. Delivering all data in the highest quality format can overload system resources. The goal is to make intelligent...
Conference Paper
We consider the jointly optimal design of a transmission scheduling and admission control policy for adaptive streaming over wireless device-to-device networks. We formulate the problem as a dynamic network utility maximization and observe that it naturally decomposes into two subproblems: admission control and transmission scheduling. The resultin...
Article
We investigate optimal routing and scheduling strategies for multi-hop wireless networks with rateless codes. Rateless codes allow each node of the network to accumulate mutual information with every packet transmission. This enables a significant performance gain over conventional shortest path routing. Further, it also outperforms cooperative com...
Conference Paper
This paper considers optimal control for a collection of separate Markov decision systems that operate asynchronously over their own state spaces. Decisions at each system affect: (i) the time spent in the current state, (ii) a vector of penalties incurred, and (iii) the next-state transition probabilities. An example is a network of smart devices...
Conference Paper
Full-text available
We analyze a generalized index coding problem that allows multiple users to request the same packet. For this problem we introduce a novel coding scheme called partition multicast. Our scheme can be seen as a natural generalization of clique cover for directed index coding problems. Further, partition multicast corresponds to an achievable scheme f...
Article
Full-text available
Lyapunov drift is a powerful tool for optimizing stochastic queueing networks subject to stability. However, the most convenient drift conditions often provide results in terms of a time average expectation, rather than a pure time average. This paper provides an extended drift-plus-penalty result that ensures stability with desired time averages w...
Conference Paper
An information collection problem in a wireless network with random events is considered. Wireless nodes report on each event using one of multiple reporting formats. Each format has a different quality and uses a different number of bits. Delivering all data in the highest quality format can overload system resources. The goal is to make intellige...
Article
We consider a discrete time queueing system where a controller makes a 2-stage decision every slot. The decision at the first stage reveals a hidden source of randomness with a control-dependent (but unknown) probability distribution. The decision at the second stage generates an attribute vector that depends on this revealed randomness. The goal i...
Article
We investigate opportunistic cooperation between secondary (femtocell) users and primary (macrocell) users in cognitive femtocell networks. We consider two models for such cooperation. In the first model, called the Cooperative Relay Model, a secondary user cannot transmit its own data concurrently with a primary user. However, it can employ cooper...
Conference Paper
We consider energy-aware scheduling in a multi-server system with N classes of jobs. Jobs arrive randomly and are queued according to their class. Servers operate asynchronously over their own timelines. Each server can be in either the active state or the idle state. At the beginning of each active period, a server chooses a processing mode from a...
Article
In this work we focus on a stochastic optimization based approach to make distributed routing and server management decisions in the context of large-scale, geographically distributed data centers, which offers significant potential for exploring power cost reductions. Our approach considers such decisions at different time scales and offers provab...
Conference Paper
This paper considers peer-to-peer scheduling for a network with multiple wireless devices. A subset of the devices are mobile users that desire specific files. Each user may already have certain popular files in its cache. The remaining devices are access points that typically have access to a larger set of files. Users can download packets of thei...
Article
This paper considers energy-aware control for a computing system with two states: "active" and "idle." In the active state, the controller chooses to perform a single task using one of multiple task processing modes. The controller then saves energy by choosing an amount of time for the system to be idle. These decisions affect processing time, ene...
Conference Paper
Full-text available
The multiple-access framework of ZigZag decoding (1) is a useful technique for combating interference via multiple repeated transmissions, and is known to be compatible with distributed random access protocols. However, in the presence of noise this type of decoding can magnify errors, particularly when packet sizes are large. We present a simple s...
Article
The multiple-access framework of ZigZag decoding (Gollakota and Katabi 2008) is a useful technique for combating interference via multiple repeated transmissions, and is known to be compatible with distributed random access protocols. However, in the presence of noise this type of decoding can magnify errors, particularly when packet sizes are larg...
Article
Full-text available
The freedom and flexibility of wireless Mobile Ad-hoc Networks (MANETs) that make them extremely desirable for many military, emergency, and sensor network applications also present challenges for multiple layers in the network stack. Max-weight scheduling, also known as backpressure routing, is a cross-layer control algorithm that is well-known to...
Conference Paper
We consider a system with K states which operates over frames with different lengths. Every frame, the controller observes a new random event and then chooses a control action based on this observation. The current state, random event, and control action together affect: (i) the frame size, (ii) a vector of penalties incurred over the frame, and (i...
Article
Full-text available
We consider a wireless broadcast station that transmits packets to multiple users. The packet requests for each user may overlap, and some users may already have certain packets. This presents a problem of broadcasting in the presence of side information, and is a generalization of the well known (and unsolved) index coding problem of information t...
Article
We study the fundamental network capacity of a multiuser wireless downlink under two assumptions: (1) Channels are not explicitly measured and thus instantaneous states are unknown; (2) Channels are modeled as Markov chains. This is an important network model to explore because channel probing may be costly or infeasible in some contexts. In this c...
Conference Paper
Full-text available
This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known, and all control decisions are based on information provided by ACK/NACK feedback from past transmissions. This system can be viewed as a restless multi-armed bandit problem with a...
Conference Paper
There has been considerable recent work developing a new stochastic network utility maximization framework using Backpressure algorithms, also known as MaxWeight. A key open problem has been the development of utility-optimal algorithms that are also delay efficient. In this paper, we show that the Backpressure algorithm, when combined with the LIF...

Citations

... To measure the freshness of data, the concept of Age of Information (AoI) has been introduced over the last decade (see, for example, [2]- [4]), which is defined concisely as the elapsed time since the generation time of the last received status update. Since the introduction of the AoI metric, numerous related studies emerged in various networking scenarios, including wireless random access networks (e.g., [5], [6]), content distribution networks (e.g., [7], [8]), scheduling (e.g., [9]- [13]), queuing networks (e.g., [14], [15]), and vehicular networks (e.g., [16]). ...
... Let E p [|A[k] p − p|] denote the expected mean absolute error given the true parameter is p. The following Bernoulli estimation lemma for mean absolute error is from [28] and is a modified version of a lemma for mean squared error developed in [25]: ...
... The n channels are related, and each channel follows Markovity. Therefore, the entire system can be described as a Markov model [21] [22]. Each channel has two states(idle (1) or busy (0)), which is called the Gilbert-Elliot channel [23] as shown in Figure 1. ...
... This is surprising because strong convexity/concavity provides convergence improvements in other contexts, including online convex optimization problems [23], [24], deterministic minimization via gradient descent [25], and deterministic minimization via stochastic gradients [17]- [19]. 1 This emphasizes the unique properties of opportunistic scheduling problems. ...
... Lemma A.1 (Pushback property of Bregman divergences[25, Lemma 14]). Let B : ∆ × ∆ o → R be a Bregman divergence function, where ∆ is the probability simplex in R d and ∆ o is the interior of ∆. ...
... Yu et al. [46] provide a primal-dual proximal gradient algorithm achieving O( √ T ) cumulative regret and constraint violation by assuming Slater's condition. Moreover, Wei et al. [43] provide bounds of the same order by assuming a less stringent version of the Slater's condition. As a performance metric, the latter work use static regret. ...
... They proposed an algorithm that simultaneously achieves O( √ T ) regret and (expected) constraint violation. A recent work (Wei, Yu, and Neely 2020) has improved this result by removing some assumptions while maintaining the regret guarantees. ...
... However, the energy of battery may be exhausted if the average of harvested energy is below the maximum of feasible set. Then the OCO with stochastic constraints [13] is deeply analyzed, and a new algorithm of guaranteeing node continuity is proposed which required a costly big-capacity battery [14]. ...
... However, the MFA approach is only applicable to a homogeneous system. Finally, [17] and [27] studied weakly-coupled MDPs, where individual MDPs are independent and coupled through constraints, instead of coupled reward as in ours. ...
... In the seminal work of [36] and [37], the max-weight algorithm for assigning service to queues was shown to maximize throughput under complex scheduling constraints and probabilistic dynamics. This framework has been extended and applied to network switching [38], satellite communications [39], ad-hoc networking [40], [41], packet multicasting and broadcasting [42], packet-delivery-time reduction [43], multi-user MIMO [44], energy harvesting systems [45], and age-of-information minimization [46], [47]. In the works of [48] and [49], learning algorithms were used for achieving network stability under unknown arrival and channel statistics. ...