Michael J. Neely’s research while affiliated with University of Southern California and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (177)


Fig. 1. Top-Left: Scenario 1. Top-Right: Scenario 2. Bottom-Left: Scenario 3. Bottom-Right: Scenario 4.
Automatic Link Selection in Multi-Channel Multiple Access with Link Failures
  • Preprint
  • File available

January 2025

·

4 Reads

·

Michael J. Neely

This paper focuses on the problem of automatic link selection in multi-channel multiple access control using bandit feedback. In particular, a controller assigns multiple users to multiple channels in a time slotted system, where in each time slot at most one user can be assigned to a given channel and at most one channel can be assigned to a given user. Given that user i is assigned to channel j, the transmission fails with a fixed probability fi,jf_{i,j}. The failure probabilities are not known to the controller. The assignments are made dynamically using success/failure feedback. The goal is to maximize the time average utility, where we consider an arbitrary (possibly nonsmooth) concave and entrywise nondecreasing utility function. The problem of merely maximizing the total throughput has a solution of always assigning the same user-channel pairs and can be unfair to certain users, particularly when the number of channels is less than the number of users. Instead, our scheme allows various types of fairness, such as proportional fairness, maximizing the minimum, or combinations of these by defining the appropriate utility function. We propose two algorithms for this task. The first algorithm is adaptive and gets within O(log(T)/T1/3)\mathcal{O}(\log(T)/T^{1/3}) of optimality over any interval of T consecutive slots over which the success probabilities do not change. The second algorithm has faster O(log(T)/T)\mathcal{O}(\sqrt{\log(T)/T}) performance over the first T slots, but does not adapt well if probabilities change.

Download

Nonsmooth projection-free optimization with functional constraints

September 2024

·

24 Reads

Computational Optimization and Applications

This paper presents a subgradient-based algorithm for constrained nonsmooth convex optimization that does not require projections onto the feasible set. While the well-established Frank–Wolfe algorithm and its variants already avoid projections, they are primarily designed for smooth objective functions. In contrast, our proposed algorithm can handle nonsmooth problems with general convex functional inequality constraints. It achieves an ϵ\epsilon -suboptimal solution in O(ϵ2)\mathcal {O}(\epsilon ^{-2}) iterations, with each iteration requiring only a single (potentially inexact) Linear Minimization Oracle call and a (possibly inexact) subgradient computation. This performance is consistent with existing lower bounds. Similar performance is observed when deterministic subgradients are replaced with stochastic subgradients. In the special case where there are no functional inequality constraints, our algorithm competes favorably with a recent nonsmooth projection-free method designed for constraint-free problems. Our approach utilizes a simple separation scheme in conjunction with a new Lagrange multiplier update rule.



Fig. 9: Comparing the proposed algorithm with the J-based heuristic for the same scenario as Fig. 5b (parameter u ∈ [4, 10]). Proposed virtual is solid blue; proposed actual is dashed red; J-based is dashed yellow.
Opportunistic Learning for Markov Decision Systems with Application to Smart Robots

August 2024

·

17 Reads

This paper presents an online method that learns optimal decisions for a discrete time Markov decision problem with an opportunistic structure. The state at time t is a pair (S(t),W(t)) where S(t) takes values in a finite set S\mathcal{S} of basic states, and {W(t)}t=0\{W(t)\}_{t=0}^{\infty} is an i.i.d. sequence of random vectors that affect the system and that have an unknown distribution. Every slot t the controller observes (S(t),W(t)) and chooses a control action A(t). The triplet (S(t),W(t),A(t)) determines a vector of costs and the transition probabilities for the next state S(t+1). The goal is to minimize the time average of an objective function subject to additional time average cost constraints. We develop an algorithm that acts on a corresponding virtual system where S(t) is replaced by a decision variable. An equivalence between virtual and actual systems is established by enforcing a collection of time averaged global balance equations. For any desired ϵ>0\epsilon>0, we prove the algorithm achieves an ϵ\epsilon-optimal solution on the virtual system with a convergence time of O(1/ϵ2)O(1/\epsilon^2). The actual system runs at the same time, its actions are informed by the virtual system, and its conditional transition probabilities and costs are proven to be the same as the virtual system at every instant of time. Also, its unconditional probabilities and costs are shown in simulation to closely match the virtual system. Our simulations consider online control of a robot that explores a region of interest. Objects with varying rewards appear and disappear and the robot learns what areas to explore and what objects to collect and deliver to a home base.


A Two-Player Resource-Sharing Game with Asymmetric Information

September 2023

·

35 Reads

This paper considers a two-player game where each player chooses a resource from a finite collection of options. Each resource brings a random reward. Both players have statistical information regarding the rewards of each resource. Additionally, there exists an information asymmetry where each player has knowledge of the reward realizations of different subsets of the resources. If both players choose the same resource, the reward is divided equally between them, whereas if they choose different resources, each player gains the full reward of the resource. We first implement the iterative best response algorithm to find an ϵ-approximate Nash equilibrium for this game. This method of finding a Nash equilibrium may not be desirable when players do not trust each other and place no assumptions on the incentives of the opponent. To handle this case, we solve the problem of maximizing the worst-case expected utility of the first player. The solution leads to counter-intuitive insights in certain special cases. To solve the general version of the problem, we develop an efficient algorithmic solution that combines online convex optimization and the drift-plus penalty technique.


Fig. 1. Case a, b, c, d = 0, 0, 0, 3
A Two-Player Resource-Sharing Game with Asymmetric Information

June 2023

·

54 Reads

This paper considers a two-player game where each player chooses a resource from a finite collection of options without knowing the opponent's choice in the absence of any form of feedback. Each resource brings a random reward. Both players have statistical information regarding the rewards of each resource. Additionally, there exists an information asymmetry where each player has knowledge of the reward realizations of different subsets of the resources. If both players choose the same resource, the reward is divided equally between them, whereas if they choose different resources, each player gains the full reward of the resource. We first implement the iterative best response algorithm to find an ϵ\epsilon-approximate Nash equilibrium for this game. This method of finding a Nash equilibrium is impractical when players do not trust each other and place no assumptions on the incentives of the opponent. To handle this case, we solve the problem of maximizing the worst-case expected utility of the first player. The solution leads to counter-intuitive insights in certain special cases. To solve the general version of the problem, we develop an efficient algorithmic solution that combines online-convex optimization and the drift-plus penalty technique.



Projection-Free Non-Smooth Convex Programming

August 2022

·

24 Reads

In this paper, we provide a sub-gradient based algorithm to solve general constrained convex optimization without taking projections onto the domain set. The well studied Frank-Wolfe type algorithms also avoid projections. However, they are only designed to handle smooth objective functions. The proposed algorithm treats both smooth and non-smooth problems and achieves an O(1/T)O(1/\sqrt{T}) convergence rate (which matches existing lower bounds). The algorithm yields similar performance in expectation when the deterministic sub-gradients are replaced by stochastic sub-gradients. Thus, the proposed algorithm is a projection-free alternative to the Projected sub-Gradient Descent (PGD) and Stochastic projected sub-Gradient Descent (SGD) algorithms.


Fig. 1. An illustration of the decision set D(1) as compared to an alternative logarithmic decision set. The solid curve shows the (X 1 [t], X 2 [t]) points in D(1) when S[t] = 1; the dashed curve shows the points (C 1 [t], C 2 [t]) defined by (8)-(9).
Fig. 2. The set C of all one-shot expectations for the case q = 0.5, which includes all points on or between the upper and lower boundary curves. The lower boundary is determined by (16) and the upper boundary is determined by (17). Lemma 1 (The Set C): Fix q ∈ (0, 1] and assume {S[t]} ∞ t=0 is an i.i.d. Bernoulli process with P [S[t] = 1] = q. The set C ⊆ R 2 of all one-shot expectations E [(X 1 [0], X 2 [0])] achievable in the 2-user system is
A Converse Result on Convergence Time for Opportunistic Wireless Scheduling

August 2022

·

19 Reads

·

2 Citations

IEEE/ACM Transactions on Networking

This paper proves an impossibility result for stochastic network utility maximization for multi-user wireless systems, including multiple access and broadcast systems. Every time slot an access point observes the current channel states for each user and opportunistically selects a vector of transmission rates. Channel state vectors are assumed to be independent and identically distributed with an unknown probability distribution. The goal is to learn to make decisions over time that maximize a concave utility function of the running time average transmission rate of each user. Recently it was shown that a stochastic Frank-Wolfe algorithm converges to utility-optimality with an error of O(log(T)/T)O(\log (T)/T) , where T is the time the algorithm has been running. An existing Ω(1/T)\Omega (1/T) converse is known. The current paper improves the converse to Ω(log(T)/T)\Omega (\log (T)/T) , which matches the known achievability result. It does this by constructing a particular (simple) system for which no algorithm can achieve a better performance. The proof uses a novel reduction of the opportunistic scheduling problem to a problem of estimating a Bernoulli probability p from independent and identically distributed samples. Along the way we refine a regret bound for Bernoulli estimation to show that, for any sequence of estimators, the set of values p[0,1]p \in [{0,1}] under which the estimators perform poorly has measure at least 1/6.


Bregman-style Online Convex Optimization with Energy Harvesting Constraints

June 2022

·

2 Reads

ACM SIGMETRICS Performance Evaluation Review

This paper considers online convex optimization (OCO) problems where decisions are constrained by available energy resources. A key scenario is optimal power control for an energy harvesting device with a finite capacity battery. The goal is to minimize a time-average loss function while keeping the used energy less than what is available. In this setup, the distribution of the randomly arriving harvestable energy (which is assumed to be i.i.d.) is unknown, the current loss function is unknown, and the controller is only informed by the history of past observations. A prior algorithm is known to achieve O(√T) regret by using a battery with an O(√T) capacity. This paper develops a new algorithm that maintains this asymptotic trade-off with the number of time steps T while improving dependency on the dimension of the decision vector from O(√n) to O(√log(n)). The proposed algorithm introduces a separation of the decision vector into amplitude and direction components. It uses two distinct types of Bregman divergence, together with energy queue information, to make decisions for each component.


Citations (74)


... This can happen in settings where the population of the agents is not fixed and agents are unaware of the total number of agents currently S. Sudhakara present or their own index in the population. An example of such a situation for a multi-access communication problem is described in [11]. When an agent doesn't know its own index ("Am I agent 1 or agent 2?"), it makes sense to use symmetric (i.e. ...

Reference:

Optimal Symmetric Strategies in Multi-Agent Systems with Decentralized Information
Repeated Games, Optimal Channel Capture, and Open Problems for Slotted Multiple Access
  • Citing Conference Paper
  • September 2022

... To measure the freshness of data, the concept of Age of Information (AoI) has been introduced over the last decade (see, for example, [2]- [4]), which is defined concisely as the elapsed time since the generation time of the last received status update. Since the introduction of the AoI metric, numerous related studies emerged in various networking scenarios, including wireless random access networks (e.g., [5], [6]), content distribution networks (e.g., [7], [8]), scheduling (e.g., [9]- [13]), queuing networks (e.g., [14], [15]), and vehicular networks (e.g., [16]). ...

Efficient Distributed MAC for Dynamic Demands: Congestion and Age Based Designs
  • Citing Article
  • January 2022

IEEE/ACM Transactions on Networking

... Let E p [|A[k] p − p|] denote the expected mean absolute error given the true parameter is p. The following Bernoulli estimation lemma for mean absolute error is from [28] and is a modified version of a lemma for mean squared error developed in [25]: ...

A Converse Result on Convergence Time for Opportunistic Wireless Scheduling

IEEE/ACM Transactions on Networking

... The key lemma establishes the bound of "one-step regret + Lyapunov drift" in (6) and bridges the analysis to bound both regret and the virtual queue (i.e., constraint violation). The upper bound includes the key cross-term, the proximal bias D(x, x t , x t+1 ), trade-off factor ξ. We choose the parameters {V, η, ξ} to minimize the upper bound. ...

Bregman-style Online Convex Optimization with EnergyHarvesting Constraints
  • Citing Article
  • November 2020

Proceedings of the ACM on Measurement and Analysis of Computing Systems

... Some previous works focus on the case in which constraints are generated i.i.d. according to some unknown stochastic model [6,5], or generated oblivious adversarial constraints and target functions under some strong assumptions on the structure of the problem or using a weaker regret metric [8,10,7,9]. Castiglioni et al. [4] unified stochastically and oblivious adversarially generated constraints settings, and extended online convex optimization framework by allowing for general non-convex functions f t and c it and arbitrary feasibility sets X . ...

Online Primal-Dual Mirror Descent under Stochastic Constraints
  • Citing Article
  • June 2020

Proceedings of the ACM on Measurement and Analysis of Computing Systems

... Simp. Steinhardt and Liang (2014) B i (T ) ( ) Euclidean Yang et al. (2014) V * (T ) ∨ L f General Mahdavi, Jin, and Yang (2012) Jenatton, Huang, and Archambeau (2016) T max{β,1−β} T 1−β/2 Yu, Neely, and Wei (2017) √ T √ T Yuan and Lamperski (2018) Wei, Yu, and Neely (2020) √ (1). The parameter β satisfies β ∈ (0, 1). ...

Online Primal-Dual Mirror Descent under Stochastic Constraints
  • Citing Conference Paper
  • June 2019

... To solve this problem, we design the online target coverage optimization technique using Lyapunov optimization theory. According to 33,34 , we use the virtual queue's stability to substitute the limitation (13). As a result, the virtual queue Q i (t + 1) is defined as follows. ...

Learning-Aided Optimization for Energy-Harvesting Devices With Outdated State Information
  • Citing Article
  • July 2019

IEEE/ACM Transactions on Networking

... An O( √ T ) regret algorithm is developed using online convex programming and a quasi-stationary assumption on the Markov chain. The model is extended in [31] to allow time varying constraint costs and coupled multi-chains, again with O( √ T ) regret, see also a recent treatment in [32]. MDPs where transition probabilities are allowed to vary slowly over time are considered in [33]. ...

Online Learning in Weakly Coupled Markov Decision Processes: A Convergence Time Study
  • Citing Article
  • January 2019

ACM SIGMETRICS Performance Evaluation Review

... This research provides insights into the use of piezoelectric energy harvesters as a potential energy source in multi-source energy harvesting systems. Another study explored outdated state information in optimizing energy harvesting devices [4]. A learning -aided algorithm was proposed which has reached utility within specific range of optimal solution. ...

Learning Aided Optimization for Energy Harvesting Devices with Outdated State Information
  • Citing Conference Paper
  • April 2018