About
18
Publications
3,942
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
510
Citations
Introduction
Skills and Expertise
Publications
Publications (18)
This paper studies the extended mean-field control problems with state-control joint law dependence and Poissonian common noise. We develop the stochastic maximum principle (SMP) and establish the connection to the Hamiltonian-Jacobi-Bellman (HJB) equation on the Wasserstein space. The presence of the conditional joint law in the McKean-Vlasov dyna...
This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterizati...
One of the challenges for multiagent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. Whereas exciting progress has been made to analyze decentralized MARL with the network of agents for social networks and team video games,...
This paper studies the q-learning, recently coined as the continuous-time counterpart of Q-learning by Jia and Zhou (2022c), for continuous time Mckean-Vlasov control problems in the setting of entropy-regularized reinforcement learning. In contrast to the single agent's control problem in Jia and Zhou (2022c), the mean-field interaction of agents...
We establish Itô's formula along flows of probability measures associated with general semimartingales; this generalizes existing results for flows of measures on Itô processes. Our approach is to first establish Itô's formula for cylindrical functions and then extend it to the general case via function approximation and localization techniques.
T...
Multiagent systems—such as recommendation systems, ride-sharing platforms, food-delivery systems, and data-routing centers—are areas of rapid technology development that require constant improvements to address the lack of efficiency and curse of dimensionality. In the paper “Dynamic Programming Principles for Mean-Field Controls with Learning,” we...
One of the challenges for multi-agent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. In this system, it is desirable to learn policies of a decentralized type. A recent and promising paradigm to analyze such decentralized M...
This paper focuses on a dynamic multi‐asset mean‐variance portfolio selection problem under model uncertainty. We develop a continuous time framework for taking into account ambiguity aversion about both expected return rates and correlation matrix of the assets, and for studying the join effects on portfolio diversification. The dynamic setting al...
We establish It\^o's formula along flows of probability measures associated with general semimartingales; this generalizes existing results for flows of measures on It\^o processes. Our approach is to first establish It\^o's formula for cylindrical functions and then extend it to the general case via function approximation and localization techniqu...
Multi-agent reinforcement learning (MARL), despite its popularity and empirical success, suffers from the curse of dimensionality. This paper builds the mathematical framework to approximate cooperative MARL by a mean-field control (MFC) framework, and shows that the approximation error is of O(\frac{1}{\sqrt{N}}). By establishing appropriate form...
Dynamic programming principle (DPP), or the time consistency property, is fundamental for Markov decision problems (MDPs), for reinforcement learning (RL), and more recently for mean-field controls (MFCs). However, in the learning framework of MFCs, DPP has not been rigorously established, despite its potentials for algorithm designs.
In this paper...
This thesis deals with the study of optimal control of McKean-Vlasov dynamics and its applicationsin mathematical finance. This thesis contains two parts.In the first part, we develop the dynamic programming (DP) method for solving McKean-Vlasovcontrol problem. Using suitable admissible controls, we propose to reformulate the value function of thep...
This paper is concerned with a multi-asset mean-variance portfolio selection problem under model uncertainty. We develop a continuous time framework for taking into account ambiguity aversion about both expected return rates and correlation matrix of the assets, and for studying the effects on portfolio diversification. We prove a separation princi...
We consider the stochastic optimal control problem of nonlinear mean-field
systems in discrete time. We reformulate the problem into a deterministic
control problem with marginal distribution as controlled state variable, and
prove that dynamic programming principle holds in its general form. We apply
our method for solving explicitly the mean-vari...
We study the optimal control of general stochastic McKean-Vlasov equation. Such problem is motivated originally from the asymptotic formulation of cooperative equilibrium for a large population of particles (players) in mean-field interaction under common noise. Our first main result is to state a dynamic programming principle for the value functio...
We consider the stochastic optimal control problem of McKean-Vlasov
stochastic differential equation. By using feedback controls, we reformulate
the problem into a deterministic control problem with only the marginal
distribution as controlled state variable, and prove that dynamic programming
principle holds in its general form. Then, by relying o...