
Fabrizio Lillo- Scuola Normale Superiore
Fabrizio Lillo
- Scuola Normale Superiore
About
348
Publications
64,921
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,303
Citations
Introduction
Current institution
Publications
Publications (348)
Devising models of the limit order book that realistically reproduce the market response to exogenous trades is extremely challenging and fundamental in order to test trading strategies. We propose a novel explainable model for small tick assets, the Non-Markovian Zero Intelligence, which is a variant of the well-known Zero Intelligence model. The...
Blockchain technology has revolutionized financial markets by enabling decentralized exchanges (DEXs) that operate without intermediaries. Uniswap V2, a leading DEX, facilitates the rapid creation and trading of new tokens, offering high return potential but exposing investors to significant risks. In this work, we analyze the financial impact of n...
Estimating market impact and transaction costs of large trades (metaorders) is a very important topic in finance. However, using models of price and trade based on public market data provide average price trajectories which are qualitatively different from what is observed during real metaorder executions: the price increases linearly, rather than...
We propose a theory of unimodal maps perturbed by an heteroscedastic Markov chain noise and experiencing another heteroscedastic noise due to uncertain observation. We address and treat the filtering problem showing that by collecting more and more observations, one would predict the same distribution for the state of the underlying Markov chain no...
Motivated by the increasing abundance of data describing real-world networks that exhibit dynamical features, we propose an extension of the exponential random graph models (ERGMs) that accommodates the time variation of its parameters. Inspired by the fast-growing literature on dynamic conditional score models, each parameter evolves according to...
While Carbon Dioxide Removal (CDR) solutions are considered essential to meet Paris Agreement objectives and curb climate change, their maturity and current ability to operate at scale are highly debated. The rapid development, deployment, and diffusion of such methods will likely require the coordination of science, technology, policy, and societa...
Change points in real-world systems mark significant regime shifts in system dynamics, possibly triggered by exogenous or endogenous factors. These points define regimes for the time evolution of the system and are crucial for understanding transitions in financial, economic, social, environmental, and technological contexts. Building upon the Baye...
In financial risk management, Value at Risk (VaR) is widely used to estimate potential portfolio losses. VaR's limitation is its inability to account for the magnitude of losses beyond a certain threshold. Expected Shortfall (ES) addresses this by providing the conditional expectation of such exceedances, offering a more comprehensive measure of ta...
We investigate and prove the mathematical properties of a general class of one-dimensional unimodal smooth maps perturbed with a heteroscedastic noise. Specifically, we investigate the stability of the associated Markov chain, show the weak convergence of the unique stationary measure to the invariant measure of the map, and show that the average L...
Financial order flow exhibits a remarkable level of persistence, wherein buy (sell) trades are often followed by subsequent buy (sell) trades over extended periods. This persistence can be attributed to the division and gradual execution of large orders. Consequently, distinct order flow regimes might emerge, which can be identified through suitabl...
Using a perturbation approach, we derive a new approximate filtering and smoothing methodology for a general class of state-space models including univariate and multivariate location, scale, and count data models. The main properties of the methodology can be summarized as follows: (i) it generalizes several existing approaches to robust filtering...
We investigate and prove the mathematical properties of a general class of one-dimensional unimodal smooth maps perturbed with a heteroscedastic noise. Specifically, we investigate the stability of the associated Markov chain, show the weak convergence of the unique stationary measure to the invariant measure of the map, and show that the average L...
A large body of empirical literature has shown that market impact of financial prices is transient. However, from a theoretical standpoint, the origin of this temporary nature is still unclear. We show that an implied transient impact arises from the Nash equilibrium between a directional trader and one arbitrageur in a market impact game with fixe...
Estimators of integrated and spot variance are tested using the queue-reactive model of the limit order book
The estimation of market impact is crucial for measuring the information content of trades and for transaction cost analysis. Hasbrouck's (1991) seminal paper proposed a Structural-VAR (S-VAR) to jointly model mid-quote changes and trade signs. Recent literature has highlighted some pitfalls of this approach: S-VAR models can be misspecified when t...
A crucial aspect of every experiment is the formulation of hypotheses prior to data collection. In this paper, we use a simulation-based approach to generate synthetic data and formulate the hypotheses for our market experiment and calibrate its laboratory design. In this experiment, we extend well-established laboratory market models to the two-as...
Identifying market abuse activity from data on investors' trading activity is very challenging both for the data volume and for the low signal to noise ratio. Here we propose two complementary unsupervised machine learning methods to support market surveillance aimed at identifying potential insider trading activities. The first one uses clustering...
We consider the general problem of a set of agents trading a portfolio of assets in the presence of transient price impact and additional quadratic transaction costs and we study, with analytical and numerical methods, the resulting Nash equilibria. Extending significantly the framework of Schied and Zhang (2019) and Luo and Schied (2020), who cons...
A common issue when analyzing real-world complex systems is that the interactions between their elements often change over time. Here we propose a new modeling approach for time-varying interactions generalising the well-known Kinetic Ising Model, a minimalistic pairwise constant interactions model which has found applications in several scientific...
The digitalization of news and social media provides an unprecedented source to investigate the role of information on market dynamics. However, the observed sentiment time-series represent a noisy proxy of the true investor sentiment. Moreover, modeling the joint dynamics of different sentiment series can be beneficial for the assessment of their...
This paper presents results from the SESAR ER3 Domino project. Three mechanisms are assessed at the ECAC-wide level: 4D trajectory adjustments (a combination of actively waiting for connecting passengers and dynamic cost indexing), flight prioritisation (enabling ATFM slot swapping at arrival regulations), and flight arrival coordination (where fli...
While the vast majority of the literature on models for temporal networks focuses on binary graphs, often one can associate a weight to each link. In such cases the data are better described by a weighted, or valued, network. An important well known fact is that real world weighted networks are typically sparse. We propose a novel time varying para...
This paper investigates how Covid mobility restrictions impacted the population of investors of the Italian stock market. The analysis tracks the trading activity of individual investors in Italian stocks in the period January 2019-September 2021, investigating how their composition and the trading activity changed around the Covid-19 lockdown peri...
A large body of empirical literature has shown that market impact of financial prices is transient. However, from a theoretical standpoint, the origin of this temporary nature is still unclear. We show that an implied transient impact arises from the Nash equilibrium between a directional trader and one arbitrageur in a market impact game with fixe...
We study information dynamics between the largest Bitcoin exchange markets during the bubble in 2017–2018. By analyzing high-frequency market microstructure observables with different information-theoretic measures for dynamical systems, we find temporal changes in information sharing across markets. In particular, we study time-varying components...
Many of the biological, social and man-made networks around us are inherently dynamic, with their links switching on and off over time. The evolution of these networks is often observed to be non-Markovian, and the dynamics of their links are often correlated. Hence, to accurately model these networks, predict their evolution, and understand how in...
The estimation of the volatility with high-frequency data is plagued by the presence of microstructure noise, which leads to biased measures. Alternative estimators have been developed and tested either on specific structures of the noise or by the speed of convergence to their asymptotic distributions. Gatheral and Oomen (2010) proposed to use the...
While the vast majority of the literature on models for temporal networks focuses on binary graphs, often one can associate a weight to each link. In such cases the data are better described by a weighted, or valued, network. An important well known fact is that real world weighted networks are typically sparse. We propose a novel time varying para...
We study the information dynamics between the largest Bitcoin exchange markets during the bubble in 2017-2018. By analysing high-frequency market-microstructure observables with different information theoretic measures for dynamical systems, we find temporal changes in information sharing across markets. In particular, we study the time-varying com...
We study the problem of the intraday short-term volume forecasting in cryptocurrency multi-markets. The predictions are built by using transaction and order book data from different markets where the exchange takes place. Methodologically, we propose a temporal mixture ensemble, capable of adaptively exploiting, for the forecasting, different sourc...
This paper presents results from the SESAR ER3 Domino project. Three mechanisms are assessed at the ECAC-wide level: 4D trajectory adjustments (a combination of actively waiting for connecting passengers and dynamic cost indexing), flight prioritisation (enabling ATFM slot swapping at arrival regulations), and flight arrival coordination (where fli...
Market liquidity is a latent and dynamic variable. We propose a dynamical linear price impact model at high frequency in which the price impact coefficient is a product of a daily, a diurnal, and an auto-regressive stochastic intraday component. We estimate the model using a Kalman filter on order book data for stocks traded on the NASDAQ in 2016....
I present an overview of some recent advancements on the empirical analysis and theoretical modeling of the process of price formation in financial markets as the result of the arrival of orders in a limit order book exchange. After discussing critically the possible modeling approaches and the observed stylized facts of order flow, I consider in d...
We consider a model of a simple financial system consisting of a leveraged investor that invests in a risky asset and manages risk by using Value-at-Risk (VaR). The VaR is estimated by using past data via an adaptive expectation scheme. We show that the leverage dynamics can be described by a dynamical system of slow-fast type associated with a uni...
By exploiting a bipartite network representation of the relationships between mutual funds and portfolio holdings, we propose an indicator that we derive from the analysis of the network, labelled the Average Commonality Coefficient (ACC), which measures how frequently the assets in the fund portfolio are present in the portfolios of the other fund...
A recent trend in algorithm design consists of augmenting classic data structures with machine learning models, which are better suited to reveal and exploit patterns and trends in the input data so to achieve outstanding practical improvements in space occupancy and time efficiency. This is especially known in the context of indexing data structur...
Betweenness centrality quantifies the importance of a vertex for the information flow in a network. The standard betweenness centrality applies to static single-layer networks, but many real world networks are both dynamic and made of several layers. We propose a definition of betweenness centrality for temporal multiplexes. This definition account...
Binary random variables are the building blocks used to describe a large variety of systems, from magnetic spins to financial time series and neuron activity. In statistical physics the kinetic Ising model has been introduced to describe the dynamics of the magnetic moments of a spin lattice, while in time series analysis discrete autoregressive pr...
We study the problem of estimating the total number of searches (volume) of queries in a specific domain, which were submitted to a search engine in a given time period. Our statistical model assumes that the distribution of searches follows a Zipf’s law, and that the observed sample volumes are biased accordingly to three possible scenarios. These...
We study the problem of estimating the total number of searches (volume) of queries in a specific domain, which were submitted to a search engine in a given time period. Our statistical model assumes that the distribution of searches follows a Zipf's law, and that the observed sample volumes are biased accordingly to three possible scenarios. These...
Identifying risk spillovers in financial markets is of great importance for assessing systemic risk and portfolio management. Granger causality in tail (or in risk) tests whether past extreme events of a time series help predicting future extreme events of another time series. The topology and connectedness of networks built with Granger causality...
Identifying risk spillovers in financial markets is of great importance for assessing systemic risk and portfolio management. Granger causality in tail (or in risk) tests whether past extreme events of a time series help predicting future extreme events of another time series. The topology and connectedness of networks built with Granger causality...
Scientific discovery is shaped by scientists’ choices and thus by their career patterns. The increasing knowledge required to work at the frontier of science makes it harder for an individual to embark on unexplored paths. Yet collaborations can reduce learning costs—albeit at the expense of increased coordination costs. In this article, we use dat...
Binary random variables are the building blocks used to describe a large variety of systems, from magnetic spins to financial time series and neuron activity. In Statistical Physics the Kinetic Ising Model has been introduced to describe the dynamics of the magnetic moments of a spin lattice, while in time series analysis discrete autoregressive pr...
We propose a method to infer lead-lag networks of traders from the observation of their trade record as well as to reconstruct their state of supply and demand when they do not trade. The method relies on the Kinetic Ising model to describe how information propagates among traders, assigning a positive or negative ‘opinion’ to all agents about whet...
We introduce a generalization of the Kinetic Ising Model using the score-driven approach, which allows the efficient estimation and filtering of time-varying parameters from time series data. We show that this approach allows to overcome systematic errors in the parameter estimation, and is useful to study complex systems of interacting variables w...
A recent trend in algorithm design consists of augmenting classic data structures with machine learning models, which are better suited to reveal and exploit patterns and trends in the input data so to achieve outstanding practical improvements in space occupancy and time efficiency. This is especially known in the context of indexing data structur...
In ATM systems, the massive number of interacting entities makes it difficult to identify critical elements and paths of disturbance propagation, as well as to predict the system-wide effects that innovations might have. To this end, suitable metrics are required to assess the role of the interconnections between the elements and complex network sc...
We study the problem of the intraday short-term volume forecasting in cryptocurrency exchange markets. The predictions are built by using transaction and order book data from different markets where the exchange takes place. Methodologically, we propose a temporal mixture ensemble model, capable of adaptively exploiting, for the forecasting, differ...
Identifying risk spillovers in financial markets is of great importance for assessing systemic risk and portfolio management. Granger causality in tail (or in risk) tests whether past extreme events of a time series help predicting future extreme events of another time series. The topology and connectedness of networks built with Granger causality...
We revisit the trading invariance hypothesis recently proposed by Kyle, A.S. and Obizhaeva, A.A. [‘Market microstructure invariance: Empirical hypotheses.’ Econometrica, 2016, 84(4), 1345–1404] by empirically investigating a large dataset of metaorders provided by ANcerno. The hypothesis predicts that the quantity I:=W/N3/2, where W is the daily ex...
We consider the general problem of a set of agents trading a portfolio of assets in the presence of transient price impact and additional quadratic transaction costs and we study, with analytical and numerical methods, the resulting Nash equilibria. Extending significantly the framework of Schied and Zhang (2018), who considered two agents and one...
The analysis of the intraday dynamics of covariances among high-frequency returns is challenging due to asynchronous trading and market microstructure noise. Both effects lead to significant data reduction and may severely affect the estimation of the covariances if traditional methods for low-frequency data are employed. We propose to model intrad...
Scientific discovery is shaped by scientists' choices and thus by their career patterns. The increasing knowledge required to work at the frontier of science makes it harder for an individual to embark on unexplored paths. Yet collaborations can reduce learning costs -- albeit at the expense of increased coordination costs. In this article, we use...
Betweenness centrality quantifies the importance of a vertex for the information flow in a network. We propose a flexible definition of betweenness for temporal multiplexes, where geodesics are determined accounting for the topological and temporal structure and the duration of paths. We propose an algorithm to compute the new metric via a mapping...
Aggregate and systemic risk in complex systems are emergent phenomena depending on two properties: the idiosyncratic risk of the elements and the topology of the network of interactions among them. While a significant attention has been given to aggregate risk assessment and risk propagation once the above two properties are given, less is known ab...
In ATM systems, the massive number of interacting entities makes it difficult to predict the system-wide effects that innovations might have. Here, we present the approach proposed by the project Domino to assess and identify the impact that innovations might bring for the different stakeholders, based on agent-based modelling and complex network s...
The complex networks approach has been gaining popularity in analysing investor behaviour and stock markets, but within this approach, initial public offerings (IPOs) have barely been explored. We fill this gap in the literature by analysing investor clusters in the first two years after the IPO filing in the Helsinki Stock Exchange by using a stat...
We propose a novel approach to sentiment data filtering for a portfolio of assets. In our framework, a dynamic factor model drives the evolution of the observed sentiment and allows to identify two distinct components: a long-term component, modeled as a random walk, and a short-term component driven by a stationary VAR(1) process. Our model encomp...
We propose a novel approach to sentiment data filtering for a portfolio of assets. In our framework, a dynamic factor model drives the evolution of the observed sentiment and allows to identify two distinct components: a long-term component, modeled as a random walk, and a short-term component driven by a stationary VAR(1) process. Our model encomp...
This paper is devoted to the important yet unexplored subject of crowding effects on market impact, that we call ‘co-impact’. Our analysis is based on a large database of metaorders by institutional investors in the U.S. equity market. We find that the market chiefly reacts to the net order flow of ongoing metaorders, without individually distingui...
We propose a method to infer lead-lag networks of traders from the observation of their trade record as well as to reconstruct their state of supply and demand when they do not trade. The method relies on the Kinetic Ising model to describe how information propagates among traders, assigning a positive or negative ``opinion" to all agents about whe...
Many real-world biological, social and man-made networks are inherently dynamic, with their links switching on and off over time. In particular, the evolution of these networks is often observed to be non-Markovian, and the dynamics of their links are often correlated. Hence, to accurately model these networks, the inclusion of both memory and dyna...
In complex networks, centrality metrics quantify the connectivity of nodes and identify the most important ones in the transmission of signals. In many real world networks, especially in transportation systems, links are dynamic, i.e. their presence depends on time, and travelling between two nodes requires a non-vanishing time. Additionally, many...
We propose a dynamic network model where two mechanisms control the probability of a link between two nodes: (i) the existence or absence of this link in the past, and (ii) node-specific latent variables (dynamic fitnesses) describing the propensity of each node to create links. Assuming a Markov dynamics for both mechanisms, we propose an Expectat...
We present an empirical study of price reversion after the executed metaorders. We use a dataset with more than 8 million metaorders executed by institutional investors in the US equity market. We show that relaxation takes place as soon as the metaorder ends: while at the end of the same day, it is on average [Formula: see text] of the peak impact...
We consider the problem of inferring a causality structure from multiple binary time series by using the kinetic Ising model in datasets where a fraction of observations is missing. Inspired by recent work on mean field methods for the inference of the model with hidden spins, we develop a pseudo-expectation-maximization algorithm that is able to w...
The complex networks approach has been gaining popularity in analysing investor behaviour and stock markets, but within this approach, initial public offerings (IPO) have barely been explored. We fill this gap in the literature by analysing investor clusters in the first two years after the IPO filing in the Helsinki Stock Exchange by using a stati...
Motivated by the evidence that real-world networks evolve in time and may exhibit non-stationary features, we propose an extension of the Exponential Random Graph Models (ERGMs) accommodating the time variation of network parameters. Within the ERGM framework, a network realization is sampled from a static probability distribution defined parametri...
We study the problem of estimating the total volume of queries of a specific domain, which were submitted to the Google search engine in a given time period. Our statistical model assumes a Zipf's law distribution of the population in the reference domain, and a non-uniform or noisy sampling of queries. Parameters of the distribution are estimated...
We study the problem of identifying macroscopic structures in networks, characterizing the impact of introducing link directions on the detectability phase transition. To this end, building on the stochastic block model, we construct a class of nontrivially detectable directed networks. We find closed-form solutions by using the belief propagation...
We consider how to optimally allocate investments in a portfolio of competing technologies using the standard mean-variance framework of portfolio theory. We assume that technologies follow the empirically observed relationship known as Wright's law, also called a “learning curve” or “experience curve”, which postulates that costs drop as cumulativ...
Many biological, social and man-made systems are better described in terms of temporal networks, i.e. networks whose links are only present at certain points in time, rather than by static ones. In particular, it has been found that non-Markovianity is a necessary ingredient to capture the non-trivial temporal patterns of real-world networks. Howev...
Using a large database of 8 million institutional trades executed in the U.S. equity market, we establish a clear crossover between a linear market impact regime and a square-root regime as a function of the volume of the order. Our empirical results are remarkably well explained by a recently proposed dynamical theory of liquidity that makes speci...
This note is commenting on Hasbrouck (2018). The paper investigates the problem of price discovery on markets with trades recorded at sub-millisecond frequencies. The application of the popular information share measure of Hasbrouck (1995) to such data faces several difficulties, as the underlying vector error correction models would need a huge nu...
In complex networks, centrality metrics quantify the connectivity of nodes and identify the most important ones in the transmission of signals. In many real world networks, especially in transportation systems, links are dynamic, i.e. their presence depends on time, and travelling between two nodes requires a non-vanishing time. Additionally, many...
We present an analytical model to study the role of expectation feedbacks and overlapping portfolios on systemic stability of financial systems. Building on Corsi et al. (2016), we model a set of financial institutions having Value-at-Risk capital requirements and investing in a portfolio of risky assets, whose prices evolve stochastically in time...
This note is commenting on Hasbrouck (2018). The paper investigates the problem of price discovery on markets with trades recorded at sub-millisecond frequencies. The application of the popular information share measure of Hasbrouck (1995) to such data faces several difficulties, as the underlying VECM would need a huge number of lags to capture dy...
We revisit the trading invariance hypothesis recently proposed by Kyle and Obizhaeva by empirically investigating a large dataset of bets, or metaorders, provided by ANcerno. The hypothesis predicts that the quantity $I:=\ri/N^{3/2}$, where $\ri$ is the exchanged risk (volatility $\times$ volume $\times$ price) and $N$ is the number of bets, is inv...
We present an empirical study of price reversion after the executed metaorders. We use a data set with more than 8 million metaorders executed by institutional investors in the US equity market. We show that relaxation takes place as soon as the metaorder ends:{while at the end of the same day it is on average $\approx 2/3$ of the peak impact, the...
We solve the problem of optimal liquidation with volume weighted average price (VWAP) benchmark when the market impact is linear and transient. Our setting is indeed more general as it considers the case when the trading interval is not necessarily coincident with the benchmark interval: Implementation Shortfall and Target Close execution are shown...