BookPDF Available

Theory of Financial Risk and Derivative Pricing

Authors:

Abstract and Figures

Risk control and derivative pricing have become of major concern to financial institutions, and there is a real need for adequate statistical tools to measure and anticipate the amplitude of the potential moves of the financial markets. Summarising theoretical developments in the field, this 2003 second edition has been substantially expanded. Additional chapters now cover stochastic processes, Monte-Carlo methods, Black-Scholes theory, the theory of the yield curve, and Minority Game. There are discussions on aspects of data analysis, financial products, non-linear correlations, and herding, feedback and agent based models. This book has become a classic reference for graduate students and researchers working in econophysics and mathematical finance, and for quantitative analysts working on risk management, derivative pricing and quantitative trading strategies.
Content may be subject to copyright.
Theory of Financial Risk and
Derivative Pricing
From Statistical Physics to Risk Management
second edition
Jean-Philippe Bouchaud and Marc Potters
published by the press syndicate of the university of cambridge
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
cambridge university press
The Edinburgh Building, Cambridge CB2 2RU, UK
40 West 20th Street, New York, NY 10011–4211, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
Ruiz de Alarc´on 13, 28014 Madrid, Spain
Dock House, The Waterfront, Cape Town 8001, South Africa
http://www.cambridge.org
C
Jean-Philippe Bouchaud and Marc Potters 2000, 2003
This book is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without
the written permission of Cambridge University Press.
First published 2000
This edition published 2003
Printed in the United Kingdom at the University Press, Cambridge
Typefaces Times 10/13 pt. and Helvetica System L
A
TEX2ε[tb]
A catalogue record for this book is available from the British Library
Library of Congress Cataloguing in Publication data
Bouchaud, Jean-Philippe, 1962–
Theory of financial risk and derivative pricing : from statistical physics to risk
management / Jean-Philippe Bouchaud and Marc Potters.–2nd edn
p. cm.
Rev. edn of: Theory of financial risks. 2000.
Includes bibliographical references and index.
ISBN 0 521 81916 4 (hardback)
1. Finance. 2. Financial engineering. 3. Risk assessment. 4. Risk management.
I. Potters, Marc, 1969– II. Bouchaud, Jean-Philippe, 1962– Theory of financial risks.
III. Title.
HG101.B68 2003
658.155 – dc21 2003044037
ISBN 0 521 81916 4 hardback
Contents
Foreword page xiii
Preface xv
1 Probability theory: basic notions 1
1.1 Introduction 1
1.2 Probability distributions 3
1.3 Typical values and deviations 4
1.4 Moments and characteristic function 6
1.5 Divergence of momentsasymptotic behaviour 7
1.6 Gaussian distribution 7
1.7 Log-normal distribution 8
1.8 evy distributions and Paretian tails 10
1.9 Other distributions ()14
1.10 Summary 16
2 Maximum and addition of random variables 17
2.1 Maximum of random variables 17
2.2 Sums of random variables 21
2.2.1 Convolutions 21
2.2.2 Additivity of cumulants and of tail amplitudes 22
2.2.3 Stable distributions and self-similarity 23
2.3 Central limit theorem 24
2.3.1 Convergence to a Gaussian 25
2.3.2 Convergence to a L´evy distribution 27
2.3.3 Large deviations 28
2.3.4 Steepest descent method and Cram`er function ()30
2.3.5 The CLT at work on simple cases 32
2.3.6 Truncated L´evy distributions 35
2.3.7 Conclusion: survival and vanishing of tails 36
2.4 From sum to max: progressive dominance of extremes ()37
2.5 Linear correlations and fractional Brownian motion 38
2.6 Summary 40
vi Contents
3 Continuous time limit, Ito calculus and path integrals 43
3.1 Divisibility and the continuous time limit 43
3.1.1 Divisibility 43
3.1.2 Infinite divisibility 44
3.1.3 Poisson jump processes 45
3.2 Functions of the Brownian motion and Ito calculus 47
3.2.1 Ito’s lemma 47
3.2.2 Novikov’s formula 49
3.2.3 Stratonovich’s prescription 50
3.3 Other techniques 51
3.3.1 Path integrals 51
3.3.2 Girsanov’s formula and the Martin–Siggia–Rose trick ()53
3.4 Summary 54
4 Analysis of empirical data 55
4.1 Estimating probability distributions 55
4.1.1 Cumulative distribution and densities – rank histogram 55
4.1.2 Kolmogorov–Smirnov test 56
4.1.3 Maximum likelihood 57
4.1.4 Relative likelihood 59
4.1.5 A general caveat 60
4.2 Empirical moments: estimation and error 60
4.2.1 Empirical mean 60
4.2.2 Empirical variance and MAD 61
4.2.3 Empirical kurtosis 61
4.2.4 Error on the volatility 61
4.3 Correlograms and variograms 62
4.3.1 Variogram 62
4.3.2 Correlogram 63
4.3.3 Hurst exponent 64
4.3.4 Correlations across different time zones 64
4.4 Data with heterogeneous volatilities 66
4.5 Summary 67
5 Financial products and financial markets 69
5.1 Introduction 69
5.2 Financial products 69
5.2.1 Cash (Interbank market) 69
5.2.2 Stocks 71
5.2.3 Stock indices 72
5.2.4 Bonds 75
5.2.5 Commodities 77
5.2.6 Derivatives 77
Contents vii
5.3 Financial markets 79
5.3.1 Market participants 79
5.3.2 Market mechanisms 80
5.3.3 Discreteness 81
5.3.4 The order book 81
5.3.5 The bid-ask spread 83
5.3.6 Transaction costs 84
5.3.7 Time zones, overnight, seasonalities 85
5.4 Summary 85
6 Statistics of real prices: basic results 87
6.1 Aim of the chapter 87
6.2 Second-order statistics 90
6.2.1 Price increments vs. returns 90
6.2.2 Autocorrelation and power spectrum 91
6.3 Distribution of returns over different time scales 94
6.3.1 Presentation of the data 95
6.3.2 The distribution of returns 96
6.3.3 Convolutions 101
6.4 Tails, what tails? 102
6.5 Extreme markets 103
6.6 Discussion 104
6.7 Summary 105
7 Non-linear correlations and volatility fluctuations 107
7.1 Non-linear correlations and dependence 107
7.1.1 Non identical variables 107
7.1.2 A stochastic volatility model 109
7.1.3 GARCH(1,1) 110
7.1.4 Anomalous kurtosis 111
7.1.5 The case of infinite kurtosis 113
7.2 Non-linear correlations in financial markets: empirical results 114
7.2.1 Anomalous decay of the cumulants 114
7.2.2 Volatility correlations and variogram 117
7.3 Models and mechanisms 123
7.3.1 Multifractality and multifractal models () 123
7.3.2 The microstructure of volatility 125
7.4 Summary 127
8 Skewness and price-volatility correlations 130
8.1 Theoretical considerations 130
8.1.1 Anomalous skewness of sums of random variables 130
8.1.2 Absolute vs. relative price changes 132
8.1.3 The additive-multiplicative crossover and the q-transformation 134
viii Contents
8.2 A retarded model 135
8.2.1 Definition and basic properties 135
8.2.2 Skewness in the retarded model 136
8.3 Price-volatility correlations: empirical evidence 137
8.3.1 Leverage effect for stocks and the retarded model 139
8.3.2 Leverage effect for indices 140
8.3.3 Return-volume correlations 141
8.4 The Heston model: a model with volatility fluctuations and skew 141
8.5 Summary 144
9 Cross-correlations 145
9.1 Correlation matrices and principal component analysis 145
9.1.1 Introduction 145
9.1.2 Gaussian correlated variables 147
9.1.3 Empirical correlation matrices 147
9.2 Non-Gaussian correlated variables 149
9.2.1 Sums of non Gaussian variables 149
9.2.2 Non-linear transformation of correlated Gaussian variables 150
9.2.3 Copulas 150
9.2.4 Comparison of the two models 151
9.2.5 Multivariate Student distributions 153
9.2.6 Multivariate L´evy variables () 154
9.2.7 Weakly non Gaussian correlated variables () 155
9.3 Factors and clusters 156
9.3.1 One factor models 156
9.3.2 Multi-factor models 157
9.3.3 Partition around medoids 158
9.3.4 Eigenvector clustering 159
9.3.5 Maximum spanning tree 159
9.4 Summary 160
9.5 Appendix A: central limit theorem for random matrices 161
9.6 Appendix B: density of eigenvalues for random correlation matrices 164
10 Risk measures 168
10.1 Risk measurement and diversification 168
10.2 Risk and volatility 168
10.3 Risk of loss, ‘value at risk’ (VaR) and expected shortfall 171
10.3.1 Introduction 171
10.3.2 Value-at-risk 172
10.3.3 Expected shortfall 175
10.4 Temporal aspects: drawdown and cumulated loss 176
10.5 Diversification and utilitysatisfaction thresholds 181
10.6 Summary 184
Contents ix
11 Extreme correlations and variety 186
11.1 Extreme event correlations 187
11.1.1 Correlations conditioned on large market moves 187
11.1.2 Real data and surrogate data 188
11.1.3 Conditioning on large individual stock returns:
exceedance correlations 189
11.1.4 Tail dependence 191
11.1.5 Tail covariance () 194
11.2 Variety and conditional statistics of the residuals 195
11.2.1 The variety 195
11.2.2 The variety in the one-factor model 196
11.2.3 Conditional variety of the residuals 197
11.2.4 Conditional skewness of the residuals 198
11.3 Summary 199
11.4 Appendix C: some useful results on power-law variables 200
12 Optimal portfolios 202
12.1 Portfolios of uncorrelated assets 202
12.1.1 Uncorrelated Gaussian assets 203
12.1.2 Uncorrelated ‘power-law’ assets 206
12.1.3 ‘Exponential’ assets 208
12.1.4 General case: optimal portfolio and VaR () 210
12.2 Portfolios of correlated assets 211
12.2.1 Correlated Gaussian fluctuations 211
12.2.2 Optimal portfolios with non-linear constraints () 215
12.2.3 ‘Power-law’ fluctuations – linear model () 216
12.2.4 ‘Power-law’ fluctuations – Student model () 218
12.3 Optimized trading 218
12.4 Value-at-risk– general non-linear portfolios () 220
12.4.1 Outline of the method: identifying worst cases 220
12.4.2 Numerical test of the method 223
12.5 Summary 224
13 Futures and options: fundamental concepts 226
13.1 Introduction 226
13.1.1 Aim of the chapter 226
13.1.2 Strategies in uncertain conditions 226
13.1.3 Trading strategies and efficient markets 228
13.2 Futures and forwards 231
13.2.1 Setting the stage 231
13.2.2 Global financial balance 232
13.2.3 Riskless hedge 233
13.2.4 Conclusion: global balance and arbitrage 235
xContents
13.3 Options: definition and valuation 236
13.3.1 Setting the stage 236
13.3.2 Orders of magnitude 238
13.3.3 Quantitative analysisoption price 239
13.3.4 Real option prices, volatility smile and ‘implied’
kurtosis 242
13.3.5 The case of an infinite kurtosis 249
13.4 Summary 251
14 Options: hedging and residual risk 254
14.1 Introduction 254
14.2 Optimal hedging strategies 256
14.2.1 A simple case: static hedging 256
14.2.2 The general case and ‘’ hedging 257
14.2.3 Global hedging vs. instantaneous hedging 262
14.3 Residual risk 263
14.3.1 The Black–Scholes miracle 263
14.3.2 The ‘stop-loss’ strategy does not work 265
14.3.3 Instantaneous residual risk and kurtosis risk 266
14.3.4 Stochastic volatility models 267
14.4 Hedging errors. A variational point of view 268
14.5 Other measures of riskhedging and VaR () 268
14.6 Conclusion of the chapter 271
14.7 Summary 272
14.8 Appendix D 273
15 Options: the role of drift and correlations 276
15.1 Influence of drift on optimally hedged option 276
15.1.1 A perturbative expansion 276
15.1.2 ‘Risk neutral’ probability and martingales 278
15.2 Drift risk and delta-hedged options 279
15.2.1 Hedging the drift risk 279
15.2.2 The price of delta-hedged options 280
15.2.3 A general option pricing formula 282
15.3 Pricing and hedging in the presence of temporal correlations () 283
15.3.1 A general model of correlations 283
15.3.2 Derivative pricing with small correlations 284
15.3.3 The case of delta-hedging 285
15.4 Conclusion 285
15.4.1 Is the price of an option unique? 285
15.4.2 Should one always optimally hedge? 286
15.5 Summary 287
15.6 Appendix E 287
Contents xi
16 Options: the Black and Scholes model 290
16.1 Ito calculus and the Black-Scholes equation 290
16.1.1 The Gaussian Bachelier model 290
16.1.2 Solution and Martingale 291
16.1.3 Time value and the cost of hedging 293
16.1.4 The Log-normal Black–Scholes model 293
16.1.5 General pricing and hedging in a Brownian world 294
16.1.6 The Greeks 295
16.2 Drift and hedge in the Gaussian model () 295
16.2.1 Constant drift 295
16.2.2 Price dependent drift and the Ornstein–Uhlenbeck paradox 296
16.3 The binomial model 297
16.4 Summary 298
17 Options: some more specific problems 300
17.1 Other elements of the balance sheet 300
17.1.1 Interest rate and continuous dividends 300
17.1.2 Interest rate corrections to the hedging strategy 303
17.1.3 Discrete dividends 303
17.1.4 Transaction costs 304
17.2 Other types of options 305
17.2.1 ‘Put-call’ parity 305
17.2.2 ‘Digital’ options 305
17.2.3 Asian’ options 306
17.2.4 American’ options 308
17.2.5 ‘Barrier’ options () 310
17.2.6 Other types of options 312
17.3 The ‘Greeks’ and risk control 312
17.4 Risk diversification () 313
17.5 Summary 316
18 Options: minimum variance Monte–Carlo 317
18.1 Plain Monte-Carlo 317
18.1.1 Motivation and basic principle 317
18.1.2 Pricing the forward exactly 319
18.1.3 Calculating the Greeks 320
18.1.4 Drawbacks of the method 322
18.2 An ‘hedged’ Monte-Carlo method 323
18.2.1 Basic principle of the method 323
18.2.2 A linear parameterization of the price and hedge 324
18.2.3 The Black-Scholes limit 325
18.3 Non Gaussian models and purely historical option pricing 327
18.4 Discussion and extensions. Calibration 329
xii Contents
18.5 Summary 331
18.6 Appendix F: generating some random variables 331
19 The yield curve 334
19.1 Introduction 334
19.2 The bond market 335
19.3 Hedging bonds with other bonds 335
19.3.1 The general problem 335
19.3.2 The continuous time Gaussian limit 336
19.4 The equation for bond pricing 337
19.4.1 A general solution 339
19.4.2 The Vasicek model 340
19.4.3 Forward rates 341
19.4.4 More general models 341
19.5 Empirical study of the forward rate curve 343
19.5.1 Data and notations 343
19.5.2 Quantities of interest and data analysis 343
19.6 Theoretical considerations () 346
19.6.1 Comparison with the Vasicek model 346
19.6.2 Market price of risk 348
19.6.3 Risk-premium and the θlaw 349
19.7 Summary 351
19.8 Appendix G: optimal portfolio of bonds 352
20 Simple mechanisms for anomalous price statistics 355
20.1 Introduction 355
20.2 Simple models for herding and mimicry 356
20.2.1 Herding and percolation 356
20.2.2 Avalanches of opinion changes 357
20.3 Models of feedback effects on price fluctuations 359
20.3.1 Risk-aversion induced crashes 359
20.3.2 A simple model with volatility correlations and tails 363
20.3.3 Mechanisms for long ranged volatility correlations 364
20.4 The Minority Game 366
20.5 Summary 368
Index of most important symbols 372
Index 377
1
Probability theory: basic notions
All epistemological value of the theory of probability is based on this: that large scale
random phenomena in their collective action create strict, non random regularity.
(Gnedenko and Kolmogorov, Limit Distributions for Sums of Independent
Random Variables.)
1.1 Introduction
Randomness stems from our incomplete knowledge of reality, from the lack of information
which forbids a perfect prediction of the future. Randomness arises from complexity, from
the fact that causes are diverse, that tiny perturbations may result in large effects. For over a
century now, Science has abandoned Laplace’s deterministic vision, and has fully accepted
the task of deciphering randomness and inventing adequate tools for its description. The
surprise is that, after all, randomness has many facets and that there are many levels to
uncertainty, but, above all, that a new form of predictability appears, which is no longer
deterministic but statistical.
Financial markets offer an ideal testing ground for these statistical ideas. The fact that
a large number of participants, with divergent anticipations and conflicting interests, are
simultaneously present in these markets, leads to unpredictable behaviour. Moreover, finan-
cial markets are (sometimes strongly) affected by external newswhich are, both in date
and in nature, to a large degree unexpected. The statistical approach consists in drawing
from past observations some information on the frequency of possible price changes. If one
then assumes that these frequencies reflect some intimate mechanism of the markets them-
selves, then one may hope that these frequencies will remain stable in the course of time.
For example, the mechanism underlying the roulette or the game of dice is obviously always
the same, and one expects that the frequency of all possible outcomes will be invariant in
time– although of course each individual outcome is random.
This ‘bet’ that probabilities are stable (or better, stationary) is very reasonable in the
case of roulette or dice;it is nevertheless much less justified in the case of financial
markets despite the large number of participants which confer to the system a certain
The idea that science ultimately amounts to making the best possible guess of reality is due to R. P. Feynman
(Seeking New Laws, in The Character of Physical Laws, MIT Press, Cambridge, MA, 1965).
2Probability theory: basic notions
regularity, at least in the sense of Gnedenko and Kolmogorov. It is clear, for example, that
financial markets do not behave now as they did 30 years ago: many factors contribute to
the evolution of the way markets behave (development of derivative markets, world-wide
and computer-aided trading, etc.). As will be mentioned below, ‘young’ markets (such as
emergent countries markets) and more mature markets (exchange rate markets, interest rate
markets, etc.) behave quite differently. The statistical approach to financial markets is based
onthe ideathat whateverevolutiontakes place, this happens sufficientlyslowly (onthe scale
of several years) so that the observation of the recent past is useful to describe a not too
distant future. However, even this ‘weak stability’ hypothesis is sometimes badly in error,
in particular in the case of a crisis, which marks a sudden change of market behaviour. The
recent example of some Asian currencies indexed to the dollar (such as the Korean won or
the Thai baht) is interesting, since the observation of past fluctuations is clearly of no help
to predict the amplitude of the sudden turmoil of 1997, see Figure 1.1.
9706 9708 9710 9712
0.4
0.6
0.8
1
x(t)
KRW/USD
9206 9208 9210 9212
6
8
10
12
x(t)
Libor 3M dec 92
8706 8708 8710 8712
t
200
250
300
350
x(t)
S&P 500
Fig. 1.1. Three examples of statistically unforeseen crashes: the Korean won against the dollar in
1997 (top), the British 3-month short-term interest rates futures in 1992 (middle), and the S&P 500
in 1987 (bottom). In the example of the Korean won, it is particularly clear that the distribution of
price changes before the crisis was extremely narrow, and could not be extrapolated to anticipate
what happened in the crisis period.
1.2 Probability distributions 3
Hence, the statistical description of financial fluctuations is certainly imperfect. It is
nevertheless extremely helpful: in practice, the ‘weak stability’ hypothesis is in most cases
reasonable, at least to describe risks.
In other words, the amplitude of the possible price changes (but not their sign!) is, to a
certain extent, predictable. It is thus rather important to devise adequate tools, in order to
control (if at all possible) financial risks. The goal of this first chapter is to present a certain
number of basic notions in probability theory which we shall find useful in the following.
Our presentation does not aim at mathematical rigour, but rather tries to present the key
concepts in an intuitive way, in order to ease their empirical use in practical applications.
1.2 Probability distributions
Contrarily to the throw of a dice, which can only return an integer between 1 and 6, the
variation of price of a financial assetcan be arbitrary (we disregard the fact that price
changes cannot actually be smaller than a certain quantity – a ‘tick’). In order to describe
a random process Xfor which the result is a real number, one uses a probability density
P(x), such that the probability that Xis within a small interval of width dxaround X=x
is equal to P(x)dx. In the following, we shall denote as P(·) the probability density for
the variable appearing as the argument of the function. This is a potentially ambiguous, but
very useful notation.
The probability that Xis between aand bis given by the integral of P(x) between a
and b,
P(a<X<b)=b
aP(x)dx.(1.1)
In the following, the notation P(·) means the probability of a given event, defined by the
content of the parentheses (·).
The function P(x) is a density; in this sense it depends on the units used to measure X.
For example, if Xis a length measured in centimetres, P(x) is a probability density per unit
length, i.e. per centimetre. The numerical value of P(x) changes if Xis measured in inches,
but the probability that Xlies between two specific values l1and l2is of course independent
of the chosen unit. P(x)dxis thus invariant upon a change of unit, i.e. under the change
of variable xγx. More generally, P(x)dxis invariant upon any (monotonic) change of
variable xy(x): in this case, one has P(x)dx=P(y)dy.
In order to be a probability density in the usual sense, P(x) must be non-negative
(P(x)0 for all x) and must be normalized, that is that the integral of P(x) over the
whole range of possible values for Xmust be equal to one:
xM
xm
P(x)dx=1,(1.2)
The prediction of future returns on the basis of past returns is however much less justified.
Asset is the generic name for a financial instrument which can be bought or sold, like stocks, currencies, gold,
bonds, etc.
4Probability theory: basic notions
where xm(resp. xM) is the smallest value (resp. largest) which Xcan take. In the case where
the possible values of Xare not bounded from below, one takes xm=−, and similarly
for xM. One can actually always assume the bounds to be ±∞ by setting to zero P(x)in
the intervals ]−∞,xm] and [xM,[. Later in the text, we shall often use the symbol as
a shorthand for +∞
−∞ .
An equivalent way of describing the distribution of Xis to consider its cumulative
distribution P<(x), defined as:
P<(x)P(X<x)=x
−∞
P(x)dx.(1.3)
P<(x) takes values between zero and one, and is monotonically increasing with x. Obvi-
ously, P<(−∞)=0 and P<(+∞)=1. Similarly, one defines P>(x)=1P<(x).
1.3 Typical values and deviations
It is quite natural to speak about ‘typical’ values of X. There are at least three mathematical
definitions of this intuitive notion: the most probable value, the median and the mean.
The most probable value xcorresponds to the maximum of the function P(x); xneeds
not be unique if P(x) has several equivalent maxima. The median xmed is such that the
probabilities that Xbe greater or less than this particular value are equal. In other words,
P<(xmed)=P>(xmed)=1
2. The mean, or expected value of X, which we shall note as
mor xin the following, is the average of all possible values of X, weighted by their
corresponding probability:
m≡x=xP(x)dx.(1.4)
For a unimodal distribution (unique maximum), symmetrical around this maximum, these
three definitions coincide. However, they are in general different, although often rather
close to one another. Figure 1.2 shows an example of a non-symmetric distribution, and the
relative position of the most probable value, the median and the mean.
One can then describe the fluctuations of the random variable X: if the random process is
repeated several times, one expects the results to be scattered in a cloud of a certain ‘width’
in the region of typical values of X. This width can be described by the mean absolute
deviation (MAD) Eabs,bytheroot mean square (RMS) σ(or, standard deviation), or
by the ‘full width at half maximum’ w1/2.
The mean absolute deviation from a given reference value is the average of the distance
between the possible values of Xand this reference value,
Eabs |xxmed|P(x)dx.(1.5)
One chooses as a reference value the median for the MAD and the mean for the RMS, because for a fixed
distribution P(x), these two quantities minimize, respectively, the MAD and the RMS.
1.3 Typical values and deviations 5
02468
x
0.0
0.1
0.2
0.3
0.4
P(x)
x*
xmed
<x>
Fig. 1.2. The ‘typical value’ of a random variable Xdrawn according to a distribution density P(x)
can be defined in at least three different ways: through its mean value x, its most probable value x
or its median xmed. In the general case these three values are distinct.
Similarly, the variance (σ2) is the mean distance squared to the reference value m,
σ2≡(xm)2=(xm)2P(x)dx.(1.6)
Since the variance has the dimension of xsquared, its square root (the RMS, σ) gives the
order of magnitude of the fluctuations around m.
Finally, the full width at half maximum w1/2is defined (for a distribution which is
symmetrical around itsunique maximum x)suchthat P(x±(w1/2)/2) =P(x)/2,which
corresponds to the points where the probability density has dropped by a factor of two
compared to its maximum value. One could actually define this width slightly differently,
for example such that the total probability to find an event outside the interval [(xw/2),
(x+w/2)] is equal to, say, 0.1. The corresponding value of wis called a quantile. This
definition is important when the distribution has very fat tails, such that the variance or the
mean absolute deviation are infinite.
The pair mean–variance is actually much more popular than the pair median–MAD. This
comes from the fact that the absolute value is not an analytic function of its argument, and
thusdoesnotpossesstheniceproperties of the variance, such as additivity under convolution,
which we shall discuss in the next chapter. However, for the empirical study of fluctuations,
it is sometimes preferable to use the MAD; it is more robust than the variance, that is, less
sensitive to rare extreme events, which may be the source of large statistical errors.
6Probability theory: basic notions
1.4 Moments and characteristic function
Moregenerally,onecandefinehigher-order moments ofthedistribution P(x)asthe average
of powers of X:
mn≡xn=xnP(x)dx.(1.7)
Accordingly, the mean mis the first moment (n=1), and the variance is related to the
second moment (σ2=m2m2). The above definition, Eq. (1.7), is only meaningful if the
integral converges, which requires that P(x) decreases sufficiently rapidly for large |x|(see
below).
From a theoretical point of view, the moments are interesting: if they exist, their knowl-
edge is often equivalent to the knowledge of the distribution P(x) itself.In practice how-
ever, the high order moments are very hard to determine satisfactorily: as ngrows, longer
and longer time series are needed to keep a certain level of precision on mn; these high
moments are thus in general not adapted to describe empirical data.
For many computational purposes, it is convenient to introduce the characteristic func-
tion of P(x), defined as its Fourier transform:
ˆ
P(z)eizx P(x)dx.(1.8)
The function P(x) is itself related to its characteristic function through an inverse Fourier
transform:
P(x)=1
2πeizx ˆ
P(z)dz.(1.9)
Since P(x) is normalized, one always has ˆ
P(0) =1. The moments of P(x) can be obtained
through successive derivatives of the characteristic function at z=0,
mn=(i)ndn
dznˆ
P(z)z=0
.(1.10)
One finally defines the cumulants cnof a distribution as the successive derivatives of the
logarithm of its characteristic function:
cn=(i)ndn
dznlog ˆ
P(z)z=0
.(1.11)
The cumulant cnis a polynomial combination of the moments mpwith pn. For example
c2=m2m2=σ2. It is often useful to normalize the cumulants by an appropriate power
of the variance, such that the resulting quantities are dimensionless. One thus defines the
normalized cumulants λn,
λncnn.(1.12)
This is not rigorously correct, since one can exhibit examples of different distribution densities which possess
exactly the same moments, see Section 1.7 below.
1.6 Gaussian distribution 7
One often uses the third and fourth normalized cumulants, called the skewness (ς) and
kurtosis (κ),
ςλ3=(xm)3
σ3κλ4=(xm)4
σ43.(1.13)
The above definition of cumulants may look arbitrary, but these quantities have remark-
able properties. For example, as we shall show in Section 2.2, the cumulants simply add
when one sums independent random variables. Moreover a Gaussian distribution (or the
normal law of Laplace and Gauss) is characterized by the fact that all cumulants of order
larger than two are identically zero. Hence the cumulants, in particular κ, can be interpreted
as a measure of the distance between a given distribution P(x) and a Gaussian.
1.5 Divergence of moments – asymptotic behaviour
The moments (or cumulants) of a given distribution do not always exist. A necessary
condition for the nth moment (mn) to exist is that the distribution density P(x) should
decay faster than 1/|x|n+1for |x|going towards infinity, or else the integral, Eq. (1.7),
would diverge for |x|large. If one only considers distribution densities that are behaving
asymptotically as a power-law, with an exponent 1 +µ,
P(x)µAµ
±
|x|1+µfor x→±, (1.14)
then all the moments such that nµare infinite. For example, such a distribution has
no finite variance whenever µ2. [Note that, for P(x) to be a normalizable probability
distribution, the integral, Eq. (1.2), must converge, which requires µ>0.]
The characteristic function of a distribution having an asymptotic power-law behaviour
given by Eq. (1.14) isnon-analytic around z =0. The small z expansion contains regular
terms of the form znfor n followed by a non-analytic term |z|µ(possibly with
logarithmic corrections such as |z|µlog z for integer µ). The derivatives of order larger
or equal to µof the characteristic function thus do not exist at the origin (z =0).
1.6 Gaussian distribution
The most commonly encountered distributions are the ‘normal’ laws of Laplace and Gauss,
which we shall simply call Gaussian in the following. Gaussians are ubiquitous: for
example, the number of heads in a sequence of a thousand coin tosses, the exact number
of oxygen molecules in the room, the height (in inches) of a randomly selected individual,
Note that it is sometimes κ+3, rather than κitself, which is called the kurtosis.
8Probability theory: basic notions
are all approximately described by a Gaussian distribution.The ubiquity of the Gaussian
can be in part traced to the central limit theorem (CLT) discussed at length in Chapter 2,
which states that a phenomenon resulting from a large number of small independent causes
is Gaussian. There exists however a large number of cases where the distribution describing
a complex phenomenon is not Gaussian: for example, the amplitude of earthquakes, the
velocity differences in a turbulent fluid, the stresses in granular materials, etc., and, as we
shall discuss in Chapter 6, the price fluctuations of most financial assets.
A Gaussian of mean mand root mean square σis defined as:
PG(x)1
2πσ2exp (xm)2
2σ2.(1.15)
The median and most probable value are in this case equal to m, whereas the MAD (or any
other definition of the width) is proportional to the RMS (for example, Eabs =σ2 ).
For m=0, all the odd moments are zero and the even moments are given by m2n=
(2n1)(2n3)...σ2n=(2n1)!!σ2n.
All the cumulants of order greater than two are zero for a Gaussian. This can be realized
by examining its characteristic function:
ˆ
PG(z)=exp σ2z2
2+imz.(1.16)
Its logarithm is a second-order polynomial, for which all derivatives of order larger than
two are zero. In particular, the kurtosis of a Gaussian variable is zero. As mentioned above,
the kurtosis is often taken as a measure of the distance from a Gaussian distribution. When
κ>0(leptokurticdistributions), the corresponding distribution density has a marked peak
around the mean, and rather ‘thick’ tails. Conversely, when κ<0, the distribution density
has a flat top and very thin tails. For example, the uniform distribution over a certain interval
(for which tails are absent) has a kurtosis κ=−6
5. Note that the kurtosis is bounded from
below by the value 2, which corresponds to the case where the random variable can only
take two values aand awith equal probability.
A Gaussian variable is peculiar because ‘large deviations’ are extremely rare. The quan-
tity exp(x2/2σ2) decays so fast for large xthat deviations of a few times σare nearly
impossible. For example, a Gaussian variable departs from its most probable value by more
than 2σonly 5% of the times, of more than 3σin 0.2% of the times, whereas a fluctuation
of 10σhas a probability of less than 2 ×1023; in other words, it never happens.
1.7 Log-normal distribution
Another very popular distribution in mathematical finance is the so-called log-normal law.
That Xis a log-normal random variable simply means that log Xis normal, or Gaussian. Its
use in finance comes from the assumption that the rate of returns, rather than the absolute
Although, in the above three examples, the random variable cannot be negative. As we shall discuss later, the
Gaussian description is generally only valid in a certain neighbourhood of the maximum of the distribution.
1.7 Log-normal distribution 9
change of prices, are independent random variables. The increments of the logarithm of the
price thus asymptotically sum to a Gaussian, according to the CLT detailed in Chapter 2.
The log-normal distribution density is thus defined as:
PLN(x)1
x2πσ2exp log2(x/x0)
2σ2,(1.17)
the moments of which being: mn=xn
0en2σ2/2.
From these moments, one deduces the skewness, given by ς3=(e3σ23eσ2+2)/
(eσ21)3/2,(3σfor σ1), and the kurtosis κ=(e6σ24e3σ2+6eσ23)/(eσ2
1)23, (19σ2for σ1).
In the context of mathematical finance, one often prefers log-normal to Gaussian distri-
butions for several reasons. As mentioned above, the existence of a random rate of return,
or random interest rate, naturally leads to log-normal statistics. Furthermore, log-normals
account for the following symmetry in the problem of exchange rates:if xis the rate of
currency A in terms of currency B, then obviously, 1/xis the rate of currency Bin terms
of A. Under this transformation, log xbecomes log xand the description in terms of a
log-normal distribution (or in terms of any other even function of logx) is independent of
the reference currency. One often hears the following argument in favour of log-normals:
since the price of an asset cannot be negative, its statistics cannot be Gaussian since the
latter admits in principle negative values, whereas a log-normal excludes them by construc-
tion. This is however a red-herring argument, since the description of the fluctuations of
the price of a financial asset in terms of Gaussian or log-normal statistics is in any case an
approximation which is only valid in a certain range. As we shall discuss at length later,
these approximations are totally unadapted to describe extreme risks. Furthermore, even if
a price drop of more than 100% is in principle possible for a Gaussian process,§the error
caused by neglecting such an event is much smaller than that induced by the use of either
of these two distributions (Gaussian or log-normal). In order to illustrate this point more
clearly, consider the probability of observing ntimes ‘heads’ in a series of Ncoin tosses,
which is exactly equal to 2NCn
N. It is also well known that in the neighbourhood of N/2,
2NCn
Nis very accurately approximated by a Gaussian of variance N/4; this is however
not contradictory with the fact that n0 by construction!
Finally, let us note that for moderate volatilities (up to say 20%), the two distributions
(Gaussian and log-normal) look rather alike, especially in the ‘body’ of the distribution
(Fig. 1.3). As for the tails, we shall see later that Gaussians substantially underestimate
their weight, whereas the log-normal predicts that large positive jumps are more frequent
A log-normal distribution has the remarkable property that the knowledge of all its moments is not suffi-
cient to characterize the corresponding distribution. One can indeed show that the following distribution:
1
2πx1exp[1
2(log x)2][1 +asin(2πlog x)], for |a|≤1, has moments which are independent of the value of
a, and thus coincide with those of a log-normal distribution, which corresponds to a=0.
This symmetry is however not always obvious. The dollar, for example, plays a special role. This symmetry can
only be expected between currencies of similar strength.
§In the rather extreme case of a 20% annual volatility and a zero annual return, the probability for the price to
become negative after a year in a Gaussian description is less than one out of 3 million.
10 Probability theory: basic notions
50 75 100 125 150
x
0.00
0.01
0.02
0.03
P(x)
Gaussian
log-normal
Fig. 1.3. Comparison between a Gaussian (thick line) and a log-normal (dashed line), with
m=x0=100 and σequal to 15 and 15% respectively. The difference between the two curves
shows up in the tails.
thanlarge negativejumps. Thisis at variancewith empirical observation:the distributions of
absolute stock price changes are rather symmetrical; if anything, large negative draw-downs
are more frequent than large positive draw-ups.
1.8 L ´
evy distributions and Paretian tails
evy distributions (noted Lµ(x) below) appear naturally in the context of the CLT (see
Chapter 2), because of their stability property under addition (a property shared by
Gaussians). The tails of L´evy distributions are however much ‘fatter’ than those of Gaus-
sians, and are thus useful to describe multiscale phenomena (i.e. when both very large
and very small values of a quantity can commonly be observedsuch as personal income,
size of pension funds, amplitude of earthquakes or other natural catastrophes, etc.). These
distributions were introduced in the 1950s and 1960s by Mandelbrot (following Pareto)
to describe personal income and the price changes of some financial assets, in particular
the price of cotton. An important constitutive property of these L´evy distributions is their
power-law behaviour for large arguments, often called Pareto tails:
Lµ(x)µAµ
±
|x|1+µfor x→±,(1.18)
where 0 <µ<2 is a certain exponent (often called α), and Aµ
±two constants which we
call tail amplitudes,orscale parameters:A±indeed gives the order of magnitude of the
1.8 evy distributions and Paretian tails 11
large (positive or negative) fluctuations of x. For instance, the probability to draw a number
larger than xdecreases as P>(x)=(A+/x)µfor large positive x.
One can of course in principle observe Pareto tails with µ2; but, those tails do not
correspond to the asymptotic behaviour of a L´evy distribution.
In full generality, L´evy distributions are characterized by an asymmetry parameter
defined as β(Aµ
+Aµ
)/(Aµ
++Aµ
), which measures the relative weight of the positive
and negative tails. We shall mostly focus in the following on the symmetric case β=0. The
fully asymmetric case (β=1) is also useful to describe strictly positive random variables,
such as, for example, the time during which the price of an asset remains below a certain
value, etc.
An important consequence of Eq. (1.14) with µ2 is that the variance of a L´evy
distribution is formally infinite: the probability density does not decay fast enough for the
integral, Eq. (1.6), to converge. In the case µ1, the distribution density decays so slowly
that even the mean, or the MAD, fail to exist.The scale of the fluctuations, defined by the
width of the distribution, is always set by A=A+=A.
There is unfortunately no simple analytical expression for symmetric L´evy distributions
Lµ(x), except for µ=1, which corresponds to a Cauchy distribution (or Lorentzian):
L1(x)=A
x2+π2A2.(1.19)
However, the characteristic function of a symmetric L´evy distribution is rather
simple, and reads:
ˆ
Lµ(z)=exp(aµ|z|µ),(1.20)
where aµis a constant proportional to the tail parameter Aµ:
Aµ=µ(µ1)sin(πµ/2)
πaµ1<µ<2,(1.21)
and
Aµ=(1 µ)(µ)sin(πµ/2)
πµ aµµ<1.(1.22)
It is clear, from (1.20), that in the limit µ=2, one recovers the definition of a Gaussian.
When µdecreases from 2, the distribution becomes more and more sharply peaked around
the origin and fatter in its tails, while ‘intermediate’ events lose weight (Fig. 1.4). These
distributions thus describe ‘intermittent’ phenomena, very often small, sometimes gigantic.
The moments of the symmetric L´evy distribution can be computed, when they exist. One
finds:
|x|ν=(aµ)ν/µ (ν/µ)
µ(ν) cos(πν/2),1. (1.23)
Themedianandthemostprobablevaluehoweverstill exist. For a symmetric L´evy distribution, the most probable
value defines the so-called localization parameter m.
12 Probability theory: basic notions
-3 -2 -1 0 1
x
0
0.5
1
P(x)
-3 -2 -1 0 2
µ=0.8
µ=1.2
µ=1.6
µ=2 (Gaussian)
510 15
0
0.02
3
Fig. 1.4. Shape of the symmetric L´evy distributions with µ=0.8,1.2,1.6 and 2 (this last value
actually corresponds to a Gaussian). The smaller µ, the sharper the ‘body’ of the distribution, and
the fatter the tails, as illustrated in the inset.
Notefinally that Eq.(1.20) does notdefine a probabilitydistributionwhen µ>2,because
its inverse Fourier transform is not everywhere positive.
In the case β= 0, one would have:
ˆ
Lβ
µ(z)=exp aµ|z|µ1+iβtan(µπ/2) z
|z| (µ= 1).(1.24)
It is important to notice that while the leading asymptotic term for large xis given
by Eq. (1.18), there are subleading terms which can be important for finite x. The full
asymptotic series actually reads:
Lµ(x)=
n=1
()n+1
πn!
an
µ
x1+nµ(1 +nµ)sin(πµn/2).(1.25)
The presence of the subleading terms may lead to a bad empirical estimate of the exponent
µbased on a fit of the tail of the distribution. In particular, the ‘apparent’ exponent which
describes the function Lµfor finite xis larger than µ, and decreases towards µfor x→∞,
but more and more slowly as µgets nearer to the Gaussian value µ=2, for which the
power-law tails no longer exist. Note however that one also often observes empirically
the opposite behaviour, i.e. an apparent Pareto exponent which grows with x. This arises
when the Pareto distribution, Eq. (1.18), is only valid in an intermediate regime x1,
beyond which the distribution decays exponentially, say as exp(αx). The Pareto tail is
then ‘truncated’ for large values of x, and this leads to an effective µwhich grows with x.
1.8 evy distributions and Paretian tails 13
An interesting generalization of the symmetric L´evy distributions which accounts for this
exponential cut-off is given by the truncated L´evy distributions (TLD), which will be of
much use in the following. A simple way to alter the characteristic function Eq. (1.20) to
account for an exponential cut-off for large arguments is to set:
ˆ
L(t)
µ(z)=exp aµ
(α2+z2)µ
2cos(µarctan(|z|))αµ
cos(πµ/2) ,(1.26)
for 1 µ2. The above form reduces to Eq. (1.20) for α=0. Note that the argument in
the exponential can also be written as:
aµ
2cos(πµ/2)[(α+iz)µ+(αiz)µ2αµ].(1.27)
The first cumulants of the distribution defined by Eq. (1.26) read, for 1 <µ<2:
c2=µ(µ1) aµ
|cosπµ/2|αµ2c3=0.(1.28)
The kurtosis κ=λ4=c4/c2
2is given by:
λ4=(3 µ)(2 µ)|cosπµ/2|
µ(µ1)aµαµ.(1.29)
Note that the case µ=2 corresponds to the Gaussian, for which λ4=0 as expected.
On the other hand, when α0, one recovers a pure L´evy distribution, for which c2
and c4are formally infinite. Finally, if α→∞with aµαµ2fixed, one also recovers the
Gaussian.
As explained below in Section 3.1.3, the truncated L´evy distribution has the interesting
property of being infinitely divisible for all values of αand µ(this includes the Gaussian
distribution and the pure L´evy distributions).
Exponential tail: a limiting case
Very often in the following, we shall notice that in the formal limit µ→∞, the power-
law tail becomes an exponential tail, if the tail parameter is simultaneously scaled as
Aµ=(µ/α)µ. Qualitatively, this can be understood as follows: consider a probability
distribution restricted to positive x, which decays as a power-law for large x, defined
as:
P>(x)=Aµ
(A+x)µ.(1.30)
This shape is obviously compatible with Eq. (1.18), and is such that P>(x=0) =1.If
A=(µ/α), one then finds:
P>(x)=1
[1+(αx)]µ−→
µ→∞ exp(αx).(1.31)
14 Probability theory: basic notions
1.9 Other distributions ()
There are obviously a very large number of other statistical distributions useful to describe
random phenomena. Let us cite a few, which often appear in a financial context:
rThe discrete Poisson distribution: consider a set of points randomly scattered on the
real axis, with a certain density ω(e.g. the times when the price of an asset changes).
The number of points nin an arbitrary interval of length is distributed according to the
Poisson distribution:
P(n)(ω)n
n!exp(ω).(1.32)
rThe hyperbolic distribution, which interpolates between a Gaussian ‘body’ and expo-
nential tails:
PH(x)1
2x0K1(αx0)exp αx2
0+x2,(1.33)
where the normalization K1(αx0) is a modified Bessel function of the second kind. For
xsmall compared to x0,PH(x) behaves as a Gaussian although its asymptotic behaviour
for xx0is fatter and reads exp(α|x|).
From the characteristic function
ˆ
PH(z)=αx0K1(x01+αz)
K1(αx0)1+αz,(1.34)
we can compute the variance
σ2=x0K2(αx0)
αK1(αx0),(1.35)
and kurtosis
κ=3K2(αx0)
K1(αx0)2
+12
αx0
K2(αx0)
K1(αx0)3.(1.36)
Note that the kurtosis of the hyperbolic distribution is always between zero and three.
(The skewness is zero since the distribution is even.)
In the case x0=0, one finds the symmetric exponential distribution:
PE(x)=α
2exp(α|x|),(1.37)
with even moments m2n=(2n)!α2n, which gives σ2=2α2and κ=3. Its character-
istic function reads: ˆ
PE(z)=α2/(α2+z2).
rThe Student distribution, which also has power-law tails:
PS(x)1
π
((1 +µ)/2)
(µ/2) aµ
(a2+x2)(1+µ)/2,(1.38)
which coincides with the Cauchy distribution for µ=1, and tends towards a Gaussian in
the limit µ→∞, provided that a2is scaled as µ. This distribution is usually known as
1.9 Other distributions ()15
0x
10-2
10-1
100
P(x)
Truncated Levy
Student
Hyperbolic
510 15 20
10-18
10-15
10-12
10-9
10-6
10-3
23
1
Fig. 1.5. Probability density for the truncated L´evy (µ=3
2), Student and hyperbolic distributions.
All three have two free parameters which were fixed to have unit variance and kurtosis. The inset
shows a blow-up of the tails where one can see that the Student distribution has tails similar to (but
slightly thicker than) those of the truncated L´evy.
Student’s t-distribution with µdegrees of freedom, but we shall call it simply the Student
distribution.
The even moments of the Student distribution read: m2n=(2n1)!!(µ/2n)/
(µ/2)(a2/2)n, provided 2n; and are infinite otherwise. One can check that
in the limit µ→∞, the above expression gives back the moments of a Gaussian:
m2n=(2n1)!!σ2n. The kurtosis of the Student distribution is given by κ=6/(µ4).
Figure 1.5 shows a plot of the Student distribution with κ=1, corresponding to
µ=10.
Note that the characteristic function of Student distributions can also be explicitly
computed, and reads:
ˆ
PS(z)=21µ/2
(µ/2)(az)µ/2Kµ/2(az),(1.39)
where Kµ/2is the modified Bessel function. The cumulative distribution in the useful
cases µ=3 and µ=4 with achosen such that the variance is unity read:
PS,>(x)=1
21
πarctan x+x
1+x2(µ=3,a=1),(1.40)
16 Probability theory: basic notions
and
PS,>(x)=1
23
4u+1
4u3,(µ=4,a=2),(1.41)
where u=x/2+x2.
rTheinversegamma distribution,for positive quantities (such as, for example, volatilities,
or waiting times), also has power-law tails. It is defined as:
P(x)=xµ
0
(µ)x1+µexp x0
x.(1.42)
Its moments of order nare easily computed to give: mn=xn
0(µn)/(µ). This
distribution falls off very fast when x0. As we shall see in Chapter 7, an inverse
gamma distribution and a log-normal distribution can sometimes be hard to distinguish
empirically. Finally, if the volatility σ2of a Gaussian is itself distributed as an inverse
gammadistribution,thedistributionofxbecomesa Student distribution see Section9.2.5
for more details.
1.10 Summary
rThe most probable value and the mean value are both estimates of the typical values
of a random variable. Fluctuations around this value are measured by the root mean
square deviation or the mean absolute deviation.
rFor some distributions with very fat tails, the mean square deviation (or even the
mean value) is infinite, and the typical values must be described using quantiles.
rThe Gaussian, the log-normal and the Student distributions are some of the important
probability distributions for financial applications.
rThe way to generate numerically random variables with a given distribution
(Gaussian, L´evy stable, Student, etc.) is discussed in Chapter 18, Appendix F.
rFurther reading
W. Feller, An Introduction to Probability Theory and its Applications, Wiley, New York, 1971.
P. L´evy, Th´
eorie de l’addition des variables al´
eatoires, Gauthier Villars, Paris, 1937–1954.
B. V. Gnedenko, A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables,
Addison Wesley, Cambridge, MA, 1954.
G. Samorodnitsky, M. S. Taqqu, Stable Non-Gaussian Random Processes, Chapman & Hall, New
York, 1994.
... Great advantages have been achieved by analysing time series of the price and return of financial assets. Besides the standard asset price model based on geometric Brownian motion [1,2], stochastic volatility and multifractal models inspired by turbulence [3,4], multi-timescale, scaling and various types of self-similar theories [5][6][7][8][9][10] have been established. Also multiagent models [11,12], Auto-Regressive-Conditional-Heteroskedastic (ARCH) and Generalized-ARCH (GARCH) models [13][14][15] are developed to describe the dynamics of financial markets [16,17]. ...
... That means, we assume that the purchase rate y j (p) must be proportional to the product of the number of supplied units z j (p) and the number of demanded units n j (p). 2 In other words the purchase process is interpreted as a "reaction" between demanded and supplied units where the "reaction velocity" corresponds to the purchase rate of the form: ...
... The stationary solution of Eq. (2) and Eq.(3) is given by Eq. (9). For an examination of the stability of this stationary state we write the number of demanded and supplied units as: ...
Preprint
The paper presents an evolutionary economic model for the price evolution of stocks. Treating a stock market as a self-organized system governed by a fast purchase process and slow variations of demand and supply the model suggests that the short term price distribution has the form a logistic (Laplace) distribution. The long term return can be described by Laplace-Gaussian mixture distributions. The long term mean price evolution is governed by a Walrus equation, which can be transformed into a replicator equation. This allows quantifying the evolutionary price competition between stocks. The theory suggests that stock prices scaled by the price over all stocks can be used to investigate long-term trends in a Fisher-Pry plot. The price competition that follows from the model is illustrated by examining the empirical long-term price trends of two stocks.
... Among the most relevant statistical properties of these volatility stochastic processes, volatility seems to be responsible for the observed clustering in stock returns. That is, large returns are commonly followed by other large returns and similarly for small ones [7]. Another feature is that, in clear contrast with stock returns, which show negligible autocorrelations, squared stock return, which is essentially volatility, autocorrelation is still significant for time lags longer than one year [21,20,7,18,10,17]. ...
... That is, large returns are commonly followed by other large returns and similarly for small ones [7]. Another feature is that, in clear contrast with stock returns, which show negligible autocorrelations, squared stock return, which is essentially volatility, autocorrelation is still significant for time lags longer than one year [21,20,7,18,10,17]. Additionally, there exists the so-called leverage effect, i.e., much shorter (few weeks) negative cross-correlation between current stock returns and future volatility [6,7,4,5]. ...
... Another feature is that, in clear contrast with stock returns, which show negligible autocorrelations, squared stock return, which is essentially volatility, autocorrelation is still significant for time lags longer than one year [21,20,7,18,10,17]. Additionally, there exists the so-called leverage effect, i.e., much shorter (few weeks) negative cross-correlation between current stock returns and future volatility [6,7,4,5]. ...
Preprint
Stochastic volatility models describe stock returns rtr_t as driven by an unobserved process capturing the random dynamics of volatility vtv_t. The present paper quantifies how much information about volatility vtv_t and future stock returns can be inferred from past returns in stochastic volatility models in terms of Shannon's mutual information.
... Such behavior points to a presence of long-range memory of the magnitude of price increments (volatility), see e.g. [4,5]. In particular, long memory property of volatility dynamics shows itself in a very slow power-like decay of volatility autocorrelation 3 . ...
... To answer the second one one has to introduce a quantitative measure of the distance between the distributions of returns P(r) and P(r F ) and their gaussian distributions having the same standard deviation. A convenient quantitative measure of this distance is the (normalized) hypercumulant [4]: ...
... One of the most important properties of financial time series is a slow powerlike decay of g(k) with k showing that a stochastic process governing temporal evolution of volatility is a long-range memory one, see e.g. [4]. ...
Preprint
Volatility dynamics of wavelet - filtered stock price time series is studied. Using the universal thresholding method of wavelet filtering and a principle of minimal linear autocorrelation of noise component we find that the quantitative characteristics of volatility dynamics of denoised series are noticeably different from those of the raw data and the noise.
... The dataset is the same as the one studied in [33], which was each day t. This is expected to remove, at least partly, long-term auto-correlation effects in the (absolute) size of returns -the so-called volatility [33,36]. The list of stocks taken into account along with the industrial sector are listed in Tab. ...
... are statistically insignificant for τ = 0 (see Appendix). This is consistent with the Efficient Market Hypothesis, that states that current prices fully reflect all available information, and hence future returns cannot be predicted [36]. Yet, the time series of returns is far from stationary, as evidenced by the plot in Fig. 11 of the equal time connected correlation For each pair i < j, we compute the confidences η ij (t) of the MS inference on windows of N = 200 days and their average η ij over t. ...
Preprint
We propose a method for recovering the structure of a sparse undirected graphical model when very few samples are available. The method decides about the presence or absence of bonds between pairs of variable by considering one pair at a time and using a closed form formula, analytically derived by calculating the posterior probability for every possible model explaining a two body system using Jeffreys prior. The approach does not rely on the optimisation of any cost functions and consequently is much faster than existing algorithms. Despite this time and computational advantage, numerical results show that for several sparse topologies the algorithm is comparable to the best existing algorithms, and is more accurate in the presence of hidden variables. We apply this approach to the analysis of US stock market data and to neural data, in order to show its efficiency in recovering robust statistical dependencies in real data with non stationary correlations in time and space.
... Heavy nuclei are an example of such a system; they are seemingly hopelessly complex, yet the spacings between their energy levels follow well-known statistics of random matrix eigenvalues [1,2]. More recently, one of such statistics, the Marchenko-Pastur distribution, has been found in fluctuations of financial covariance matrices [3], despite the strong non-Gaussian dependencies observed in real financial time series [4]. Latter example underscores the success of econophysics: Socio-economic human systems are highly non-linear [5,6,7,8] and chaotic [9], but methods borrowed from statistical physics can still be successful in describing bulk statistics of these systems. ...
Preprint
Full-text available
We find a remarkable agreement between the statistics of a randomly divided interval and the observed statistical patterns and distributions found in horse racing betting markets. We compare the distribution of implied winning odds, the average true winning probabilities, the implied odds conditional on a win, and the average implied odds of the winning horse with the corresponding quantities from the "randomly broken stick problem". We observe that the market is at least to some degree informationally efficient. From the mapping between exponential random variables and the statistics of the random division we conclude that horses' true winning abilities are exponentially distributed.
... It is known that the Pearson's estimator of the correlation matrix is not the maximum likelihood estimator when the variables are non-Gaussian. In the case of the Student's t-distribution of Eq. 20 there exists a recursive equation for the maximum likelihood estimatorC which is (Bouchaud and Potters, 2003 ...
Preprint
We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tree obtained from a correlation matrix. The information retained in filtering procedures and its stability with respect to statistical fluctuations is quantified by using the Kullback-Leibler distance.
... The dynamics of financial markets, with its erratic and irregular behavior at different time scales, has stimulated important theoretical contributions by several physicists and mathematicians like Mandelbrot, Stanley, Mantegna, Bouchaud, Farmer, Sornette, Tsallis, [1,2,3,4,5,6], among many others, since long time. In particular statistical physics has provided the newborn field of "Econophysics" with new tools and techniques that allow to model and characterize in a quantitative way the apparently unpredictable behavior of price and trading time dynamics. ...
Preprint
We introduce a new Self-Organized Criticality (SOC) model for simulating price evolution in an artificial financial market, based on a multilayer network of traders. The model also implements, in a quite realistic way with respect to previous studies, the order book dy- namics, by considering two assets with variable fundamental prices. Fat tails in the probability distributions of normalized returns are observed, together with other features of real financial markets.
... Such a regularizer can suppress large sample fluctuations that lead to a high degree of estimation error, especially in the high dimensional setting where both the dimension N and the sample size T are large. An alternative motivation for an 2 constraint is to prevent the over-concentration of the optimal portfolio on a small number of blue chips [30][31][32], a particularly strong tendency in small markets, and also by taking into account the market impact of a future liquidation of the portfolio already at the stage of its composition [21]. ...
Preprint
A large portfolio of independent returns is optimized under the variance risk measure with a ban on short positions. The no-short selling constraint acts as an asymmetric 1\ell_1 regularizer, setting some of the portfolio weights to zero and keeping the out of sample estimator for the variance bounded, avoiding the divergence present in the non-regularized case. However, the susceptibility, i.e. the sensitivity of the optimal portfolio weights to changes in the returns, diverges at a critical value r=2. This means that a ban on short positions does not prevent the phase transition in the optimization problem, it merely shifts the critical point from its non-regularized value of r=1 to 2. At r=2 the out of sample estimator for the portfolio variance stays finite and the estimated in-sample variance vanishes. We have performed numerical simulations to support the analytic results and found perfect agreement for N/T<2N/T<2. Numerical experiments on finite size samples of symmetrically distributed returns show that above this critical point the probability of finding solutions with zero in-sample variance increases rapidly with increasing N, becoming one in the large N limit. However, these are not legitimate solutions of the optimization problem, as they are infinitely sensitive to any change in the input parameters, in particular they will wildly fluctuate from sample to sample. We also calculate the distribution of the optimal weights over the random samples and show that the regularizer preferentially removes the assets with large variances, in accord with one's natural expectation.
... There is, however, no difficulty in doing this analysis: the results are obtained by doing as if the pdf had a finite support, the upper bound x M being given as an increasing function of N (for an introduction to statistics with fat tails, see e.g. [17]). ...
Preprint
We consider a model of socially interacting individuals that make a binary choice in a context of positive additive endogenous externalities. It encompasses as particular cases several models from the sociology and economics literature. We extend previous results to the case of a general distribution of idiosyncratic preferences, called here Idiosyncratic Willingnesses to Pay (IWP). Positive additive externalities yield a family of inverse demand curves that include the classical downward sloping ones but also new ones with non constant convexity. When j, the ratio of the social influence strength to the standard deviation of the IWP distribution, is small enough, the inverse demand is a classical monotonic (decreasing) function of the adoption rate. Even if the IWP distribution is mono-modal, there is a critical value of j above which the inverse demand is non monotonic, decreasing for small and high adoption rates, but increasing within some intermediate range. Depending on the price there are thus either one or two equilibria. Beyond this first result, we exhibit the generic properties of the boundaries limiting the regions where the system presents different types of equilibria (unique or multiple). These properties are shown to depend only on qualitative features of the IWP distribution: modality (number of maxima), smoothness and type of support (compact or infinite). The main results are summarized as phase diagrams in the space of the model parameters, on which the regions of multiple equilibria are precisely delimited.
... The phenomenology of critical phenomena, encompassing phase transitions in diverse contexts, stands as a cornerstone within the framework of Statistical Mechanics theory. Initially conceived within the realm of many-body physics, it has evolved into a concept with far-reaching applications spanning disciplines such as economics, network theory, sociophysics, game theory, and numerous others [1][2][3][4][5]. ...
Article
Full-text available
Random matrix theory, particularly using matrices akin to the Wishart ensemble, has proven successful in elucidating the thermodynamic characteristics of critical behavior in spin systems across varying interaction ranges. This paper explores the applicability of such methods in investigating critical phenomena and the crossover to tricritical points within the Blume–Capel model. Through an analysis of eigenvalue mean, dispersion, and extrema statistics, we demonstrate the efficacy of these spectral techniques in characterizing critical points in both two and three dimensions. Crucially, we propose a significant modification to this spectral approach, which emerges as a versatile tool for studying critical phenomena. Unlike traditional methods that eschew diagonalization, our method excels in handling short timescales and small system sizes, widening the scope of inquiry into critical behavior.
ResearchGate has not been able to resolve any references for this publication.