PreprintPDF Available

From model to market risks: The Implicit Function Theorem (IFT) demystified

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

One persisting conundrum in the theory and practice of quantitative risk management models is the relationship of model risks (the risk sensitivities of a transaction or set of transactions to the parameters of the model, for example, in a Dupire (1992) model, the local volatility surface) and market risks (the sensitivities to the market variables, for example, the implied volatility surface). Model parameters are typically calibrated to market variables, sometimes analytically (see Dupire’s formula expressing a local volatility as a function of the implied volatilities) but mostly numerically, where the model parameters are iteratively set to minimize the (generally, squared) error to market instruments. In machine learning lingo, the model learns its parameters by calibration to market instruments (underlying assets and European options) and applies them to off-market instruments (exotics). The value of a transaction is an explicit (although, most of the time, numerical) function of the model parameters, so the model risks are easily computed by finite difference or AAD (automatic adjoint differentiation, see for instance Savine’s textbook, Wiley, 2018). It is however, in general, not advisable, for speed, memory footprint or accuracy, to proceed in the same manner for market risks. To differentiate through a numerical calibration is an inefficient process that may lead to unstable or wrong results. The solution, presented by a few authors (Giles, Capriotti, Naumann, Huge-Flyger-Savine), is the application of a multi-dimensional version of the implicit function theorem (IFT) to deduce market risks from model risks. This document explains, and hopefully demystifies the application of the IFT in this contexts, and outlines the steps of an efficient and accurate algorithm for the determination of the market risks, in particular in the context of AAD.
From model to market risks:
The Implicit Function Theorem (IFT) demystified
Antoine Savine, 2016-2018
(Building on unpublished results by Jesper Andreasen)
Introduction
One persisting conundrum in the theory and practice of quantitative risk management models is the
relationship of model risks (the risk sensitivities of a transaction or set of transactions to the parameters
of the model, for example, in a Dupire (1992) model, the local volatility surface) and market risks (the
sensitivities to the market variables, for example, the implied volatility surface). Model parameters are
typically calibrated to market variables, sometimes analytically (see Dupire’s formula expressing a local
volatility as a function of the implied volatilities) but mostly numerically, where the model parameters are
iteratively set to minimize the (generally, squared) error to market instruments. In machine learning lingo,
the model learns its parameters by calibration to market instruments (underlying assets and European
options) and applies them to off-market instruments (exotics). The value of a transaction is an explicit
(although, most of the time, numerical) function of the model parameters, so the model risks are easily
computed by finite difference or AAD (automatic adjoint differentiation, see for instance Savine’s
textbook, Wiley, 2018). It is however, in general, not advisable, for speed, memory footprint or accuracy,
to proceed in the same manner for market risks. To differentiate through a numerical calibration is an
inefficient process that may lead to unstable or wrong results. The solution, presented by a few authors
(Giles, Capriotti, Naumann, Huge-Flyger-Savine), is the application of a multi-dimensional version of the
implicit function theorem (IFT) to deduce market risks from model risks. This document explains, and
hopefully demystifies the application of the IFT in this contexts, and outlines the steps of an efficient and
accurate algorithm for the determination of the market risks, in particular in the context of AAD.
Formalism
Consider a market with a set of n parameters a. This is what we calibrate to. For instance, we could
calibrate to an option market where the parameters are the spot price and a surface of n-1 implied
volatilities. Now consider a valuation model with m parameters b that calibrate to the market parameters
a given a set of i calibration instruments I; and a set of p additional parameters c that are not calibrated
to the market but rather directly derived from it. For instance, consider a dynamic model with parameters
some dynamic (local, model) volatilities b, calibrated to the instruments I given the market a; and a spot
c that is directly copied from the market: c=a[0]. We call these parameters “model parameters”, some
calibrated (b) and some derived (c).1
1 A model may also have parameters that are direct trader inputs, which are not either derived from or calibrated
to a market. For instance a volatility of volatility in an SLV model. In the context of this note, we consider these
Consider a transaction/trading book/xVa valued with the model. We denote its value V and we want to
compute its market risk
V
a
. From the model we get the model risks
V
b
and
V
c
(either with finite
differences or AAD over the valuation process). The Implicit Function Theorem (IFT) provides a matrix
formula for the computation of ,,
a I
V V V
f
a b c
 
 
 
 
 
. We call this formula IFT and what implements the
“backward propagation of risk” because it propagates risks from the model parameters onto the market
parameters to which the model parameters are calibrated.
Thanks to this multivariate calculus theorem, we don’t have to go through a costly and potentially
unstable differentiation of the calibration procedure, either by “bumping and recalibrating” (which would
be unstable) or with an AAD instrumentation of the calibration process (which would consume insane
amounts of memory to produce potentially wrong results).
In addition, as a major side benefit, the IFT correctly takes care of imperfect (best fit) calibrations with
more instruments than model parameters, including with uneven weights.
Not only the IFT correctly propagates the sensitivities to the calibrated model parameters b, it also
correctly adjusts risks to the derived model parameters c. For instance, the IFT formula would correctly
propagate model vegas into market vegas, and at the same time properly adjust delta to account for the
calibration.
The IFT
Conceptually, the market risks represent the sensitivity of V to small variations (bumps) of the market
parameters a. of When we bump a, we re-derive the model parameters c and recalibrate the model
parameters b before we re-compute the value of the transaction V in the model. Hence, market
parameters only impact the value through the re-derived and re-calibrated model parameters. Therefore:
 
1 1 1
n m m n p p n
V V b V c
a b a c a
   
   
 
   
V
a
is the result we need.
V
b
and
V
c
result from the differentiation of the valuation procedure, after
and excluding calibration (for example, with AAD instrumentation).
c
a
is obtained by differentiation of
parameters to be part of the market (the market contains a parameter that says “if you calibrate an SLV to me, use
that SV”) and part of the derived parameters of the model (directly copied from the relevant market entries).
c f a
. For instance, if a consists in a spot and some implied volatilities, and c is the spot, then trivially,
0
c a
and
 
1, 0,..., 0
c
a
.
The only unknown then is the sensitivity matrix of calibrated to market parameters
b
a
. This is where we
invoke the IFT to avoid having to differentiate the calibration.
Crucially, our equation holds for all transactions, in particular all the calibration instruments I. So:
i n i m m n i p p n
I I b I c
a b a c a
 
   
 
   
The unknown Jacobian
b
a
is the same in both equations. We compute it by inversion of the second
equation.
 
1
m n i n i p p n
m i
i n
i n
b I I I c
a b a c a
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The inversion of the mxi matrix is covered in the next paragraph. For now, we re-inject the result in the
first equation to get the final IFT formula:
 
1
1 1 1
1
1
n m i n i p p n p p n
m i
i n n
i n
n
V V I I I c V c
a b b a c a c a
 
 
 
 
 
 
 
 
 
 
 
 


The sensitivities
I
a
,
I
b
and
I
c
of the calibration instruments to the market and model parameters
are computed by differentiation of the valuation formulas or procedures
, ,
I f a I g b c
, which
are also necessary for calibration and therefore implemented for all models. For instance, a could be a set
of known market implied volatilities,
f a
could be the market prices of the corresponding European
calls, in which case
0, , ,
k k k k
f a BS a a T K
where BS is Black and Scholes’s formula,
0
a
is the spot
price (for this example, we assume no rates or dividends), and
, ,
k k k
a T K
are respectively the implied
volatility, maturity and strike of the kth European call. In this (simplistic) example,
I a
 
is the diagonal
matrix of European call vegas. In a more general context,
k
f
gives the price of a European call of some
strike
k
K
and maturity
k
T
, with an implied volatility
ˆ
k
interpolated from a volatility surface
ˆ ˆ
,
k a k k
K T
 
parameterized by (part of) a. In this case,
I a
 
is no longer diagonal but still analytic.
On the model side,
,
k
g b c
gives the price of the kth European call as a function of the model
parameters, which is generally obtained analytically (Hull & White interest rate model), or by expansion
(Andreasen’s implementation of a multifactor version of Cheyette’s interest rate model, see Back to the
Future, 2005) or by a fast numerical scheme (Dupire, Heston).
Note that the formula propagates to the market parameters, both the calibrated and derived parameters,
as if we were bumping through calibration, but without actually recalibrating, hence, in a faster, more
robust manner, and one that correctly deals with the case of imperfect (best fit) calibration, even
weighted, as explained below.
Pseudo-inversion
The IFT formula involves the inverse of the Jacobian of calibration instruments to calibrated parameters
I
b
(for instance, the model vegas of the calibration instruments). This matrix may not be invertible. It
doesn’t even have to be square. The calibration may be a best fit to more instruments than we have
parameters: i>m. In this case, we may even want uneven calibration weights to ensure better fit to the
“most important” instruments like ATM options.
Hence, instead of an actual inverse, this is a pseudo-inverse defined by the well known formula (here,
extended to account for calibration weights
):
1
2
1
' '
m i i i i m i i i m i i i m
i m i m i m
m i m i
i i
X X X X
   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

where
is a diagonal matrix whose diagonal elements are the calibration weights of the instruments.
With even weights, the formula simplifies into the classical pseudo-inverse:
 
1
1
' '
X X XX
Note that if X has dimensions nxm, its pseudo-inverse is of shape mxn.
When X is square and of full rank, the pseudo-inverse trivially coincides with the naive inverse. Hence,
with a perfect, well defined, one-to-one calibration, the IFT formula could be applied without further
development.
But this is not advisable. When X is square but “almost” singular, a naive inversion may raise numerical
instabilities. It is therefore best practice to implement pseudo-inversion in all cases, and, in addition,
implement it with SVD to avoid instabilities in the matrix inversion involved. SVD is known to be robust
and stable in the face of singularity or near co-linearity and is found in Numerical Recipes, with source
code.
When X is not squared (best fit), the naive inverse is undefined, while SVD inversion produces a result that
corresponds to a weighted least square projection of model risks onto market risks, which is what we
want in this case.
Therefore, SVD inversion is always either identical or superior to naive inversion and the additional cost
is, in practice, negligible compared to the necessary calculation of
I
a
,
I
b
and
I
c
. With SVD inversion,
we overcome potential numerical difficulties, correctly handle best fits and, as we see in the example
below, correctly adjust risks to parameters that are not calibrated.
Illustration
A simple yet representative example is when the market consists in a spot and a number of implied
volatilities:
1 1
ˆ ˆ
, ,..., n
a S
 
, the model also involves the spot, and a set of dynamic parameters,
typically model volatilities:
1 1
, ,..., n
b S
 
. In this example, we have n-1 model volatilities perfectly
fitting n-1 implied volatilities so the weights are irrelevant:
1
n
I
  . The calibration instruments are the
n-1 options corresponding to the implied volatilities. In this simple case, we focus on the meaning of the
IFT formula. The comments remain valid when we best fit n-1 market volatilities to m-1<n-1 model
volatilities with or without non uniform weights.
Applying the IFT formula we get (note in our example, the pseudo-inverse is just the inverse):
 
 
 
   
 
 
1
delta, vegas, model vegas, deltas of Is, veltas of Is,
naive inverse of
scalar 1 1 1 1 1 1 1 1
model vegas of
calibration instruments,
11 1
, ,
I I I
mkt mkt mdl mdl mkt mkt
n n n n n
nn n
 
   
 
 
 
 
 
 
 


 
 
 
 
model deltas of Is, model delta,scalar
1 1
11 1
1
1
1
1
1 1
0 0
' '
... ...
0 0
I
mdl mdl
n
n n
n n
n n
n n
n
 
   
 
 
 
 
 
   
 
   
 
   
 
 
   
 
   
 
   
 
 
 




n

And we immediately get:
1
final delta model delta delta adjustment
I I I
mkt mdl mdl mdl mkt mdl
 
 

and:
1
final vegas model vegas market vegas of Is
inverse of model vegas of Is
I I
mkt mdl mdl mkt
 

We can see that these formulas are sensible and reproduce the risks we would get if we bumped,
recalibrated and repriced, but without the associated cost and instability.
The final delta corrects the model delta for calibration, adding a vega weighted “delta of model
volatilities”. The final vegas are simply obtained by multiplication of the model vegas by the Jacobian of
model to market volatilities, in this case the inverse Jacobian of market to model volatilities.
Finally, we can see that for a transaction that happens to be a calibration instrument, the formulas
produce the exact same risks as those directly obtained from the market. In the same way that a calibrated
model is sometimes seen as an extrapolation tool from standard to less standard instruments, the IFT
provides an extrapolation of the risks. The closer a product is to a calibration instrument, the closer its
risks will be to the standard market risks. For other transactions, the model in combination with the IFT
produces a projection of non-standard risks to the standard ones.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.