Content uploaded by Caio Almeida

Author content

All content in this area was uploaded by Caio Almeida on Oct 01, 2018

Content may be subject to copyright.

The Role of No-Arbitrage on Forecasting:

Lessons from a Parametric Term Structure

Model ∗

Caio Almeida

Graduate School of Economics

Getulio Vargas Foundation

Praia de Botafogo 190, 11th Floor,

Botafogo, Rio de Janeiro, Brazil

Phone: 5521-2559-5828, Fax: 5521-2553-8821,

calmeida@fgv.br

Jos´e Vicente

Research Department

Central Bank of Brazil

Av. Presidente Vargas 730, 7th Floor,

Centro, Rio de Janeiro, Brazil

Phone: 5521-21895762, Fax: 5521-21895092,

jose.valentim@bcb.gov.br

June 10, 2008

∗We thank Antonio Diez de los Rios, Darrell Duﬃe, Marcelo Fernandes, Jean-Sebastien

Fontaine, Ren´e Garcia, and Lotﬁ Karoui for important comments. We also thank com-

ments and suggestions from seminar participants at the 26th Brazilian Colloquium of

Mathematics, the Soﬁe 2008 Conference, Getulio Vargas Foundation, HEC Montreal, and

Catholic University in Rio de Janeiro. The views expressed are those of the authors and

do not necessarily reﬂect those of the Central Bank of Brazil. The ﬁrst author gratefully

acknowledges ﬁnancial support from CNPq-Brazil.

1

Abstract

Parametric term structure models have been successfully applied

to innumerous problems in ﬁxed income markets, including pricing,

hedging, managing risk, as well as studying monetary policy impli-

cations. On their turn, dynamic term structure models, equipped

with stronger economic structure, have been mainly adopted to price

derivatives and explain empirical stylized facts. In this paper, we

combine ﬂavors of those two classes of models to test if no-arbitrage

aﬀects forecasting. We construct cross section (allowing arbitrages)

and arbitrage-free versions of a parametric polynomial model to an-

alyze how well they predict out-of-sample interest rates. Based on

U.S. Treasury yield data, we ﬁnd that no-arbitrage restrictions sig-

niﬁcantly improve forecasts. Arbitrage-free versions achieve overall

smaller biases and Root Mean Square Errors for most maturities and

forecasting horizons. Furthermore, a decomposition of forecasts into

forward-rates and holding return premia indicates that the superior

performance of no-arbitrage versions is due to a better identiﬁcation

of bond risk premium.

Keywords: Dynamic term structure models, parametric functions, fac-

tor loadings, time series analysis, time-varying bond risk premia

EFM codes:

2

1 Introduction

Fixed income portfolio managers, central bankers, and market participants

are in a continuous search for econometric models to better capture the evo-

lution of interest rates. As the term structure of interest rates carries out

important information about monetary policy and market risk factors, those

models might be seen as useful decision-orienting tools. In fact, in a quest to

better understand the behavior of interest rates, a large literature on excess

returns predictability and interest rates forecasting has emerged1. In partic-

ular, some models are not consistent inter-temporally while others impose

no-arbitrage restrictions, and so far the importance of such restrictions on

the forecasting context has not been established yet.

Testing the importance of no-arbitrage on interest rate forecasts should

be relevant for at least two reasons. First, since imposing no-arbitrage im-

plies stronger economic structure, testing how it will aﬀect model ability to

capture risk premium dynamics should be of direct concern to researchers. In

principle, although we could expect that a more theoretically-sound model

would better capture risk premiums, only careful empirical analysis might

manage to answer such question. On the other hand, from a practitioner’s

viewpoint, testing how no-arbitrage aﬀects forecasting will objectivelly in-

forme managers if it is worth to implement more complex interest rate mod-

els or not. Since latent factor models with no economic restrictions usually

represent a simpler alternative to be implemented, if no-arbitrage restrictions

don’t aggregate practical gains, they do not necessarily have to be enforced.

In this paper, we address the above mentioned points by testing how no-

arbitrage restrictions aﬀect the forecasting ability and risk premium structure

of a parametric term structure model2. We argue that parametric models

1Fama (1984), Fama and Bliss (1987), Campbell and Shiller (1991), Dai and Singleton

(2002), Duﬀee (2002), and Cochrane and Piazzesi (2005) analyze the failure of the expec-

tation hypothesis and the importance of time-varying risk premia. Kargin and Onatski

(2007), Bali et al. (2006), Diebold and Li (2006), and Bowsher and Meeks (2006) study

diﬀerent model speciﬁcations in a search for adequate forecasting candidates. Ang and

Piazessi (2003), Hordahl et al. (2006), Huse (2007), Favero et al. (2007), and M¨onch

(2007) relate interest rates and macroeconomic variables through term structure models.

2In parametric term structure models, the term structure is a linear combination of

predetermined parametric functions, such as polynomials, exponentials, or trigonometric

functions among others. For examples, see for instance, McCulloch (1971), Vasicek and

Fong (1982), Chambers et al. (1984), Nelson and Siegel (1987), and Svenson (1994),

among others.

3

are particularly appropriate to test the eﬀects of no-arbitrage on forecasting,

since they keep a ﬁxed factor-loading structure that is independent of the

underlying factors’ dynamics. This invariant loading structure implies that

across diﬀerent versions of the model, bond risk premia relate to a common

set of underlying factors, i.e. term structure movements. Based on this ﬁxed

set of factors, it should be possible to perform a careful analysis of how each

model version and no-arbitrage restrictions aﬀect risk premium.

We parameterize the term structure of interest rates as a linear combi-

nation of Legendre polynomials. This framework supports ﬂexible factors’

dynamics, including versions that allow for arbitrage opportunities and oth-

ers that are arbitrage-free. Focusing the analysis on three-factor models3,

we compare a cross section (CS) version, which allows for the existence of

arbitrages, to two aﬃne arbitrage-free versions, one Gaussian (AFG) and the

other with one factor driving stochastic volatility (AFSV).

The CS polynomial version is similar to the exponential model adopted by

Diebold and Li (2006) to forecast the U.S. term structure of Treasury bonds,

i.e. they are both parametric models that don’t rule out arbitrages. On their

turn, the arbitrage-free versions of the Legendre model share many charac-

teristics with the class of aﬃne models proposed by Duﬃe and Kan (1996).

No-arbitrage restrictions are imposed through the inclusion of conditionally

deterministic factors of small magnitude that guarantee the existence of an

equivalent martingale probability measure (Almeida 2005). Each arbitrage-

free version is implemented with six latent factors: three stochastic, and

three conditionally deterministic. Interestingly, by aﬀecting the dynamics of

the three basic stochastic factors (“level”, “slope” and “curvature”), the con-

ditionally deterministic factors directly aﬀect bond risk premium structure.

More general arbitrage-free versions of the polynomial model exist and

could also be analyzed4. However, priming for objectivity and transparency,

a more concise analysis was favored, with choices of Gaussian (AFG) and

Stochastic Volatility (AFSV) aﬃne versions motivated by Dai and Singleton

3Litterman and Scheinkman (1991) show that most of the variability of the U.S. term

structure of Treasury bonds can be captured by three factors: level, slope and curva-

ture. Many subsequent more recent works have conﬁrmed their ﬁndings. An exception

is Cochrane and Piazzesi (2005) who ﬁnd that a fourth latent factor improves forecasting

ability.

4For instance, versions with more than one factor driving stochastic volatility within

the aﬃne family, or even models with a non-aﬃne diﬀusion structure. For examples, see

Almeida (2005).

4

(2002), Duﬀee (2002), and Tang and Xia (2007). Duﬀee (2002) elects the

three-factor aﬃne Gaussian model as the best (within aﬃne) to predict U.S.

bond excess returns. Dai and Singleton (2002) identify that the same Gaus-

sian model correctly reproduces the failures of the expectation hypothesis

documented by Fama and Bliss (1987) for U.S. Treasury bonds. In contrast,

Tang and Xia (2007) show that a three-factor aﬃne model with one factor

driving stochastic volatility generates bond risk premium patterns compat-

ible with data from ﬁve major ﬁxed income markets (Canada, Japan, UK,

US, and Germany). A key ingredient to all these ﬁndings is the ﬂexible es-

sentially aﬃne parameterization of the market prices of risk (Duﬀee 2002),

which we also adopt in our work.

Based on monthly U.S. zero-coupon Treasury data, we analyze the out-

of-sample behavior of the three proposed versions under diﬀerent forecasting

horizons (1-month, 6-month, and 12-month). Forecasting results indicate

that dynamic arbitrage-free versions of the model achieve overall lower bias

and root mean square errors for most maturities, with stronger results holding

for longer forecasting horizons. Diebold and Mariano (1995) tests conﬁrm

the statistical signiﬁcance of obtained results.

In order to analyze the eﬀects of no-arbitrage in the risk premium struc-

ture, we decompose yield forecasts into forward rates and risk premium com-

ponents. The decomposition allows us to identify that the superior forecast-

ing performance of arbitrage-free versions is primarily due to a better identi-

ﬁcation of bond risk premium dynamics. This result represents an important

eﬀort in the direction of understanding how no-arbitrage aﬀects forecasting.

It also indicates that further analysis with other classes of parametric models

should be seriously considered.

Related works include the papers by Duﬀee (2002), Ang and Piazzesi

(2003), Favero et al. (2007), and Christensen et al. (2007). Duﬀee (2002)

tests the ability of aﬃne models on forecasts of interest rates, concluding that

completely aﬃne models fail to reproduce U.S. term structure stylized facts,

while essentially aﬃne models do a better job due to a richer risk premium

structure. While Duﬀee (2002) analyzes how diﬀerent market prices of risk

speciﬁcations aﬀect forecasting in arbitrage-free models, we study how no-

arbitrage aﬀects forecasting, what stands for including models that allow for

arbitrages in our analysis.

Ang and Piazzesi (2003) show that imposing no-arbitrage restrictions to

a VAR with macroeconomic variables improves its forecasting ability. Simi-

larly, Favero et al. (2007) test how macroeconomic variables and no-arbitrage

5

restrictions aﬀect interest rate forecasting, ﬁnding that no-arbitrage mod-

els, when supplemented with macro data, are more eﬀective in forecasting.

Both papers model factor dynamics with a Gaussian VAR structure, while

we include stochastic volatility in our analysis, ﬁnding it to be relevant to

improve forecasting. In addition, both allow for changes in term structure

loadings when comparing no-arbitrage models to models allowing for arbi-

trages. Those changes in factors and bond risk premiums make it harder

to isolate the pure eﬀects of no-arbitrage on forecasting. In contrast, the

parametric term structure polynomial model adopted in our work avoids this

issue due to its ﬁxed factor-loading structure.

Christensen et al. (2007) obtain a Gaussian arbitrage-free version of the

parametric exponential model proposed by Diebold and Li (2006). They em-

pirically test their arbitrage-free version and identify that it oﬀers predictive

gains for moderate to long maturities and forecasting horizons. Although

in this case they keep a ﬁxed factor loading structure as we do, there are

interesting diﬀerences between the two papers. First, the two papers analyze

distinct parametric families, each oﬀering interesting insights on their own.

Second, the technique used to derive arbitrage-free versions is quite distinct.

While we base our derivations on Filipovic’s (2001) consistency work, which

is not attached to the class of aﬃne models, they make use of Duﬃe and

Kan’s (1996) arguments, which are valid only under aﬃne models. Third,

they present a Gaussian arbitrage-free version while we also include the im-

portant case where volatility is stochastic. Last, in addition to the forecasting

analysis, we propose a careful analysis of the risk premium structure, which

should be particularly interesting for portfolio managers and risk managers,

as a complementing tool.

Our results should be important to managers and practitioners in general.

They suggest it should be worth constructing arbitrage-free versions of other

parametric models to test their performances as practical forecasting/hedging

tools. The techniques adopted to construct arbitrage-free versions of the

polynomial model can be found in Filipovic (2001), and can be readily applied

to other parametric families, such as variations of Nelson and Siegel (1987)

models, Svenson (1994) models5, and splines models with ﬁxed knots, among

5Filipovic (1999) showed that there is no non-trivial arbitrage-free version of the original

Nelson and Siegel (1987) model. Nevertheless, it is possible to construct arbitrage-free

versions of variations of the Nelson and Siegel and Svenson (1994) models, as shown for

instance, by Christensen et al. (2007). On this matter, see Sharef and Filipovic (2005) for

theoretical results, and De Rossi (2004) for an implementation of a Gaussian exponential

6

others.

We provide evidence that no-arbitrage restrictions improve interest rate

forecasting for a class of parametric models, but in what generality can we say

that no-arbitrage restrictions indeed help? Our results when coupled with

those by Ang and Piazzesi (2003), Favero et al. (2007), and Christensen et

al. (2007), indicate that the validity of no-arbitrage restrictions as a tool to

improve model’s forecasting ability appear to be reasonably general 6. More-

over, as we show that no-arbitrage restrictions help to better econometrically

identify risk premium parameters, this identiﬁcation improvement should

be even more signiﬁcant considering more complex dynamic term structure

models. Models with time-varying conditional variances and non-linear risk

premiums are becoming more common as tools to capture empirical stylized

facts of the term structure of interest rates (see Dai and Singleton (2003)

for a discussion)7. Correspondingly, parametric models with more general

dynamics should be tested both with the purpose of ﬁtting and forecasting

interest rates. This suggests that no-arbitrage restrictions as a tool to im-

prove econometric identiﬁcation of parameters (specially risk premium pa-

rameters) should be of fundamental importance for the implementation of

such models.

The paper is organized as follows. Section 2 introduces the polynomial

model, presenting its CS and arbitrage-free versions. Section 3 explains the

dataset adopted, and presents empirical results, including an interesting dis-

cussion relating bond risk premium to model forecasting ability. Section

4 oﬀers concluding remarks and possibly extending topics. The Appendix

presents details on the arbitrage-free versions of the polynomial model.

arbitrage-free model.

6Nevertheless, we believe that a more detailed analysis of the contributions of no-

arbitrage to interest rate forecasting should be accomplished with extensive tests of a

variety of diﬀerent dynamic models, also complemented with robustness tests for the pe-

riods of data adopted. For instance, in a recent paper Duﬀee (2007) suggests, for a class

of aﬃne Gaussian models, that no-arbitrage restrictions do not agregate additional fore-

casting ability for his proposed model, although not hurting its performance either.

7For instance, in a recent paper Dai et al. (2006) propose a class of non-linear discrete-

time models whose market prices of risk are non-linear functions of the state variables.

They show, under a three factor dynamic model, that the inclusion of a cubic term in the

drift of the factor driving stochastic volatility improves out-of-sample forecasting ability

when compared to a linear drift for the same factor.

7

2 The Legendre Polynomial Model

Almeida et al. (1998) proposed modeling the term structure of interest rates

R(.) as a linear combination of Legendre polynomials8:

R(t, τ ) = X

n≥1

Yt,nPn−1(2τ

`−1),(1)

where τdenotes time to maturity, Pnis the Legendre polynomial of degree

nand `is the longest maturity in the bond market. In this model, each

Legendre polynomial represents a term structure movement, providing an

intuitive generalization of the principal components analysis proposed by

Litterman and Scheinkman (1991). The constant polynomial is related to

parallel shifts, the linear polynomial is related to changes in the slope, and

the quadratic polynomial is related to changes in the curvature. Naturally,

higher-order polynomials are interpreted as loadings of diﬀerent types of

curvatures. For illustration purposes, Figure 1 depicts the ﬁrst four Legendre

polynomials9. This model has been applied to problems involving scenario-

based portfolio allocation, risk management, and hedging with non-paralell

movements (see, for instance, Almeida et al. 2000, 2003).

On the estimation process, the number of Legendre polynomials is ﬁxed

according to some statistical criterion10. When considering zero-coupon

yields, on each date, the model is estimated by running a linear regression

of the corresponding vector of observed yields into the set of Legendre poly-

nomials (Pn(·)’s ) previously selected. The cross section version (CS) of

the model is characterized by repeatedly running this linear regression at

diﬀerent instants of time, to extract a time series of term structure move-

ments {Yt}t=1,..,T . Equipped with those time series one can choose any arbi-

trary time-series process to ﬁt their joint dynamics. It is important to note,

8A parametric term structure model based on the power series as opposed to the Leg-

endre polynomial basis, appeared before in Chambers et al. (1984). The advantage of

Legendre polynomials is that they form an orthogonal basis, being less subject to multi-

colinearity problems.

9They are respectively P0(x) = 1, P1(x) = x,P2(x) = 1

2(3x2−1), and P3(x) =

1

2(5x3−3x), deﬁned within the interval [-1,1]. The Legendre polynomials of degrees four

and ﬁve, P4(x) = 1

8(35x4−30x2+3) and P5(x) = 1

8(63x5−70x3+15x), are also of interest,

since they will be adopted to build arbitrage-free versions of the Legendre model.

10Almeida et al. (1998) suggest the use of a stepwise regression, Akaike or Bayesian

information criteria.

8

however, that the time-series extraction step imposes no inter-temporal

restrictions to term structure movements, consequently allowing for the

existence of arbitrages within the model11.

From an economic point of view, it would be interesting to add enough

structure to our model so as to enforce absence of arbitrages. To that end, we

begin by assuming the following dynamics for the stochastic factors driving

term structure movements:

dYt=µ(Yt)dt+σ(Yt)dWt,(2)

where Wis a N-dimensional independent standard Brownian motion under

the objective probability measure Pand µ(·) and σ(·) are progressively mea-

surable processes with values in RNand in RN×N, respectively, such that the

diﬀerential system above is well-deﬁned.

How do we impose no-arbitrage conditions to the polynomial model?

From ﬁnance theory, it suﬃces to guarantee the existence of a martingale

measure equivalent to P(see Duﬃe 2001). More speciﬁcally, in order to rule

out arbitrage opportunities, and to keep the polynomial term structure form,

the following conditions (hereafter denominated AF conditions) must hold

1. The time tprice of a bond with time to maturity τ=T−t,B(t, T ),

should be given by:

B(t, T )=e−τ G(τ)0Yt,(3)

where G(τ) is a vector containing the ﬁrst NLegendre polynomials

evaluated at maturity τ:

G(τ) = P02τ

`−1P12τ

`−1. . . PN−12τ

`−10

.(4)

2. There should exist a probability measure Qequivalent to Psuch that,

under Q, discounted bond prices are martingales.

The next theorem establishes restrictions (hereafter denominated AF re-

strictions12) that will provide arbitrage-free versions of the polynomial model.

11This is the same approach chosen by Diebold and Li (2006) to extract time-series of

term structure movements implied by a parametric exponential model to forecast U.S.

Treasury interest rates.

12The AF restriction is equivalent to imposing the Heath et al. (1992) forward rate drift

restriction that ensures absence of arbitrages in the market.

9

Theorem 1 Assume Yt-dynamics under a probability measure Qequivalent

to Pgiven by:

dYt=µQ(Yt)dt+σ(Yt)dW∗

t,(5)

where W∗is a Browian motion under Q.

If µQ(Yt)satisﬁes the restriction expressed in Equation 6, Qis an equiv-

alent martingale measure and the AF conditions hold13.

(6.1) PN

j=2(j−1)LjYt,j τj−2=PN

j=1 LjµQ

j(Yt)τj−1−P[N

2]

j=1 P[N

2]

k=1 Γjk (Yt)τj+k−1

k

(6.2) Γjk (Yt)=0for j > [N

2]or k > [N

2]

(6)

with Γ(Yt) = Lσ(Yt)σ(Yt)L0,Ljstanding for the jth -line of an upper tri-

angular matrix that depends only on `, and [·]representing the integer part

of a number.

Proof and technical details are provided in the Appendix.

The AF restriction has a fundamental implication for any AF version of

the Legendre polynomial model: for each stochastic term structure movement

there must exist a corresponding conditionally deterministic movement whose

drift will compensate the diﬀusion of the former. If we adopt, for instance, a

CS version with Nfactors driving movements of the term structure, the cor-

responding arbitrage-free versions should present 2Nlatent factors in order

to become stochastically compatible with CS: Nstochastic factors with non-

null diﬀusion coeﬃcients, and Nconditionally deterministic factors. Observe

that although the AF restriction is enforced to the drift of the risk neutral

dynamics (5), in principle, we can work with any general drift (for the ﬁrst N

factors) under the objective dynamics (2) by taking general market prices of

risk processes. However, the restriction that imposes the existence of condi-

tionally deterministic factors must hold under both the risk neutral and the

objective measures, and this is what enforces no-arbitrage, and distinguishes

AF versions from CS.

In this paper, we focus our analysis on AF versions whose dynamics belong

to the class of aﬃne models (Duﬃe and Kan 1996). This is implemented by

13In addition to the drift restriction, σ(Yt) should present enough regularity to guaran-

tee that discounted bond prices that are local martingales, also become martingales. In

practical problems, a bounded or a square-aﬃne σ(Yt) is enough to enforce the martingale

condition.

10

restricting the diﬀusion coeﬃcient of the state vector Yto be within the

aﬃne class, simplifying the SDEs for Yto14:

dYt=κQ(θ−Yt)dt + ΣpSt(Yt)dW ∗

t,(7)

where the matrix Stis diagonal with elements Sii

t=αi+β0

iYtfor some scalar

αiand some RN-vector βi.

In the empirical section, we compare a three factor CS version with two

AF versions that present three stochastic factors with non-null diﬀusions.

We have seen before that this implies arbitrage-free versions with six factors

(three stochastic, three conditionally deterministic). The ﬁrst AF version

is a Gaussian model (βi= 0,∀i) and the second is a stochastic volatility

model with only one factor driving the volatility. In the Appendix, we show

in details how to translate the AF restriction to the aﬃne framework, and

further specialize the results to the Gaussian and stochastic volatility AF

versions.

Following Duﬀee (2002) we specify the connection between risk neutral

probability measure Qand objective probability measure Pthrough an es-

sentially aﬃne market price of risk

Λt=pStλ0+qS−

tλYYt,(8)

where λ0is a N×1 vector, λYis a N×Nmatrix, Stappears in Equation

7, and S−

tis deﬁned by:

Sii−

t=

1

Sii

tif inf(αi+βt

iYt)>0

0 otherwise.

(9)

The market prices of risk turn out to be of fundamental importance since

the dependence of bond expected excess returns ei

t,τ on term structure move-

ments Yis what moves the model away from the Expectation Hypothesis

Theory:

ei

t,τ =−τ G (τ) Σ Stλ0+I−λYYt.(10)

14Note that although bond prices are exponential aﬃne functions of the state space

vector Y(see (3)), in general the dynamics of Yis not restricted to be that of an aﬃne

model. For instance, if we choose σ(Y) not to be the square root of an aﬃne function of

Y, the dynamics of Ywill be non-aﬃne.

11

Equation 10 indicates that zero coupon bond instantaneous expected ex-

cess return is a linear combination of model factors, with weights depending

on matrices λY, and Σ, and on a predetermined vector of maturity-dependent

Legendre polynomial terms.

Finally, to estimate the parameters of the two AF versions we use a Quasi-

Maximum Likelihood procedure since, within the class of aﬃne models, both

ﬁrst and second conditional moments of latent factors are known in closed-

form formulas (see Appendix for details).

2.1 Forecasting with the Polynomial Model

Within the sub-class of aﬃne polynomial models with essentially aﬃne mar-

ket prices of risk, any arbitrage-free version will correspond to a continuous

time vector autoregressive model of order 1 (possibly with stochastic volatil-

ity). In order to provide fair comparisons, we match the lagging structure of

the time series processes describing arbitrage-free and CS versions, therefore,

specializing the CS version to forecast with a VAR(1) process.

The procedure to forecast under the CS version is divided in two steps:

First extract the time series YCS

tof term structure movements by running

cross section regressions and then to ﬁt a VAR(1) process to those series of

term structure movements:

YCS

t=c+φY CS

t−1+t.(11)

Given a ﬁxed maturity τand a ﬁxed forecasting horizon (h-step horizon),

forecasts are produced by calculating the conditional expectation of CS fac-

tors under the VAR(1) structure:

EtYCS

t+h=c

h−1

X

j=0

φj+φhYCS

t.(12)

The conditional expectation of the τ-maturity yield is obtained by substitut-

ing factor forecasts in (1):

Et(R(t+h, τ )) = G(τ)0EtYCS

t+h.(13)

Similarly, for the arbitrage-free aﬃne versions, interest rate forecasts can

be produced by using the closed form structure of conditional factor means.

As under the aﬃne sub-class the drift of latent factors Yarb.free can be written

12

as µQ(Yarb.free

t) = κQ(θ−Yarb.free

t), the time tconditional expectation of

Yarb.free

t+his given by (Duﬀee 2002):

EtYarb.free

t+h= (I2N−e−κQh)θ+e−κQhYarb.free

t(14)

where I2Nis the identity matrix of order 2N. Finally, for any ﬁxed maturity

τ, the term structure formula in (1) should be used to forecast:

Et(R(t+h, τ )) = G(τ)0EtYarb.free

t+h(15)

Under both CS and arbitrage-free versions, forecasts considering horizons

longer than the sampling frequency are produced under a multi-step predic-

tion structure, as opposed to re-estimating the models under each horizon

frequency.

3 Empirical Results

3.1 Data Description

Data consists of 324 monthly observations of bootstrapped smoothed Fama-

Bliss U.S. Treasury zero-coupon yields (2-, 3-, 5-, 7-, and 10-year maturities)

observed from January, 1972 to December, 199815. Based on a sub-sample of

276 observations from January, 1972 to December, 1994, we estimate three

distinct versions of the Legendre polynomial model: The CS version that al-

lows for arbitrages, a Gaussian arbitrage-free version (AFG), and a stochastic

volatility arbitrage-free version with one variable driving volatility (AFSV).

The following subsequent four years of monthly data (from 1995 to 1998) not

included in the estimation process, are used to measure models’ forecasting

ability, and to study their risk premium structure.

3.2 Estimation

The two AF versions were estimated using a Quasi-Maximum Likelihood

procedure, explicitly exploring the fact that the conditional ﬁrst and sec-

ond moments of latent variables are known analytically. Adopting Chen and

15This dataset is an extended version of the same dataset used by Dai and Singleton

(2002).

13

Scott’s (1993) methodology, a subset of zero-rates (2-, 5- and 10-year ma-

turities) was priced without errors, while the remaining rates were priced

with i.i.d zero-mean errors. Parameters that identify the stochastic discount

factor appear in Table 1. Σ’s and β’s are parameters related to volatility, λ’s

are related to factors’ risk premia, and Y0’s deﬁne initial conditions for con-

ditionally deterministic factors. Standard deviations from residual ﬁts of 3-

and 7-year zeros, indicate that the AFSV version presents a better in-sample

cross section ﬁtting than the AFG version (13.6 and 26.0 bps under AFG

versus 9.3 and 16.0 bps under AFSV).

Figures 2 and 3 present time-series of factors capturing term structure

movements, for respectively the AFG and AFSV versions. Left-hand side

graphs present “level”, “slope” and “curvature” factors. Right-hand side

graphs depict the three conditionally deterministic factors. As yields have

intrinsic stochastic behavior, it is natural to expect that conditionally de-

terministic factors will have their in-sample values minimized by the QML

optimization procedure. Indeed, factors ﬁve and six, are practically negligible

under both arbitrage-free versions. However, factor four, relating to the cubic

Legendre polynomial (dashed blue line) gets up to 75 bps under the Gaus-

sian version (in-sample), and gets up to 20 bps under the stochastic volatility

version (in-sample). It doesn’t vanish like the other two conditionally deter-

ministic factors because it represents the “price” that the polynomial model

has to pay in order to become arbitrage-free. The three higher order fac-

tors change the time-series of lower order movements (“level”, “slope” and

“curvature”) in a way to guarantee no-arbitrage under each arbitrage-free

version.

The small magnitude of conditionally deterministic factors explains why

the three lower order movements present similar time series across diﬀerent

versions of the model (see Figures 2 and 3). Note that the two arbitrage-free

versions present the same term structure parametric form, a linear combina-

tion of the ﬁrst six Legendre polynomials, implying that any diﬀerences on

the time series of the lower order movements should come from diﬀerences

on the higher order conditionally deterministic factors across versions.

The CS version is a three-factor model estimated by running monthly

separate cross sectional regressions. While arbitrage-free versions were esti-

mated under QML explicitly considering the dynamics of the six polynomial

factors, the CS version, in contrast, assumes complete time-independence for

factors dynamics, and is based on only the three lower order factors, “level”,

“slope” and “curvature”, since conditionally deterministic factors are not

14

necessary in this case, given that no-arbitrage restrictions are not imposed.

Figure 4 presents time-series of the diﬀerences between each factor in the

CS version (“level”, “slope” and “curvature”), and the corresponding factor

on each dynamic version (AFG and AFSV). Those distances are small in mag-

nitude and again, come predominantly from the conditionally deterministic

factor due to the cubic Legendre polynomial. In fact, for each arbitrage-free

version, the shape of the fourth factor time-series is carried out to Figure 416.

3.3 Forecast Comparisons

We proceed as in Section 2.1 to produce, for each version, forecasts based

on ﬁxed parameters estimated with the sample ranging from January 1972

to December 199417. We argue that keeping ﬁxed estimated parameters, as

opposed to recursively re-estimating models out-of-sample (like performed in

other studies), is an appropriate choice: With ﬁxed estimated parameters,

better out-of-sample forecasting suggests higher ability to capture the under-

lyind dynamics of interest rates. This choice is consistent with our goal of

further analyzing the risk premium structure of the polynomial model.

Table 2 presents yield forecast biases and Root Mean Square Errors

(RMSE) for the out-of-sample period, from January of 1995 to December of

1998. For each maturity and forecasting horizon h, a total of 49-hforecasts

is produced, with h-month ahead forecasts beginning in the hth month of

1995, and ending in December of 1998. Bias and RMSE are measured in ba-

sis points, and bold values indicate the lowest absolute value of bias/RMSE

under a ﬁxed maturity and forecasting horizon. We ﬁrst concentrate our

analysis on the bias results.

16Favero et al. (2007) also compare time-series of term structure movements coming from

models with and without no-arbitrage restrictions. They compare movements coming from

a Gaussian arbitrage-free model to corresponding movements coming from the Diebold and

Li (2006) model, ﬁnding that, across models, level factors are more homogenous, while

slope and curvature present higher distances.

17In order to further check and validate our results, we performed a number of robustness

tests: i) changed the number of factors in the CS version from three to six, ii) changed

the in-sample estimation period to (1972-1996) and corresponding out-of-sample period

to (1997-2000), iii) changed the estimation method of the CS version to invert from three

bonds, similarly to the arbitrage-free versions. The two arbitrage-free versions continue

to outperform the CS version, with stronger results in i), and with slightly weaker results

but still statistically signiﬁcant in ii) and iii). Those robustness test results are available

upon request.

15

From a total of 15 entries appearing in the table (three forecasting hori-

zons and ﬁve observed maturities), the CS version presents the lowest abso-

lute bias in 4 of them, AFG version in 4, and AFSV in 7. In other words, in

more than 70% of the entries the arbitrage-free models present signiﬁcantly

lower biases. Interestingly, the CS version is superior only on the shortest

forecasting-horizon (1-month), indicating that no-arbitrage restrictions im-

prove longer-horizon forecasts. A more appropriate comparison is proposed

by separately comparing CS to each arbitrage-free version. In this case, the

AFG version presents absolute bias lower than CS in 9 out of 15 entries,

and the AFSV version presents absolute bias lower than CS in 11 out of

15 entries. In summary, from a bias perspective, no-arbitrage tremendously

improves results, specially for longer forecasting horizons.

Bias results are pictured in Figure 5, where out-of-sample averaged ob-

served and averaged model implied term structures appear. For instance,

for a 1-month forecasting horizon, the solid blue line represents an average

of the 48 curves that were observed between January 1995 and December of

1998. Correspondingly, the red dotted, the cyan dash-dotted, and the black

dashed lines, represent the average of the 48 forecasts produced respectively

by CS, AFG, and AFSV versions. The bias is simply the diﬀerence between

averaged observed and model implied curves. Note how, due to the con-

ditionally deterministic factors, arbitrage-free versions present much higher

curvature than CS. This higher curvature produces two antagonistic eﬀects:

it makes arbitrage-free versions to get much closer to observed yields for most

maturities, but also generates strong bias for a few cases18.

Now observing RMSE results in Table 2, it is clear that arbitrage-free

versions are again superior. When compared by pairs CS x AFSV and CS x

AFG, AFSV is superior to CS in 11 out of 15 entries, and AFG is superior

to CS in 9 out of 15 entries. For short-horizon forecasts, the AFSV version

presents the best performance, under the RMSE criterion, among the three

competitors, and for long-horizon forecasts, AFG takes its place. On its

turn, CS version is only better on the 10-year maturity, where arbitrage-

free versions are biased due to the conditionally deterministic factors (as

mentioned above), and on the short-term forecast of the 7-year yield.

We check the statistical signiﬁcance of our results by means of the Diebold

and Mariano (1995) test. Under a Mean Absolute Error loss function (MAE),

18The AFG presents high bias at the 7-, and 10-year maturities, and the AFSV, at the

10-year maturity.

16

Table 3 compares forecasting errors produced with the arbitrage-free versions

to corresponding CS forecasting errors19. Negative values of the statistics

(S1or S2) indicate that no-arbitrage improves forecasts. According to S2,

which is robust to small samples, from a total of 15 table entries, AFSV

has forecasting ability superior to CS in 8 of them at a 99% conﬁdence

level (bi-caudal test) (in 9 entries at a 95% conﬁdence level). On the other

hand, in only 2 entries CS would be superior to AFSV, at both 95% or 99%

conﬁdence level. On comparisons between AFG and CS versions, results are

more balanced but still in favor of no-arbitrage, with 6 entries in favor of

AFG, signiﬁcant at a 95% conﬁdence level (5 entries at 99%), and 5 entries

in favor of CS, at a 99% conﬁdence level. Interestingly, against AFG, CS

is strong on short-horizon forecasts and on forecasts for the 10-year yield.

Against AFSV, CS is strong only on forecasts for the 10-year yield.

3.4 Discussion

3.4.1 The Eﬀects of Bond Risk Premium in Bias.

In order to better understand the diﬀerences in forecasting ability across the

three distinct versions of the polynomial model analyzed in this paper, we

are interested in decomposing the conditional expectations of yields as the

diﬀerence of a forward rate component and a bond risk premium component.

The bond risk premium component is deﬁned as a holding-return premium,

similarly to Hordahl et al. (2006)20.

Suppose we want to analyze model forecasting behavior for a ﬁxed ma-

turity of τyears, and forecasting horizon of hmonths, where one month is

our basic time slot. The idea is to consider, at time t, the return of buying

a zero-coupon bond with time to maturity τ+h

12 and selling it hmonths in

the future, leading to the following excess return expression with respect to

the time tshort-term yield with maturity h

12 ,R(t, h

12 ):

BP (τ , h) = Et"log B(t+h

12 , τ )

B(t, τ +h

12 )!−Rt, h

12#(16)

We deﬁne this holding period return BP to be the bond premium. Now,

deﬁning the t1-maturity forward rate, t2years in the future to be f(t, t1, t2),

19Signiﬁcance of results is not aﬀected when we tested with a quadratic loss function.

20See Kim and Orphanides (2007) for a careful explanation about the term premium.

17

the relation between bond premium, corresponding forward rate, and yield

conditional expectation is given by:

EtRt+h

12, τ =ft, τ, h

12− h

12

τ!BP (τ , h) (17)

Equation 17 says that the h-month ahead forecast for the yield with maturity

τcan be directly decomposed as the forward rate of a τ-maturity yield seen h

months in the future, subtracted by a normalized risk premium (normalized

by forecasting horizon over time-to-maturity).

This way, adopting Equation 17, conditional yields are decomposed in a

forward rate, and a holding-return premium component. These decomposed

forecasts might be useful for managers as an accessing tool to extract risk

premium, since there is large interest in obtaining bond premiums from term

structure data, and since they are hard to estimate (Kim and Orphanides

2007).

Tables 4, 5, and 6 respectively present out-of-sample averaged yields, av-

eraged forward rates, and averaged bond premium. By looking at the ﬁrst

two tables, with a few exceptions, we note that forward rates are higher than

average yields, directly indicating that models should present positive risk

premium in order to compensate this diﬀerence, and to decrease bias. Inter-

estingly, Table 6 indicates that both arbitrage-free versions indeed generate

positive risk premiums, while in contrast, the CS version generates negative

premiums. In other words, under a vector autoregressive structure of lag

one, the version that allows arbitrages does not capture risk premium cor-

rectly21. For instance, the behavior of the 5-year yield under short/medium

term forecasting horizons (1- and 6- month) is of particular interest to our

risk premium analysis. The short-term horizon is a good example because

forward rates under the three versions of the model are close to each other

(see Table 5) implying that diﬀerences in bias across versions come pre-

dominantly from diﬀerences in their implied risk premiums. For a 1-month

forecasting horizon, Table 4 shows an averaged observed out-of-sample yield

for the 5-year maturity equal to 5.648%22. From Table 5, the 1-month ahead

21It is important to say that the lack of CS ability to reproduce risk premiums can not

be attributed to instability in the estimated VAR. In fact, the vector autoregressive model

estimated under the CS version is stable, with all roots from the characteristic polynomial

lying within the unit circle.

22The average of observed yields is depending on the forecasting horizon because the

18

5-year forward rates are respectively 5.709%, 5.723%, and 5.723%, for CS,

AFG, and AFSV versions, with roughly a diﬀerence of 1.5 bps between CS

and arbitrage-free versions. On the other hand, from Table 6, the averaged

risk premiums implied by CS, AFG, and AFSV versions are respectively -

1.6, 7.8, and 6.0 bps, indicating that CS misses bond premium even when

forward rates are all similar across versions, that is, when we control for dif-

ferences in forward rates across versions. Similarly, considering the 6-month

forecasting horizon, the 6-month ahead 5 year forward rates for the CS and

AFSV versions are very similar, respectively, 5.906% and 5.895% (Table 5),

but their implied risk-premiums are very distinct, respectively -18.6 and 25.1

bps (Table 6). It is clear that the forward rates coming from the two versions

are overestimating future 5-year yields, but while the positive risk premium

implied by the AFSV version corrects this overestimation, the negative risk

premium implied by the CS version worsens.

3.4.2 What is the Contribution of No-arbitrage?

Why imposing no-arbitrage leads to better forecasts? The mechanics of the

problem can be directly explained by the conditionally deterministic factors.

Once they are included in the term structure parameterization, they change

the original time series of “level”, “slope” and “curvature” factors, conse-

quently aﬀecting the behavior of bond risk premium.

Further appreciation of the no-arbitrage eﬀect on risk premium can be

obtained from Table 7. It presents, for each model version, the ratio of the

bias generated by assuming a zero bond risk premium (no model implied

risk-premium eﬀect), over the true bias generated when model implied bond

risk premium is fully incorporated. Whenever risk premium has a positive

eﬀect on forecasting, we should immediately observe values higher than 1

for this ratio. For values lower than 1, the model is not correctly capturing

the risk premium dynamics. It is particularly interesting to observe that

CS presents values lower than 1 in all table entries, indeed conﬁrming that

it is not correctly capturing risk premium dynamics. In sparkling contrast,

arbitrage-free versions not only present (for most table entries) values higher

than 1, but in addition, some entries have values much higher than 123,

horizon deﬁnes the beginning of the averaging window. See the description of Table 4 for

further explanations.

23Under the AFG version, 7 ratio values are higher than 3, and under the AFSV ver-

sion, 6 ratio values are higher than 3. A ratio value higher than 3 indicates that once

19

indicating that no-arbitrage tremendously increase model ability to correctly

capture risk premium dynamics.

A dynamic picture of the risk premium eﬀect described on the paragraph

above can be readily observed in Figure 6. For a ﬁxed 12-month forecasting

horizon, it presents time-series of observed out-of-sample 2-year yields, with

corresponding forward rates, and model implied bond risk premiums24 . On

each graph, the dotted line represents observed yields, the dashed line repre-

sents the 12-month ahead 2-year forward rate, and the solid line represents

the risk premium corrected forward rate, that is, the yield forecast produced

with 17. Once risk premium is included, it clearly improves forecasts under

the two arbitrage-free versions: the solid line is much closer to the dotted

line than the dashed line is. However, under the CS version, risk premium

degrades its performance. The dashed line (the one with zero-premium) is

much closer to the true observed yield than the solid line (the one including

risk premium).

Figure 7 presents examples of risk premium dynamics along the 27 years,

from 1972 to 1998, for diﬀerent maturities and forecasting horizons. The goal

of this picture is to show similarities and diﬀerences among risk premiums

implied by each model version, both in- and out-of-sample. It presents the

1-month holding period return premium for the 5-year bond, the 6-month

premium for the 10-year bond, and the 12-month premium for the 2-year

bond. Those three maturities give pretty much an idea of the risk premium

behavior across the U.S. Treasury term structure for maturities up to 10-

years. For the three forecasting horizons, the less volatile premium comes

from the AFSV arbitrage-free version. Despite presenting a smaller volatility,

it has a very strong eﬀect on improving forecasts as previously observed

in Table 7. Risk premiums coming from the other two versions (CS and

AFG) have more similar in-sample behavior, but clearly get apart out-of-

sample, with the AFG version generating positive premiums, and the CS

version generating negative ones. This out-of-sample separation of premiums

indicates that while CS might be doing a good job when ﬁtting in-sample

data, it is probably overﬁtting data and missing the true dynamics of yields.

The second picture in Figure 7 presents the premium behavior of the

model implied risk premium is considered in forecasting (and not only forward rates), bias

decreases for less than one third of the bias value with zero-premium.

24The choice of a 12-month forecasting horizon is justiﬁed by our interest in making ex-

plicit the role of risk premium, since its importance is an increasing function of forecasting

horizon.

20

10-year yield under a 6-month forecasting horizon. We have intentionally

included this particular maturity to show that even the best arbitrage-free

version of the polynomial model (analyzed in this paper) can not capture

all features of data, ending up missing the risk premium for this particular

maturity. Observe that in the out-of-sample period the AFSV premium

converges to approximately the same negative values produced by the CS

version, when both should be producing positive premiums. This is a ﬁrst

indication that the polynomial family, at least under its aﬃne subclass, might

not be the best candidate to simultaneously describe the behavior of the

whole cross section of yields, and to guarantee inter-temporal consistency of

the underlying term structure factors.

The third picture in Figure 7 presents the dynamic premium behavior

of the 2-year yield under a 12-month forecasting horizon. Note how the

out-of-sample behavior of the premium implied under the three versions is

tremendously diﬀerent, with the AFG premium highly positive, AFSV pre-

mium slightly positive, and CS premium highly negative. This distinct dy-

namic behavior translates into rather diﬀerent implications for bias. For

instance, the AFG excellent performance when forecasting the 2-year yield

12-months in future (-2.8 bps of bias) can be explained by it risk premium

out-of-sample behavior. Picture 2 in Figure 6 indicates that its forward rates

are exaggerated with respect to realized yields. However, its out-of-sample

risk premium is positive and high, thus compensating those exaggerated for-

ward rates, and bringing forecasts to values close to observed yields. On the

other hand, AFSV version presents a positive bias of 35.2 basis points, indi-

cating that it should have produced higher risk premium values to decrease

bias. CS version clearly misses the premium as it should have been positive

(see picture 3 in Figure 6), while it is negative during the whole out-of-sample

period.

Finally, rather than looking for the best forecasting candidate, our speciﬁc

interest was to identify if no-arbitrage improves or degrades the forecasting

ability of a given parametric term structure model. However, with the in-

tention of putting the polynomial model among credible benchmarks, we

present in Table 8, bias and Root Mean Square Errors coming from the best

polynomial version AFSV, the established Random Walk (RW) benchmark,

and the recently proposed Diebold and Li (2006) model (DL). Forecasting

horizons (1-,6-, and 12- month) and maturities (2-, 3-, 5-, 7-, and 10-year)

are the same as presented in previous tables. The polynomial model achieves

smaller bias and RMSE in 9 out of 15 entries, and, interestingly 7 among

21

those 9 entries are related to longer forecasting horizons (6- and 12-month).

4 Conclusion

We tested the eﬀect of no-arbitrage restrictions on out-of-sample interest

rate forecasts. This was implemented with the use of a parametric term

structure model that expresses the term structure of interest rates as a linear

combination of polynomials. We test this family by comparing forecasts

of a model version which admits arbitrages, to two diﬀerent arbitrage-free

versions of the same model, concluding that absence of arbitrage decreases

bias and RMSE, specially for longer forecasting horizons.

An important feature of performing this no-arbitrage eﬀect test with a

parametric family that presents closed-form formula for bond prices, is that it

allows us to isolate the eﬀects of no-arbitrage from other eﬀects like changes

in factor loadings under diﬀerent model dynamic speciﬁcations. Fixed factor

loadings not only put the forecasting comparison on a ﬁxed basis, but also

allow for a similar interpretation of bond risk premia across diﬀerent model

versions. By looking at model implied risk premia, we ﬁnd that the diﬀerent

versions generate very distinct bond risk premium behavior, whose eﬀect can

be directly observed in the out-of-sample forecasting biases. The risk pre-

mium implied by arbitrage-free versions improves forward rates forecasting

ability while the corresponding premium implied by the cross section version

degrades forecasting ability.

Note that rather than proposing an isolated test of no-arbitrage eﬀects on

forecasting, the test is conditional to the Legendre polynomial term structure

model. However, if something can be attributed to the particular polynomial

structure, is that it is biased against no-arbitrage. This bias can be directly

observed in Figures 2, 3, and 5, which show that for 7-, and 10-year matu-

rities under the AFG, and 10-year maturity under the AFSV, out-of-sample

forecasts are biased due to an explosion of the conditionally deterministic

factors, out-of-sample25. With this observation in mind, we could conjecture

25The explosion of these conditionally deterministic factors is exacerbated by the para-

metric polynomial structure of the yield curve. A test where all conditionally determin-

istic factors are kept at a constant value (their last in-sample value) during the whole

out-of-sample period, considerably improves forecasts under both versions, at those “bad”

maturities, while keeping the previous good results at other maturities. The results of this

test are available upon request.

22

that under more ﬂexible parametric families, the no-arbitrage restrictions

might generate even more positive eﬀects on forecasting. This way, it ap-

pears to be room for further evaluation of important parametric families

such as the classical polynomial-exponential family whose models by Nelson

and Siegel (1987), Diebold and Li (2006), and Svenson (1994) belong to, and

also analysis of more complex families like “splines with ﬁxed knots” (see

Bowsher and Meeks 2006)26. Moreover, as the techniques used to generate

arbitrage-free versions of parametric models readily allow for inclusion of

extra variables in factor dynamics, tests including macroeconomic variables

could possibly better identify bond risk premium behavior (see Ludvigson

and Ng 2007). We leave those topics for future research.

26Equipped with Filipovic’s (2001) theoretical results on consistent term structure mod-

els, our tests can be readily extended to other parametric families, as long as they support

at least one arbitrage-free version for the term structure model.

23

References

[1] Almeida C.I.R (2005). Aﬃne Processes, Arbitrage-Free Term Structures

of Legendre Polynomials, and Option Pricing, International Journal of

Theoretical and Applied Finance,8, 2, 1-23.

[2] Almeida C.I.R., A.M. Duarte, and C.A.C. Fernandes (1998). Decompos-

ing and Simulating the Movements of Term Structures of Interest Rates

in Emerging Eurobonds Markets. Journal of Fixed Income,8, 1, 21-31.

[3] Almeida C.I.R., A.M. Duarte, and C.A.C. Fernandes (2000). Credit

Spread Arbitrages in Emerging Eurobonds Markets. Journal of Fixed

Income,10, 3, 100-111.

[4] Almeida C.I.R , A.M. Duarte, and C.A.C. Fernandes (2003) A Gen-

eralization of Principa Component Analysis for Non-observable Term

Structures in Emerging Markets, International Journal of Theoretical

and Applied Finance,6, 8, 885-903.

[5] Ang A., and M. Piazzesi (2003). A No-Arbitrage Vector Autoregression

of Term Structure Dynamics with Macroeconomic and Latent Variables.

Journal of Monetary Economics,50, 745-787.

[6] Bali T., M. Heidari, and L. Wu (2006). Predictability of Interest Rates

and Interest Rate Portfolios. Working Paper, Baruch College.

[7] Bowsher C. and R. Meeks (2006). High Dimensional Yield Curves: Mod-

els and Forecasting. Working Paper, Nuﬃeld College, University of Ox-

ford.

[8] Campbell, J. Y. and R. Shiller (1991). Yield Spreads and Interest Rate

Movements: A Birds Eye View. Review of Economic Studies,58, 495-

514.

[9] Chambers D.R., W.T. Carleton, and D.W. Waldman (1984). A New

Approach to Estimation of the Term Structure of Interest Rates. Journal

of Financial and Quantitative Analysis,19, 3, 233-251.

[10] Cochrane J. and M. Piazzesi (2005). Bond Risk Premia. American Eco-

nomic Review,95, 1, 138-160.

24

[11] Chen R.R. and L. Scott (1993). Maximum Likelihood Estimation for a

Multifactor Equilibrium Model of the Term Structure of Interest Rates.

Journal of Fixed Income,3, 14-31.

[12] Christensen J.H.E., F. Diebold, and G.D. Rudebusch (2007). The Aﬃne

Arbitrage-free Class of the Nelson-Siegel Term Structure Models. Work-

ing Paper, Federal Reserve Bank of San Francisco.

[13] Dai Q., A. Le, and K. Singleton (2006). Discrete-time Dynamic Term

Structure Models with Generalized Market Prices of Risk. Working Pa-

per, University of North Carolina at Chapel Hill.

[14] Dai Q. and K. Singleton (2000). Speciﬁcation Analysis of Aﬃne Term

Structure Models. Journal of Finance,LV, 5, 1943-1977.

[15] Dai Q. and Singleton K. (2002). Expectation Puzzles, Time-Varying

Risk Premia, and Aﬃne Models of the Term Structure. Journal of Fi-

nancial Economics,63, 415-441.

[16] Dai Q. and Singleton K. (2003). Term Structure Modeling in Theory

and Reality. Review of Financial Studies,16, 631-678.

[17] De Rossi G. (2004). Kalman Filtering of Consistent Forward Rate

Curves: A Tool to Estimate and Model Dynamically the Term Struc-

ture. Journal of Empirical Finance,11, 277-308.

[18] Diebold F.X. and C. Li (2006). Forecasting the Term Structure of Gov-

ernment Bond Yields. Journal of Econometrics, 130, 337-364.

[19] Diebold F.X. and R.S. Mariano (1995). Comparing Predictive Accuracy.

Journal of Business and Economic Statistics,13, 253-263.

[20] Duﬀee G. R. (2002). Term Premia and Interest Rates Forecasts in Aﬃne

Models. Journal of Finance,57, 405-443.

[21] Duﬀee G. R. (2007). Forecasting with the Term Structure: The Role of

No-arbitrage. Working Paper, University of California - Berkeley.

[22] Duﬃe D. (2001). Dynamic Asset Pricing Theory. Princeton University

Press.

25

[23] Duﬃe D. and Kan R. (1996). A Yield Factor Model of Interest Rates.

Mathematical Finance,6, 4, 379-406.

[24] Fama E.F. (1984). The Information in the Term Structure of Interest

Rates. Journal of Financial Economics,13, 2, 509-528.

[25] Fama, E. F. and Bliss R.R. (1987). The information in long-maturity

forward rates. American Economic Review,77, 4, 680-692.

[26] Favero A.C., L. Niu, and L. Sala (2007). Term Structure Forecasting:

No-Arbitrage Restrictions vs. Large Information Set. Working Paper,

Bocconi University.

[27] Filipovic D. (1999). A Note on the Nelson and Siegel Family. Mathemat-

ical Finance,9, 4, 349-359.

[28] Filipovic D. (2001). Consistency Problems for Heath-Jarrow-Morton

Interest Rate Models. Lecture Notes in Mathematics,1760, Springer-

Verlag, Berlin.

[29] Heath D., R. Jarrow and A. Morton (1992). Bond Pricing and the Term

Structure of Interest Rates: A New Methodology for Contingent Claims

Valuation. Econometrica,60, 1, 77-105.

[30] Hordahl P., O. Tristani, and D. Vestin (2006). A Joint Econometric

Model of Macroeconomic and Term Structure Dynamics. Journal of

Econometrics,131, 405-440.

[31] Huse C. (2007). Term Structure Modelling with Observable State Vari-

ables. Working Paper, London School of Economics.

[32] Kargin V. and A. Onatski (2007). Curve Forecasting by Functional Au-

toregression. Working Paper, Columbia University.

[33] Kim D. and A. Orphanides (2007). The Bond Market Term Premium:

What is it, and How can we Measure it?, BIS Quarterly Review, June.

[34] Litterman R. and Scheinkman J.A. (1991). Common Factors Aﬀecting

Bond Returns. Journal of Fixed Income,1, 54-61.

[35] Ludvigson S. and S. Ng (2007). Macro Factors in Bond Risk Premia.

Working Paper, Department of Economics, New York University.

26

[36] McCulloch J.H. (1971). Measuring the Term Structure of interest Rates.

Journal of Business, 44, 19-31.

[37] M¨onch E. (2007). Forecasting the Yield Curve in a Data-Rich Envi-

ronment: A No-Arbitrage Factor-Augmented VAR Approach. Working

Paper, Humboldt Universitt zu Berlin.

[38] Nelson C. and A. Siegel (1987). Parsimonious Modeling of Yield Curves.

Journal of Business, 60, 4, 473-489.

[39] Sharef E. and D. Filipovic (2004). Conditions for Consistent

Exponential-Polynomial Forward Rate Processes with Multiple Nontriv-

ial Factors. International Journal of Theoretical and Applied Finance,

7, 685-700.

[40] Svensson L. (1994). Monetary Policy with Flexible Exchange Rates and

Forward Interest Rates as Indicators. Institute for International Eco-

nomic Studies, Stockholm University.

[41] Tang H. and Y. Xia (2007). An International Examination of Aﬃne

Term Structure Models and the Expectations Hypothesis. Journal of

Financial and Quantitative Analysis,42, 1, 41-80.

[42] Vasicek O. and G. Fong (1982). Term Structure Modelling Using Expo-

nential Splines. Journal of Finance, 37, 2, 339-48.

27

5 Appendix

5.1 Proof of Theorem 1.

Theorem 1.

Assume Yt-dynamics under a probability measure Qequivalent to Pgiven

by:

dYt=µQ(Yt)dt+σ(Yt)dW∗

t,(18)

where W∗is a Browian motion under Q.

If µQ(Yt) satisﬁes the restriction expressed in Equation (19), Qis an

equivalent martingale measure and the AF conditions hold27 .

PN

j=2(j−1)LjYt,j τj−2=PN

j=1 LjµQ

j(Yt)τj−1−P[N

2]

j=1 P[N

2]

k=1 Γjk (Yt)τj+k−1

k

Γjk (Yt) = 0 for j > [N

2] or k > [N

2]

(19)

with Γ(Yt) = Lσ(Yt)σ(Yt)L0,Ljstanding for the jth -line of an upper

triangular matrix that depends only on `, and [·] representing the integer

part of a number.

Proof of Theorem 1.

The term structure of the Legendre polynomial model is given by:

R(τ, Yt) = G(τ)0Yt=

N

X

n=1

Yt,nPn−1(2τ

`−1),(20)

that is, the loadings of the term structure are Legendre polynomials. There-

fore, the τ-maturity instantaneous forward rate is

f(τ, Yt) =

N

X

n=1

Yt,nPn−1(2τ

`−1) + τ N

X

n=1

Yt,n

∂Pn−1(2τ

`−1)

∂τ !.(21)

In the equation above, the forward rates are expressed as linear combinations

of Legendre polynomials, which can be readily expressed as linear combina-

27In addition to the drift restriction, σ(Yt) should present enough regularity to guaran-

tee that discounted bond prices that are local martingales, also become martingales. In

practical problems, a bounded or a square-aﬃne σ(Yt) is enough to enforce the martingale

condition.

28

tions of powers of τ:

f(τ, Yt) =

N

X

n=1

LnYtτn−1,(22)

where Lnis the nth row of the upper triangular matrix L. In fact, (22) deﬁnes

matrix L. If N= 6 the matrix Lis28

L=

1−1 1 −1 1 −1

04

`−12

`

24

`−40

`

60

`

0 0 18

`2−90

`2

270

`2−630

`2

0 0 0 80

`3−560

`3

2450

`3

0 0 0 0 350

`4−3150

`4

0 0 0 0 0 1512

`5

.(24)

Proposition 3.2 of Filipovic (1999) presents conditions on f(τ, Yt), which

guarantee that discounted bond prices are martingales under any speciﬁc in-

terest rate model29. Using these conditions, Almeida (2005) proves that if the

AF restrictions (19) hold, then the Legendre polynomial model is arbitrage-

free.

28Using the ﬁrst six Legendre polynomials we have

f(τ, Yt) =

Yt,1+Yt,2x+Yt,3

2(3x2−1) + Yt,4

2(5x3−3x)+

Yt,5

8(35x4−30x2+ 3) + Yt,6

8(63x5−70x3+ 15x)+

2τ

`hYt,2+ 3Yt,3x+Yt,4

2(15x2−3)i+

2τ

`hYt,5

8(140x3−60x) + Yt,6

8(315x4−210x2+ 15)i.

(23)

where x=2τ

`−1. Collecting terms that are powers of τin the expression above we obtain

the upper triangular matrix Lfor N= 6.

29Basically, Proposition 3.2 of Filipovic (1999) imposes a speciﬁc relationship between

the partial derivatives of f(τ, Yt).

29

5.2 Technical Details about the Sub-Class of Arbitrage-

Free Legendre Models with Aﬃne Dynamics.

The aﬃne class of dynamic term structure models is composed by processes

whose state vector Yis an aﬃne diﬀusion30, and whose implied short term

rate is aﬃne in Y. Dai and Singleton (2000) proposed the following notation

to describe the dynamics of canonical aﬃne models under the risk neutral

measure Q:

dYt=µQ(Yt)dt + ΣpSt(Yt)dW ∗

t=κQ(θQ−Yt)dt + ΣpSt(Yt)dW ∗

t(25)

where κQand Σ are N×Nmatrices, θQis a RN-vector, and Stis diagonal

matrix with elements Sii

t=αi+β0

iYtfor some scalar αiand some RN-vector

βi.

Now suppose we want to equip the aﬃne class of models with a loadings

structure composed by Legendre polynomials31. To this end, we have to

impose the AF restrictions of Theorem 1.

Consider the auxiliary state space vector ˜

Ytdeﬁned by

˜

Yt=LYt,(26)

where Lis the upper triangular matrix of Theorem 1. This auxiliary pro-

cess characterizes term structure movements when the loadings come from a

power series in the maturity variable τ. It appears as an intermediate step

in calculations.

The dynamics of ˜

Ytunder probability measure Qis given by

d˜

Yt= ˜µQ(˜

Yt)dt +˜

Σq˜

St(˜

Yt)dW ∗

t,(27)

where the parameters of this stochastic diﬀerential equations system are de-

ﬁned in similar way to (25) (i.e., ˜

Sii

t= ˜αi+˜

βi

0˜

Ytfor some scalar ˜αiand some

RN-vector ˜

βiand so on) and are related through (26) with the corresponding

parameters in (25). It should be clear that ˜

Ytis aﬃne if, and only if, Ytis

aﬃne, because L is invertible.

30This means that the drift and the squared diﬀusion terms of Yare aﬃne functions of

Y.

31Note that it is not possible to make use of Duﬃe and Kan (1996) separation arguments

that lead to their pair of Ricatti equations since the Legendre polynomials do not satisfy

one of the algebraic conditions stated in their main theorem.

30

Under this particular sub-class (aﬃne plus polynomial loadings), the ﬁrst

requirement of AF restrictions becomes

N

X

j=2

(j−1) ˜

Yt,j τj−2=

N

X

j=1

µQ

j(˜

Yt)τj−1−

[N

2]

X

j=1

[N

2]

X

k=1

(˜

H0,jk +˜

H1,jk ˜

Yt)τj+k−1

k,(28)

where ˜

Σ˜

St˜

Σ0ij =˜

H0ij +˜

H1ij ˜

Yt, with ˜

H0ij ∈Rand ˜

H1ij ∈RN.

In particular, by matching coeﬃcients on the maturity variable τin (28),

we obtain an explicit expression for the drift of the auxiliary process:

˜µQ˜

Yti=i˜

Yt,i+1 +

Min{i−1,[N

2]}

X

j=Max{1,i−[N

2]}

˜

H0,j(i−j)+˜

H1,j(i−j)˜

Yt

i−j.(29)

This expression can be readily translated to a similar expression for the drift

of the original state vector Ywith the use of (26).

In the empirical section of our paper, we compare a three factor CS ver-

sion with corresponding AF versions that present three stochastic factors

with non-null diﬀusions. By Theorem 1, a natural way to implement this ap-

plication, is to work with AF versions driven by six factors (three stochastic,

three conditionally deterministic). In the next lines, we provide the restric-

tions that should be implemented to generate aﬃne models with polynomial

loadings, and how to translate those restrictions to generate aﬃne models

with Legendre polynomial loadings. After that, we explain in details the

two AF versions chosen to be implemented in this work: the Arbitrage-Free

Gaussian (AFG) version, in which the volatility of Yis deterministic and time

independent, and the Arbitrage-Free Stochastic Volatility (AFSV) version,

in which only one stochastic factor determines the volatility of Y.

31

When N= 6 the dynamics of ˜

Ythas the following form:

˜

Sii

t(˜

Yt) = ˜αi+˜

β0

i˜

Ytif i≤3

0 if i > 3,

˜

Σi,j = 0 i, j > 3,

˜µQ(˜

Yt)1=˜

Yt,2,

˜µQ(˜

Yt)2= 2 ˜

Yt,3+˜

H0,11 +˜

H1,11 ˜

Yt,

˜µQ(˜

Yt)3= 3 ˜

Yt,4+˜

H0,12

2+˜

H1,12

2˜

Yt+˜

H0,21 +˜

H1,21 ˜

Yt,

˜µQ(˜

Yt)4= 4 ˜

Yt,5+˜

H0,13

3+˜

H1,13

3˜

Yt+˜

H0,22

2+˜

H1,22

2˜

Yt+˜

H0,31 +˜

H1,31 ˜

Yt,

˜µQ(˜

Yt)5= 5 ˜

Yt,6+˜

H0,23

3+˜

H1,23

3˜

Yt+˜

H0,32

2+˜

H1,32

2˜

Yt,

˜µQ(˜

Yt)6=˜

H0,33

3+˜

H1,33

3˜

Yt.

(30)

The dynamics of the term structure movements Yunder the original

Legendre polynomial parameterization can then be obtained by solving (26).

To that end, let us rewrite the drift ˜µQin matrix notation, as an aﬃne

transformation of ˜

Y:

˜µQ(˜

Yt) = M+U˜

Yt,(31)

32

where U=U1+U2, and U1,U2and Mare given by:

M=

0

˜

H0,11

˜

H0,12

2+˜

H0,21

˜

H0,13

3+˜

H0,22

2+˜

H0,31

˜

H0,23

3+˜

H0,32

2

˜

H0,33

3

,(32)

U1=

010000

002000

000300

000040

000005

000000

,(33)

U2=

01×6

˜

H1,11

˜

H1,12

2+˜

H1,21

˜

H1,13

3+˜

H1,22

2+˜

H1,31

˜

H1,23

3+˜

H1,32

2

˜

H1,33

3

.(34)

Finally, the drift and diﬀusion of process Yare given by:

µQ(Yt) = L−1˜µQ(˜

Yt) = L−1˜µQ(LYt) = L−1M+L−1U LYt(35)

and

σ(Yt) = L−1˜

Σq˜

St(LYt).(36)

33

In our empirical application, the maximum maturity is equal to `= 10

years. Then, matrix Lis given by:

L=

1−1 1 −1 1 −1

0 0.4−1.2 2.4−4 6

000.180 −0.9 2.70 −6.3

0 0 0 0.08 −0.56 2.24

0 0 0 0 0.035 −0.3158

0 0 0 0 0 0.0152

.(37)

Now we are ready to specialize the drift restriction (30) to each particular

AF version implemented in this paper (AFG and AFSV), and also to obtain

the corresponding restrictions for the process of interest Y, the one that

drives term structure movements within the Legendre polynomial model.

5.2.1 The AFG Version

Noting that in this version the matrix controlling the diﬀusion structure of

vector ˜

Y, i.e. ˜

S(.), is the identity matrix, we directly obtain ˜

Σ˜

Σ0=˜

H0, and

from (36) we obtain the relation between ˜

H0and Σ:

˜

H0=LΣ2((L−1)0)−1=LΣ2L0.(38)

If we adopt a diagonal matrix representation for Σ32, with Σii as the ith-

diagonal term, then, in order to match the second requirement of AF re-

strictions we must have Σii = 0 for i≥4. Therefore, using transformation

Lbetween Yand ˜

Y,˜

H0can be explicitly related to the non-null diagonal

terms in Σ:

32This representation for Σ provides exactly the same identiﬁcation structure of Dai and

Singleton (2002).

34

˜

H0=

Σ2

11 + Σ2

22 + Σ2

33 −0.4Σ2

22 −1.2Σ2

33 −0.18Σ2

33 000

−0.4Σ2

22 −1.2Σ2

33 0.16Σ2

22 + 1.44Σ2

33 −0.216Σ2

33 000

0.18Σ2

33 −0.216Σ2

33 0.0324Σ2

33 000

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

.(39)

Since U2is null under the Gaussian version, we learn from (35) that the

two matrices (L−1Mand L−1U1L) necessary to obtain an explicit expression

for the drift µQ(Yt) are given by:

L−1M=

5

2Σ2

11 +5

6Σ2

22 +1

2Σ2

33

5

2Σ2

11 +3

2Σ2

22 +11

14Σ2

33

5

3Σ2

22 +5

7Σ2

33

Σ2

22 + Σ2

33

9

7Σ2

33

5

7Σ2

33

(40)

and

L−1U1L=

0 0.4−0.3 0.56667 −0.41667 0.65667

0 0 0.9−0.5 1.25 −0.77

0 0 0 1.3333 −0.58333 1.7833

0 0 0 0 1.75 −0.63

0 0 0 0 0 2.16

0 0 0 0 0 0

(41)

35

Note that Y4,Y5, and Y6are deterministic factors under the Gaussian case.

This is a consequence of two facts: (i) their dynamics do not depend on

the Brownian motion vector, and (ii) their drifts do not depend on the ﬁrst

three components of the state vector. With matrices L−1Mand L−1U1Lin

hands, we obtain the drift of vector Y, and in particular, the drifts of the

deterministic factors Y4,Y5, and Y6:

µQ(Yt)4= Σ2

22 + Σ2

33 + 1.75Yt,5−0.63Yt,6,

µQ(Yt)5=9

7Σ2

33 + 2.16Yt,6,

µQ(Yt)6=5

7Σ2

33.

(42)

By explicitly solving the ordinary diﬀerential equations implied for these

factors, we have

Yt,4=Y0,4+(Σ2

22+Σ2

33+1.75Y0,5−0.63Y0,6)t+(0.9Σ2

33+0.189Y0,6)t2+0.45Σ2

33t3,

(43)

Yt,5=Y0,5+ (9

7Σ2

33 + 2.16Y0,6)t+27

35Σ2

33t2,(44)

Yt,6=Y0,6+5

7Σ2

33t. (45)

Note that, under this Gaussian version, the dynamics of the state vari-

ables Yt,4,Yt,5and Yt,6, in addition to being deterministic, are completely

determined by parameters Σ22, Σ33, and the initial conditions Y0,4,Y0,5and

Y0,6.

5.2.2 The AFSV Version

The AFSV version, presents one stochastic factor driving the stochastic

volatility of the three stochastic factors (Y1,Y2and Y3). In order to keep

the risk-neutral dynamics of both Ytand ˜

Ytwithin the sub-class of aﬃne

models with only one factor determining the volatility, we choose factor Y3

36

to drive the stochastic volatility33 . Speciﬁcally we set

β0

i= [0 0 βi30 0 0] ,

what gives:

H1,ij = [0 0 hij 0 0 0] 1 ≤i, j ≤6;

where (ΣStΣ0)ij =H0ij +H1ij Ytwith H1ij ∈RN. This speciﬁcations imply

that

H1·Yt=Yt,3H,

where in the right-hand side we have a tensor product, with Hb eing a 6 ×6-

matrix with elements hij (see Duﬃe (2001) for the tensorial notation).

From (36) and the relation ˜

Yt=LYtwe obtain

H0= Σ (diag [α1, . . . , α6]) Σ0,

˜

H0=LH0L0,

H= Σ (diag [b1, . . . , b6]) Σ0.

Since ˜

H1·˜

Yt=L(H1·Yt)L0=Yt,3LHL0we have

˜

H1,ij =0 0 zij 11.25zij

720

7zij

6250

7zij ,(46)

with zij = (LHL0)ij .

Hence from (32), (33) and (34) we can express Mand Uas well as the drift

of Ytas functions of αi,βi3(i= 1,...,6) and Σ. Finally, for identiﬁcation

purposes, in the empirical implementation of this version, we ﬁx Σ to be

a diagonal matrix (with Σii = 0 for i≥4 in order to match the second

requirement of AF restrictions) and α= [1 1 1 0 0 0].

5.2.3 Estimation Procedures

How does one estimate CS and AF versions of the Legendre polynomial

model?

For the CS version, we run cross-sectional independent regressions for

each point tin time, within the sample period. In a market of zero coupon

33Choices of any of the two remaining factors to capture stochastic volatility could have

been implemented but with higher computational costs.

37

bonds, assuming that we observe yields Robs with measurement error, the

model is estimated with the use of the following linear regression:

ˆ

Yt= (F0F)−1F0Rt

obs,(47)

where Rt

obs is a vector containing observed yields, at time t, for diﬀerent

maturities (τ1,...τk) , and Fis the following matrix:

F=

P0(2τ1

`−1) P1(2τ1

`−1) ... PN−1(2τ1

`−1)

P0(2τ2

`−1) P1(2τ2

`−1) ... PN−1(2τ2

`−1)

.

.

..

.

..

.

.

P0(2τk

`−1) P1(2τk

`−1) ... PN−1(2τk

`−1)

.

For both AF versions we use the Quasi-Maximum Likelihood (QML) proce-

dure, adopting the methodology proposed by Chen and Scott (1993), with

2-,5-, and 10-year maturity zero-coupon bonds priced exactly and 3-, and

7-year maturity zero-coupon bonds priced with i.i.d zero-mean errors. The

conditional transition densities are obtained with the use of closed-form for-

mulas for the ﬁrst and the second moments of Ywithin the aﬃne framework

(see for instance, Duﬀee (2002), Jacobs and Karoui (2006)). Observe that for

the AFG version the QML is, in fact, a pure maximum likelihood procedure

since the transitions densities come exactly from a normal distribution.

38

Parameter AFG AFSV

β13 - 86.20

(13.96)

β23 - 63.77

(21.72)

β33 - 62.78

(7.45)

Σ11 0.0206

(0.0005) 0.0218

(0.0005)

Σ22 0.0094

(0.0004) 0.0099

(0.0004)

Σ33 0.0023

(0.0000) 0.0031

(0.0001)

λ0(1) 1.56

(1.02) *

λ0(2) * 1.30

(0.92)

λ0(3) * −0.31

(0.79)

λY(1,1) −17.65

(9.79) −21.65

(68.72)

λY(1,2) * *

λY(1,3) 131.5

(111.48) *

λY(2,1) 1.70

(2.58) 14.08

(10.68)

λY(2,2) −164.81

(47.49) 186.89

(80.73)

λY(2,3) −268.89

(138.63) 480.15

(191.58)

λY(3,1) * −4.42

(8.09)

λY(3,2) −82.53

(25.82) 149.31

(46.68)

λY(3,3) −144.66

(59.77) 521.53

(144.56)

Y0,40.0082

(0.0006) −0.0034

(0.0003)

Y0,5−0.0009

(0.0000) 0.0008

(0.0001)

Y0,60.0000

(0.0000) −0.0001

(0.0000)

Table 1: Estimated Parameters and Standard Errors for the AFG

Model

Both models were estimated by QML adopting the methodology proposed by Chen

and Scott (1993), with 2-,5-, and 10-year maturity zero-coupon bonds priced exactly

and 3-, and 7-year maturity zero-coupon bonds priced with i.i.d zero-mean errors.

Under AFSV model, for each iand j6= 3, βij is ﬁxed to zero (only the third

factor drives stochastic volatility). Values with stars were not signiﬁcant in a ﬁrst

QML estimation passage. Values with dashes do not apply to the speciﬁc model.

Estimation sample ranges from January 1972 to December 1994. Standard errors

were obtained by the BHHH method.

39

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

Model 1-Month Forecasting Horizon

CS 13.8/20.5 6.7/20.1 7.8/24.6 11.9/27.7 9.5/27.9

AFG -5.6/17.6 25.8/31.8 -0.2/23.8 -78.1/86.6 16.6/31.0

AFSV -0.9/15.2 6.9/19.8 1.6/23.7 -20.2/34.4 25.8/37.6

Model 6-Month Forecasting Horizon

CS 64.4/73.5 55.2/70.0 54.2/75.0 58.3/81.2 59.6/84.0

AFG -14.1/46.4 21.4/50.2 4.2/55.4 -60.3/88.5 77.8/97.7

AFSV 15.1/43.7 17.4/52.7 9.5/60.1 -2.5/65.8 87.6/114.8

Model 12-Month Forecasting Horizon

CS 109.5/116.5 98.6/109.1 96.7/111.8 100.9/117.9 102.7/121.0

AFG -2.8/52.8 31.3/58.7 12.2/57.9 -49.3/79.8 120.0/133.9

AFSV 35.2/64.1 26.8/67.8 4.1/72.5 -13.9/78.1 91.6/127.5

Table 2: Bias and Root Mean Square Errors for Out-of-Sample Fore-

casts (in bps)

This table presents bias (ﬁrst number in each cell) and RMSE (second number

in each cell) for 1-month, 6-month and 12-month ahead out-of-sample forecasts,

for the three versions of the polynomial model considered: Cross Sectional (CS),

Arbitrage-free Gaussian (AFG), Arbitrage-free with Stochastic Volatility (AFSV).

Out-of-sample period ranges from January 1995 to December 1998. Smaller absolute

bias and RMSE across models appears in bold.

40

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

Model 1-Month Forecasting Horizon

S1 AFSV x CS −2.2∗∗ -0.13 -0.41 1.59 4.17∗∗∗

S2 AFSV x CS −3.17∗∗∗ 0.0 0.29 0.87 3.17∗∗∗

S1 AFG x CS -0.80 5.61∗∗∗ -0.16 8.70∗∗∗ 3.09∗∗∗

S2 AFG x CS -1.44 3.75∗∗∗ 0.29 5.48∗∗∗ 2.88∗∗∗

Model 6-Month Forecasting Horizon

S1 AFSV x CS −3.76∗∗∗ −1.64∗-1.15 -1.13 3.20∗∗∗

S2 AFSV x CS −4.42∗∗∗ −3.81∗∗∗ −2.90∗∗∗ −2.90∗∗∗ 4.72∗∗∗

S1 AFG x CS −1.91∗∗ −2.61∗∗∗ -1.57 0.43 3.05∗∗∗

S2 AFG x CS −2.28∗∗ −4.11∗∗∗ −2.59∗∗∗ 1.37 4.42∗∗∗

Model 12-Month Forecasting Horizon

S1 AFSV x CS −35.04∗∗∗ −10.40∗∗∗ −1.78∗-1.15 0.01

S2 AFSV x CS −6.08∗∗∗ −5.10∗∗∗ −2.79∗∗∗ −2.46∗∗ 1.15

S1 AFG x CS −6.33∗∗∗ −3.95∗∗∗ −2.98∗∗∗ -0.96 3.75∗∗∗

S2 AFG x CS −5.42∗∗∗ −5.75∗∗∗ −4.77∗∗∗ -1.48 6.08∗∗∗

Table 3: Statistical Comparison of Forecasts through the Diebold

and Mariano (1995) Test

This table presents the Diebold and Mariano (1995) S1, and S2 statistics for 1-

month, 6-month and 12-month ahead out-of-sample forecasts, comparing the AFSV

and the AFG to the CS version. Comparisons are done as functions of Mean Abso-

lute Errors (MAE). In-sample period ranges from January 1972 to December 1994.

Out-of-sample period ranges from January 1995 to December 1998. Negative values

are in favor of AFSV / AFG versions, and against the CS version. Small p-values

indicate high probability of rejecting the null hypothesis of a zero diﬀerence in Mean

Absolute Errors. Values with a star indicate signiﬁcance at a 90% level, with two

stars, signiﬁcance at a 95% level, and three stars, signiﬁcance at a 99% level, on a

bi-caudal test.

41

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

1-Month Forecasting Horizon

Average Yields 5.337 5.502 5.648 5.717 5.822

6-Month Forecasting Horizon

Average Yields 5.245 5.411 5.550 5.614 5.717

12-Month Forecasting Horizon

Average Yields 5.208 5.389 5.536 5.601 5.706

Table 4: Observed Yields Averaged across the Out-of-Sample Period

(in %)

This table presents observed yields averaged across the out-of-sample period, for the

three diﬀerent forecasting horizons. Out-of-sample period ranges from January 1995

to December 1998. For the h-month forecasting horizon, the average is performed

with a window of data ranging from the hth month of 1995 up to December 1998.

42

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

Model 1-Month Forecasting Horizon

CS 5.433 5.539 5.709 5.822 5.883

AFG 5.536 5.943 5.723 4.961 6.023

AFSV 5.453 5.680 5.723 5.510 5.943

Model 6-Month Forecasting Horizon

CS 5.627 5.736 5.906 6.010 6.044

AFG 6.198 6.352 5.839 5.100 6.903

AFSV 5.823 5.960 5.895 5.683 6.381

Model 12-Month Forecasting Horizon

CS 5.800 5.905 6.064 6.154 6.158

AFG 6.589 6.505 5.766 5.179 8.062

AFSV 6.082 6.130 5.974 5.806 6.896

Table 5: Model Implied Forward Rates Averaged Across the Out-

of-Sample Period (in %)

This table presents model implied forward rates with maturities τ, and forward term

equal to respectively 1-, 6-, and 12-month, averaged across the out-of-sample period,

for the three versions of the polynomial model considered: Cross Sectional (CS),

Arbitrage-free Gaussian (AFG), Arbitrage-free with Stochastic Volatility (AFSV).

In-sample period ranges from January 1972 to December 1994. Out-of-sample period

ranges from January 1995 to December 1998.

43

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

Model δt=1-Month

CS -4.2 -3.1 -1.6 -1.4 -3.5

AFG 25.5 18.2 7.8 2.5 3.4

AFSV 12.5 10.7 6.0 -0.5 -13.8

Model δt=6-Month

CS -26.3 -22.7 -18.6 -18.7 -27.0

AFG 109.3 72.7 24.8 8.9 40.8

AFSV 42.6 37.5 25.1 9.5 -21.2

Model δt=12-Month

CS -50.3 -46.9 -43.8 -45.6 -57.5

AFG 140.9 80.4 10.9 7.1 115.6

AFSV 52.2 47.3 39.8 34.4 27.4

Table 6: Model Implied Bond Risk Premium Averaged Across the

Out-of-Sample Period

This table presents model implied bond risk premium for 1-,6- and 12-month hold-

ing periods, averaged across the out-of-sample period, for the three versions of the

polynomial model considered: Cross Sectional (CS), Arbitrage-free Gaussian (AFG),

Arbitrage-free with Stochastic Volatility (AFSV). In-sample period ranges from Jan-

uary 1972 to December 1994. Out-of-sample period ranges from January 1995 to

December 1998. Bond risk premium for maturity τand holding period δtwas nor-

malized by a factor δt

τ.

44

Maturity 2-Year 3-Year 5-Year 7-Year 10-Year

Model δt=1-Month

CS 0.69 0.55 0.79 0.88 0.63

AFG 3.55 1.71 36.89 0.97 1.21

AFSV 13.71 2.55 4.80 1.03 0.47

Model δt=6-Month

CS 0.59 0.59 0.66 0.68 0.55

AFG 6.76 4.40 6.93 0.85 1.52

AFSV 3.82 3.15 3.65 2.73 0.76

Model δt=12-Month

CS 0.54 0.52 0.55 0.55 0.44

AFG 49.21 3.57 1.90 0.86 1.96

AFSV 2.48 2.77 10.81 1.48 1.30

Table 7: Eﬀects of Bond Risk Premium on Forecasting Bias

This table presents ratios of the absolute value of forecasting bias imposing zero

bond risk premium (using only forward rates) over the absolute value of the ac-

tual forecasting bias, for 1-, 6- and 12-month holding period intervals, for the three

versions of the polynomial model considered: Cross Sectional (CS), Arbitrage-free

Gaussian (AFG), Arbitrage-free with Stochastic Volatility (AFSV). In-sample pe-

riod ranges from January 1972 to December 1994. Out-of-sample period ranges from

January 1995 to December 1998. Ratio above one indicates that model implied risk

premium contributes to decrease bias, and bellow one indicates that risk premium

was not correctly estimated.

45

Maturity 2 Year 3 Year 5 Year 7 Year 10 Year

Model 1-Month Forecasting Horizon

AFSV -0.9/15.2 6.9/19.8 1.6/23.7 -20.2/34.4 25.8/37.6

RW 4.7/16.0 5.4/20.0 6.0/24.5 6.3/26.4 6.4/27.5

DL 3.7/15.9 2.0/19.2 5.8/24.2 8.9/26.6 6.7/27.4

Model 6-Month Forecasting Horizon

AFSV 15.1/43.7 17.4/52.7 9.5/60.1 -2.5/65.8 87.6/114.8

RW 22.1/46.3 24.0/55.9 27.5/67.0 29.5/72.2 30.2/74.7

DL 39.1/56.9 37.9/62.6 44.2/73.4 49.6/80.0 49.2/82.1

Model 12-Month Forecasting Horizon

AFSV 35.2/64.1 26.8/67.8 4.1/72.5 -13.9/78.1 91.6/127.5

RW 29.4/58.6 30.4/68.5 34.8/81.6 38.2/87.7 39.8/90.4

DL 76.7/90.1 73.8/92.4 80.6/103.7 87.2/111.5 87.9/113.7

Table 8: Bias and Root Mean Square Errors for Out-of-Sample Fore-

casting (in bps): Comparisons with the Random Walk and Diebold

and Li (2006) models

This table presents bias (ﬁrst number in each cell) and RMSE (second number in

each cell) for 1-month, 6-month, and 12-month ahead out-of-sample forecasts for the

RW, and DL models, and compare them to the AFSV polynomial model. In-sample

period ranges from January 1972 to December 1994. Out-of-sample period ranges

from January 1995 to December 1998. For a ﬁxed forecasting horizon (1-month,

6-month, 12-month), smaller absolute bias and smaller RMSE across models appear

in bold.

46

Figure 1: The First Four Legendre Polynomials

This picture depicts the ﬁrst four Legendre polynomial, which are respectively

P0(x) = 1, P1(x) = x,P2(x) = 1

2(3x2−1), and P3(x) = 1

2(5x3−3x), deﬁned

within the interval [-1,1].

47

Figure 2: Dynamic Factors in the AFG Polynomial Model

This picture presents the time series of the six factors estimated under the AFG

model version. The left-hand side factors are the three lower order factors with non-

null diﬀusions, respectively capturing “level”, “slope” and “curvature” movements.

The right-hand side factors are the three conditionally deterministic higher order

factors, respectively related to the Legendre polynomials of degree 3, 4 and 5. In-

sample period ranges from January 1972 to December 1994. Out-of-sample period

ranges from January 1995 to December 1998.

48

Figure 3: Dynamic Factors in the AFSV Polynomial Model

This picture presents the time series of the six factors estimated under the AFSV

model. The left-hand side factors are the three lower order factors with non-null

diﬀusions, respectively capturing “level”, “slope” and “curvature” movements. The

curvature (third) factor drives stochastic volatility of the three lower order factors.

The right-hand side factors are the three conditionally deterministic higher order

factors, respectively related to the Legendre polynomials of degree 3, 4 and 5. In-

sample period ranges from January 1972 to December 1994. Out-of-sample period

ranges from January 1995 to December 1998.

49

Figure 4: Distance Between Factors from CS and Arbitrage-Free

Versions

This picture presents the distance between the CS “level”, “slope” and “curvature”

factors, and the same factors under each arbitrage-free version of the polynomial

model. Blue full line captures the distance between a CS factor and the corre-

sponding AFSV factor. Red dashed line captures the distance between a CS factor

and the corresponding AFG factor. In-sample period ranges from January 1972 to

December 1994.

50

Figure 5: Out-of-Sample Averaged Forecasts and Observed Yield

Curves

This picture presents a spline version of the observed yield curve (2-, 3-, 5-, 7-, and

10- year maturities) averaged across the out-of-sample period (from Jan. 95 to Dec.

98), and corresponding averaged yield curves implied by the diﬀerent versions of the

polynomial model. Blue solid line represents the observed yield curve, dotted red

line represents the CS version, cyan dash-dotted line represents the AFG version,

and black dashed line represents the AFSV version. In-sample period ranges from

January 1972 to December 1994.

51

Figure 6: 12-Month Ahead Out-of-Sample Forecasting of the 2-Year

Yield

This picture presents, out-of-sample time series of observed yields, model implied for-

ward rates, and model implied bond risk premium, for diﬀerent forecasting horizons.

Dotted line represents observed yields. Solid line represents model forecast given

by Equation (17). Dashed line represents model implied forward rate. In-sample

period ranges from January 1972 to December 1994. For the h-month forecasting

horizon, the out-of-sample period ranges from the hth month of 1995 to December

1998.

52

Figure 7: Bond Risk Premium for Diﬀerent Maturities and Fore-

casting Horizons

This picture presents the time series of bond risk premium implied by each model

version, for diﬀerent maturities and forecasting horizons. Cyan solid line represents

AFG, black dashed line represents AFSV, and red dotted line represents CS. In-

sample period ranges from January 1972 to December 1994. Out-of-sample period

ranges from January 1995 to December 1998.

53