ArticlePDF Available

Autoregressive approaches to import–export time series I: basic techniques

Authors:

Abstract and Figures

This work is the first part of a project dealing with an in-depth study of effective techniques used in econometrics in order to make accurate forecasts in the concrete framework of one of the major economies of the most productive Italian area, namely the province of Verona. In particular, we develop an approach mainly based on vector autoregressions, where lagged values of two or more variables are considered, Granger causality, and the stochastic trend approach useful to work with the cointegration phenomenon. Latter techniques constitute the core of the present paper, whereas in the second part of the project, we present how these approaches can be applied to economic data at our disposal in order to obtain concrete analysis of import--export behavior for the considered productive area of Verona.
Content may be subject to copyright.
arXiv:1506.02940v1 [stat.AP] 4 Jun 2015
Modern Stochastics: Theory and Applications 2 (2015) 51–65
DOI: 10.15559/15-VMSTA22
Autoregressive approaches to import–export time
series I: basic techniques
Luca Di Persioa
aDept. Informatics, University of Verona, strada le Grazie 15, 37134, Italy
dipersioluca@gmail.com (L. Di Persio)
Received: 9 February 2015, Revised: 8 April 2015, Accepted: 8 April 2015,
Published online: 20 April 2015
Abstract This work is the first part of a project dealing with an in-depth study of effective
techniques used in econometrics in order to make accurate forecasts in the concrete framework
of one of the major economies of the most productive Italian area, namely the province of
Verona. In particular, we develop an approach mainly based on vector autoregressions, where
lagged values of two or more variables are considered, Granger causality, and the stochastic
trend approach useful to work with the cointegration phenomenon. Latter techniques constitute
the core of the present paper, whereas in the second part of the project, we present how these
approaches can be applied to economic data at our disposal in order to obtain concrete analysis
of import–export behavior for the considered productive area of Verona.
Keywords Econometrics time series, autoregressive models, Granger causality,
cointegration, stochastic nonstationarity, AIC and BIC criteria, trends and breaks
1 Introduction
The analysis of time series data constitutes a key ingredient in econometric stud-
ies. Last years have been characterized by an increasing interest toward the study of
econometric time series. Although various types of regression analysis and related
forecast methods are rather old, the worldwide financial crisis experienced by mar-
kets starting from last months of 2007, and which is not yet finished, has put more
attention on the subject. Moreover, analysis and forecast problems have become of
great momentum even for medium and small enterprizes since their economic sus-
tainability is strictly related to the propensity of a bank to give credits at reasonable
conditions.
© 2015 The Author(s). Published by VTeX. Open access article under the CC BY license.
www.i-journals.org/vmsta
52 L. Di Persio
In particular, great efforts have been made to read economic data not as mon-
ads, but rather as constituting pieces of a whole. Namely, new techniques have been
developed to study interconnections and dependencies between different factors char-
acterizing the economic history of a certain market, a given firm, a specified industrial
area, and so on. From this point of view, methods such as the vector autoregression,
the cointegration approach, and the copula techniques have been benefitted by new
research impulses.
A challenging problem is then to apply such instruments in concrete situations and
the problem becomes even harder if we take into account the economies are hardly
hit by the aforementioned crisis. A particularly important case study is constituted by
a close analysis of import–export time series. In fact, such an information, spanning
from countries to small firms, has the characteristic to provide highly interesting hints
for people, for example, politicians or CEOs, to depict future economic scenarios and
related investment plans for the markets in which they are involved.
Exploiting precious economic data that the Commerce Chamber of Verona
Province has put at our disposal, we successfully applied some of the relevant ap-
proaches already cited to find dependencies between economic factor characterizing
the Province economy and then to make effective forecasts, very close to the real
behavior of studied markets.
For completeness, we have split our project into two parts, namely the present
one, which aims at giving a self-contained introduction to the statistical techniques of
interest, and the second one, where the Verona import–export case study have been
treated in detail.
In what follows, we first recall univariate time series models, paying particular
attention to the AR model, which relates a time series to its past values. We will
explain how to make predictions, by using these models, how to choose the delays,
for example, using the Akaike and Bayesian information crtiteria (AIC, resp. BIC),
and how to behave in the presence of trends or structural breaks. Then we move to the
vector autoregression (VAR) model, in which lagged values of two or more variables
are used to forecast future values of these variables. Moreover,we present the Granger
causality, and, in the last part, we return to the topic of stochastic trend introducing
the phenomenon of cointegration.
2 Univariate time-series models
Univariate models have been widely used for short-run forecast (see, e.g., [6, Exam-
ples of Chapter 2]. In what follows, we recall some of these techniques, focusing our-
selves particularly on the analysis of autoregressive (AR) processes, moving average
(MA) processes, and a combination of both types, the so-called ARMA processes;
for further details, see, for example, [3,2,8] and references therein.
The observation on the time-series variable Ymade at date tis denoted by Yt,
whereas TN+indicates the total number of observations. Moreover, we denote
the jth lag of a time series {Yt}t=0,...,T by Ytj(the value of the variable Y j pe-
riods ago); similarly, Yt+jdenotes the value of Y j periods to the future, where,
for any fixed t {0,...,T}, j is such that jN+, t j0,and t+jT.
The jth autocovariance of a series Ytis the covariance between Ytand its jth lag,
Autoregressive approaches to import–export time series I: basic techniques 53
that is, autocovariancej=σj:= cov(Yt, Ytj), whereas the jth autocorrelation co-
efficient is the correlation between Ytand Ytj, thats is, autocorrelationj=ρj:=
corr(Yt, Ytj) = cov(Yt,Ytj)
var(Yt) var(Ytj). When the average and variance of a variable
are unknown, we can estimate them by taking a random sample of nobservations.
In a simple random sample, nobjects are drawn at random from a population, and
each object is equally likely to be drawn. The value of the random variable Yfor
the ith randomly drawn object is denoted Yi. Because each object is equally likely
to be drawn and the distribution of Yiis the same for all i, the random variables
Y1,...,Ynare independent and identically distributed (i.i.d.). Given a variable Y,
we denote by Yits sample average with respect to the nobservations Y1,...,Yn,
thats is, Y=1
n(Y1+Y2+···+Yn) = 1
nPn
i=1 Yi, whereas we define the re-
lated sample variance by s2
Y:= 1
n1Pn
i=1(YiY)2.The jth autocovariances, resp.
autocorrelations, can be estimated by the jth sample autocovariances, resp. autocorre-
lations, as follows: bσj:= 1
TPT
t=j+1(YtYj+1,T )(YtjY1,T j), resp. bρj:=
cσj
s2
Y,
where Yj+1,T denotes the sample average of Ytcomputed over the observations
t=j+ 1,...,T. Concerning forecast based on regression models that relates a time
series variable to its past values, for completeness, we shall start with the first-order
autoregressive process, namely the AR(1) model, which uses Yt1to forecast Yt. A
systematic way to forecast is to estimate an ordinary least squares (OLS) regression.
The OLS estimator chooses the regression coefficients so that the estimated regres-
sion line is as close as possible to the observed data, where the closeness is measured
by the sum of the squared mistakes made in predicting Ytgiven Yt1. Hence, the
AR(1) model for the series Ytis given by
Yt=β0+β1Yt1+ut,(1)
where β0and β1are the regression coefficients. In this case, the intercept β0is the
value of the regression line when Yt1= 0, the slope β1represents the change in
Ytassociated with a unit change in Yt1, and utdenotes the error term whose nature
will be later clarified. Let us assume that the value Yt0of the time series Ytat initial
time t0is given; then Yt0+1 =β0+β1Yt0+ut0+1, so that iterating relation (1) up to
order τ > 0, we get
Yt0+τ=1 + β1+β2
1+···+βτ1
1β0+βτ
1Yt0
+βτ1
1ut0+1 +βτ2
1ut0+2 +···+β1ut0+τ1+ut0+τ
=βτ
1Yt0+1βτ
1
1β1
β0+
τ1
X
j=0
βj
1ut0+τj.
Hence, taking t=t0+τwith t0= 0, we obtain
Yt=βt
1Y0+1βt
1
1β1
β0+
t1
X
j=0
βj
1utj.(2)
A time series Ytis called stationary if its probability distribution does not change
over time, that is, if the joint distribution of (Ys+1 , Ys+2,...,Ys+T)does not depend
54 L. Di Persio
on s; otherwise, Ytis said to be nonstationary. In (2), the process Ytconsists of both
time-dependent deterministic and stochastic parts, and, thus, it cannot be stationary.
Formally, the process with stochastic initial conditions results from (2) if and only
if |β1|<1. It follows that if limt0→−∞ Yt0is bounded, then, as t0 −∞, we have
Yt=β0
1β1
+
X
j=0
βj
1utj;(3)
see, for example, [6, Chap. 2.1.1]. Equation (3) can be rewritten by means of the lag
operator, which acts as follows: LYt=Yt1, L2Yt=Yt2,...,LkYt=Ytk, so
that Eq. (1) becomes (1 β1L)Yt=β0+ut. Assuming that E[ut] = 0 for all t, we
have
E[Yt] = E"β0
1β1
+
X
j=0
βj
1utj#=β0
1β1
+
X
j=0
βj
1E[utj] = β0
1β1
=µ,
V[Yt] = EYtβ0
1β12=E"
X
j=0
βj
1utj!2#
=Eut+β1ut1+β2
1ut2+···2
=Eu2
t+β2
1u2
t1+β4
1u2
t2+···+ 2β1utut1+ 2β2
1utut2+···
=σ21 + β2
1+β4
1+···=σ2
1β2
1
,
where we have used that E[utus] = 0 for t6=sand |β1|<1. Hence, both the mean
and variance are constants, and thus the covariances are given by
Cov[Yt, Yt1] = EYtβ0
1β1Yt1β0
1β1
=Eut+β1ut1+···+βτ
1utτ+···
×utτ+β1utτ1+β2
1utτ2+···
=Eut+β1ut1+···+βτ1
1utτ1
+βτ
1utτ+β1utτ1+β2
1utτ2+···
×utτ+β1utτ1+β2
1utτ2+···
=βτ
1Eutτ+β1utτ1+β2
1utτ2+···2=βτ
1V[Ytτ]
=βτ
1
σ2
1β2
1
=: γ(τ).
The previous AR(1) can be generalized by considering arbitrary but finite order p > 1.
In particular , an AR(p) process can be described by the equation
Yt=β0+β1Yt1+β2Yt2+···+βpYtp+ut,(4)
where β0,...,βpare constants, whereas utis the error term represented by a random
variable with zero mean and variance σ2>0. Using the lag operator, we can rewrite
Autoregressive approaches to import–export time series I: basic techniques 55
Eq. (4) as (1 β1Lβ2L2 · ·· βpLp)Yt=β0+ut. In such a framework, it is
standard to assume that the following four properties hold (see, e.g., [7, Chap. 14.4]):
uthas conditional mean zero, given all the regressors, that is,
E(ut|Yt1, Yt2,...) = 0, which implies that the best forecast of Ytis given
by the AR(p)regression.
Yihas a stationary distribution, and Yi,Yijare assumed to become inde-
pendent as jgets large. If the time-series variables are nonstationary, then the
forecast can be biased and inefficient, or conventional OLS-based statistical
inferences can be misleading.
All the variables have nonzero finite fourth moments.
There is no perfect multicollinearity, namely it is not true that, given a certain
regressor, it is a perfect linear function of the variables.
2.1 Forecasts
In this section, we show how the previously introduced class of models can be used
to predict the future behavior of a certain quantity of interest. If Ytfollows the AR(p)
model and β0, β1,...,βpare unknown, then the forecast of YT+1 is given by β0+
β1YT+β2YT1+···+βpYTp+1 . Forecasts must be based on estimates of the
coefficients βiby using the OLS estimators based on historical data. Let ˆ
YT+1 denote
the forecast of YT+1 based on YT, YT1,...:
ˆ
YT+1|T=ˆ
β0+ˆ
β1YT+ˆ
β2YT1+···+ˆ
βpYTp+1.
Then such a forecast refers to some data beyond the data set used to estimate the
regression, so that the data on the actual value of the forecasted dependent variable
are not in the sample used to estimate the regression. Forecasts and forecast error
pertain to “out-of-sample” observations.
The forecast error is the mistake made by the forecast; this is the difference
between the value of YT+1 that actually occurred and its forecasted value forecast
error := YT+1 ˆ
YT+1|T.
The root mean squared forecast error RMSFE is a measure of the size of the
forecast error RMSFE =qE[(YT+1 ˆ
YT+1|T)2], and it is characterized by two
sources of error: the error arising because future values of utare unknown and the
error in estimating the coefficients βi. If the first source of error is much larger than
the second, the RMSFE is approximately pvar(ut), the standard deviation of the
error ut, which is estimated by the standard error of regression (SER). One useful
application used in time-series forecasting is to test whether the lags of one regressor
have useful predictive content. The claim that a variable has no predictive content
corresponds to the null hypothesis that the coefficients on all lags of that variable
are zero. Such a hypothesis can be checked by the so-called Granger causality test
(GCT), a type of F-statistic approach used to test joint hypothesis about regression
coefficients. In particular, the GCT method tests the hypothesis that the coefficients
of all the values of the variable in Yt=β0+β1Yt1+β2Yt2+···+βpYtp+
ut, namely the coefficients of Yt1, Yt2,...,Ytp, are zero, and hence this null
hypothesis implies that such regressors have no predictive content for Yt.
56 L. Di Persio
2.2 Lag length selection
Let us recall relevant statistical methods used to optimally choose the number of
lags in an autoregression model; in particular, we focus our attention on the Bayes
method (BIC) and on the Akaike method (AIC); for more details, see, for example,
[7, Chap. 14.5]. The BIC method is specified by
BIC(p) = lnSSR(p)
T+ (p+ 1)ln T
T,(5)
where SSR(p)is the sum of squared residuals of the estimated AR(p). The BIC es-
timator of pis the value that minimizes BIC(p)among all the possible choices. In the
first term of Eq. (5), the sum of squared residuals necessarily decreases when adding
a lag. In contrast, the second term is the number of estimated regression coefficients
times the factor (ln T)/T , so this term increases when adding a lag. This implies that
the BIC trades off these two aspects. The AIC approach is defined by
AIC(p) = lnSSR(p)
T+ (p+ 1) 2
T,
and hence the main difference between the AIC and BIC is that the term ln(T)in
the BIC is replaced by 2in the AIC , so the second term in the AIC is smaller. But
the second term in the AIC is not large enough to assure choosing the correct length,
so this estimator of pis not consistent. We recall that an estimator is consistent if, as
the size of the sample increases, its probability distribution concentrates at the value
of the parameter to be estimated. So, the BIC estimator ˆpof the lag length in an
autoregression is correct in large samples, that is, Pr(ˆp=p)1. This is not true
for the AlC estimator, which can overestimate peven in large samples; for the proof,
see, for example, [7, Appendix 14.5].
2.3 Trends
A further relevant topic in econometric analysis is constituted by nonstationarities
that are due to trends and breaks. A trend is a persistent long-term movement of
a variable over time. A time-series variable fluctuates around its trend. There are
two types of trends, deterministic and stochastic. A deterministic trend is a non-
random function of time. In contrast, a stochastic trend is characterized by a ran-
dom behavior over time. Our treatment of trends in economic time series focuses on
stochastic trend. One of the simplest models of time series with stochastic trend is
the one-dimensional random walk defined by the relation Yt=Yt1+ut, where
utis the error term represented by a normally distributed random variable with zero
mean and variance σ2>0. In this case, the best forecast of tomorrow’s value is
its value today. A extension of the latter is the random walk with drift defined by
Yt=β0+Yt1+ut, β0R, where the best forecast is the value of the series
today plus the drift β0. A random walk is nonstationary because the variance of a
random walk increases over time, so the distribution of Ytchanges over time. In fact,
since utis uncorrelated with Yt1, we have var(Yt) = var(Yt1) + var(ut)with
var(Yt) = var(Yt1)if and only if var(ut) = 0. The random walk is a particular
case of an AR(1) model with β1= 1. If |β1|<1and utis stationary, then Ytis
Autoregressive approaches to import–export time series I: basic techniques 57
stationary. The condition for the stationarity of an AR(p)model is that the roots of
1β1zβ2z2β3z3 ··· βpzp= 0 are greater than one in absolute value.
If an AR(p)has a root equal to one, then we say that the series has a unit root and
astochastic trend. Stochastic trends usually bring many issues, for example, the au-
toregressive coefficients are biased toward zero. Because Ytis nonstationary, the as-
sumptions for time-series regression do not hold, and we cannot rely on estimators
and test statistics having their usual large-sample normal distributions; see, for exam-
ple, [7, Chap. 3.2]. In fact, the OLS estimator of the autoregressive coefficient ˆ
β1is
consistent, but it has a nonnormal distribution; then the asymptotic distribution of ˆ
β1
is shifted toward zero. Another problem caused by stochastic trend is the nonnormal
distribution of the t-statistic, which means that conventional confidence intervals are
not valid and hypothesis tests cannot be conducted as usual. The t-statistic is an im-
portant example of a test statistic, namely of a statistic used to perform a hypothesis
test. A statistical hypothesis test can make two types of mistakes: a type I error, in
which the null hypothesis is rejected when, in fact, it is true, and a type II error, in
which the null hypothesis is not rejected when, in fact, it is false. The prespecified
rejection probability of a statistical hypothesis test when the null hypothesis is true,
that is, the prespecified probability of a type I error, is called the significance level of
the test. The critical value of the test statistic is the value of the statistic for which the
test just rejects the null hypothesis at the given significance level. The p-value is the
probability of obtaining a test statistic, by random sampling variation, at least as ad-
verse to the null hypothesis value as is the statistic actually observed, assuming that
the null hypothesis is correct. Equivalently, the p-value is the smallest significance
level at which you can reject the null hypothesis. The value of the t-statistic is
t=estimator hypothesized value
standard error of the estimator
and is well approximated by the standard normal distribution when nis large because
of the central limit theorem (see, e.g., [1, Chap. 4.3]). Moreover, stochastic trends can
lead two time series to appear related when they are not, a problem called spurious
regression (see, e.g., [5, Chap. 2] for examples). For the AR(1) model, the most
commonly used test to determine stochastic trends, is the Dickey–Fuller test (see,
e.g., [5, Chap. 3] for details. For this test, we first subtract Yt1from both sides of
the equation Yt=β0+β1Yt1+ut. Then we assume that the following hypothesis
test holds:
H0:δ= 0 versus H1:δ < 0 in YtYt1= Yt=β0+δYt1+ut
with δ=β11. For an AR(p)model, it is standard to use the augmented Dickey–
Fuller test (ADF), which tests the null hypothesis H0:δ= 0 against the one-side
alternative H1:δ < 0in the regression
Yt=β0+δYt1+γ1Yt1+γ2Yt2+···+γpYtp+ut
under the null hypothesis. Let us note that since Ythas a stochastic trend, it follows
that, under the alternative hypothesis, Ytis stationary. The ADF statistic is the OLS
t-statistic testing δ= 0. If, instead, the alternative hypothesis is that Ytis station-
ary around a deterministic linear time trend, then this trend tmust be added as an
additional regressor. In this case, the Dickey–Fuller regression becomes
58 L. Di Persio
Yt=β0+αt +δYt1+γ1Yt1+γ2Yt2+···+γpYtp+ut,
and we test for δ= 0. The ADF statistic does not have a normal distribution, and
hence different critical values have to be used.
2.4 Breaks
A second type of nonstationarity arises when the regression function changes over the
course of the sample. In economics, this can occur for a variety of reasons, such as
changes in economic policy, changes in the structure of the economy, or an invention
that changes a specific industry. These breaks cannot be neglected by the regression
model. A problem caused by breaks is that the OLS regression estimates over the
full sample will estimate a relationship that holds “on average,” in the sense that the
estimate combines two different periods, and this leads to poor forecast. There are
two types of testing for breaks: testing for a break at a known date and for a break
at an unknown break date. We consider the first option for an AR(p)model. Let τ
denote the hypothesized break date, and let Dt(τ)be the binary variable such that
Dt(τ) = 0 if t > τ and Dt(τ) = 1 if t < τ. Then the regression including the binary
break indicator and all interaction terms reads as follows:
Yt=β0+β1Yt1+β2Yt2+···+βpYtp+γ0Dt(τ)
+γ1Dt(τ)×Yt1+γ2Dt(τ)×Yt2+···+γpDt(τ)×Ytp+ut
under the null hypothesis of no breaks, γ0=γ1=γ2=··· =γp= 0. Under
the alternative hypothesis that there is a break, the regression function is different
before and after the break date τ, and we can use the F-statistic performing the so-
called the Chow test (see, e.g., [6, Chap. 5.3.3]). If we suspect a break between two
dates τ0and τ1, the Chow test can be modified to test for breaks at all possible dates τ
between τ0and τ1, then using the largest of the resulting F-statistics to test for a break
at an unknown date. The latter technique is called the Quandt likelihood ratio statistic
(QLR) (see, e.g., [7, Chap. 14.7]). Because the QLR statistic is the largest of many
F-statistics, its distribution is not the same as that of an individual F-statistic; also, the
critical values for the QLR statistic must be obtained from a special distribution.
3 MA and ARMA
In the following, we consider finite-order moving-average (MA) processes (see, e.g.,
[6, Chap. 2.2]). The moving-average process of order q, MA(q), is defined by Yt=
α0+utα1ut1α2ut2··· αqutq; equivalently, by using the lag operator
we get Ytα0= (1 α1Lα2L2··· αqLq)ut. Every finite MA(q) process is
stationary, and we have
E[Yt] = α0,
V[Yt] = E[(Ytα0)2] = (1 + α2
1+α2
2+···+α2
q)σ2,
Cov[Yt, Yt+τ] = E[(Ytα0)(Yt+τα0)]
=E[ut(ut+τα1ut+τ1 · ·· αqut+τq)
α1ut1(ut+τα1ut+τ1 · ·· αqut+τq)
· ·· αqutq(ut+τα1ut+τ1 · · · αqut+τq)].
Autoregressive approaches to import–export time series I: basic techniques 59
Combining both an autoregressive (AR) term of order pand a moving-average (MA)
term of order q, we can define the process denoted as ARMA(p, q) and represented
by
Yt=β0+β1Yt1+···+βpYtp+utα1ut1 · · · αqutq;
again, exploiting the lag operator, we can write
1β1Lβ2L2 · ·· βpLpYt=β0+1α1Lα2L2 · · · αqLqut,
β(L)Yt=β0+α(L)ut.
4 Vector autoregression
In what follows, we focus our study on the so-called vector autoregression (VAR)
econometric model, also using some remarks on the relation between the univariate
time series models described in the first part, and the set of simultaneous equations
systems of traditional econometrics characterizing the VAR approach (see, e.g., [4,
Chap. 2]).
4.1 Representation of the system
We have so far considered forecasting a single variable. However, it is often necessary
to allow for a multidimensional statistical analysis if we want to forecast more than
one-parameter dynamics. This section introduces a model for forecasting multiple
variables, namely the vector autoregression (VAR) model, in which lagged values
of two or more variables are used to forecast their future values. We start with the
autoregressive representation in a VAR model of order p, denoted by VAR(p), where
each component depends on its own lagged values up to pperiods and on the lagged
values of all other variables up to order p. It follows that the main idea behind the
VAR model is to know how new information, appearing at a certain time point and
concerning one of the observed variables, is processed in the system and which impact
it has over time not only for this particular variable but also for the other system
parameters. Hence, a VAR(p) model is a set of ktime-series regressions (kN+) in
which the regressors are lagged values of all kseries and the number of lags equals p
for each equation. In the case of two time series variables, say, Ytand Xt, the VAR(p)
consists of two equations of the form
(Yt=β10 +β11Yt1+···+β1pYtp+γ11Xt1+···+γ1pXtp+u1t,
Xt=β20 +β21Yt1+···+β2pYtp+γ21Xt1+···+γ2pXtp+u2t,(6)
where the βs and the γs are unknown coefficients, and u1tand u2tare error terms rep-
resented by normally distributed random variables with zero mean and variance σ2
i>
0. The VAR assumptions are the same as those for the time-series regression defin-
ing AR models and applied to each equation; moreover, the coefficients of each VAR
are estimated by means of the OLS approach. The reduced form of a vector autore-
gression of orderpis defined as Zt=δ+A1Zt1+A2Zt2+···+ApZtp+Ut,
where Ai, i = 1,...,p, are k-dimensional quadratic matrices, Urepresents the k-
dimensional vector of residuals at time t, and δis the vector of constant terms.
System (6) can be rewritten compactly as Ap(L)Zt=δ+Ut, where Ap(L) =
60 L. Di Persio
IkA1LA2L2 ··· ApLp,E[Ut] = 0, E[UtU
t] = σuu, and E[UtU
s] = 0
for t6=s. Such a system is stable if and only if all included variables are station-
ary, that is, if all roots of the characteristic equation of the lag polynomial are out-
side the unit circle, namely det(IkA1zA2z · ·· Apz)6= 0 for |z| 1
(for details, see, e.g., [6, Chap. 4.1]). We use this condition because we saw in Sec-
tion 2.3 that the condition for the stationarity of an AR(p)model is that the roots of
1β1zβ2z2β3z3 ··· βpzp= 0 are greater than one in absolute value.
If an AR(p)has a root equal to one, we say that the series has a unit root and a
stochastic trend. Moreover, the previous system can be rewritten by exploiting the
MA representation as follows:
Zt=A1(L)δ+A1(L)Ut
=µ+UtB1Ut1B2Ut2B3Ut3 · ··
=µ+B(L)Ut
with
B0=Ik, B(L) := I
X
j=1
BjLjA1(L),
µ=A1(1)δ=B(1)δ.
The autocovariance matrices are defined as ΓZ(τ) = E[(Ztµ)(Ztτµ)]; without
loss of generality, we set δ= 0 and, therefore, µ= 0, whence we obtain
EZtZ
tτ=A1EZt1Z
tτ+A2EZt2Z
tτ
+···+ApEZtpZ
tτ+EUtZ
tτ
and, for τ0,
ΓZ(τ) = A1ΓZ(τ1) + A2ΓZ(τ2) + ···+ApΓZ(τp),
ΓZ(0) = A1ΓZ(1) + A2ΓZ(2) + ···+ApΓZ(p) + Σuu
=A1ΓZ(1)+A2ΓZ(2)+···+ApΓZ(p)+Σuu .
Since the autocovariance matrix entries link a variable with both its delays and
the remaining model variables, we have that if the autocovariance between Xand Y
is positive, then Xtends to move accordingly with Yand vice versa, whereas if X
and Yare independent, their autocovariance obviously equals zero.
4.2 Determining lag lengths in VARs
An appropriate method for the lag length selection of VAR is fundamental to deter-
mine properties of VAR and related estimates. There are two main approaches used
for selecting or testing lag length in VAR models. The first consists of rules of thumb
based on the periodicity of the data and past experience, and the second is based on
formal information criteria. VAR models typically include enough lags to capture the
full cycle of the data; for monthly data, this means that there is a minimum of 12 lags,
but we will also expect that there is some seasonality that is carried over from year
Autoregressive approaches to import–export time series I: basic techniques 61
to year, so often lag lengths of 13–15 months are used (see, e.g., [4, Chap. 2.5]). For
quarterly data, it is standard to use six lags. This captures the cyclical components in
the year and any residual seasonal components in most cases. Usually, we decide to
choose the number of lags not exceeding kp + 1 < T , where kis the number of en-
dogenous variables, pis the lag length, and Tis the total number of observations. We
use this limitation because the estimate of all these coefficients increases the amount
of forecast estimation errors, which can result in a deterioration of the accuracy of
the forecast itself. The lag length in VAR can be formally determined using informa-
tion criteria; let ˆ
Σuu be the estimate of the covariance matrix with the (i, j)element
1
TPT
t=1 ˆuitˆujt , where ˆuit is the OLS residual from the jth equation. The BIC for the
kth equation in a VAR model is
BIC(p) = lndet( ˆ
Σuu)+k(kp + 1) ln T
T,(7)
whereas the AIC is computed using Eq. (7), modified by replacing the term ln Tby 2.
Among a set of candidate values of p, the estimated lag length ˆpis the value of pthat
minimizes BIC(p).
4.3 Multiperiod VAR forecast
Iterated multivariate forecasts are computed using a VAR in much the same way as
univariate forecasts are computed using an autoregression. The main new feature of a
multivariate forecast is that the forecast of one variabledepends on the forecast of all
variables in the VAR. To compute multiperiod VAR forecasts hperiods ahead, it is
necessary to compute forecast of all variables for all intervening periods between T
and T+h. Then the following scheme applies: compute the one-period-ahead forecast
of all the variables in the VAR, then use those forecasts to compute the two-period-
ahead forecasts, and repeat the previous stops until the desired forecast horizon. For
example, the two-period-ahead forecast of YT+2 based on the two-variable VAR(p)
in Eq. (6) is
ˆ
YT+2|T=ˆ
β10 +ˆ
β11 ˆ
YT+1|T+ˆ
β12YT+ˆ
β13YT1+···+ + ˆ
β1pYTp+2
+ ˆγ11 ˆ
XT+1|T+ ˆγ12XT+ ˆγ13XT1+···+ ˆγ1pXTp+2,(8)
where the coefficients in (8) are the OLS estimates of the VAR coefficients.
4.4 Granger causality
An important question in multiple time series is to assign the value of individual
variables to explain the remaining ones in the considered system of equations. An
example is the value of a variable Ytfor predicting another variable Xtin a dynamic
system of equations or understanding if the variable Ytis informative about future
values of Xt. The answer is based on the determination of the so-called Granger
causality parameter for a time-series model (for details, see, e.g., [4, Chap. 2.5.4]).
To define the concept precisely, consider the bivariate VAR model for two variables
(Yt, Xt)as in Eq. (6). Using this system of equations, Granger causality states that,
for linear models, XtGranger causes Ytif the behavior of past Ytcan better pre-
dict the behavior of Xtthan the past Xtalone. For the model in system (6), if Xt
Granger causes Yt, then the coefficients for the past values of Xtin the Ytequation
62 L. Di Persio
are nonzero, that is, γ1i6= 0 for i= 1,2,...,p. Similarly, if YtGranger causes Xt
in the Xtequation, then the coefficients for the past values of Ytare nonzero, that is,
β2i6= 0 for i= 1,2,...,p. The formal testing for Granger causality is then done by
using an F test for the joint hypothesis that the possible causal variable does not cause
the other variable. We can specify the null hypothesis for the Granger causality test
as follows.
H0:Granger noncausality Xtdoes not predict Ytif
γ11 =γ12 =···=γ1p= 0,
H1:Granger causality Xtdoes predict Ytif
γ11 6= 0, γ12 6= 0,..., or γ1p6= 0,
whereas the F test implementation is based on two models.
Model 1 (unrestricted)
Yt=β10 +β11Yt1+···+β1pYtp+γ11Xt1+···+γ1pXtp+u1t.
Model 2 (restricted)
Yt=β10 +β11Yt1+···+β1pYtp+u1t.
In the first model, we have γ11 6= 0, γ12 6= 0,...,γ1p6= 0, so the variable Xt
compares in the equation of Yt, namely the values of Xtare useful to predict Yt.
Instead, in the second model, γ11 =γ12 =···=γ1p= 0, so Xtdoes not Granger
cause Yt. The test statistic has an Fdistribution with(p, T 2p1) degrees of
freedom:
F(p, T 2p1) (SSRrestricted SSRunrestricted)/p
SSRunrestricted /(T2p1) .
If this Fstatistic is greater than the critical value for a chosen level of significance, we
reject the null hypothesis that Xthas no effect on Ytand conclude that XtGranger
causes Yt.
4.5 Cointegration
In Section 2.3, we introduced the model of random walk with drift as follows:
Yt=β0+Yt1+ut.(9)
If Ytfollows Eq. (9), then it has an autoregressive root that equals 1. If we consider a
random walk for the first difference of the trend, then we obtain
Yt=β0+ Yt1+ut.(10)
Hence, if Ytfollows Eq.(10), then Ytfollows a random walk, and accordingly
YtYt1is stationary; this is the second difference of Ytand is denoted 2Yt.
A series that has a random walk trend is said to be integrated of order one, or I(1);
Autoregressive approaches to import–export time series I: basic techniques 63
Table 1. Critical values for the EG-ADF statistic
Numbers of regressors 10% 5% 1%
13,12 3,41 3,96
23,52 3,80 4,36
33,84 4,16 4,73
44,20 4,49 5,07
a series that has a trend of the form (10) is said to be integrated of order two, or
I(2); and a series that has no stochastic trend and is stationary is said to be inte-
grated of order zero, or I(0). The order of integration in the I(1) and I(2) termi-
nology is the number of times that the series needs to be differenced for it to be
stationary. If Ytis I(2), then Ytis I(1), so Ythas an autoregressive root that
equals 1. If, however, Ytis I(1), then Ytis stationary. Thus, the null hypothesis
that Ytis I(2) can be tested against the alternative hypothesis that Ytis I(1) by testing
whether Ythas a unit autoregressive root. Sometimes, two or more series have the
same stochastic trend in common. In this special case, referred to as cointegration,
regression analysis can reveal long-run relationships among time series variables.
One could think that a linear combination of two processes I(1) is a process I(1).
However, this is not always true. Two or more series that have a common stochas-
tic trend are said to be cointegrated. Suppose that Xtand Ytare integrated of or-
der one. If, for some coefficient θ,YtθXtis integrated of order zero, then Xt
and Ytare said to be cointegrated, and the coefficient θis called the cointegrat-
ing coefficient. If Xtand Ytare cointegrated, then they have a common stochastic
trend that can be eliminated by computing the difference YtθXt, which elim-
inates this common stochastic trend. There are three ways to decide whether two
variables can be plausibly modeled exploiting the cointegration approach, namely,
by expert knowledge and economic theory, by a qualitative (graphical) analysis of
the series checking for common stochastic trend, and by performing statistical tests
for cointegration. In particular, there is a cointegration test when θis unknown. Ini-
tially, the cointegrating coefficient θis estimated by OLS estimation of the regres-
sion
Yt=α+θXt+zt,(11)
and then we use the Dickey–Fuller test (see Section 2.3) to test for a unit root in zt;
this procedure is called the Engle–Granger augmented Dickey–Fuller test for coin-
tegration (EG-ADF test); for details, see, for example, [6, Chap. 6.2] . The concepts
covered so far can be extended to the case of more than two variables, for example,
three variables, each of which is I(1), are said to be cointegrated if Ytθ1X1tθ2X2t
is stationary. The Dickey–Fuller needs the use of different critical values (see Table 1),
where the appropriate line depends on the number of regressors used in the first step
of estimating the OLS cointegrating regression.
A different estimator of the cointegratingcoefficient is the dynamic OLS (DOLS)
estimator, which is based on the equation
Yt=β0+θXt+
p
X
j=p
δjXtj+ut.(12)
64 L. Di Persio
In particular, from Eq. (12) we notice that DOLS includes past, present, and future
values of the changes in Xt. The DOLS estimator of θis the OLS estimator of θin
Eq. (12). The DOLS estimator is efficient, and statistical inferences about θand δs in
Eq. (12) are valid. If we have cointegration in more than two variables, for example,
three variable Yt, X1t, X2t, each of which is I(1), then they are cointegrated with
cointegrating coefficients θ1and θ2if Ytθ1X1tθ2X2tis stationary. The EG-ADF
procedure to test for a single cointegrating relationship among multiple variables is
the same as for the case of two variables, except that the regression in Eq. (11) is
modified so that both X1tand X2tare regressors. The DOLS estimator of a single
cointegrating relationship among multiple Xs involves the level of each Xalong with
lags of the first difference of each X.
5 Conclusion
In this first part of our ambitious project to use multivariate statistical techniques to
study critic econometric data of one of the most influential economy in Italy, namely
the Verona import–export time series, we have focused ourselves on a self-contained
introduction to techniques of estimating OLS-type regressions, analysis of the cor-
relations obtained between the different variables and various types of information
criteria to check for the goodness of fit. A particular relevance has been devoted to
the application of tests able to enlightening various types of nonstationarity for the
considered time series, for example, the augmented Dickey–Fuller test (ADF) and
the Quandt likelihood ratio statistic (QLR). Moreover, we have also exploited both
the Granger causality test and the Engle–Granger augmented Dickey–Fuller test for
cointegration (EG-ADF) in order to analyze if and how these variables are related
to each other and to have a measure on how much a variable gives information on
the other one. Such approaches constitute the core of the second part of our project,
namely the aforementioned Verona case study.
Acknowledgements
The author would like to acknowledge the excellent support that Dr. Chiara Segala
gave him. Her help has been fundamental to develop the whole project, particularly,
for the realization of the applied sections, which constitute the core of the whole
work.
References
[1] Baldi, P.: Calcolo delle Probabilitá. The McGraw-Hill Companies, Milano (2007)
[2] Bee Dagum, E.: Analisi delle Serie Storiche, Modellistica, Previsione e Scomposizione.
Springer, Milano (2002)
[3] Bernstein, S., Bernstein, R.: Statistica Inferenziale. McGraw-Hill, Milano (2003)
[4] Brandt, P.T., Williams, J.T.: Multiple Time Series Models. Sage Publications, Thousand
Oaks, CA (2007)
[5] Harris, R., Sollis, R.: Applied Time Series Modelling and Forecasting. John Wiley & Sons
Ltd, West Sussex, England (2003)
Autoregressive approaches to import–export time series I: basic techniques 65
[6] Kirchgässner, G., Wolters, J.: Introduction to Modern Time Series Analysis. Springer,
Berlin, Heidelberg (2007). MR2451567
[7] Stock, J.-H., Watson, M.W.: Introduzione all’Econometria. Pearson, Milano (2012)
[8] Wei, W.W.S.: Time Series Analysis, Univariate and Multivariate Methods. Pearson, Boston
(2006). MR2517831
... In this second part of a two-paper project, we move from theory of autoregressive, possibly multivalued, time series to the study of a concrete framework. In particular, exploiting precious economic data that the Commerce Chamber of Verona Province has put at our disposal, we successfully applied some of the relevant approaches introduced in [5] to find dependencies between economic factors characterizing the Province economy, then to make effective forecasts, very close to the real behavior of studied markets. The present part of the project is divided as follows: first, we consider an AR-approach to Verona import-export time series, then we provide a VAR model analysis of Verona relevant econometric data taken from various web databases such as Coeweb, Stockview, and Movimprese, and, within the last section, we compare such data with those coming from the whole Italian scenario. ...
... The present part of the project is divided as follows: first, we consider an AR-approach to Verona import-export time series, then we provide a VAR model analysis of Verona relevant econometric data taken from various web databases such as Coeweb, Stockview, and Movimprese, and, within the last section, we compare such data with those coming from the whole Italian scenario. We would like to emphasize that all the theoretical background and related definitions can be retrieved from [5]. ...
... Each F-statistic tests seven restrictions. Restrictions on the coefficients equaled to zero under the null hypothesis (see [5,Sect. 2.4]), and since in our case we have the coefficients of the six delays and the intercept, we get seven restrictions. ...
Article
Full-text available
The present work constitutes the second part of a two-paper project that, in particular, deals with an in-depth study of effective techniques used in econometrics in order to make accurate forecasts in the concrete framework of one of the major economies of the most productive Italian area, namely the province of Verona. It is worth mentioning that this region is indubitably recognized as the core of the commercial engine of the whole Italian country. This is why our analysis has a concrete impact; it is based on real data, and this is also the reason why particular attention has been taken in treating the relevant economical data and in choosing the right methods to manage them to obtain good forecasts. In particular, we develop an approach mainly based on vector autoregression where lagged values of two or more variables are considered, Granger causality, and the stochastic trend approach useful to work with the cointegration phenomenon.
... The signal is simulated by an underlying process such as an auto-regressive process and in turn generates synthetic data for analysis. Other examples of random signals could be tidal levels [76,151], import/export quantities [152,153], and pairs trading [47][48][49][50][51][52] and Sec. 3.3. ...
Conference Paper
This thesis is a collection of three separate studies -- but split into four chapters -- which address the underlying issues in the nature and dynamics of markets. The studies investigate price-formation in the presence of noisy asymmetric information flow to a synthetic market, the statistical behaviour of in-play predictive markets and a reformulation of the Markowitz portfolio optimisation for financial market securities into the time-domain. The first study looks to examine modern in-play gambling or predictive markets, in particular, horse racing markets. Since the advent of online sports gambling approximately 15 years ago large amounts of data have been collected for many different sporting events such as football, greyhound racing and cricket. In this study, the focus is on in-play horse racing markets where stylised statistical facts are presented and discussed. Price efficiency is analysed, and statistical arbitrage trading algorithms are developed to evaluate such efficiencies/inefficiencies. We develop a new model for testing the efficiencies of the initial implied odds quoted on the market. Exploring the efficiencies/inefficiencies found in the in-play markets we develop a martingale toy model and a statistical arbitrage trading model. In the second study, we explore price-formation and the pioneering approach to financial asset pricing known as the Brody-Hughston-Macrina framework. The Brody-Hughston-Macrina information-based asset pricing framework is investigated in two parts; the first a development of a trading model and the other a generalisation of the information process that does not assume a linear rate-of-information flow. The trading model developed is a computational agent-based model that allows different configurations of agents to trade and hence create a synthetic market. The different configurations are explored by tracking the market price and times between adjacent trades with respect to changing certain model parameters, such as spread. The generalisation of the rate-of-information does not assume a linear function, as in the original Brody-Hughston-Macrina framework, but instead one that is non-linear in time. We estimate such a function from gambling market data and find it not to be a linear function. The non-linear Brody-Hughston-Macrina framework is fitted to winning horse odds signals. The final study is motivated by recent advances in the spectral theory of auto-covariance matrices, and we are led to revisit a reformulation of Markowitz' mean-variance portfolio optimisation approach in the time domain. In its simplest incarnation, it applies to a single traded asset and allows to find an optimal trading strategy which, for a given return, is minimally exposed to market price fluctuations. The model is initially investigated for a range of synthetic price processes, taken to be either second order stationary, or to exhibit second order stationary increments. Attention is paid to consequences of estimating auto-covariance matrices from small finite samples, and auto-covariance matrix cleaning strategies to mitigate against these are investigated. Finally, we apply our framework to real world data.
Book
This book presents modern developments in time series econometrics that are applied to macroeconomic and financial time series. It bridges the gap between methods and realistic applications. This book contains the most important approaches to analyze time series which may be stationary or nonstationary. It starts with modeling and forecasting univariate time series and then presents Granger causality tests and vector autoregressive models for multiple stationary time series. For real applied work the modeling of nonstationary uni- or multivariate time series is most important. Therefore, unit root and cointegration analysis as well as vector error correction models play a central part. Modelling volatilities of financial time series with autoregressive conditional heteroskedastic models is also treated.
  • W W S Wei
Wei, W.W.S.: Time Series Analysis, Univariate and Multivariate Methods. Pearson, Boston (2006). MR2517831
Analisi delle Serie Storiche, Modellistica, Previsione e Scomposizione
  • Bee Dagum
Bee Dagum, E.: Analisi delle Serie Storiche, Modellistica, Previsione e Scomposizione. Springer, Milano (2002)
Calcolo delle Probabilitá. The McGraw-Hill Companies
  • P Baldi
Baldi, P.: Calcolo delle Probabilitá. The McGraw-Hill Companies, Milano (2007)
Introduzione all'Econometria
  • J.-H Stock
  • M W Watson
Stock, J.-H., Watson, M.W.: Introduzione all'Econometria. Pearson, Milano (2012)