Content uploaded by Elias Tzavalis

Author content

All content in this area was uploaded by Elias Tzavalis on Nov 30, 2017

Content may be subject to copyright.

Computational Statistics and Data Analysis 76 (2014) 391–407

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis

journal homepage: www.elsevier.com/locate/csda

Testing for unit roots in short panels allowing for a structural break

Yiannis Karavias, Elias Tzavalis ∗

Department of Economics, Athens University of Economics & Business, Greece

article info

Article history:

Received 16 April 2011

Received in revised form 18 October 2012

Accepted 19 October 2012

Available online 26 October 2012

Keywords:

Panel data models

Unit roots

Structural breaks

Sequential tests

Bootstrap

Trade openness

abstract

Panel data unit root tests which allow for a common structural break in the individual

effects or linear trends of the AR(1) panel data model are suggested. These allow the date

of the break to be unknown. The tests assume that the time-dimension of the panel (T) is

fixed (finite) while the cross-section (N) is large. Under the null hypothesis of unit roots,

they are similar to the initial conditions of the model and its individual effects. Extensions

of the tests to the AR(2) model are provided. These highlight the difficulties in extending

the tests to higher order serial correlation of the error terms. Monte Carlo experiments

indicate that the small sample performance of the tests is very satisfactory. Application of

the tests to the trade openness variable of the non-oil countries indicates that evidence

of persistence of this variable can be attributed to trade liberalization policies adopted by

many developing countries since the early nineties.

©2012 Elsevier B.V. All rights reserved.

1. Introduction

The autoregressive panel data model of lag order one (denoted as AR(1)), which assumes that the time dimension of

the data (denoted as T) is fixed (finite) while its cross-sectional (denoted as N) is large, has been extensively used in the

literature to study the dynamic behavior of many economic time series across different units, i.e. countries or industries

(see Arellano, 2003,Arellano and Honoré, 2002 and Baltagi and Kao, 2000,inter alia). Of particular interest is the use of this

model to examine if economic series contain a unit root in their autoregressive component (see Hlouskova and Wagner,

2006, for a recent survey). Recent economic applications of panel data unit root tests include investigation of the following:

the economic growth convergence hypothesis (see de la Fuente, 1997, for a survey), the random walk hypothesis of stock

prices and dividends (see, e.g., Harris and Tzavalis, 2004 and Lo and MacKinlay, 1995), the long-run validity of purchasing

power parity (see Culver and Papell, 1999,inter alia) and, finally, the permanent effects of liberalization policies on trade

(see, e.g., Wacziarg and Welch, 2004).

This paper extends Harris’ and Tzavalis (1999) panel unit root tests assuming fixed-Tto allow for a potential structural

break in the deterministic components of the AR(1) panel data model, namely its individual effects and/or linear trends, at

a known or unknown date. This is a very useful extension given recent evidence suggesting that the presence of a unit root

in the autoregressive component of many economic series can be attributed to the existence of structural breaks in their

deterministic components, which are ignored by standard unit root testing procedures (see Perron,1989,1990, for single

time series analysis). The panel data approach offers an interesting and unique perspective to investigate if evidence of unit

roots can be falsely attributed to the existence of structural breaks, which is not shared by single series methods. The cross-

sectional units of panel data can provide important sample information which can help to distinguish permanent stochastic

shifts of economic time series from changes in their deterministic components (see, e.g., Bai, 2010).

In contrast to the vast literature for single time series, there are few studies that consider panel data unit root tests

allowing for structural breaks (see Bai and Carrion-i-Silvestre, 2009,Carrion-i-Silvestre et al., 2005 and Chan and Pauwels,

∗Correspondence to: Department of Economics, Athens University of Economics & Business, Patission 76, Athens 104 34, Greece. Tel.: +30 2108203332.

E-mail addresses: jkaravia@aueb.gr (Y. Karavias), e.tzavalis@aueb.gr (E. Tzavalis).

0167-9473/$ – see front matter ©2012 Elsevier B.V. All rights reserved.

doi:10.1016/j.csda.2012.10.014

392 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

2011). However, these tests assume that the time-dimension of the panel model Tis large and, more importantly, that it

grows larger than N. Thus, they are more appropriate for panel data sets where Tis bigger than N, referred to as large panels.

As shown in Harris and Tzavalis (1999), application of large-Tpanel unit root tests to short panels, which assume fixed-T,

leads to serious size distortions and power reductions, since their sample distribution is not well approximated in panels

with small T. In the literature which assumes fixed-Tpanel data unit root tests, there are also few studies which suggest

unit root tests allowing for structural breaks (see Carrion-i-Silvestre et al., 2002,Tzavalis, 2002 and, more recently, Hadri

et al., forthcoming). These studies are mainly interested in pursuing ideas on how to test for a unit root in the AR(1) panel

data model allowing for a common break in its individual effects. They mainly consider the case of a known date break and

they assume that the error terms of the AR(1) panel data model are normally distributed.

The main goal of this paper is to extend the above fixed-Tpanel unit root tests, considering a common break in the

deterministic components of the AR(1) panel data model, to allow for an unknown break point. This is done under quite

general distributional assumptions of the error terms. The proposed test statistics are similar (invariant) under the null

hypothesis to the initial conditions and/or the individual effects of the panel data model. This property of the tests is very

useful for the following two reasons. First, it does not require any assumption about the initial conditions of the panel and,

second, it does not involve estimation of its individual effects. As recently shown by Kim and Perron (2009), unit root testing

procedures relying on estimation of these effects in a first step perform poorly in small samples. Our suggested tests can

be applied to the case of a two-way error component panel data model, which allows for cross-correlation across the error

terms. This can be done by taking deviations of the individual series of the panel from their cross-section mean at each point

in time t(see O’Connell, 1998), in the first step.

To apply the tests in the case of an unknown break point, the paper relies on the sequential testing procedure

recommended in single time series analysis by Zivot and Andrews (1992) (see also Andrews, 1993 and Perron, 1997). This

procedure calculates the minimum value of one-sided standardized test statistics which assume a known date break. These

statistics are sequentially computed for each possible break point of the sample that the break can occur. The limiting

distribution of these sequential test statistics is that of the minimum value of a fixed number of correlated standard normals.

The paper derives analytically the correlation matrix of these variables and tabulates critical values of the distribution of

their minimum based on Monte Carlo simulations. Finally, the paper shows how to extend the tests to the case of a panel data

autoregressive model of lag order two, which may describe some economic series (see, e.g., Cati et al., 1999). This extension

highlights some of the difficulties of generalizing the tests to allow for serially correlated error terms when Tis fixed.

The paper is organized as follows. Section 2derives the limiting distributions of the test statistics suggested by the paper

for the cases that the break point is considered as known and unknown. This section also proves the consistency of the tests

under the alternative hypothesis of stationarity. Section 3extends the tests to the case of AR(2) panel data model. Section 4

carries out a Monte Carlo study which evaluates the small sample performance of the tests. Section 5implements the tests to

investigate if trade liberalization policies introduced at the end of eighties and/or the early nineties in the non-oil countries

have permanently fostered international trade. Section 6concludes the paper.

2. The test statistics and their limiting distribution

2.1. The date of the break point is known

Consider the following non-linear AR(1) panel data models, denoted as m= {M1,M2}, allowing for a common structural

break in their deterministic components at time point T0:

M1:yi=ϕyi−1+(1−ϕ)(a(λ)

ie(λ) +a(1−λ)

ie(1−λ))+ui,i=1,2,...,Nand (1)

M2:yi=ϕyi−1+ϕβie+(1−ϕ)(a(λ)

ie(λ) +a(1−λ)

ie(1−λ))+(1−ϕ)(β(λ)

iτ(λ) +β(1−λ)

iτ(1−λ))+ui,(2)

where yi=(yi1,...,yiT )′is a vector which collects the time series observations of panel data series yit , for t=1,2,...,T,

across the cross-sectional units of the panel i=1,2,...,N,yi−1=(yi0,...,yiT−1)′is vector yilagged one period back,

ui=(ui1,...,uiT )′is the vector of error terms uit , for all t, βieis defined as βie=β(λ)

ie(λ) +β(1−λ)

ie(1−λ), where e

is a (TX1)-dimension vector of unities, and e(λ) and e(1−λ) are (TX1)-dimension vectors defined, respectively, as follows:

e(λ)

t=1 if t≤T0and 0 otherwise, and e(1−λ)

t=1 if t>T0and 0 otherwise. These vectors are appropriately designed

to capture a possible common break in the individual effects of models (1) and (2),ai, before and after the break occurs,

denoted respectively as a(λ)

iand a(1−λ)

i, where λdenotes the fraction of the sample that this break occurs. λis defined as

λ∈I=1

T,2

T,...,T−1

Tfor model M1 and λ∈I=2

T,3

T,...,T−2

Tfor model M2, where [·] denotes the

integer part. In addition to a common break in individual effects ai, model (2) also allows for a common break in the slope

coefficients of the individual linear trends of the panel, βi, denoted as β(λ)

iand β(1−λ)

i, for all i. Vectors τ(λ) and τ(1−λ) collect

the time points of these trends. More specifically, the elements of these vectors are defined as follows: τ(λ)

t=tif t≤T0

and 0 otherwise, while τ(1−λ)

t=tif t>T0and 0 otherwise.

The AR(1) panel data models M1 and M2, given by Eqs. (1) and (2), can be employed to obtain panel data unit root test

statistics which are similar (invariant) under null hypothesis H0:ϕ=1 to the initial conditions of the panel yi0and/or

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 393

its individual effects, for all i. Similarity is a desired property of these tests because estimation of the initial conditions or

individual effects of the panel, in a first step, can be proved very inefficient, even under H0:ϕ=1. As shown recently by

Kim and Perron (2009) in time series analysis, imprecise estimation of nuisance parameters a(λ)

iand a(1−λ)

i, and/or break

point T0can seriously affect the performance of unit root tests allowing for structural breaks. In particular, model M1 can be

employed to develop panel unit root test statistics which are similar to yi0in the case where panel data series yit constitute

pure random walks under H0:ϕ=1, i.e. yit =yit−1+uit , for all i. On the other hand, model M2 is appropriate to derive

panel unit root test statistics which are similar to yi0and the individual effects of the panel in the case where yit are random

walks with drifts under H0:ϕ=1. That is, yit =βi+yit−1+uit , where drift parameters βi(βi=β(λ)

i=β(1−λ)

i)constitute

the individual effects of the panel data model under H0:ϕ=1 (see, e.g., Andrews, 1993,Vogelsang and Perron, 1998 and

Zivot and Andrews, 1992).

Unit root test statistics based on models M1 and M2 will have the power to reject null hypothesis H0:ϕ=1 in favor of

its alternative of stationarity, defined as Ha:ϕ < 1, around broken individual effects or linear trends, when Ha:ϕ < 1 is

true. As mentioned in the introduction, the main focus of these testing procedures is to diagnose whether evidence of unit

roots can be spuriously attributed to ignorance of a common structural break in nuisance parameters αiand βiof models

M1 and M2, for all cross-sectional units of the panel. This break can be attributed to a monetary policy regime change

announcement, a credit crunch or an exchange rate realignment, all of which have strong effect on the units of the panel.

The unit root test statistics that we present in this section rely on the following pooled least squares (LS) estimator of the

autoregressive coefficient ϕof models (1) and (2):

ˆϕ=N

i=1

y′

i−1Q(λ)

myi−1−1N

i=1

y′

i−1Q(λ)

myi,m= {M1,M2},(3)

where Q(λ)

mis the (TXT ) ‘‘within’’ transformation (annihilator) matrix of the time series of the panel yit (see Baltagi, 1995,

inter alia). This matrix is defined as Q(λ)

m=I−X(λ)

mX(λ)′

mX(λ)

m−1

X(λ)′

m, where X(λ)

M1=e(λ),e(1−λ) for model M1 and X(λ)

M2=

e(λ),e(1−λ) , τ (λ), τ (1−λ) for model M2. Matrix Q(λ)

mhas the following useful properties: Q(λ)

me=Q(λ)

me(λ) =Q(λ)

me(1−λ) =0,

for both models M1 and M2. By specifying appropriately Q(λ)

m, our tests can be extended to include exogenous (non-entity

specific) regressors Xt. As can be seen in the Appendix (see system of Eqs. (17)), these properties of Q(λ)

mrender the unit

root test statistics based on LS estimator ˆϕsimilar to the initial conditions of the panel yi0and/or its individual effects under

H0:ϕ=1, given as βi. Since ˆϕis an inconsistent estimator of ϕdue to the within transformation of the data (see, e.g., Kiviet,

1995 and Nickell, 1981), the unit root test statistics that we propose correct for the inconsistency of ϕunder H0:ϕ=1

along the lines suggested by Harris and Tzavalis (1999). The limiting distribution of this corrected estimator is derived under

the following assumption about error terms uit .

Assumption 1. {uit }is a sequence of independently and identically distributed (IID) random variables with E(uit )=

0,Var(uit )=σ2

u<∞,E(u4

it )=k+3σ4

u,∀i∈ {1,2,...,N}and ∀t∈ {1,2,...,T}, where k<∞.

Assumption 1 enables us to derive the limiting distributions of test statistics based on estimator ˆϕusing standard

asymptotic theory, assuming that the cross-section dimension of the panel Ngoes to infinity. The time-dimension Tof

the panel will be treated as fixed (or finite) (see, e.g., Arellano, 2003). This can be done under quite general distributional

assumptions of error terms uit . The condition k<∞implies that the fourth moment of uit exists. This condition is required

for the application of the Kitchin weak law of large numbers (KWLLN) and the Lindeberg–Levy central limit theorem (CLT)

in driving the limiting distribution of ˆϕ, corrected for its inconsistency. Note that, as mentioned before, in deriving these

limiting distributions we do not need to make any assumptions about yi0and βi. The restriction that error terms uit are IID is

a necessity, which is due to the finite Tdimension. Under this assumption, we cannot apply results on martingale difference

sequences to obtain the limiting distributions of our tests, as in single time-series unit root tests assuming large T. When T

is fixed, extensions of our tests to higher order serially correlated error terms can rely on procedures like those considered

in Section 4, extending the tests to the case of the AR(2) panel data model.

The next theorem presents the limiting distribution of ˆϕ−1 adjusted for the inconsistency of ˆϕunder H0:ϕ=1. This

is done for the case that the break point T0is considered as known. Since this inconsistency is a function of the fraction of

the sample that the break occurs λ, it will be henceforth denoted as B(λ).

Theorem 1. Let us assume that the break point T0is known. Then, under H0:ϕ=1and Assumption 1, we have

Z(λ) ≡√N(ˆϕ−1−B(λ)) L

−→ N0,C(k, σ 2

u, λ),(4)

as N → ∞, where

B(λ) =plim

N→∞

(ˆϕ−1)=tr[Λ′Q(λ)

m]{tr(Λ′Q(λ)

mΛ)}−1,for m = {M1,M2},

394 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

and

C(k, σ 2

u, λ) =k

T

j=1

a(λ)2

jj +2σ4

utr(A(λ)2)σ2

utr(Λ′Q(λ)

mΛ)−2,

where Λis a (TXT )matrix defined as Λr,c=1, if r >c and 0otherwise, A(λ) ≡ [a(λ)

ij ]is a (TXT )-dimension symmetric matrix,

defined as A(λ) =1

2(Λ′Q(λ)

m+Q(λ)

mΛ)−B(λ)(Λ′Q(λ)

mΛ)and ‘ L

−→’ signifies convergence in distribution. The proof is given in

the Appendix.

The test statistics given by Theorem 1 can be easily implemented to test for unit roots using the tables of the standard

normal distribution when scaled appropriately by their standard deviations, i.e. C(k, σ 2

u, λ)−1/2Z(λ). This can be done

separately for models M1 and M2, after specifying appropriately annihilator matrix Q(λ)

m. To calculate these statistics, we

need unbiased (or N-consistent) estimates of the nuisance parameters kand σ2

uof the variance function of the limiting

distribution of Z(λ), C(k, σ 2

u, λ). These can be obtained under H0:ϕ=1, based on the first differences of the panel data

series 1yit (see Harris and Tzavalis, 2004). The results of Theorem 1 can be easily extended to the case that the disturbance

terms uit are heterogeneous across i, i.e. uit ∼IID(0, σ 2

ui)with E(u4

it )=ki+3σ4

ui, where ki<∞ ∀ i∈ {1,2,...,N}. In this

case, nuisance parameters σ2

uand kwill be given as σ2

u=1

NN

i=1σ2

uiand k=1

NN

i=1ki, respectively (see White, 2000).

These parameters can be easily estimated under H0:ϕ=1, following an analogous procedure to that for the case where

σ2

uiand kiare homogeneous, for all i. If error terms uit are normally distributed, i.e. uit ∼NIID(0, σ 2

u), then variance function

C(k, σ 2

u, λ) becomes invariant to nuisance parameters kand σ2

u, since k=0 and σ2

ucancels out from the numerator and

denominator of C(k, σ 2

u, λ). In this case, the limiting distribution of Z(λ) is given in the next corollary.

Corollary 1. If uit ∼NIID(0, σ 2

u), then the limiting distribution of Z (λ) becomes

Z(λ) ≡√N(ˆϕ−1−B(λ)) L

−→ N(0,C(λ)),(5)

where C(λ) =2tr(A(λ)2)tr(Λ′Q(λ)

mΛ)−2

.

The test statistics given by Theorem 1 (or Corollary 1) are consistent under alternative hypothesis Ha:ϕ < 1, as N→ ∞.

This result can be proved under Assumption 1 and the following weak assumption.

Assumption 2. (b1) E(uit yi0)=E(uit β(r)

i)=E(uit a(r)

i)=0 for r= {λ, (1−λ)}and ∀i∈ {1,2,...,N},t∈ {1,2,...,T}.

(b2) E(y4

i0) < +∞,E((a(r)

i)4) < +∞,E((β(r)

i)4) < +∞, for r= {λ, (1−λ)}and ∀i∈ {1,2,...,N},t∈ {1,2,...,T}

(b3) E(y2

i0f(r)

if′(r)

i) < +∞, where f(r)

i=(a(r)

i, β(r)

i)for r= {λ, (1−λ)}and ∀i∈ {1,2,...,N},t∈ {1,2,...,T}.

The consistency of Z(λ) is established in the next theorem.

Theorem 2. Under Assumptions 1and 2, as N → ∞, we have

lim

N→∞ PC(k, σ 2

u, λ)−1/2Z(λ) < ca|Ha:ϕ < 1=1,

where cais the left-tail critical value of the limiting distribution of test statistic C (k, σ 2

u, λ)−1/2Z(λ) under H0:ϕ=1at a level

of significance a. The proof is given in the Appendix.

After being scaled (multiplied) by T, test statistics Z(λ) can be applied to the case that both Nand Tdimensions of the

panel become large. Following Hahn and Kuersteiner (2002) and Harris and Tzavalis (1999), this yields

Z′(λ) ≡T√N(ˆϕ−1−B(λ)) L

−→ N0,T2C(k, σ 2

u, λ).

Under the assumption that uit ∼NIID(0, σ 2

u), we can easily see that, as T→ +∞, the limiting distribution of Z′(λ) is given

as

Z′(λ) ≡T√N(ˆϕ−1−Bm(λ)) L

−→ N(0,Dm(λ)),for m= {M1,M2},(6)

where Bm(λ) and Dm(λ) are defined as follows

BM1(λ) = − 3

2λ2−2λ+1and BM2(λ) = − 15

2(2λ2−2λ+1),

DM1(λ) =3(40λ6−120λ5+204λ4−208λ3+162λ2−78λ+17)

5(2λ2−2λ+1)4,

and DM2(λ) =3360λ6−10 080λ5+20 070λ4−23 340λ3+22 410λ2−12 420λ+2895

1792λ8−7168λ7+14 336λ6−17 920λ5+15 232λ4−8960λ3+3584λ2−896λ+112 .

The proof of (6) is given in the Appendix.

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 395

As noted by Hahn and Kuersteiner (2002, see Theorem 4, p. 1646), the above result, given by (6), holds independently

of the relative growth rate between Nand T. The conditions of their theorem apply to our case, because test statistic Z(λ)

is similar (invariant) with respect the initial conditions yi0and individual effects βiof the panel, under the null hypothesis.

The limiting distribution of test statistics Z′(λ), given by (6), can be employed to determine the time dimension Tof panel

data models M1 and M2 for which large-Tpanel data unit root test statistics allowing for structural breaks can sufficiently

approximate their sample (empirical) distribution. This is useful in practice, as it can indicate the minimum size of Trequired

so that large-Tpanel data unit root tests allowing for structural breaks can be successfully implemented, in practice. Finally,

note that, when there is no break (i.e. λ=0, or λ=1), the limiting distributions of statistics Z′(λ) reduce to those of Harris

and Tzavalis (1999), who consider no breaks in models m= {M1,M2}.

2.2. The date of the break point is unknown

This section relaxes the assumption that the break point is known and proposes unit root test statistics based on models

M1 and M2 which allow for a structural break of unknown date. As with statistics Z(λ), this break is considered under

alternative hypothesis Ha:ϕ < 1. Following Perron and Vogelsang (1992) and Zivot and Andrews (1992) for single

time series, or De Wachter and Tzavalis (2012) for panel data, the proposed tests will view the selection of the break

point as the outcome of a sequential testing procedure minimizing the standardized test statistic given by Theorem 1, i.e.

C(k, σ 2

u, λ)−1/2Z(λ), over all possible break points of the sample, after trimming out the initial and final parts of our sample.

The minimum value of test statistics C(k, σ 2

u, λ)−1/2Z(λ), for all λ∈I, defined as z, will give the least favorable result for

null hypothesis H0:ϕ=1.

Let ˆ

λmin denote the break point at which the minimum value of C(k, σ 2

u, λ)−1/2Z(λ), for all λ, is obtained. Then, H0:ϕ=1

will be rejected, if we have

z≡min

λ∈IC(k, σ 2

u, λ)−1/2Z(λ) < cmin,a,

where cmin,adenotes the left-tail critical value of the limiting distribution of test statistic zat a level of significance a. The

following theorem enables us to tabulate critical values of this distribution at any desired value a.

Theorem 3. Let Assumption A1 hold. Then, under H0:ϕ=1and as N → ∞ we have

z≡min

λ∈IC(k, σ 2

u, λ)−1/2Z(λ) L

−→ min

λ∈IN(0,R), (7)

where R ≡ [Corrλs]is the covariance (correlation) matrix of standardized statistics C (k, σ 2

u, λ)−1/2Z(λ). The elements of matrix

R are the correlation coefficients between C(k, σ 2

u, λ)−1/2Z(λ) and C(k, σ 2

u,s)−1/2Z(s), for all different pairs of break fractions

(λ, s)∈I. These elements can be analytically calculated based on the following formula:

Corrλs=

k

T

j=1

a(λ)

jj a(s)

jj +2σ4

utr(A(λ)A(s))

k

T

j=1

a(λ)2

jj +2σ4

utr(A(λ)2)1/2k

T

j=1

a(s)2

jj +2σ4

utr(A(s)2)1/2,(8)

where matrix A(ξ) ≡ [a(ξ )

ij ], for ξ= {λ, s} ∈ I, is defined in Theorem 1.

The result of this theorem follows as an extension of Theorem 1, by applying the continuous mapping theorem to the

joint limiting distribution of standardized test statistics C(k, σ 2

u, λ)−1/2Z(λ), for all λ∈I. The derivation of correlation

coefficients Corrλsformula (9) is given in the Appendix (see proof of Theorem 3). Note that, if uit ∼NIID(0, σ 2

u), Corrλs

become independent of nuisance parameters kand σ2

u, i.e.

Corrλs=tr(A(λ)A(s))

tr(A(λ)2)1/2tr(A(s)2)1/2.(9)

The test statistics given by Theorem 3,z, are consistent under alternative hypothesis Ha:ϕ < 1, for both models M1 and

M2. This is established in the following theorem.

Theorem 4. Under Assumptions 1and 2, as N → ∞, we have

lim

N→∞ Pz≡min

λ∈IC(k, σ 2

u, λ)−1/2Z(λ) < cmin,a|Ha=1,

where cmin,ais the left-tail critical value of the limiting distribution of test statistic z under H0:ϕ=1 at a level of significance a.

The proof follows immediately from Theorem 2.

396 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

Table 1

Critical values of distribution minλ∈IN(0,R).

a(%) T

Panel A(for model M1) Panel B(for model M2)

10 15 25 50 10 15 25 50

1−2.91 −2.95 −2.98 −3.05 −2.92 −2.97 −3.04 −3.10

5−2.15 −2.33 −2.37 −2.43 −2.31 −2.38 −2.43 −2.49

10 −1.83 −2.00 −2.04 −2.10 −1.99 −2.07 −2.11 −2.16

Notes: Panel A of Table 1 presents the critical values of distribution minλ∈IN(0,R)for model M1, where X(λ)

M1=e(λ),e(1−λ) , and Panel B for model M2,

where X(λ)

M2=e(λ),e(1−λ) , τ (λ), τ (1−λ) .

The results of Theorem 3 imply that critical values of the limiting distribution of sequential test statistics zcan be easily

calculated based on those of the distribution of the minimum value of a fixed number of correlated normal variables whose

covariance matrix is defined by R, with elements Corrλs. In Table 1, we present critical values of this distribution for the

case where uit ∼NIID(0, σ 2

u). This is done for different values of significance level aand time dimension of the panel T,

i.e. a= {1%,5%,10%}and T= {10,15,25,50}, respectively, as well as for the two different models M1 and M2. The critical

values reported in Table 1 are calculated based on 100 000 Monte Carlo simulations as follows. For each simulation, we

generated a vector of observations from a multivariate normal distribution of Tvariables minus the trimming points of the

series implied by models M1 and M2, with zero mean, unit variance and covariance matrix R. Then, we sorted out this vector

in a descending order and we selected their minimum value.

The critical values reported in Table 1 are clearly well below the left-tail critical values of the standard normal distribution.

This is true for all levels of aand Tconsidered. These values are negative and decrease as Tincreases. The latter happens

because, as the number of random variables increases, the joint distribution skews to the right, thus increasing the

probability of having a minimum from a tail event of this distribution. The above results clearly indicate that using critical

values of the standard normal distribution in implementing sequential panel unit root test statistics allowing for breaks,

such as statistics z, will lead to over sized tests, which will tend to reject H0:ϕ=1 very often.

Finally, note that the test statistic zimplied by Theorem 3 for model M2 cannot be applied in case where this model

considers broken individual effects under H0:ϕ=1, i.e. βie=β(λ)

ie(λ)

t+β(1−λ)

ie(1−λ)

t. This is because Q(λ)

me(s)̸= 0, for

λ̸= s, and thus zwill depend on the elements of vector βieunder the null hypothesis. To test for unit roots in this case,

we can rely on a consistent estimator of the break point T0under this hypothesis, which implies that the first-difference of

data panel series yit can be written as 1yit =β(λ)

ie(λ)

t+β(1−λ)

ie(1−λ)

t+uit , for all i. The last process is stationary and, thus,

Bai’s (2010) procedure can be applied to obtain a consistent estimate of break point T0, which converges at an op(√N)rate.

Given this, we can treat T0as known and, thus, we can apply test statistics Z(λ) given by Theorem 1 to test H0:ϕ=1.

The performance of test statistic Z(λ) in this case was investigated through a Monte Carlo exercise reported in a previous

version of the paper, and was found to be satisfactory.

3. The case of serially correlated error terms

3.1. Known break point

In this section, we suggest an extension of the test statistics to allow for higher-order serial correlation in error terms uit .

This is done based on AR(2) panel data models, considered recently by De Blander and Dhaene (2011) who developed panel

data unit root tests allowing for serial correlation in uit . The goal of this extension is to highlight some of the difficulties

encountered in developing panel unit root tests for higher than one order of serial correlation of uit in the presence of

structural breaks, when Tis fixed.

Assume that the vector of errors terms uiis given by the following autoregressive process:

ui=ρui−1+εi,(10)

where εiis a vector of independently and identically distributed error terms εit , with E(εit )=0,Var(εit )=σ2

ε<

∞,E(ε4

it )=k+3σ4

ε, for all iand t, i.e. εit ∼IID(0, σ 2

ε). These assumptions about εit correspond to those about uit ,

summarized by Assumption 1. Using (10), models M1 and M2 now become

M1∗:yi=ϕ∗yi−1+ρ∗1yi−1+a∗

i+εi,t=3,4,...,Tand i=1,2,...,N,(11)

M2∗:yi=ϕ∗yi−1+ρ∗1yi−1+a∗

i+β∗

i+ϕ(1−ρ)βi+εi,(12)

respectively, where ϕ∗=(ϕ +ρ(1−ϕ )), ρ∗=ρϕ , a∗

i=(1−ϕ)(1−ρ)(a(λ)

ie(λ) +a(1−λ)

ie(1−λ)), β ∗

i=(1−ϕ)(1−

ρ)(β (λ)

iτ(λ) +β(1−λ)

iτ(1−λ)), and 1yi−1=yi−1−yi−2. Due to the second order lag structure of models M1∗and M2∗, given

by Eqs. (11) and (12), respectively, the dimension of column vectors yi,yi−1and εiis reduced by two. These now are defined

as follows: yi=(yi3,...,yiT )′,yi−1=(yi2,...,yiT−1)′,yi−2=(yi1,...,yiT −2)′and εi=(εi3, . . . , εiT )′. This implies that

the index sets of break fraction λnow are defined as I∗=2

T,...,T−2

Tfor model M1∗and I∗=2

T,...,T−4

Tfor

model M2∗.

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 397

Under null hypothesis H0:ϕ=1, models M1∗and M2∗imply the following panel data generating processes:

yit =yit−1+ρ1yit −1+εit and yit =(1−ρ)βi+yit−1+ρ1yit −1+εit, respectively, since ϕ∗=1 and ρ∗=ρ. Under

this hypothesis, the pooled least squares estimators of ϕ∗and ρ∗, denoted as ˆϕ∗and ˆρ∗, satisfy the following system of

equations:

ˆϕ∗−1

ˆρ∗−ρ=

N

i=1

y′

i−1Q(λ)

myi−1

N

i=1

y′

i−1Q(λ)

m1yi−1

N

i=1

y′

i−1Q(λ)

m1yi−1

N

i=1

1y′

i−1Q(λ)

m1yi−1

−1

N

i=1

y′

i−1Q(λ)

mεi

N

i=1

1y′

i−1Q(λ)

mεi

,

(see e.g., De Blander and Dhaene, 2011). Based on this system, the next theorem derives the inconsistency functions of ˆϕ∗

and ˆρ∗under H0:ϕ=1, denoted as Bϕ(λ, ρ∗)and Bρ(λ, ρ ∗), respectively. This is done for the case where break point T0is

known and under the assumptions on error terms εit made before.

Theorem 5. Let us assume that break point T0is known and εit ∼IID(0, σ 2

ε), with E(ε4

it )=k+3σ4

ε. Then, under H0:ϕ=1

we have:

Bϕ(λ, ρ∗)

Bρ(λ, ρ∗)≡plim(ˆϕ∗−1)

plim(ˆρ∗−ρ∗)=tr(Q(λ)

mKPK ′)tr(Q(λ)

mKP)

tr(Q(λ)

mKP)tr(Q(λ)

mP)−1tr(Q(λ)

mFG′K′)

tr(Q(λ)

mFG′),

as N → ∞, where now Q (λ)

mhas dimension (T−2)X(T−2), K is a (T−2)X(T−2)lower triangular matrix of ones (including

its main diagonal elements), P =G+G′−IT−2

1−ρ∗2where

G=

1 0 0 ··· 0

ρ∗1 0 ··· 0

· · · ··· ·

ρ∗T−3ρ∗T−4ρ∗T−5··· 1

,F=0(T−3)X1IT−3

0 01X(T−3),

and, IT−2and IT−3are (T−2)X(T−2)and (T−3)X(T−3)identity matrices, respectively.

The proof of the theorem follows immediately from the results of De Blander and Dhaene (2011), by substituting their

annihilator matrix with Q(λ)

m. The inconsistency function Bϕ(λ, ρ∗)derived by Theorem 5 can be employed to correct

estimator ˆϕ∗for its inconsistency. The resulting inconsistency-corrected estimator can be then used to provide unit root test

statistics, following analogous steps to those of the derivation of statistics Z(λ). Since Bϕ(λ, ρ∗)depends on ρ∗, the correction

of ˆϕ∗for its inconsistency requires a consistent estimator of ρ∗. Following De Blander and Dhaene (2011) and Phillips and

Sul (2007), this can be obtained based on the inconsistency function of ˆρ∗, given as Bρ(λ, ρ ∗)(see Theorem 1). Under H0:

ϕ=1 (implying ρ∗=ρ), Bρ(λ, ρ∗)is given by the following relationship: g(ρ ∗, λ) ≡ρ∗+Bρ(λ) =plim ˆρ∗. Based on

this, we can derive a consistent estimator of ρ∗(or ρ), defined as ˜ρ∗=(g(ρ∗, λ))−1(ˆρ∗). Note that, for model M2∗, this

function of ˜ρ∗may not be monotonic, especially for very large negative or positive values of ρ. In this case, one must choose

between two estimates, say ˜ρ∗

Aand ˜ρ∗

B, the one that is consistent. We suggest choosing the estimate of ρwhich is closest to

that obtained by a consistent GMM estimator of model M2∗under H0:ϕ=1, given as 1yit =ρ1yit−1+(1−ρ )βi+εit.

That is, one which minimizes ˙ρ∗

GMM − ˜ρ∗

v, for v= {A,B}(see De Blander and Dhaene, 2011). Substituting this estimator

into Bϕ(λ, ρ∗)gives the following inconsistency-corrected estimator of ϕ∗: ˜ϕ∗= ˆϕ∗−Bϕ(λ, ˜ρ∗). Under H0:ϕ=1 and

N→ ∞, it can be proved that

√N˜ϕ∗−1

˜ρ∗−ρd

→N(0,Ω(k, σ 2

u, λ)),

where Ω(k, σ 2

u, λ) is De Blander’s and Dhaene (2011)Ωmatrix, adjusted to allow for a break point based on annihilator

matrix Q(λ)

m. The last asymptotic result implies the following unit root test statistics:

Z∗(λ) ≡√N(˜ϕ∗−1)d

→N(0,Ω11(k, σ 2

u, λ)), for m= {M1∗,M2∗},(13)

where Ω11(k, σ 2

u, λ) is the (1, 1) element (submatrix) of Ω(k, σ 2

u, λ).

3.2. The date of the break point is unknown

When break point T0is unknown, the sequential version of Z∗(λ) is defined as

z∗≡min

λ∈I∗Ω−1

11 (k, σ 2

u, λ)Z∗(λ),

where Z∗(λ) is given by (13). Based on this test statistic, null hypothesis H0:ϕ=1 will be rejected, if z∗<c∗

min,a, where

c∗

min,ais a left-tail critical value of the limiting distribution of z∗at a significance level a.AsinTheorem 3, this limiting

398 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

distribution can be obtained as the minimum of a fixed number of correlated normally distributed variables. However,

the covariance matrix of these variables cannot be derived analytically, as in case of test statistics z. In addition to this,

this distribution will depend on the serial correlation nuisance parameter ρ, whose effects on the limiting distribution of

statistic z∗are unknown. This happens because estimator ˜ϕ∗is computed numerically and submatrix Ω11 (k, σ 2

u, λ) has a

very complicated form. Thus, to obtain critical values c∗

min,awe will rely on the bootstrap simulation method. For model M1∗,

this method iterates the following steps:

1. Estimate the following regression implied by model M1∗under H0:ϕ=1:

1yi=ρ∗1yi−1+εi,i=1,...,N,(14)

based on the pooled LS estimator, and obtain the vector of centered residuals ¯εi= ˆεi−1

NN

j=1ˆεj, for i=1,...,N(see,

e.g., Park, 2003). This is because the limiting distribution of test statistic z∗is derived under this hypothesis.

2. Resample with replacement from vector ¯εi, for all i, and denote the bootstrap samples as ε∗

i.

3. Construct recursively the values of vector of error terms u∗

ifrom vector ε∗

ibased on the following regression model:

u∗

i=ρ∗u∗

i−1+ε∗

i,i=1,...,N,(15)

taking the initial values of u∗

i2to be u∗

i2=1yi2(see Chang and Park, 2003).

4. Construct recursively the values of y∗

ibased on the following model:

y∗

i=y∗

i−1+u∗

t,i=1,...,N,(16)

assuming y∗

it =yi1(see Chang and Park, 2003). The bootstrap samples must be built assuming ϕ=1, otherwise they

will not behave as unit root processes (see Basawa et al., 1991).

5. Calculate the minimum of the following statistic:

√NΩ−1/2

11 (k, σ 2

u, λ)( ˜ϕ∗(b)− ˜ϕ∗), for all λ∈I,

where ˜ϕ∗(b)is the estimator of ϕ∗based on the bootstrap sample, while ˜ϕ∗is its sample estimator, defined before.

The above steps are iterated for a number of times. This bootstrap procedure provides consistent estimates of the limiting

distribution of z∗, by the theorem of Mammen (1992) and the continuous mapping theorem (see Cameron and Trivedi, 2005

and Horowitz, 2001). The empirical distributions of the minimum of statistics √NΩ−1/2

11 (k, σ 2

u, λ)( ˜ϕ∗(b)−ϕ(λ) )obtained

through these iterations can be employed to calculate critical value c∗

min,a. Although the above bootstrap procedure can be

successfully applied to test for unit roots based on model M1∗(see our simulation results of next section), its application to

panel data model M2∗, which allows for linear trends, is not straightforward when Tis fixed. This is due to the presence of

individual effects βiunder hypothesis H0:ϕ=1, as M2∗becomes yi=βi(1−ρ) +yi−1+ρ∗1yi−1+εi. To implement the

above bootstrap method in case of model M2∗, we need consistent estimates of ρ∗and ˆεiunder this hypothesis. While an N-

consistent estimator of ρ∗can be obtained based on the following transformation of model M2∗:∆2yi=ρ∗∆2yi−1+1εi,

which wipes off individual effects βiby taking second differences of model M2∗under H0:ϕ=1 (see, e.g., Han and

Phillips, 2010), this transformation will not provide consistent estimates of ˆεi. It can only give consistent estimates of the

first difference of ˆεi,1ˆεi. Based on a Monte Carlo simulation analysis, we have found that a bootstrap method based on 1ˆεi

provides inaccurate estimates of the resampled values of vectors ε∗

iand y∗

i. These lead to a test statistic z∗which is seriously

undersized and has very small power.

Summing, the results of this section show that, in order to extend sequential unit root test statistics zto the case of AR(2)

panel data models M1∗and M2∗, we can rely on the bootstrap simulation method since the limiting distribution of these

statistics depend on the serial correlation nuisance parameters. However, implementation of this method is straightforward

only for model M1∗. For model M2∗, it requires consistent estimates of the individual effects of the panel, which is not feasible

for short (fixed-T) panel data models. Taking second differences of model M2∗, which wipes off individual effects, will not

provide accurate estimates of the bootstrap samples of residuals and panel data series.

4. Simulation results

In this section, we conduct a Monte Carlo simulation exercise to evaluate the small sample size and power performance of

the test statistics suggested in the previous sections. Our exercise is based on 10 000 Monte Carlo simulations. This exercise

considers different combinations of Nand T, and both known and unknown break point cases. For useful comparisons with

other panel data unit root tests, we present size and power values of the tests under the assumption that error terms uit and

εit are normally distributed. Our analysis starts with test statistics Z(λ) and z, for models M1 and M2, and then proceeds to

test statistics Z∗(λ) and z∗, for models M1∗and M2∗.

Tables 2(a)–(b) report the empirical size at a=5% level and the power of standardized test statistics C(λ)−1/2Z(λ), for

the case of known break point. We consider the following cases of break fractions: λ= {[0.25T],[0.50T],[0.75T]}, where

[·] denotes the lower integer. To calculate the power of the tests for the above values of λ, we have generated panel data

model series yit under Ha:ϕ < 1, for ϕ= {0.95,0.90}. The initial observations of the panel data models are generated as

yi0∼NIID(0,1), for all i, while their individual effects and slope coefficients of the individual linear trends are assumed to

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 399

Table 2(a)

Rejection probabilities of statistic C(λ)−1/2Z(λ) for model M1 (known break).

N25 25 50 50 50 100 100 100 100

T10 15 10 15 25 10 15 25 50

λ=0.25

ϕ=1.00 0.07 0.07 0.06 0.06 0.06 0.06 0.06 0.06 0.06

ϕ=0.95 0.24 0.34 0.34 0.51 0.73 0.51 0.74 0.94 1.00

ϕ=0.90 0.43 0.62 0.65 0.86 0.99 0.87 0.99 1.00 1.00

λ=0.50

ϕ=1.00 0.06 0.06 0.06 0.06 0.06 0.05 0.05 0.06 0.06

ϕ=0.95 0.20 0.29 0.29 0.44 0.68 0.43 0.65 0.91 1.00

ϕ=0.90 0.39 0.55 0.58 0.78 0.97 0.82 0.96 1.00 1.00

λ=0.75

ϕ=1.00 0.07 0.07 0.06 0.06 0.06 0.06 0.06 0.06 0.06

ϕ=0.95 0.24 0.36 0.33 0.54 0.81 0.51 0.78 0.97 1.00

ϕ=0.90 0.46 0.68 0.68 0.90 0.99 0.91 0.99 1.00 1.00

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of the test statistic C(λ)−1/2Z(λ) at the 5% nominal level under the

alternative hypotheses ϕ= {0.95,0.90}, for model M1.

be distributed as follows: a(λ)

i∼U(−0.5,0), a(1−λ)

i∼U(0,0.5), β(λ)

i∼U(0.0,0.025)and β(1−λ)

i∼U(0.025,0.05), where

U(·)denotes the uniform distribution. The small magnitudes of nuisance parameters a(λ)

i,a(1−λ)

i, β(λ)

iand β(1−λ)

iimplied by

the above distributions are consistent with evidence reported in the literature, which indicates small differences of them

across i(see, e.g., Hall and Mairesse, 2005). These will make rejections of null hypothesis H0:ϕ=1 a very difficult task.

The results of Tables 2(a)–(b) clearly indicate that test statistics Z(λ) have size which is very close to its nominal 5% level,

even though Nand Tare relatively small. This is true for both models M1 and M2. Their power performance is also very

satisfactory, especially for model M1. It is analogous to that of Harris’s and Tzavalis (1999, 2004) panel unit root tests, which

do not allow for structural breaks (see Tables 2(a)–(b) of Harris and Tzavalis, 1999 and Table 1 of Harris and Tzavalis, 2004).

The power of Z(λ) is much higher than that of single time series unit root tests with, or without, structural breaks (see,

e.g., Perron, 2006, for a survey). To achieve levels of power or size which are analogous to those reported in Tables 2(a)–(b),

single time series unit root tests allowing for breaks consider values of ϕwhich are far below unity, i.e. ϕ=0.80, or they

consider very large sizes of T, i.e. T= {100,200}(see, e.g., Kim and Perron, 2009). Analogous sizes of Tare also required by

large-Tpanel data unit root tests allowing for a common break (see, e.g., Tables 2 and 3 of Bai and Carrion-i-Silvestre, 2009

and Chan and Pauwels, 2011). As in other panel data unit root tests, the power of test statistics Z(λ) is found to increase as

both Nand Tincrease, but it grows faster with Trather than N.

The better power performance of Z(λ) for model M1 than for M2, which considers broken individual linear trends

under Ha:ϕ < 1, is consistent with simulation evidence provided in the time series or panel data literature allowing

for structural breaks, or not (see, e.g., Bai and Carrion-i-Silvestre, 2009,Harris and Tzavalis, 2004, and Vogelsang and Perron,

1998). However, the power deterioration of Z(λ) for model M2 is much smaller than that of single time series unit root tests,

allowing for deterministic trends. This can be obviously attributed to the fact that panel data unit root tests exploit sample

information across two different dimensions of the data: the cross-sectional and the time dimension.

To evaluate the performance of large-Textensions of Z(λ) in small samples, Table 3 presents size and power values of

our standardized test statistics Z′(λ), which assume large T(see Eq. (6)). Note that, for reasons of space, the table reports

results only for the case that break fraction λis the middle of the sample, i.e. λ=0.50. Analogous results are taken for

λ= {0.25,0.75}. Comparing the results of Table 3 to those of Tables 2(a)–(b) can be concluded that both size and power

performance of Z′(λ) is much smaller than that of Z(λ) in small samples. The reported size and power values of statistics

Z′(λ) are zero even for panel data sets with quite large T, e.g., T=50. This magnitude of Tis larger than that required by

large-Tpanel unit root tests which do not consider breaks to work satisfactorily (see Table 3 of Harris and Tzavalis, 1999).

This can be attributed to the fact that, apart from T, the inconsistency functions B1(λ) and B2(λ) or variance functions D1(λ)

and D2(λ) of test statistic Z′(λ), defined by Eq. (6), depend also on break fraction λ.

The results of our simulation study regarding the size and power performance of sequential test statistics z≡

minλ∈IC(λ)−1/2Z(λ), where Z(λ) =√N(ˆϕ−1−B(λ)), which treat break point T0of panel data models M1 and M2

as unknown, are reported in Tables 4(a)–(b), respectively. The rejection probabilities of these statistics are calculated based

on the critical values reported in Table 1, for 5% significance level. The results of these tables clearly indicate that both size

and power performance of statistics zis very satisfactory, especially for model M1. For both models M1 and M2, the values

of size and power of zreported in the table are very close to those corresponding to the case that T0is known, i.e. for statistics

Z(λ) (see Tables 2(a)–(b)). This is true even for very small T. In fact, the power performance of statistics zis slightly higher

than that of Z(λ), for almost all cases of N,Tand λexamined. Evidence that sequential test statistics like zhave higher

power than standardized test statistics Z(λ), which assume a known break point, is also provided in the literature of single

time series analysis (see Fig. 1 of Kim and Perron, 2009). In general, it may be attributed to the fact that sequential testing

procedures minimize test statistics like Z(λ) assuming that every alternative hypothesis Ha:ϕ < 1, indexed by λ, is based

400 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

Table 2(b)

Rejection probabilities of statistic C(λ)−1/2Z(λ) for model M2 (known break).

N25 25 50 50 50 100 100 100 100

T10 15 10 15 25 10 15 25 50

λ=0.25

ϕ=1.00 0.06 0.06 0.05 0.06 0.06 0.05 0.05 0.06 0.05

ϕ=0.95 0.07 0.07 0.06 0.08 0.10 0.07 0.08 0.12 0.39

ϕ=0.90 0.08 0.11 0.08 0.12 0.27 0.10 0.16 0.39 0.99

λ=0.50

ϕ=1.00 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.06 0.05

ϕ=0.95 0.06 0.06 0.06 0.06 0.08 0.06 0.06 0.09 0.26

ϕ=0.90 0.07 0.07 0.07 0.09 0.16 0.07 0.10 0.24 0.91

λ=0.75

ϕ=1.00 0.06 0.06 0.05 0.06 0.06 0.05 0.05 0.06 0.06

ϕ=0.95 0.06 0.07 0.06 0.07 0.10 0.06 0.08 0.12 0.42

ϕ=0.90 0.07 0.10 0.08 0.12 0.26 0.09 0.15 0.40 0.99

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of the test statistic C(λ)−1/2Z(λ) at the 5% nominal level under the

alternative hypotheses ϕ= {0.95,0.90}, for model M2.

Table 3

Rejection probabilities of Dm(λ)−1/2Z′(λ) assuming large Tand known break.

N25 25 50 50 50 100 100 100 100

T10 15 10 15 25 10 15 25 50

Panel A: Model M1

ϕ=1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02

ϕ=0.95 0.00 0.00 0.00 0.00 0.00 0.00 0.04 0.53 0.99

ϕ=0.90 0.00 0.00 0.00 0.00 0.00 0.01 0.33 0.98 1.00

Panel B: Model M2

ϕ=1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

ϕ=0.95 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

ϕ=0.90 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.22

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of the test statistic Dm(λ)−1/2Z′(λ) under the nominal 5% level under the

alternative hypotheses ϕ= {0.95,0.90}and λ=0.50.

Table 4(a)

Rejection probabilities of statistic z≡minλ∈IC(λ)−1/2Z(λ) for model M1 (unknown break).

N25 25 50 50 50 100 100 100 100

T10 15 10 15 25 10 15 25 50

λ=0.25

ϕ=1.00 0.09 0.08 0.08 0.07 0.08 0.07 0.06 0.07 0.07

ϕ=0.95 0.34 0.43 0.46 0.62 0.90 0.66 0.86 1.00 1.00

ϕ=0.90 0.59 0.77 0.81 0.95 1.00 0.97 1.00 1.00 1.00

λ=0.50

ϕ=1.00 0.09 0.08 0.08 0.07 0.08 0.07 0.06 0.06 0.07

ϕ=0.95 0.33 0.43 0.46 0.62 0.89 0.67 0.86 1.00 1.00

ϕ=0.90 0.60 0.76 0.82 0.95 1.00 0.97 1.00 1.00 1.00

λ=0.75

ϕ=1.00 0.09 0.08 0.08 0.07 0.07 0.07 0.06 0.07 0.07

ϕ=0.95 0.33 0.44 0.46 0.63 0.90 0.67 0.86 1.00 1.00

ϕ=0.90 0.60 0.77 0.81 0.95 1.00 0.97 1.00 1.00 1.00

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of sequential test statistic z≡minλ∈IC(λ)−1/2Z(λ) at the 5% nominal

level under the alternative hypotheses ϕ= {0.95,0.90}, for model M1.

on the correct date. This tends to be in favor of the alternative hypothesis Ha:ϕ < 1, as noted by Zivot and Andrews (1992).

Monte Carlo analysis has shown that the power gains of test statistics zover Z(λ) can be mainly attributed to the mean

effects and, in particular, the adjustment for the inconsistency of the LS estimator ˆϕof the limiting distributions of these

statistics under alternative hypothesis Ha:ϕ < 1 given that their variance functions (e.g., C(λ) for Z(λ)) hardly change

under this hypothesis, especially in the neighborhood of unity (see, e.g., De Wachter et al., 2007,Harris and Tzavalis, 1999,

Madsen, 2010 and Moon and Perron, 2008).

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 401

Table 4(b)

Rejection frequencies z≡minλ∈IC(λ)−1/2Z(λ) for model M2 (unknown break).

N25 25 50 50 50 100 100 100 100

T10 15 10 15 25 10 15 25 50

λ=0.25

ϕ=1.00 0.06 0.06 0.05 0.06 0.06 0.06 0.06 0.06 0.06

ϕ=0.95 0.06 0.07 0.06 0.08 0.12 0.06 0.08 0.14 0.53

ϕ=0.90 0.07 0.11 0.08 0.13 0.33 0.09 0.17 0.51 1.00

λ=0.50

ϕ=1.00 0.05 0.06 0.06 0.06 0.07 0.05 0.06 0.06 0.06

ϕ=0.95 0.06 0.08 0.06 0.08 0.11 0.06 0.08 0.14 0.49

ϕ=0.90 0.07 0.10 0.07 0.12 0.30 0.09 0.16 0.48 1.00

λ=0.75

ϕ=1.00 0.06 0.07 0.05 0.06 0.07 0.06 0.05 0.06 0.06

ϕ=0.95 0.06 0.08 0.06 0.08 0.11 0.06 0.08 0.13 0.49

ϕ=0.90 0.07 0.10 0.08 0.12 0.31 0.09 0.16 0.47 1.00

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of the sequential test statistic z≡minλ∈IC(λ)−1/2Z(λ) at the 5% nominal

level under the alternative hypotheses ϕ= {0.95,0.90}, for model M2.

Table 5(a)

Rejection probabilities of statistic Z∗(λ) for model M1∗(known break, λ=0.5).

N25 50 100

T15 15 25 15 25

ρ−0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40

ϕ=1.00 0.07 0.06 0.06 0.06 0.07 0.06 0.06 0.05 0.06 0.06

ϕ=0.95 0.29 0.30 0.39 0.43 0.62 0.72 0.58 0.65 0.86 0.92

ϕ=0.90 0.48 0.56 0.69 0.78 0.93 0.97 0.91 0.96 1.00 1.00

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of test statistic Z∗(λ) at the 5% nominal level under the alternative

hypotheses of ϕ= {0.95,0.90}, for model M1∗.

Table 5(b)

Rejection frequencies of statistic Z∗(λ) for model M2∗(known break, λ=0.5).

N25 50 100

T15 15 25 15 25

ρ−0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40

ϕ=1.00 0.07 0.04 0.07 0.04 0.07 0.06 0.06 0.05 0.06 0.06

ϕ=0.95 0.08 0.05 0.08 0.08 0.09 0.08 0.07 0.13 0.10 0.09

ϕ=0.90 0.10 0.05 0.09 0.09 0.17 0.14 0.10 0.14 0.22 0.19

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of the test statistic Z∗(λ) at the 5% nominal level under the alternative

hypotheses of ϕ= {0.95,0.90}, for model M2∗.

Finally, our last set of simulation results investigates size and power performance of test statistics Z∗(λ) and z∗, for the

AR(2) models M1∗and M2∗.Tables 5(a)–(b) report these for test statistics Z∗(λ), assuming a known break point, while

Table 6 for sequential statistic z∗based on model M1∗. As mentioned before, this statistic does not perform well for model

M2∗, since it is based on consistent estimates of 1ˆεito wipe off individual effects of the model. The tables present results for

different values of autoregressive parameter ρ, i.e. ρ∈ {−0.4,0.4}. For reasons of space, we consider only the case that the

break point occurs at the middle of the sample, i.e. λ=0.50, and T= {15,25}. Following De Blander and Dhaene (2011), in

generating the data we assume that ui0∼N(0, (1−ρ2)−1). The number of bootstrap samples considered is B=1000. The

values of nuisance parameters of the models considered, i.e. a(λ)

i,a(1−λ)

i, β(λ)

iand β(1−λ)

i, are the same with those assumed

in our simulation exercise for models M1 and M2.

The results of Tables 5(a)–(b) indicate that test statistics Z∗(λ) perform analogously to statistics Z(λ), for models M1

and M2. This is true for both values of autoregressive coefficient ρconsidered. The tests have size which is very close to its

nominal level, while their power behaves similarly to that of Z(λ). In particular, the power of Z∗(λ) is less for model M2∗

than for model M1∗, and it increases faster with Tthan N. Regarding sequential test statistic z∗for model M1∗, the results of

Table 6 indicate that this statistic works satisfactorily for both values of ρ. Its performance is similar to that of test statistic

zfor model M1.

5. Empirical application

As an empirical application of the test statistics presented in the previous sections, we employ the sequential test statistic

zfor model M2, allowing for broken individual linear trends under H0:ϕ < 1, to investigate if evidence of unit roots in

402 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

Table 6

Rejection probabilities of statistic z∗for model M1∗(unknown break, λ=0.50).

N25 50 100

T T =15 T=15 T=25 T=15 T=25

ρ−0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40 −0.40 0.40

ϕ=1.00 0.06 0.08 0.04 0.08 0.05 0.05 0.04 0.05 0.05 0.05

ϕ=0.95 0.30 0.45 0.39 0.64 0.75 0.86 0.66 0.80 0.96 1.00

ϕ=0.90 0.53 0.73 0.74 0.94 0.98 0.99 0.96 1.00 1.00 1.00

Notes: The table presents the size at 5% nominal level (see ϕ=1) and the power of test statistic z∗at the 5% nominal level under the alternative hypotheses

of ϕ= {0.95,0.90}, for model M1∗with λ=0.5 and T=10.

Fig. 1. Estimates of Z(λ) over all possible break points.

the trade openness variable, measured as the sum of imports and exports over GDP, is due to trade liberalization policies,

such as tariff barriers reductions. These policies were introduced by many developed or developing countries since the early

nineties (see, e.g., Faini, 2004).

Fig. 1 graphically presents estimates of the above test statistic over all possible break points of the sample. This is done

for the group of ‘‘non-oil countries’’ (see e.g., Mankiw et al., 1992). Our panel data set is taken from Penn’s World Tables. This

set consists of N=97 cross-sectional units and T=40 time series observations, which cover the period from year 1970 to

2009. To mitigate for the effects of possible cross section correlation of disturbance terms uit on the test, all the individual

series of the data were taken in deviations from their cross-section mean at each point in time t(see O’Connell, 1998). This

procedure wipes out the effects of cross-sectional correlation of uit on panel unit toot tests when uit has the following factor

representation: uit =vt+ζit , where vtis an IID random variable which is common across all cross-section units of the

panel i, and ζit are IID disturbance terms. Our choice to employ model M2, instead of model M2∗, to conduct our sequential

unit root test is based on evidence that autoregressive coefficient ρis very close to zero. This is based on a GMM consistent

estimator of ρ(see, e.g., Arellano, 2003), which gives an estimate of it which is very close to zero, i.e. 0.025. The results of

Fig. 1 clearly indicate that the null hypothesis of a unit root in the trade openness variable is rejected in favor of its stationary

alternative. The estimate of statistic zis found to be −7.012, which is smaller than the critical value of this statistic at 5%

given by Table 1.

6. Conclusions

This paper proposes panel data unit root tests which allow for a common structural break of known or unknown date

in the deterministic components of the canonical AR(1) panel data model, namely its individual effects and/or individual

linear trends. The suggested tests assume that the time dimension of the panel Tis fixed, while its cross-sectional Nis large.

Thus, they are suitable for short panels used in many microeconomic studies. They can be also employed in macroeconomic

studies which rely on low frequency of data, i.e. yearly observations.

When the break point is considered as known, the suggested test statistics have a limiting distribution which is normal.

When the break point is unknown, they rely on a sequential testing procedure and, thus, their distribution is not standard.

This procedure entails computing the values of the relevant statistics considering known break over all possible time points

of the sample. Then, the unit root hypothesis can be tested based on the minimum value of these test statistics. This has a

limiting distribution whose critical values can be tabulated as those of the minimum value of a fixed number of correlated

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 403

normal variables. The paper derives the analytic formula of the covariance matrix of these normal variables, which is

necessary to obtain critical values of the limiting distribution of the test statistics.

To highlight some of the difficulties in extending the tests to higher than one order of serial correlation of the error terms

of panel data models assuming fixed T, the paper presents extensions of them for the AR(2) panel data model. Since in this

case the limiting distributions of the test statistics for an unknown date break cannot be easily tabulated due to the serial

correlation nuisance parameters of the error terms, the paper suggests a bootstrap method to calculate critical values of

these distributions. This method is found to work efficiently only for the case of the AR(2) panel data model with individual

effects. For the case of the model that also includes individual linear trends, it does not work well due to the presence of

individual effects under the null hypothesis of unit roots. Taking second differences of the model to wipe off these effects

will not provide accurate estimates of the bootstrap samples of residuals and panel data series.

To evaluate the small sample performance of the suggested tests, the paper conducts a Monte Carlo simulation study.

This study indicates that the tests have empirical sizes which are very close to their nominal level and very satisfactory

power. The latter happens independently on whether the break point is assumed as known or unknown. The power of the

tests is found to be better than that of large-Tpanel unit root tests allowing for structural breaks or not, due to the fixed-T

assumption. This simulation study also shows that the power of our tests increases with both dimensions of the panel Nand

T, but faster with T. The above results also hold for the case of the AR(2) panel data model, with the exception of the version

of the model including individual linear trends in its deterministic component in the case of an unknown date break. This

happens for the reasons mentioned above. This case will be the focus of future research. In an empirical application of the

sequential version of the tests, the paper shows that evidence of persistence in the trade openness variable for the group of

‘‘non-oil’’ countries can be attributed to a structural break. This is associated with the trade liberalization policies introduced

by most developed and developing countries after the early nineties.

Acknowledgments

The authors would like to thank the editor, an associate editor and three anonymous referees, as well as G. Dhaene, Z.

Psaradakis, participants at CFE-9 conference held in Cyprus 2009 and at the 16th International Conference on Panel Data

held in Amsterdam, 2010, for useful comments.

Appendix

In this appendix we present the proofs of the main theoretical results of the paper.

Proof of Theorem 1. To derive the limiting distribution of the test statistics given by the theorem, we will proceed as

follows. First, we will show that the pooled LS estimator ˆϕis inconsistent, as N→ ∞, and will derive its inconsistency.

Second, we will define a normalized test statistic based on ˆϕcorrected for its inconsistency and, then, we will derive its

limiting distribution under H0:ϕ=1, as N→ ∞.

By solving backwards yi−1yields:

yi−1=eyi0+Λui,for model M1,

yi−1=eyi0+β(λ)

iΛe+Λui,for model M2.(17)

Multiplying both sides of the above equations with annihilator matrix Q(λ)

myields

Q(λ)

myi,−1=Q(λ)

mΛui,(18)

since Q(λ)

me=Q(λ)

me=Q(λ)

me(1−λ) =0, for m= {M1,M2}. Note that this results also holds in the case where

yi−1=eyi0+Λe(λ)

iβ(λ)

i+Λe(1−λ)β(1−λ)

i+Λui,

which includes broken individual effects β(λ)

iand β(1−λ)

iunder H0:ϕ=1.

Substituting (18) into (3) and noticing that Q(λ)

mis an idempotent and symmetric matrix yields

ˆϕ−1=N

i=1

u′

iΛ′Q(λ)

mui N

i=1

u′

iΛ′Q(λ)

mΛui−1

.(19)

Taking probability limits of Eq. (19) gives the inconsistency function of ˆϕas follows:

B(λ) =plim

N→∞(ˆϕ−1)=Eu′

iΛ′Q(λ)

muiEu′

iΛ′Q(λ)

mΛui−1

=tr Λ′Q(λ)

mtr Λ′Q(λ)

mΛ−1.(20)

by KWLLN.

404 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

Subtracting B(λ) from (19) gives the inconsistency-corrected estimator of ϕ, or ϕ−1:

ˆϕ−1−B(λ) =N

i=1u′

iΛ′Q(λ)

mui−B(λ)(u′

iΛ′Q(λ)

mΛui) N

i=1

u′

iΛ′Q(λ)

mΛu−1

=N

i=1

ξ(λ)

i N

i=1

u′

iΛ′Q(λ)

mΛu−1

,(21)

where ξ(λ)

i=u′

iΛ′Q(λ)

mui−B(λ)(u′

iΛ′Q(λ)

mΛui)is a random variable which has zero mean by construction and constant

variance, for all i. Using standard results on quadratic forms, ξ(λ)

ican be written as follows:

ξ(λ)

i=u′

i

1

2Λ′Q(λ)

m+Q(λ)

mΛui−B(λ)(u′

iΛ′Q(λ)

mΛui)

=u′

i1

2Λ′Q(λ)

m+Q(λ)

mΛ−B(λ)(Λ′Q(λ)

mΛ)ui

=u′

iA(λ)ui,(22)

where A(λ) =1

2Λ′Q(λ)

m+Q(λ)

mΛ−B(λ)(Λ′Q(λ)

mΛ)is a symmetric matrix, since its component matrices 1

2(Λ′Q(λ)

m+Q(λ)

mΛ)

and (Λ′Q(λ)

mΛ)are symmetric. Using results on quadratic forms for symmetric matrices, it can be shown that the variance

of ξ(λ)

i, denoted as Var(ξ(λ)

i), can be analytically written as

Var(ξ(λ)

i)=Var[u′

iA(λ)ui]

=k

T

j=1

a(λ)2

jj +2σ4

utr A(λ)2,(23)

(see Anderson, 1971).

The result of Theorem 1 can be proved by scaling (21) appropriately and using the following two asymptotic results, as

N→ ∞:

1

√N

N

i=1

ξ(λ)

i

d

−→ N(0,Var(ξi)), (24)

by CLT, and

plim 1

N

N

i=1

u′

iΛ′Q(λ)

mΛui=σ2

utr Λ′Q(λ)

mΛ,(25)

by KWLLN. These results hold under Assumption 1. Note that the condition k<∞of Assumption 1 guarantees that Var(ξ (λ)

i)

constitutes a finite quantity.

Proof of Theorem 2. For simplicity, we will assume that error terms uit are NIID. Under this assumption, the limiting

distribution of standardized test statistic Z(λ) is given as C(λ)−1/2Z(λ) L

−→ N(0,1)(see Corollary 1). To prove Theorem 2,

we need to show that, as N→ ∞,C(λ)−1/2Z(λ) converges to minus infinity under Ha:ϕ < 1. The extension of the proof

to the case of non-normal disturbance terms is straightforward.

Define vector γ(λ)

i,M1=((1−ϕ)a(λ)

i, (1−ϕ)a(1−λ)

i)′for model M1 and γ(λ)

i,M2=((1−ϕ)a(λ)

i+ϕβ (λ)

i, (1−ϕ)a(1−λ)

i+

ϕβ (1−λ)

i, (1−ϕ)β (λ)

i, (1−ϕ)β (1−λ)

i)′for model M2. Write vector yi−1under Ha:ϕ < 1 as follows:

yi−1=wyi0+ΨX(λ)

mγ(λ)

i,m+Ψui,for m= {M1,M2},(26)

where w=(1, ϕ, ϕ 2, . . . , ϕT−1)′,Ψis defined as

Ψ=

0· · · · · 0

1 0 ·

ϕ1· ·

ϕ2ϕ· · ·

· · · · ·

· · 1 0 ·

ϕT−2ϕT−3· · ϕ1 0

.

Note that under H0:ϕ=1, we have Ψ=Λ.

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 405

By substituting (26) into (3),C(λ)−1/2Z(λ) can be written as

√NC(λ)−1/2(ϕ −1)+C(λ)−1/2

1

√N

N

i=1y′

i−1Q(λ)

mui−B(λ)y′

i−1Q(λ)

myi−1

1

N

N

i=1

y′

i−1Q(λ)

myi−1

,for m= {M1,M2}.(27)

Since C(λ)−1/2is bounded, the first term of the last relationship converges to minus infinity. Thus, to prove consistency of

C(λ)−1/2Z(λ) we need to show that the second summand of (27) is bounded. To this end, we need to prove the following

two asymptotic results:

(i)plim 1

N

N

i=1

y′

i−1Q(λ)

myi−1̸= 0 and (ii)plim 1

√N

N

i=1y′

i−1Q(λ)

mui−B(λ)y′

i−1Q(λ)

myi−1̸= +∞.(28)

The first of the above results (see (i)) can be easily proved by substituting (26) into y′

i−1Q(λ)

myi−1. This yields

1

N

N

i=1

y′

i−1Q(λ)

myi−1=1

N

N

i=1y2

i0w′Q(λ)

mw+yi0w′Q(λ)

mΩX(λ)

mγ(λ)

i,m+yi0w′Q(λ)

mΨui

+yi0γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mw+γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mΨX(λ)

2γ(λ)

i,m+γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mΨui

+yi0u′

iΨ′Q(λ)

mw+u′

iΨ′Q(λ)

mΨX(λ)

mγ(λ)

i,m+u′

iΨ′Q(λ)

mΩui.

By Assumptions 1,2and KWLLN, it can be shown that all the summands involved in the last relationship converge to finite

quantities (see also below).

The second of the above asymptotic results (see (ii)) can be proved by writing 1

√NN

i=1y′

i−1Q(λ)

mui−B(λ)y′

i−1Q(λ)

myi−1

as follows:

1

√N

N

i=1

y′

i−1Q(λ)

mui−B(λ) 1

√N

N

i=1

y′

i−1Q(λ)

myi−1.(29)

Using (26), the first summand of the last relationship can be decomposed as

1

√N

N

i=1

y′

i−1Q(λ)

mui=1

√N

N

i=1yi0w′Q(λ)

mui+γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mui+u′

iΨ′Q(λ)

mui.

The terms of this summand can be proved that converge to finite quantities, which implies that

plim 1

√N

N

i=1

y′

i−1Q(λ)

mui<+∞.

This can be shown using the following results, which hold under Assumption 2:

E1

N

N

i=1

yi0w′Q(λ)

mui=0 and Var 1

√N

N

i=1

yi0w′Q(λ)

mui=σ2

uσ2

0w′Q(λ)

mw < +∞,

where σ2

0denotes the variance of the initial condition yi0.

E1

N

N

i=1

γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mui=0 and

Var 1

√N

N

i=1

γ(λ)′

i,mX(λ)′

mΨ′Q(λ)

mui=σ2

utr(Q(λ)

mΨX(λ)

mΣγX(λ)′

mΨ′Q(λ)

m) < +∞,

where Σγis the variance–covariance matrix of the elements of vector γ(λ)

i,m.

E1

N

N

i=1

u′

iΨ′Q(λ)

mui=σ2

utr(Ψ′Q(λ)

m) < +∞ and

Var 1

√N

N

i=1

u′

iΨ′Q(λ)

mui=tr[Ψ′Q(λ)

mE(uiu′

iQ(λ)

mΨuiu′

i)] − σ4

utr(Ψ′Q(λ)

m)2<+∞.

406 Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407

Following analogous arguments to the above, we can also prove that the second summand of (29) converges to a finite

quantity, i.e. plim 1

√NN

i=1y′

i−1Q(λ)

myi−1<+∞.

Proof of (6). According to Corollary 1, as N→ ∞ we have

√N(ϕ−1−B(λ)) L

−→ N(0,C(λ)).

Then, by the continuous mapping theorem, we have the following asymptotic result:

T√N(ϕ−1−B(λ)) L

−→ N0,T2C(λ).

This result holds for any T, which means that there are no restrictions on the growth rate between Tand N. Also C(λ) =

2tr(A(λ)2)

tr(Λ′Q(λ)

mΛ)2, where A(λ) =1

2Λ′Q(λ)

m+Q(λ)

mΛ−B(λ)(Λ′Q(λ)

mΛ). Substituting the following polynomial expressions:

tr Λ′Q(λ)

m+Q(λ)

mΛ(Λ′Q(λ)

mΛ)= −tr Λ′Q(λ)

m,for m= {M1,M2},

tr Λ′Q(λ)

M1= −T−2

2,

tr Λ′Q(λ)

M1Λ=T2

6(2λ2−2λ+1)−2

6,

tr Λ′Q(λ)

M1+Q(λ)

M1Λ2=T2

6(2λ2−2λ+1)+T−7

3,

tr (Λ′Q(λ)

M1Λ)2=1

90 (2λ4−4λ3+6λ2−4λ+1)T4+1

36 (2λ2−2λ+1)T2−7

90 ,

tr Λ′Q(λ)

M2=4−T

2,

tr Λ′Q(λ)

M2Λ=2T2

30 (2λ2−2λ+1)−16

30 ,

tr Λ′Q(λ)

M2+Q(λ)

M2Λ2=T2

30 (2λ2−2λ+1)+T−128

30 ,

and tr (Λ′Q(λ)

M2Λ)2=11

12 600 (2λ4−4λ3+6λ2−4λ+1)T4+137

12 600 (2λ2−2λ+1)T2−181

1575 ,

into the above variance function C(λ) and taking the limit of T→ ∞ proves (6).

Proof of Theorem 3. As is stated in the main text, the proof of this theorem follows as an extension of Theorem 1, by

applying the continuous mapping theorem to the joint limiting distribution of standardized statistic C(k, σ 2

u, λ)−1/2Z(λ),

for all λ∈I. The elements of covariance (correlation) matrix Rfor two different break fractions of the sample λand s,

defined as Corrλs, can be derived analytically based on following result:

C(k, σ 2

u, λ)−1/2Z(λ)C(k, σ 2

u,s)−1/2Z(s)=√N(ϕ −1−B(λ))√N(ϕ −1−B(s))

C(k, σ 2

u, λ)1/2C(k, σ 2

u,s)1/2

=N

C(k, σ 2

u, λ)1/2C(k, σ 2

u,s)1/2

N

i=1

ξ(λ)

i

N

i=1

u′

iΛ′Q(λ)

mΛu

N

i=1

ξ(s)

i

N

i=1

u′

iΛ′Q(s)

mΛu

=σ2

utr(Λ′Q(λ)

mΛ)σ 2

utr(Λ′Q(s)

mΛ)

k

T

j=1

a(λ)2

jj +2σ4

utr(A(λ)2)k

T

j=1

a(s)2

jj +2σ4

utr(A(s)2)

1

N

N

i=1

ξ(λ)

i

N

i=1

ξ(s)

i

1

N

N

i=1

u′

iΛ′Q(λ)

mΛu1

N

N

i=1

u′

iΛ′Q(s)

mΛu

.(30)

Taking probability limits of the last relationship and using the following results: Eξ(λ)

iξ(s)

j=0, which hold for i̸= j

(see (22)),

Y. Karavias, E. Tzavalis / Computational Statistics and Data Analysis 76 (2014) 391–407 407

plim

n→∞

1

NN

i=1

ξ(λ)

i

N

i=1

ξ(s)

i=E(ξ(λ)

iξ(s)

i)=k

T

j=1

a(λ)

jj a(s)

jj +2σ4

utr(A(λ)A(s)),

and (25) yields the analytic formula of Corrλs, given by Theorem 3.

References

Anderson, T.W., 1971. An Introduction to Multivariate Statistical Analysis. Wiley, New York.

Andrews, D.W.K., 1993. Tests for parameter instability and structural change with unknown change point. Econometrica 61 (4), 821–856.

Arellano, M., 2003. Panel Data Econometrics. Oxford University Press.

Arellano, M., Honoré, B., 2002. Panel data models: some recent developments. In: Heckman, J., Leamer, E. (Eds.), Handbook of Econometrics, Vol. 5. North

Holland.

Bai, J., 2010. Common breaks in means and variances for panel data. Journal of Econometrics 157, 78–92.

Bai, J., Carrion-i-Silvestre, J.L., 2009. Structural changes, common stochastic trends and unit roots in panel data. Review of Economic Studies 76, 471–501.

Baltagi, B.H., 1995. Econometric Analysis of Panel Data. Wiley, Chichester.

Baltagi, B.H., Kao, C., 2000. Nonstationary panels, cointegration in panels and dynamic panels: a survey. Center for Policy Research Working Papers 16,

Center for Policy Research, Maxwell School, Syracuse University.

Basawa, I.V., Mallik, A.K., McCormick, W.P., Reeves, J.H., Taylor, R.L., 1991. Bootstrapping unstable first-order autoregressive processes. Annals of Statistics

19, 1098–1101.

Cameron, A.C., Trivedi, P.K., 2005. Microeconometrics: Methods and Applications. Cambridge University Press, New York.

Carrion-i-Silvestre, J.L., Del Barrio-Castro, T., Lopez-Bazo, E., 2002. Level shifts in a panel data based unit root test. An application to the rate of

unemployment. In: Proceeding of the 2002 North American Summer Meetings of the Econometric Society: Economic Theory.

Carrion-i-Silvestre, J.L., Del Barrio-Castro, T., Lopez-Bazo, E., 2005. Breaking the panels. An application to real per capita GDP. Econometrics Journal 8,

159–175.

Cati, R., Garcia, M., Perron, P., 1999. Unit roots in the presence of abrupt governmental interventions with an application to Brazilian data. Journal of Applied

Econometrics 14 (1), 27–56.

Chan, F., Pauwels, L.L., 2011. Model specification in panel data unit root tests with an unknown break. Mathematics and Computers in Simulation 81,

1299–1309.

Chang, Y., Park, J.Y., 2003. A sieve bootstrap for the test of a unit root. Journal of Time Series Analysis 24, 379–400.

Culver, S.E., Papell, D.H., 1999. Long-run power parity with short-run data: evidence with a null hypothesis of stationarity. Journal of International Money

and Finance 18, 751–768.

De Blander, R., Dhaene, G., 2011. Unit root tests for panel data with AR(1) errors and small T. Econometrics Journal 15 (1), 101–124.

de la Fuente, A., 1997. The empirics of growth and convergence: a selective review. Journal of Economic Dynamics and Control 21 (1), 23–73.

De Wachter, S., Harris, R.D.F., Tzavalis, E., 2007. Panel unit root tests: the role of time dimension and serial correlation. Journal of Statistical Planning and

Inference 137, 230–244.

De Wachter, S., Tzavalis, E., 2012. Detection of structural breaks in linear dynamic panel data models. Computational Statistics & Data Analysis 56 (11),

3020–3034.

Faini, R., 2004. Trade liberalization in a Globalizing word. IZA Discussion Paper No. 1406.

Hadri, K., Larsson, R., Rao, Y., 2012. Testing for stationarity with a break in panels where the time dimension is finite. Bulletin of Economic Research,

forthcoming (http://dx.doi.org/10.1111/j.1467-8586.2012.00457.x).

Hahn, J., Kuersteiner, G., 2002. Asymptotically unbiased inference for a dynamic panel model with fixed effects when both nand Tare large. Econometrica

70, 1639–1657.

Hall, B.H., Mairesse, J., 2005. Testing for unit roots in panel data: an exploration using real and simulated data. In: Andrews, D., Stock, J. (Eds.), Identification

and Inference in Econometric Models: Essays in Honor of Thomas J. Rothenberg. Cambridge University Press, Cambridge.

Han, C., Phillips, P.C.B., 2010. GMM estimation for dynamic panels with fixed effects and strong instruments at unity. Econometric Theory 26, 119–151.

Harris, R.D.F., Tzavalis, E., 1999. Inference for unit roots in dynamic panels where the time dimension is fixed. Journal of Econometrics 91, 201–226.

Harris, R.D.F., Tzavalis, E., 2004. Inference for unit roots for dynamic panels in the presence of deterministic trends: do stock prices and dividends follow a

random walk? Econometric Reviews 23, 149–166.

Hlouskova, J., Wagner, M., 2006. The performance of panel unit root and stationary tests: results from a large scale simulation study. Econometric Reviews

25, 85–117.

Horowitz, J., 2001. The Bootstrap. In: Heckman, J.J., Leamer., E.E. (Eds.), Handbook of Econometrics, Vol. 5. Elsevier, Amsterdam, pp. 3159–3228.

Kim, D., Perron, P., 2009. Unit root tests allowing for a break in the trend function at an unknown time under both the null and alternative hypotheses.

Journal of Econometrics 148, 1–13.

Kiviet, Y., 1995. On bias, inconsistency, and efficiency of various estimators in dynamic panel data models. Journal of Econometrics 74, 119–147.

Lo, A.W., MacKinlay, A.C., 1995. A Non-Random Walk Down Wall Street. Princeton University Press.

Madsen, E., 2010. Unit root inference in panel data models where the time-series dimension is fixed: a comparison of different tests. Econometrics Journal

13, 63–94.

Mammen, E., 1992. When Does Bootstrap Work? Asymptotic Results and Simulations. Springer, New York.

Mankiw, N.G., Romer, D., Weil, D.N., 1992. A contribution to the empirics of economic growth. The Quarterly Journal of Economics 107 (2), 407–437.

Moon, H.R., Perron, B., 2008. Asymptotic local power of pooled t-ratio tests for unit roots in panels with fixed effects. Econometrics Journal 11 (1), 80–104.

Nickell, S., 1981. Biases in dynamic models with fixed effects. Econometrica 49, 1417–1426.

O’Connell, P.G.J., 1998. The overvaluation of purchasing power parity. Journal of International Economics 44, 1–19.

Park, J.Y., 2003. Bootstrap unit root tests. Econometrica 71, 1845–1895.

Perron, P., 1989. The great crash, the oil price shock, and the unit root hypothesis. Econometrica 57, 1361–1401.

Perron, P., 1990. Testing for a unit root in a time series with a changing mean. Journal of Business & Economic Statistics 8, 153–162.

Perron, P., 1997. Further evidence on breaking trend functions in macroeconomic variables. Journal of Econometrics 80 (2), 355–385.

Perron, P., 2006. Dealing with structural breaks. In: Mills, T., Patterson, K. (Eds.), Palgrave Handbook of Econometrics, Vol. 1: Econometric Theory. Palgrave

MacMillan, pp. 278–352.

Perron, P., Vogelsang, T., 1992. Nonstationarity and level shifts with an application to purchasing power parity. Journal of Business & Economic Statistics

10, 301–320.

Phillips, P.C.B., Sul, D., 2007. Bias in dynamic panel estimation with fixed effects, incidental trends and cross section dependence. Journal of Econometrics

137, 162–188.

Tzavalis, E., 2002. Structural breaks and unit root tests for short panels. In: ESRC Conference, Cass Business School, City University London.

Vogelsang, T., Perron, P., 1998. Additional tests for a unit root allowing for a break in the trend function at an unknown time. International Economic Review

39 (4), 1073–1100.

Wacziarg, R., Welch, K.H., 2004. Trade liberalization and growth: new evidence. NBER Working Paper 10152.

White, H., 2000. Asymptotic Theory for Econometricians. Academic Press.

Zivot, E., Andrews, D.W.K., 1992. Further evidence on the great crash, the oil price shock, and the unit-root hypothesis. Journal of Business & Economic

Statistics 10, 251–270.