Content uploaded by Tucker Mcelroy

Author content

All content in this area was uploaded by Tucker Mcelroy on Nov 19, 2018

Content may be subject to copyright.

Time series seasonal adjustment using

regularized singular value decomposition

Blinded Version

Abstract

We propose a new seasonal adjustment method based on the regularized singular value

decomposition (RSVD) of the matrix obtained by reshaping the seasonal time series data.

The method is ﬂexible enough to capture two kinds of seasonality: the ﬁxed seasonality that

does not change over time and the time-varying seasonality that varies from one season to

another. RSVD represents the time-varying seasonality by a linear combination of several

seasonal patterns. The right singular vectors capture multiple seasonal patterns, and the

corresponding left singular vectors capture the magnitudes of those seasonal patterns and how

they change over time. By assuming the time-varying seasonal patterns change smoothly over

time, the RSVD uses penalized least squares with a roughness penalty to eﬀectively extract

the left singular vectors. The proposed method applies to seasonal time series data with a

stationary or nonstationary non-seasonal component. The method also has a variant that can

handle that case that an abrupt change (i.e., break) may occur in the magnitudes of seasonal

patterns. Our proposed method compares favorably with the state-of-art

X-13ARIMA-SEATS

program on both simulated and real data examples.

Key Words: Seasonal adjustment, regularized singular value decomposition, X-13ARIMA-

SEATS

JEL Classiﬁcation: C14, C22.

1 Introduction

Seasonal adjustment of economic and business time series data is of great importance in

economic analysis and business decisions. Proper use of seasonal adjustment methodology

removes the calendrical ﬂuctuations from the seasonal time series, while minimizing distortions

to other dynamics in the data, such as trend. Seasonally adjusted time series data can be

used to evaluate and study the present economic situation (e.g., by examining the business

cycle), and therefore helps policy-makers and economic agents make correct and timely

decisions. Moreover, seasonally adjusted time series data can be entered into time series

econometric models that analyze the non-seasonal dynamic relationships among economic

and business variables. Findley (2005) is the most recent review article on the subject;

Bell, Holan, and McElroy (2012) contains a volume of articles on recent developments in

seasonality and seasonal adjustment; the monograph of Dagum and Bianconcini (2016)

provides a comprehensive presentation of various seasonal adjustment methods.

Generally speaking, there are two approaches for seasonal adjustment, the model-based

approach and the empirical-based approach. The model-based approach directly incorporates

seasonality in the econometric model and jointly studies the seasonal and non-seasonal

characteristics in time series data. It can be argued that the seasonality in one economic

variable can be related to other economic variables, or to the non-seasonal components within

the same variable, and therefore seasonality should not be regarded as a single and isolated

factor; see Lovell (1963), Sims (1974), and Bunzel and Hylleberg (1982), among others.

There are many diﬀerent modeling strategies of seasonal component, which can be generally

categorized into several types. One modeling strategy treats seasonality as deterministic

linear (nonlinear) additive (multiplicative) seasonal components; see, for example, Barsky

and Miron (1989), Franses (1998), and Cai and Chen (2006). Another popular modeling

strategy considers seasonality as stochastic, where seasonality can be deﬁned as the sum of a

stationary stochastic process and a deterministic process (Canova,1992), a nonstationary

process with seasonal unit roots (Hylleberg et al.,1990;Osborn,1993), a periodic process in

1

which the coeﬃcients vary periodically with seasonal changes (Gersovitz and MacKinnon,

1978;Osborn,1991;Hansen and Sargent,1993), or an unobservable component in a structural

time series model (Harrison and Stevens,1976;Harvey,1990;Eiurridge and Wallis,1990;

Harvey and Scott,1994;Proietti,2004). Because of the direct speciﬁcation and estimation of

the seasonal component in an econometric model, the model-based approach is statistically

more eﬃcient than the empirical-based approach. The disadvantage of the model-based

approach is that the extracted seasonal component can be sensitive to the dynamic and

distributional speciﬁcations that are imposed on the econometric model.

The empirical-based approach uses ad hoc methods to extract or remove seasonality and

delivers plausible empirical results with real data. One example is the X-11 method proposed

by the U.S. Census Bureau (Shiskin, Young, and Musgrave,1965), which uses weighted

moving averages to remove seasonality. This simple empirical method can be criticized for

its inﬂexibility, lack of support from statistical theory, and possible distortions of those

non-seasonal components in the time series, which subsequently causes misinterpretations

of dynamic relationships across diﬀerent time series. In order to correct the drawbacks

of X-11, researchers have proposed various improved empirical-based methods, such as X-

11-ARIMA (proposed by Dagum (1980)) and X-12-ARIMA (proposed by the U.S. Census

Bureau, and described in Findley et al. (1998)). These improved methods pre-treat the time

series data with ARIMA model to eliminate outliers and (ir)regular calendar eﬀects, and

perform forecasting and backcasting techniques to complete the data points at both ends of

the time series before the weighted moving averages of X-11 are applied to remove seasonal

ﬂuctuations.

Given the availability of many seasonal adjustment methods, many national statistical

agencies prefer the empirical-based approach because of its simplicity and nonreliance on

model assumptions. In this paper, we adopt the empirical-based approach: Under the

innocuous assumption that the non-seasonal component is stationary or non-stationary with

a stochastic trend, we propose a ﬂexible and robust seasonal adjustment method based on

2

Regularized Singular Value Decomposition (RSVD) proposed by Huang, Shen, and Buja

(2008) and Huang, Shen, and Buja (2009). Hereafter, these two papers are referred by HSB

(2008) and HSB (2009) respectively. We ﬁrst transform the vector of seasonal time series data

into a matrix whose rows represent periods and columns represent seasons. Then we perform

the RSVD on this matrix, the obtained right singular vectors represent seasonal patterns and

left singular vectors represent the magnitudes of the seasonal patterns for diﬀerent periods.

RSVD applies regularization to ensure that the extracted seasonal patterns changes over

time slowly. Such regularization improves stability of the extracted seasonal patterns and

their magnitudes. Our new method has merits in the following aspects. First, it is ﬂexible

enough to handle both ﬁxed and time varying seasonality, with or without abrupt changes

in seasonality. Second, it can accommodate both stationary and nonstationary stochastic

non-seasonal components. Third, because the regularization parameter is fully data-driven

by generalized cross validation, it is robust and applicable to some irregular seasonal data for

which popular seasonal adjustment methods may fail to deliver reasonable results.

There are similarities and diﬀerences between the Seasonal-Trend decomposition procedure

based on Regression (STR) approach proposed by Dokumentov and Hyndman (2015) and

our RSVD method in modeling seasonality. Essentially, both STR and RSVD methods

introduce Tikhonov regularization terms that are motivated by the smoothness feature of

the seasonal component. The RSVD method considers the singular value decomposition of

the seasonal matrix and only imposes roughness penalties on the left singular vectors that

capture the variation of seasonal patterns (i.e., corresponding right singular vectors) across

consecutive seasonal cycles. In contrast, the STR method directly imposes roughness penalties

on seasonal terms (or the coeﬃcients in the linear combination of spline basis functions

that approximate the seasonal terms) across consecutive seasonal cycles and/or within each

seasonal cycle. However, compared to the STR method, the main advantage of our RSVD

method is that, with the merit of dimension reduction due to a low rank approximation, the

parameterization of our method is much more parsimonious than that of the STR method,

which directly estimates each seasonal term. Moreover, the RSVD method decomposes the

3

seasonal component into ﬁxed and time-varying seasonal patterns, which can provide much

rich information about the complexity and composition of seasonality.

In this paper, using both simulated and real economic data, we also compare our proposed

seasonal adjustment methods with two state-of-the-art and widely used seasonal adjustment

methods (X-12-ARIMA and SEATS) provided in the latest

X-13ARIMA-SEATS

program

developed by U.S. Census Bureau. We ﬁnd that (i) when seasonality is moderate or weak,

traditional X-12-ARIMA and SEATS methods tend to outperform our proposed seasonal

adjustment method, which is especially the case if the seasonality is weak; (ii) however,

in comparison to X-12-ARIMA and SEATS methods, our proposed seasonal adjustment

method is good at capturing strong seasonal variations in the series; (iii) our proposed

method is robust to some irregular seasonal data for which X-12-ARIMA and SEATS may

need additional delicate performance tuning. Moreover, compared to X-12-ARIMA and

SEATS, our proposed method provides a more transparent and meaningful explanation for

seasonality. Our proposed method decomposes the seasonal component into diﬀerent seasonal

patterns, traces the dynamics of seasonality by time-varying pattern coeﬃcients, and identiﬁes

important seasonality break times automatically, which provides rich insights into seasonality.

The remaining part of this paper is organized as follows. Section 2brief reviews the

RSVD. Section 3introduces some notations for the matrix representation of seasonal time

series. Section 4gives our basic seasonal adjustment method when non-seasonal component

is stationary or diﬀerence stationary. Section 5extend our basic seasonal adjustment method

to accommodate stochastic trend and abrupt changes in seasonality. Simulation results under

diﬀerent data generating processes (DGPs) are reported in Section 6, and three real data

examples are provided in Section 7. Section 8concludes. Due to the space limitation, some

technical details, additional simulation results and further discussions on some important

issues concerning our proposed RSVD seasonal adjustment method are provided in the online

Supplementary Material.

4

2 A brief review of regularized SVD

As a well known matrix factorization technique, the Singular Value Decomposition (SVD) has

been widely used in tackling many real practical problems. In the context of latent semantic

analysis, Deerwester et al. (1990) propose a new approach to automatic indexing and retrieval

using the SVD method. Sarwar et al. (2000) make use of SVD for recommender systems

that make product recommendations during a live customer interaction. More recently and

publicly known, Bell, Bennett, Koren, and Volinsky (2009), the 1M Grand Prize winner of

the Netﬂix Prize contest, employ SVD in the challenge of predicting user preferences.

Regularized singular valued decomposition (RSVD) is a variant of singular value decom-

position that takes into account the intrinsic smoothness structure of a data matrix data

(HSB, 2008, 2009). The basic idea of RSVD is quite intuitive. The data matrix is considered

as discretized values of a bivariate function with certain smoothness structure evaluated

at a grid of design points. To impose smoothness in singular value decomposition, RSVD

imposes roughness penalties on the left and/or right singular vectors when singular value

decomposition is implemented on the data matrix.

Consider a

n×p

dimensional data matrix

X

= (

xij

) whose column mean is zero. The ﬁrst

pair of singular vectors, uand vrespectively, solves the following minimization problem,

(b

u,b

v) = arg min

u,v

kX−uv>k2

F,(2.1)

which does not assume any smoothness structure of the data matrix. In contrast, RSVD

explores such smoothness structure by imposing roughness penalties on singular vectors

u

and

v

. In the context of seasonal adjustment, the seasonal time series can be represented as

a matrix whose each row represents one period of all seasons. We later argue that the data

matrix should have smooth changes across rows, and thus the changes in left singular vector

u

are expected to be smooth. Therefore, a relevant RSVD solves the following minimization

5

problem,

(b

u,b

v) = arg min

u,v

kX−uv>k2

F+αu>Ωu (2.2)

where is

Ω

is a

n×n

non-negative deﬁnite roughness penalty matrix,

α

is smoothing parameter,

and v>v= 1 for identiﬁcation purpose.

A simple variant of the power algorithms in HSB (2008, 2009) gives the following Algo-

rithm 1for solving the problem (2.2).

Algorithm 1 (Regularized singular value decomposition of X).

Step 1. Initialize uusing the standard SVD for X.

Step 2. Repeat until convergence:

(a) v←X>u

kX>uk.

(b) u←

(

In

+

αΩ

)

−1Xv

with

α

selected by minimizing the following generalized cross-

validation criterion,

GCV(α) = 1

n

k[In−M(α)]Xvk2

1−1

ntr{M(α)}2,(2.3)

where

In

is the

n×n

identity matrix, and

M

(

α

)=(

In

+

αΩ

)

−1

is the smoothing matrix.

The derivation of the generalized cross-validation criterion used in

(2.3)

is similar to HSB

(2008, 2009) and can be found in the supplementary materials. The diﬀerence of Algorithm 1

from previous algorithms is that, in HSB (2008) the roughness penalty is imposed only on

v

and in HSB (2009) on both

u

and

v

. If there is no penalty, i.e.,

α

= 0, the algorithm is

essentially the power algorithm for standard SVD and solves the problem (2.1).

In general, the regularized SVD attempts to ﬁnd a rank-

r

decomposition (

r≤p

) such

that

X

=

UV>

, where

U

is a

n×r

matrix, and

V

is a

p×r

matrix. The

j

-th column in

matrix

U

and

V

is called the

j

-th left and right regularized singular vector of matrix

X

respectively. Algorithm 1ﬁnds the ﬁrst regularized singular vector pair. The subsequent

regularized singular vector pairs can be obtained by repeatedly applying Algorithm 1to

6

the residual matrix X−b

ub

v>. Below, we propose some variants of Algorithm 1for diﬀerent

scenarios of seasonal adjustment.

3 Matrix representation of seasonal time series

For a seasonal time series

{xt

:

t

= 1

· · · , T }

with

p

seasons, we can represent it (c.f., Buys

and Ballot (1847)) as a matrix with

p

columns, whose each row represents one period of the

seasons, as follows

X=hx>

1·x>

2·. . . x>

i·. . . x>

n·i>

,

where the 1

×p

row vector

xi·

denotes the

i

-th row of matrix

X

. Hence, the

T×

1 column

vector form of time series xtcan be written as

XT≡Vec(X>)=(x1,· · · , xt,· · · , xT)>= (x1·,· · · ,xi·,· · · ,xn·)>,

where the function

Vec

(

·

) converts a matrix into a column vector by stacking the columns

of the matrix. The subscripts of the elements in the matrix representation can be obtained

using a mapping of the one-dimensional time subscript

t∈N

to the two-dimensional time

subscripts, (i, j)∈N2, denoting the j-th season in the i-th period,

I:N7→ N2(3.1)

t→(i(t), j(t)) ≡(dt/pe, t − bt/pcp).

Let

n≡T/p

denote the total number of time span included in the time series, so that we

have that 1

≤i≤n

, 1

≤j≤p

, and

t

= (

i

(

t

)

−

1)

p

+

j

(

t

). Here we assume that

T/p

is an

integer for simplicity of exposition.

For later use of notations, let

ip

and

0p

denote the

p×

1 column vector of ones and zeros

respectively. Moreover, let

Qn

denote the

n

-dimensional column-wise de-meaning matrix, i.e.,

Qn≡In−ini>

n/n

so that

Qna

=

a−¯

a

, for a vector

a

= (

a1, . . . , an

)

>

, where

¯

a

=

¯ain

, and

7

¯a=P1≤i≤nai/n. Let the (d−1) ×dmatrix ∆dbe the ﬁrst order diﬀerence operator, i.e.,

∆d≡h0d−1Id−1i−hId−10d−1i.

Then the second order diﬀerence operator is ∆

2

d≡

∆

d−1

∆

d

. Using these diﬀerence operators,

one widely used choice of the penalty matrix in (2.2) can take the form Ω≡(∆2

n)>∆2

n.

4 Basic seasonal adjustment

This section discusses seasonal adjustment based on regularized SVD. We motivate the use

of regularized SVD on seasonal adjustment in Section 4.1, and then propose the seasonal

adjustment procedures when the non-seasonal component of a time series is stationary or

diﬀerence stationary in Section 4.2 and 4.3 respectively. Section 4.4 introduces how to select

the number of seasonal patterns in the RSVD method.

4.1 Motivation of using regularized SVD for seasonal adjustment

We decompose the seasonal time series

{xt}T

t=1

into the deterministic seasonal component

st

and stochastic non-seasonal component etin the additive form,

xt=st+et, t = 1, . . . , T, (4.1)

where the non-seasonal component

et

is a stationary process. Using the mapping

I

deﬁned in

(3.1)

, we rewrite

(4.1)

as

xi,j

=

sij

+

ei,j

, where the seasonal component satisﬁes

Pp

j=1 si,j

= 0

for identiﬁcation. The decomposition can also be written in matrix from,

X=S+E.(4.2)

When the seasonal eﬀects are ﬁxed, that is, the seasonal pattern does not change from

period to period,

st

=

fj(t)

, the seasonal component

S

can be represented as

S

=

in·f>

. In

8

this case, a single seasonal pattern

f>

= (

f1, f2,· · · , fp

) repeats itself in each period. In

general, the seasonal eﬀects may change over time, we use a rank-

r

reduced SVD of (

S−in·f>

)

to represent the time-varying seasonality:

S=in·f>+UV>(4.3)

where

U

is a

n×r

matrix, and

V

is a

p×r

matrix with

V>V

=

Ir

and

r≤p

. For

identiﬁcation, we require the columns of

U

to be orthogonal to

in

or,

U>in

=

0

, which is

equivalent to

Q>

nU

=

U

. The second term in the decomposition

(4.3)

provides an intuitive

explanation for the seasonality. The

j

-th column vector

vj

in

V

represents the

j

-th

seasonal

pattern

; and the corresponding

j

-th column vector

uj

in

U

is called

pattern coeﬃcients

,

since its elements delineate how the

j

-th seasonal pattern changes across diﬀerent periods.

Equations

(4.2)

and

(4.3)

comprise our basic seasonal adjustment method. For example, in a

special case of

r

= 1 with

f

=

0

, the seasonal matrix is

S

=

uv>

with

v

consisting of the

seasonal pattern and ugiving its time evolution across diﬀerent periods.

Now we argue that there is intrinsic smoothness in the seasonal signal that warrants using

the regularized SVD. For notational simplicity, assume the ﬁxed seasonality term is void.

The

i

-th row of

S

, denoted by

si

, represents the seasonal behavior of series

xt

during the

i

-th

period, which is a linear combination of all the seasonal patterns in

V

with the

i

-th row of

U

as the coeﬃcients, i.e.,

si=uiV>=

p

X

j=1

ui,j vj.

A necessary condition for seasonality is persistence of a seasonal pattern from one year to

the next; for a stochastic approach, persistence is assessed through correlation, whereas in

a deterministic context the concept of smoothness is used instead. Essentially, seasonality

imposes that the

ui

’s, or,

ui,j

’s for ﬁxed

j

(i.e., the elements in each column of matrix

U

)

change smoothly with

i

. Based on this smoothness on the decomposition of seasonal matrix

S, we deem that the roughness of each column in the observed data matrix Xis due to the

9

“contamination” of the stochastic non-seasonal component

E

in

(4.2)

. This smoothness also

suggests the use of regularized SVD for ﬁnding the decomposition

(4.2)

with a roughness

penalty applied on the columns of

U

. On the other hand, it is usually not appropriate to

apply a roughness penalty on the columns of

V

, since seasonal behaviors usually have sharp

increases and falls within a period.

In sum, any seasonal matrix

S

can be decomposed uniquely into the SVD form in

(4.3)

with some

r≤p

according to matrix theory, which implies that any seasonal component is

driven by at most a ﬁxed seasonal patterns

f

and the

r

time-varying seasonal patterns in

the columns of

V

. Given the smoothness feature of seasonality, the pattern coeﬃcients in

the columns of

U

should be smooth over time. The regularization with roughness penalty

eﬀectively separates the seasonal variations in

S

from the irregular component in

E

. Based

on a selection criterion, we select those signiﬁcant seasonal patterns from the data matrix

X

that drives the seasonal behavior and discard those indiscernible seasonal patterns that are

submerged in noise. Therefore, our RSVD method should be good at capturing seasonality

that has strong variations compared to the irregular component.

There are three reasons that prevent direct application of Algorithm 1in HSB (2008,

2009) to the data matrix

X

for seasonal adjustment. First, because of the existence of ﬁxed

seasonality, it is unrealistic to restrict the sample mean of each column of

X

to zero, i.e.,

to simply subtract the mean from each column. Instead, the ﬁxed seasonality

f

should be

explicitly estimated in the seasonal adjustment procedure. Second, for identiﬁcation, the sum

of seasonal terms within a period should be zero, i.e.,

Pp

j=1 si,j

= 0 for each

i

= 1

,· · · , n

.

Otherwise, the seasonal component would incorporate part of the overall level of the series.

Third, if the non-seasonal component

{et}

is non-stationary and has a stochastic trend

which is more commonly encountered in economic time series data, Algorithm 1, assuming

stationarity in

{et}

, is invalid. Next, taking all these into account, we develop a procedure

that is based on a modiﬁcation of Algorithm 1.

10

4.2 Seasonal adjustment with stationary et

Our basic seasonal adjustment procedure has three steps: 1. Estimate the seasonal pattern

coeﬃcients in

U

using a modiﬁed version of Algorithm 1to satisfy the zero-sum restriction on

seasonal eﬀects; 2. Estimate the ﬁxed seasonal pattern

f

and the time-varying seasonal pat-

terns in

V

; 3 (optional). Estimate the parameters of the stationary non-seasonal components.

The three steps are elaborated below.

Step One: estimating seasonal pattern coeﬃcients in U

To apply Algorithm 1, we ﬁrst eliminate the ﬁxed seasonal eﬀects in

(4.3)

by pre-multiplying

the column-wise de-meaning matrix

Qn

to data matrix

X

to obtain

e

X

=

QnX

. Since

Qnin=0nand QnU=U,

e

X=QnS+QnE=Qn(in·f>+UV>) + QnE=UV>+e

E.

The resulting column-centered data matrix

e

X

does not have a ﬁxed seasonality. To guarantee

the zero-sum seasonal eﬀects requirement

S·ip

=

0n

, we enforce the suﬃcient conditions of

zero-sum seasonal patterns,

f>ip

= 0 and

V>ip

=

0r

. Combining above gives the following

modiﬁed version of Algorithm 1.

Algorithm 2.

It is the same as Algorithm 1except that

(1) the data matrix Xis replaced by e

X=QnX, and

(2) the updating equation in Step 2(a) now becomes

v←Qpe

X>u

kQpe

X>uk.

In Step 2(a) of this algorithm, pre-multiplication with

Qp

ensures

v>ip

= 0 for the zero-sum

of seasonal pattern requirement, and the normalization is to ensure

v>v

= 1 for identiﬁcation.

Applying Algorithm 2we obtain the ﬁrst pair of estimated right singular vector, denoted

as

e

v

, and the left singular vector, denoted as

b

u

. The subsequent pair of singular vectors can

11

be extracted by applying Algorithm 2to the residual matrix

e

X−b

ue

v>

, in which the preceding

eﬀect of the ﬁrst pair of singular vectors is subtracted from data matrix

e

X

. Applying this

procedure

r

times sequentially, we obtain

r

pairs of regularized singular vectors, concatenating

them into the

n×r

matrix

b

U

= (

b

u1,· · · ,b

ur

) and the

p×r

matrix

e

V

= (

e

v1,· · · ,e

vr

). We

keep b

Ufor use in the next step.

Step Two: estimating ﬁxed/time-varying seasonal patterns in f and V

Recall that

XT

=

Vec

(

X>

). Given the estimates of seasonal pattern coeﬃcients

b

U

in

Step One, and that the pattern coeﬃcients of ﬁxed seasonal pattern

f

all take value 1, the

estimates of the time varying seasonal patterns in

V

and ﬁxed seasonal pattern

f

can be

obtained jointly by solving a constrained least squares problem,

(b

f,b

V) = arg min

f,VhXT−Vec(f·i>

n+Vb

U>)i>hXT−Vec(f·i>

n+Vb

U>)i,(4.4)

such that f>·ip= 0,and V>·ip=0r.

Note that the minimization problem in (4.4) can be rewritten as,

b

β= arg min

β

(XT−Zβ)>(XT−Zβ) with Rβ=0r+1,(4.5)

where

Z≡

[

in⊗Ip,b

u1⊗Ip,· · · ,b

ur⊗Ip

],

β≡

(

f>,v>

1,· · · ,v>

r

)

>

, and

R≡Ir+1 ⊗i>

p

. Then

the estimate b

βcan be written explicitly as,

b

β≡(b

f>,b

v>

1,· · · ,b

v>

r)>=b−(Z>Z)−1R>[R(Z>Z)−1R>]−1Rb,

where b= (Z>Z)−1Z>XTis the unconstrained least squares estimate for the problem (4.5).

Given the estimates of ﬁxed and time-varying seasonal patterns in

b

f

and

b

V

obtained from

the constrained least squares regression, we obtain the estimated seasonal component as,

b

S=inb

f>+b

Ub

V>.

Step Three (optional): estimating ARMA parameters in non-seasonal compo-

nent E

12

In the second step, we obtain the estimated seasonal matrix

b

S

, which can be re-written

in vector form as

{bst}T

t=1

. Correspondingly, the estimated non-seasonal component can be

extracted by subtracting

bst

from the original time series

xt

, i.e.,

bet≡xt−bst

. If the stochastic

component of

xt

is assumed to follow a stationary ARMA(

p, q

) process, i.e.,

et∼ARMA

(

p, q

)

for

t

= 1

, . . . , T

, the ARMA parameters can then be obtained by ﬁtting the ARMA model to

bet

. (This is a “nuisance” model; other stationary models could be used without aﬀecting the

methodology.)

Based on the ﬁtted ARMA model, a feasible GLS estimation can be obtained by weighting

the least squares in

(4.4)

with the inverse of estimated variance-covariance matrix of the

stochastic non-seasonal component

bet

. Although such an iterated procedure could potentially

improve estimation accuracy, we ﬁnd (using a simulation study) that the eﬃciency gain of

the feasible GLS in terms of reductions in AMSE and AMPE for estimating the seasonal

component is only marginal (around 2%) even when the ﬁrst order autocorrelation of

et

reaches

0.8. Thus, in general, we recommend using the unweighted ordinary least squares estimation

in

(4.4)

instead of GLS unless the non-seasonal component

et

exhibits very strong persistence.

This also has the beneﬁt of avoiding the additional computation burden. Moreover, as the

ARMA model is usually not of particular interest for seasonal adjustment, Step Three can be

omitted from the procedure.

4.3 Seasonal adjustment with diﬀerence stationary et

The basic seasonal adjustment assumes that the non-seasonal component of a seasonal time

series is stationary. This section discusses the situation more commonly encountered in

economic data, wherein the non-seasonal component is nonstationary and has a stochastic

trend. More speciﬁcally, we assume the non-seasonal component

et

in the decomposition

(4.1)

xt=st+etis an integrated process, i.e., the ﬁrst diﬀerence process of etis stationary.

Existence of a stochastic trend in each column of

E

invalidates the use of regularized SVD in

the basic adjustment procedure, the direct use of which may produce an inconsistent estimate

13

of the seasonal component

S

. As examination of

(4.2)

indicates, when the non-seasonal

component is stationary, that there is no clear smooth pattern in each column of

E

, while

the seasonal component changes smoothly in each column of

S

; therefore the regularized

SVD can separate

S

from

E

. However, if there is a stochastic trend in

E

, each column of

E

has a stochastic trend and thus is quite smooth. Intuitively, the trend smoothness in

E

“contaminates” the seasonality smoothness in

S

. Thus the basic regularized SVD in Section 4

will fail to separate

S

from

E

, resulting in inconsistent estimates of smooth pattern coeﬃcients

b

u

’s. Moreover, the nonstationarity of the non-seasonal component also invalidates the use

of least squares for estimating seasonal patterns

b

v

’s in Step Two of the basic adjustment

procedure in Section 4.2.

Our new procedure is a modiﬁcation of the basic procedure to address these issues. It also

has three steps, as elaborated below.

Step One: estimating seasonal pattern coeﬃcients in U

We ﬁrst remove the stochastic trend in the non-seasonal component and then apply the

regularized SVD. To this end, we take the ﬁrst order column diﬀerence of matrix

X

. This

diﬀerencing removes the stochastic trend in

E

but will not change the column-wise smoothness

of the seasonal component matrix

S

. In matrix form, we post-multiply equation

(4.2)

by ∆

>

p

and obtain

X†≡X∆>

p=S∆>

p+E∆>

p≡S†+E†.

As in the basic adjustment procedure, we represent the seasonal component matrix using

a reduced SVD as in (4.3), that is, S=inf>+UV>. Then

S†=S∆>

p= (inf>+UV>)∆>

p≡inf>† +UV>†.(4.6)

Equations

(4.3)

and

(4.6)

show that the seasonal matrix

S

and its ﬁrst order column-diﬀerenced

matrix

S†

share the same left singular matrix

U

, as the ﬁrst order diﬀerencing operates from

the right side of the matrix. The ﬁrst order column-diﬀerence on

E

removes the nonstationary

14

trend in ARIMA(p, 1, q), so that E†is weakly stationary.

We eliminate the ﬁxed seasonal eﬀects in

(4.6)

by pre-multiplying the column-wise de-

meaning matrix

Qn

by the ﬁrst order column-diﬀerenced data matrix

X†

to obtain

e

X†

=

QnX†

.

We repeatedly apply Algorithm 1

r

times to the matrix

e

X†

(or the residual matrices) to

sequentially extract the regularized left singular vectors. Here, unlike in the basic procedure

of the previous section, there is no need to enforce the zero-sum seasonal eﬀects requirement

on the right singular vectors, since we are working on the column diﬀerenced data matrix.

Denote the so-extracted Umatrix as b

U, for use in Step Two.

Step Two: estimating ﬁxed/time-varying seasonal patterns in f and V

Given the estimated left singular vectors in

b

U

, we estimate the ﬁxed and time-varying

seasonal patterns in

f

and

V

jointly by solving a constrained least squares problem. In

contrast to the basic adjustment procedure, we need to work with the diﬀerenced series to

remove the eﬀect of nonstationarity.

Let ∆

XT

denote the ﬁrst diﬀerence of

XT

=

Vec

(

X>

), where ∆ is the diﬀerencing operator.

The constrained least squares problem is similar to that in (4.4) and can also be written as

b

β= arg min

β

(∆XT−∆Zβ)>(∆XT−∆Zβ) with Rβ=0r+1,(4.7)

where

Z

,

β

, and

R

are deﬁned in the same manner as

(4.5)

. After solving the constrained least

squares problem above, we obtain the estimated seasonal component as b

S=inb

f>+b

Ub

V>.

Step Three (optional): estimating parameters in non-seasonal component E

If we assume that the non-seasonal component

xt

follows the dynamics of an

ARIMA

(

p,

1

, q

)

process, then the ARIMA parameters can be obtained by ﬁtting an ARIMA model to the

residual series

bet

=

xt−bst

. Then, a feasible GLS estimation of

f

and

V

can be obtained

by weighting the least squares in

(4.7)

with the inverse of the estimated variance-covariance

matrix of the diﬀerenced non-seasonal component ∆

bet

. As we discussed in the description of

the basic seasonal adjustment procedure, this step is usually not necessary.

15

4.4 Selecting the number of seasonal patterns

We propose to select the number of seasonal patterns

r

by the following information criteria.

For each seasonal time series, the number of period within each season

p

is ﬁxed, and the

total number of seasons

n

increases as the total number of observations

T

=

np

increases.

We use standard Bayesian Information Criterion (BIC) in time series applications, in which

the penalty for overﬁtting (

log n

)

/n

only involves

n

. If the non-seasonal component of the

seasonal time series is stationary, the information criterion is,

BIC(r) = ln "1

T

T

X

t=1

(xt−bst)2#+rlog n

n; (4.8)

if the non-seasonal component of the seasonal time series is nonstationary, the information

criterion is,

BIC(r) = ln "1

T−1

T

X

t=2

(∆xt−∆bst)2#+rlog n

n; (4.9)

where

n

is the total number of seasons,

{xt}

is the original seasonal time series,

{bst}

is the

estimated seasonal component, and ∆ is ﬁrst order diﬀerence operator.

5 Seasonal adjustment when there is abrupt change to

seasonality

In previous discussions, it is assumed that the elements in each column of matrix

U

(represent-

ing the magnitude of a seasonal pattern in one period) changes smoothly across periods. This

has two implications. First, the magnitude of each seasonal pattern only changes in a smooth

fashion. Second, all seasonal patterns appear in all periods. In reality, these assumptions

may be violated due to sudden changes of statistic criteria (such as sampling method and

scope) or social economic environment (such as economic policies and enforcement of laws

16

aﬀecting behavior). Hence, seasonal patterns do not necessarily prevail all the time in a time

series: some seasonal patterns may transiently exist with nonzero magnitudes, and abruptly

vanish. Moreover, the change in magnitude of seasonal pattern does not necessarily have

the same “smoothness” across all time spans: the magnitudes of seasonality may present

mild changes for early periods, and then have sharp changes in other periods. This section

discusses how to perform seasonal adjustment to handle these complicated scenarios.

To address abrupt changes (also referred to as breaks or change points) in seasonality,

our method is a modiﬁcation of procedures presented in the previous two sections. We take

the procedure from Section 4.3 as an example to show how to modify it. The basic seasonal

adjustment procedure in Section 4.2 can be modiﬁed in a similar manner.

In Section 4.3, the seasonal adjustment procedure has three steps. To handle the abrupt

seasonality change, we only need to modify Step One. It is suﬃcient to allow for at most

one abrupt change for each seasonal pattern but the timing of break may be diﬀerent for

each seasonal pattern. Step One of the previous procedure is based on

e

X†

=

UV>†

+

e

E†

. By

assuming each column of

U

is smooth, the previous procedure extracts the columns of

U

using the regularized SVD. Since the columns of

U

are sequentially extracted, we only need

to discuss how to modify the procedure for one column of

U

, denoted as

u

, corresponding to

a seasonal pattern v.

Now, suppose a non-smooth change of seasonality happens after

`

seasonal periods and

`

= 0 if there is no break. The period index

`

separates the entire time span into two portions:

one part starts from the beginning and ends at period

`

, and the second part contains the rest.

If we know

`

, the timing of the break, we can apply the following modiﬁcation of Algorithm

1to extract

u

. Since the change point naturally separates

u

into two parts

u1

and

u2

, the

modiﬁed algorithm updates these two parts separately using diﬀerent smoothing parameters.

Algorithm 3.

It is the same as Algorithm 1except that

(1) the data matrix Xis replaced by e

X†=QnX∆>

p, and

17

(2) the updating equation in Step 2(b) now becomes two equations that update

u1

and

u2

separately by applying the Step 2(b) of the original algorithm to the ﬁrst `rows and the last

n−`rows of e

X†respectively.

In Algorithm 3, using diﬀerent smoothing parameters for the ﬁrst

`

elements and last

n−`

elements of

u

enhances the ﬂexibility of the procedure to handle abrupt changes in

seasonal behaviors across time spans. After applying this algorithm

r

times to sequentially

extract the columns of

b

U

, Step Two of the procedure in Section 4.3 can be used to obtain

b

f

,

b

V

. Including the dependence on

`

in our notation, we can obtain the estimated seasonal

component matrix b

S(`) = inb

f(`)>+b

U(`)b

V(`)>.

In practice, we don’t know

`

and so we need to specify it using data. Since the roughness

penalty involves second order diﬀerencing, we have 3

≤`≤n−

3. Including the no-break

case of

`

= 0, there are (

n−

5 + 1) possible values of

`

for each seasonal pattern. For

r

seasonal patterns, the set of all conﬁgurations of breaks is

L

=

{`

= (

`1, . . . , `r

)

}

, and the

total number of all possible conﬁgurations is #(

L

)=(

n−

5 + 1)

r

. When

n

is large, #(

L

)

can be so large that exhaustive search for the optimal breaks impractical due to intensive

computational burden. When applying RSVDB method to real data, we set a maximal

number of seasonal patterns rmax to alleviate this problem.

Next we discuss how to specify the timing of the breaks. We select the optimal speciﬁcation

of the change points b

`by minimizing the following criterion:

b

`= arg min

`∈L

1

T−1

T

X

t=2

[∆xt−∆bst(`)]2,

Note the criterion equals

1

T−1

T

X

t=2

[∆st−∆bst(`)]2+1

T−1

T

X

t=2

∆e2

t+2

T−1

T

X

t=2

[∆st−∆bst(`)]∆et.(5.1)

Here, by taking a ﬁrst order diﬀerence of the times series (i.e., ∆

xt

and ∆

bst

(

`

)), we avoid

working with a nonstationary series and the associated diﬃculties. By the ergodic theorem,

18

on the right hand side of

(5.1)

, the second term converges to a constant and the third term

converges to zero. Thus, minimizing this criterion essentially ﬁnds the best conﬁguration by

matching the extracted seasonal component with the true seasonal component (i.e., focusing

on the ﬁrst term).

The TRAMO-SEATS method has a feature to handle breaks in seasonality as the seasonal

outlier that allows for an abrupt increase or decrease in the level of the seasonal pattern that

is compensated for in the other months or quarters. To use this feature, one needs to specify

in advance the types of breaks and the break times in the oﬃcial

X-13ARIMA-SEATS

program

from the U.S. Census Bureau. Such seasonal outliers can be automatically detected in the

Eurostat software JDemetra+. In contrast, our RSVD method not only can automatically

detect seasonal breaks but also allows for more diverse abrupt breaks in seasonality.

6 Simulation

In this section, we use simulated monthly time series data to evaluate the ﬁnite sample

performance of our proposed seasonal adjustment methods and compare them with one

state-of-art method used by U.S. Census Bureau. The benchmark for our comparison is the

X-13ARIMA-SEATS

(U.S. Census Bureau,2017), which is a hybrid program that integrates the

model-based TRAMO/SEATS software developed at the Bank of Spain, described in G´omez

and Maravall (1992,1997) and the X-12-ARIMA program developed at the U.S. Census

Bureau. In this section and the next, we abbreviate our seasonal adjustment methods as RSVD

(since RSVD plays a critical role in our procedure), the TRAMO/SEATS and X-11 style

methodologies in

X-13ARIMA-SEATS

program as SEATS and X-12-ARIMA respectively. The

data generating processes follow

(4.1)

. Section S.2 (in the Supplementary Material) and 6.1

consider artiﬁcial seasonality with and without abrupt breaks and stationary/nonstationary

ARIMA error terms, and Section 6.2 considers seasonality from three real economic time

series.

19

6.1 Seasonality with abrupt breaks

Now we consider a deterministic monthly seasonal component with a non-smooth break

sb

t≡sb

i,j =biaj

where

i

= 1

, . . . , n

and

j

= 1

,...,

12 indicate year and month respectively, and the elements

in vector b= (b1, . . . , bn)>and a= (a1, . . . , a12)>take the following values,

bi=

1 + i/10,if 1 ≤i≤n/2,

1+(n+ 1 −i)/5,if n/2+1≤i≤n.

a= (−1.25,−2.25,−1.25,0.75,−1.25,−0.25,2.75,−0.25,0.75,−0.25,0.75,1.75)>.

The vector

a

represents the reoccurring variation within each seasonal period, which is

the same as that in Section 6.1. The magnitude of the seasonal component, captured by the

multipliers in vector

b

, increases slowly in the ﬁrst

n/

2 years linearly, doubles at

n/

2+1

year, and then decreases slowly in the last

n/

2 years linearly. The seasonality can be also

expressed in matrix form

so

=

ba>

=

inf>

+

uv>

where the terms are deﬁned the same way in

(S.3)

. In Figure S2, we plot ﬁxed/time-varying seasonal patterns

f

and

v

in upper-left panel,

ﬁxed/time-varying pattern coeﬃcients

in

and

u

in upper-right panel, ﬁxed/time-varying

seasonality inf>and uv>in lower-left panel, and total seasonality soin lower-right panel.

For the non-seasonal component, we only consider the nonstationary ARIMA(1,1,1) process

in DGP3:

et∼ARIMA

(1

,

1

,

1), with

φ

= 0

.

8 and

ψ

= 0

.

1 with

N

(0

, σ2

) innovations and

σ2

= 0

.

04. The results of stationary cases for DGP1 and DGP2, which are similar to the

nonstationary DGP3, are omitted here.

After the seasonal component

sb

t

and non-seasonal component

et

are generated, we use the

following formula to obtain simulated time series data

xt=st+et≡κSD(et)

SD(sb

t)sb

t+et,

and the sample unconditional standard deviation ratio

SD

(

st

)

/SD

(

et

) is ﬁxed to be exactly

κ

20

in each replication of DGPs. For nonstationary DGP3, we choose

κ

= 0.2, 0.4, . . . , 2 in our

setups. For each combination of DGP3 and

κ

values, we simulate monthly time series data

with sample size

T

= 240 (i.e.,

n

= 20 and

p

= 12). We repeat the simulation

B

= 500 times

for each setup.

Table 1: Evaluation of estimates of seasonal component with break (DGP3)∗

AMSE (×10−2) AMPE (%) Avg. r

X-12- X-12-

κARIMA SEATS RSVD RSVDB ARIMA SEATS RSVD RSVDB RSVD RSVDB

0.2 4.3774 5.7199 1.9851 1.7623 33.18 40.89 24.32 23.87 1.030 1.026

0.4 11.2136 11.3763 2.5562 1.6082 19.51 28.93 11.82 11.49 1.012 1.016

0.6 21.4088 16.3613 3.8361 1.5588 15.23 23.38 8.06 7.54 1.006 1.014

0.8 34.7795 20.2008 5.6820 1.5439 13.16 19.62 6.27 5.63 1.008 1.014

1.0 50.3344 23.9295 8.0542 1.5366 11.92 16.90 5.19 4.49 1.004 1.014

1.2 68.6043 27.3233 10.9465 1.5318 11.16 14.49 4.49 3.74 1.004 1.014

1.4 86.4549 31.5762 14.4462 1.5296 10.56 12.82 4.04 3.20 1.012 1.014

1.6 108.0307 36.1908 18.4037 1.5276 10.12 11.46 3.67 2.80 1.012 1.014

1.8 130.6624 41.2645 22.9642 1.5264 9.77 10.37 3.39 2.49 1.004 1.014

2.0 155.3913 46.3343 28.0056 1.5254 9.49 9.49 3.18 2.24 1.008 1.014

*

The non-seasonal component

{et}

follows Gaussian ARIMA(1,1,1) with AR(1) coeﬃcient 0.8 and

MA(1) coeﬃcient 0.1. The oﬃcial X-13ARIMA-SEATS program can only manually specify seasonal

outliers as breaks in seasonality. To make the evaluation more fair, we use the X-12-ARIMA and

SEATS provided in JDemetra+ that is capable of detecting seasonal outliers automatically.

Table 1reports the results of the two benchmark methods, RSVD without break, and

RSVD allowing for break. Both RSVD methods outperforms the benchmarks by delivering

smaller absolute and relative losses, and the RSVD allowing for break has the smallest errors

among the three methods. The absolute loss (AMSE) of RSVD and the benchmarks increases

as the ratio

κ

increases, while that of RSVD allowing for break decreases and stabilizes. The

relative loss, AMPE, of the three methods decreases as

κ

increases, and that of the RSVD

allowing for break decreases most quickly among the three methods. Moreover, similar to

the cases in Table S1, the average selected numbers of seasonal patterns

r

for both RSVD

methods are generally the same and are close to one across diﬀerent values of

κ

, and no

additional numbers of seasonal patterns are added due to the irregular variation.

21

6.2 Seasonality from real economic time series

The simulation in Section 6.1 favors our proposed methods since the artiﬁcial seasonality

takes exactly the form in

(4.3)

that the X-12-ARIMA and SEATS may disagree. We also

use the real seasonalities extracted from three seasonal economic time series to conduct

simulation. They are Industrial Production Index,Total Nonfarm Payrolls, and the Inﬂation

Rate calculated from Consumer Price Index for all Urban Consumers, which are available

on Federal Reserve Economic Data website. We adopt two diﬀerent schemes of simulation

and discuss the simulation results thoroughly. The simulation results and detailed discussion

are reported in the Supplementary Material. In general, the main messages conveyed by

these simulation exercises are in line with those in the subsections above: When seasonality

is strong, our RSVD seasonal adjustment method is superior to X-12-ARIMA and SEATS

methods by delivering much smaller AMSE and AMPE losses, and X-12-ARIMA and SEATS

methods tend to outperform our RSVD method when the seasonality is weak.

7 Real data

In the section, we use some real time series data with seasonal behaviors to compare our

proposed RSVDB seasonal adjustment method with the X-12-ARIMA and SEATS methods.

They are (i) monthly retail volume data (henceforth retail), (ii) quarterly berry production

data of New Zealand (henceforth berry), and (iii) daily online submission counts (henceforth

counts). These three empirical examples are speciﬁcally selected to showcase that our proposed

RSVDB seasonal adjustment method could produce (i) similar seasonal components as the X-

12-ARIMA and SEATS methods do; (ii) better seasonal components when X-12-ARIMA and

SEATS fail; and (iii) seasonal components for other than quarterly and monthly frequencies,

such as daily and weekly. Furthermore, the seasonality of the ﬁrst series (monthly retail

volume data) is steady and mild, which is similar to that of the simulated series in Section

6.2 where the seasonality is from real economic time series. In contrast, the seasonality of

22

the second series (quarterly berry production data of New Zealand) is strong and has some

possible breaks, which is similar to that of simulated series in Section 6.1 where the artiﬁcial

seasonality has an abrupt break in the middle of the time period.

Since no exact deﬁnition of seasonality exists and the true underlying seasonality is always

unknown in real data, diﬀerent seasonal adjustment methods recognize seasonality diﬀerently

given the same data. It is hard to formally compare the results from diﬀerent seasonal

adjustment methods, especially when they are very close. For empirical applications, these

seasonal adjustment methods can only be compared and evaluated in a qualitative fashion

with visual inspection. More importantly, our proposed RSVDB method is able to decompose

the seasonal component into diﬀerent seasonal patterns, trace the dynamics of seasonality by

time-varying pattern coeﬃcients, and identify important seasonality break times automatically.

We also use these three applications to illustrate that the RSVDB method can provide a very

transparent and meaningful explanation to economic seasonality in real data.

In this section, only seasonal decompositions of the X-12-ARIMA and SEATS methods in

X-13ARIMA-SEATS

program and the RSVDB method are compared. The ﬁrst monthly series

is already pretreated, and the second quarterly series has very strong seasonal ﬂuctuations

which may overwhelm possible calendar eﬀects and outliers. Given the automatic feature

of

X-13ARIMA-SEATS

program in seasonal adjustment, we only shut down the options for

calendar eﬀects and automatic outlier detection, and apply the X-12-ARIMA, SEATS, and

RSVDB methods to these two series directly so that the comparison among the three

methods only considers their capabilities of seasonal adjustment. In addition, the number of

seasonal patterns

r

in the RSVDB method is selected by the Bayesian Information Criterion

(4.9)

, allowing for a non-smooth break in each of the corresponding left singular vectors

with roughness penalties for all the three time series data sets. In order to control the

computational burden in the exhaustive search for abrupt seasonality breaks, we limit the

maximal number of seasonal patterns to be 3, i.e. 1 ≤r≤rmax = 3.

23

7.1 Retail volume data (retail)

We ﬁrst examine the monthly series of Motor Vehicle and Parts Dealers published by the U.S.

Census Bureau’s Advance Monthly Sales for Retail and Food Services, covering the period

from January 1992 through December 2012.

[Figure 1] [Figure 4]

Figure 1and 4show and compare the seasonal adjustment results of the retail time

series data using both the X-12-ARMA, SEATS, and RSVD methods. In Figure 4(a) – (c),

we plot the ﬁxed and time-varying seasonal patterns in

f

and

V

= (

v1,v2,v3

) and their

corresponding time-varying pattern coeﬃcients in

U

= (

u1,u2,u3

). The black solid, red

dashed, and green dotted vertical lines in Figure 4(c) represents the abrupt break detected

in

u1

,

u2

, and

u3

respectively. Figure 4(d) presents the ﬁxed seasonality

in·f>

and the

time-varying seasonality

P3

r=1 urv>

r

. In Figure 4(e), we plot the three seasonal components

extracted by X-12-ARIMA, SEATS and RSVD method. Figure 1shows the original time

series and seasonal adjustments by the three adjustment methods. Finally, to check whether

our RSVD method adjusts seasonality adequately, we plot the periodogram for the adjusted

series in Figure S3. Clearly, the sample spectrum does not show any spike at any of the

seasonal frequencies, indicating no residual seasonality exists.

First, in Figure 4(d) we ﬁnd that the ﬁxed seasonal component is larger than the time-

varying seasonal component. Second, the seasonal component

st

extracted via X-12-ARIMA,

SEATS, and RSVD are very similar for this retail volume time series. The three breaks

detected in the three time-varying pattern coeﬃcients

u1

,

u2

, and

u3

segment the time

series into four periods, see Figure 4(e) and Figure 1. In the period around red dashed and

green dotted vertical lines, the RSVD seasonal component is slightly more volatile than

the other two estimated seasonal components, and the RSVD seasonal adjusted series is

slightly smoother than the other two seasonal adjusted series. In the other periods, the RSVD

seasonal components and adjusted time series are almost the same as their counterparts.

24

7.2 Berry production data of New Zealand

We next examine the quarterly series of New Zealand constant price exports of berries,

covering the period from 1988Q1 to 2005Q2.

[Figure 2] [Figure 5]

Figure 2and 5show and compare the seasonal adjustment results of the berry production

time series data using X-12-ARIMA, SEATS, and RSVD. In Figure 5(a) – (c), we plot the

ﬁxed and time-varying seasonal patterns in

f

and

V

= (

v1,v2

) and their corresponding

time-varying pattern coeﬃcients in

U

= (

u1,u2

). The ﬁxed seasonal pattern has a much

larger scale than the time-varying seasonal patterns. The black solid and red dashed vertical

lines in Figure 5(c) represents the abrupt break detected in

u1

and

u2

respectively. Figure

5(d) presents the ﬁxed seasonality

in·f>

and the time-varying seasonality

P2

r=1 urv>

r

. In

Figure 5(e), we plot the three seasonal components extracted by X-12-ARIMA, SEATS, and

RSVDB method. Figure 2shows the original time series and seasonal adjusted ones by the

three adjustment methods. Finally, we plot the periodogram for the RSVDB adjusted series

in Figure S4. Clearly, the sample spectrum does not show any spike at any of the seasonal

frequencies, indicating no residual seasonality exists.

Because of the laws of nature in agricultural production, the actual berry production

in the fall quarter is close to zero. Despite this, both X-12-ARIMA and SEATS methods

automatically apply the logarithmic transformation and do not deliver reasonable results:

Their seasonal components are excessively negative at certain periods in Figure 5(e), and

their adjusted series in Figure 2is excessively high at those periods. It turns out that one

need further manually modify the default options of the two methods to produce reasonable

outcomes. In contrast, the RSVD method is robust to this irregularity and produce reasonable

results. Just like the retail volume data, the ﬁxed seasonal component is much more salient

than the time-varying component, and dominates the seasonality.

Moreover, RSVD identiﬁes the year 2000 as a major break time in seasonality, as the ﬁrst

25

seasonal pattern coeﬃcients drop dramatically after year 2000. This phenomenon is also

manifested in Figure 2and 5(e). The magnitude of seasonality generally increases gradually

before year 2000, then has a sudden decrease in year 2001 and decreases gradually thereafter.

7.3 Online submission count data

Lastly, we study a daily time series of submission counts for 2015 Census Test, covering March

23 through June 1. The Census Test is described in

www.census.gov/2015censustests

.

Submissions cover both self-responses and responses taken over the telephone at one of

the Census Bureau telephone centers. In this case, the seasonal component to the data

corresponds to day-of-week dynamics, and it is of interest to know whether certain days have

systematically higher activity.

[Figure 3] [Figure 6]

Figure 3and 6show the seasonal adjustment results of the submission counts time series

data using the RSVD method. In Figure 6(a) – (c), we plot the ﬁxed and time-varying

seasonal patterns in

f

and

v1

and their corresponding time-varying pattern coeﬃcients in

u1

.

The black solid vertical line in Figure 6(c) represents the abrupt break detected in

u1

. Figure

6(d) presents the ﬁxed seasonality

in·f>

and the time-varying seasonality

u1v>

1

. In Figure

6(e), we plot the seasonal component extracted by the RSVD method. Finally, we plot the

periodogram for the RSVDB adjusted series in Figure S5. The sample spectrum does not

show any spike at any of the seasonal frequencies, indicating no residual seasonality exists.

Figure 3plots the seasonal adjustment results of the daily online submission count data

in logarithms. Because the data occurs at a daily frequency, the

X-13ARIMA-SEATS

software

cannot be applied, although in principle model-based approaches could be used. However, the

seasonal pattern (i.e., the weekly pattern) is very dynamic, and hence presents a challenge

for parametric models. In contrast, our proposed RSVDB method is still well applicable to

the daily data with weekly seasonality. In Figure 6(d), the ﬁxed and time-varying seasonal

26

components have similar magnitude. In Figure 6(e), the RSVD seasonal component shows

that the seasonal behavior is quite diﬀerent at the beginning, middle, and end of the time

series. In Figure 3, the seasonal adjusted series is much smoother than the original series:

the submission counts series increases and reaches its peak in the ﬁrst week, decrease in the

second week, and ﬁrst increase and then decrease in the third week. Then, the adjusted series

keeps decreasing and reaches its trough in the sixth week. After that, the adjusted series

increases again but with more ﬂuctuations.

More interestingly, RSVD identiﬁes the 4th week as the major break time in seasonality. It

is observed that in Figure 6(c), the time-varying pattern coeﬃcients are virtually zero before

and on the 4th week, indicating that basically no time-varying seasonal pattern appears

during the ﬁrst 4 weeks. After that, the time-varying pattern coeﬃcients, moving away from

zero, ﬁrst decrease a little and then increase sharply after the 7th week. This means that the

time-varying seasonal pattern emerges gradually after the 4th week and ﬁnally dominates the

seasonal component at the end of this series.

8 Discussion and conclusion

Other important issues concerning our proposed RSVD method include how to deal with

some potential data problems (such as missing values, outliers, and calendar eﬀects), how to

obtain conﬁdence intervals for the seasonally adjusted process, and how to use RSVD to deal

with multiple types of seasonality for the time series with daily or even high frequency. Due

to the space limitation, some further discussions on these issues are provided in Section S.4

of the Supplementary Material.

The bulk of seasonal adjustment methodology and software is divided between the model-

based and empirical-based approaches, each with their own proponents among researchers and

practitioners. The empirical-based methods all rely upon linear ﬁlters, and therefore struggle

to successfully adjust highly nonlinear seasonal structures. The model-based methods are more

27

ﬂexible, yielding a wider array of ﬁlters, but the methods (whether based on deterministic or

stochastic components) still tend to be linear. When seasonality evinces structural changes

(perhaps a response of consumers to a change in legislation), systemic extremes (perhaps

due to high sensitivity to local weather conditions), or very rapid change (perhaps due to a

dynamic marketplace, where new technologies rapidly alter cultural habits) the conventional

paradigms tend to be inadequate. While it’s possible to specify ever-more complex models,

it is arguably more attractive to devise nonparametric (or empirical-based) techniques that

automatically adapt to a variety of structures in the data – this approach is especially

attractive to a statistical agency involved in adjusting thousands of series each month or

quarter, because devising specially crafted models for each problem series requires excessive

manpower.

The methodology of this paper is empirical in spirit, utilizing a nonparametric method

to separate seasonal structure from other time series dynamics. Like X-12-ARIMA, which

combines nonparametric ﬁlters with model-based forecast extension, our RSVD method

combines stochastic models of nuisance structures with the regularized elicitation of seasonal

dynamics. The advantages over purely model-based approaches are an ability to avoid model

misspeciﬁcation fallacies, allow for structural change in seasonality, handle seasonal extremes,

and capture rapidly evolving seasonality. Moreover, the RSVD method is computationally

fast and almost automatic (the ARIMA speciﬁcation does require choices of the user), and

hence is attractive in a context where individual attention to thousands of series is a logistical

impossibility. An admitted downside is that RSVD does not quantify the estimation error

in the seasonal component. With market demands for more data – higher frequency, more

granularity – coupled with tightening budgets, the necessity of automation in data processing

must drive future research eﬀorts; RSVD takes a substantial step in that direction.

Finally, we mention that there are many fruitful directions for extensions to RSVD: use

of the

U

and

V

singular vectors to detect seasonality; multivariate modeling, where

U

vectors may be common to multiple time series; handling multiple frequencies of seasonality

28

(e.g., daily time series with weekly and annual seasonality) through an extension of matrix

embedding to an array (tensor) structure. Any of these facets would greatly assist the massive

data processing task facing statistical agencies.

References

Barsky, R. B., & Miron, J. A. (1989). The Seasonal Cycle and the Business Cycle. Journal of

Political Economy, 503–534.

Bell, R., Bennett, J., Koren, Y., & Volinsky, C. (2009). The million dollar programming prize.

IEEE Spectrum, 46(5), 28–33.

Bell, W. R., Holan, S. H., & McElroy, T. S. (Ed.). (2012). Economic Time series: modeling

and seasonality. Chapman and Hall/CRC.

Bunzel, H., & Hylleberg, S. (1982). Seasonality in dynamic regression models: A comparative

study of ﬁnite sample properties of various regression estimators including band spectrum

regression. Journal of econometrics, 19(2-3), 345–366.

Buys Ballot, C.H.D. (1847). Les Changements P´eriodiques de Temp´erature. Utrecht: Kemink

et Fils.

Cai, Z., & Chen, R. (2006). Flexible seasonal time series models. Econometric Analysis of

Financial and Economic Time Series, 20, 63–87.

Canova, F. (1992). An alternative approach to modeling and forecasting seasonal time series.

Journal of Business & Economic Statistics, 10(1), 97–108.

Dagum, E. B. (1980). The X-11-ARIMA seasonal adjustment method. Statistics Canada,

Seasonal Adjustment and Time Series Staﬀ.

Dagum, E. B. & Bianconcini, S. (2016). Seasonal adjustment methods and real time Trend-

Cycle estimation. Springer.

Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990).

Indexing by latent semantic analysis. Journal of the American society for information

science, 41(6), 391–407.

29

Dokumentov, A. & Hyndman, R. J. (2015). STR: a Seasonal-Trend decomposition procedure

based on Regression (Working paper No. 13/15). Monash University Econometrics &

Business Statistics. Retrieved from http://robjhyndman.com/working-papers/str/

Eiurridge, P., & Wallis, K. F. (1990). Seasonal adjustment and Kalman ﬁltering: extension

to periodic variances. Journal of Forecasting, 9(2), 109–118.

Findley, D. F. (2005). Some recent developments and directions in seasonal adjustment.

Journal of Oﬃcial Statistics, 21, 343–365.

Findley, D. F., Monsell, B. C., Bell, W. R., Otto, M. C., & Chen, B. C. (1998). New

capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of

Business & Economic Statistics, 16(2), 127–152.

Franses, P. H. (1998). Time series models for business and economic forecasting. Cambridge

university press.

Gersovitz, M., & MacKinnon, J. G. (1978). Seasonality in regression: An application of

smoothness priors. Journal of the American Statistical Association, 73(362), 264–273.

G´omez, V., & Maravall, A. (1992). Time Series Regression with ARIMA Noise and Missing

Observations – Program TRAMO. EUI Working Paper ECO, No. 92/81.

G´omez, V., & Maravall, A. (1997). Programs TRAMO and SEATS: Instructions for the User.

Working Paper 9628. Servicio de Estudios. Banco de Espa˜na.

Hansen, L. P., & Sargent, T. J. (1993). Seasonality and approximation errors in rational

expectations models. Journal of Econometrics, 55(1), 21–55.

Harrison, P. J., & Stevens, C. F. (1976), Bayesian forecasting. Journal of Royal Statistical

Society, Series B, 38, 205–247.

Harvey, A. C. (1990). Forecasting, structural time series models and the Kalman ﬁlter.

Cambridge university press.

Harvey, A., & Scott, A. (1994). Seasonality in dynamic regression models. The Economic

Journal, 1324–1345.

Huang, J. Z., Shen, H., & Buja, A. (2008). Functional principal components analysis via

penalized rank one approximation. Electronic Journal of Statistics, Vol 2: 678–695.

30

Huang, J. Z., Shen, H., & Buja, A. (2009). The analysis of two-way functional data using two-

way regularized singular value decompositions. Journal of American Statistical Association,

104, 1609–1620.

Hylleberg, S., Engle, R. F., Granger, C. W., & Yoo, B. S. (1990). Seasonal integration and

cointegration. Journal of econometrics, 44(1), 215–238.

Lovell, M. C. (1963). Seasonal adjustment of economic time series and multiple regression

analysis. Journal of the American Statistical Association, 58(304), 993–1010.

Osborn, D. R. (1991). The implications of periodically varying coeﬃcients for seasonal

time-series processes. Journal of Econometrics, 48(3), 373–384.

Osborn, D. R. (1993). Discussion: Seasonal cointegration. Journal of Econometrics, 55(1),

299–303.

Proietti, T. (2004). Seasonal speciﬁc structural time series. Studies in Nonlinear Dynamics &

Econometrics, 8(2).

Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Application of dimensionality reduc-

tion in recommender system – a case study (No. TR-00-043). Minnesota Univ Minneapolis

Dept of Computer Science.

Shiskin, J., Young, A.H., and Musgrave, J.C. (1965). The X-11 variant of the Census method

II seasonal adjustment program. No. 15. US Department of Commerce, Bureau of the

Census.

Sims, C. A. (1974). Seasonality in regression. Journal of the American Statistical Association,

69(347), 618–626.

U.S. Census Bureau (2017). “X-13ARIMA-SEATS Reference Manual”.

http://www.census.

gov/ts/x13as/docX13AS.pdf

31

1995 2000 2005 2010

10.2 10.4 10.6 10.8 11.0 11.2

Logarithm retails: Original and seasonally adjusted series

Original series

X−12−ARIMA

SEATS

RSVDB

Figure 1:

Logarithm retail volume: Original and seasonally adjusted series with the X-12-

ARIMA, SEATS, and RSVDB methods

1990 1995 2000 2005

0 5000 10000 15000 20000 25000

Berries exports: Original and seasonally adjusted series

Original series

X−12−ARIMA

SEATS

RSVDB

Figure 2:

New Zealand berries exports: Original and seasonally adjusted series with the

X-12-ARIMA, SEATS, and RSVDB methods

2 4 6 8 10

56789

Submission counts: Original and seasonally adjusted series

Original series

RSVDB

Figure 3:

Logarithm submission counts: Original and seasonally adjusted series with the

RSVDB method

32

Month

F

F

F

F

FFFF

FF

F

F

1 2 3 4 5 6 7 8 9 10 11 12

−0.10 −0.05 0.00 0.05

(a) Fixed seasonal pattern

F Fixed seasonal pattern

Month

111111

11

1

11

1

2

2

2

22

2

22

2

22

2

3

33

3333

33333

1 2 3 4 5 6 7 8 9 10 11 12

−2 −1 0 1 2

(b) Time−varying seasonal patterns

1

2

3

1st pattern

2nd pattern

3rd pattern

1111

1

1

1

1

11

1

1

1

1

1

11

111

1

222222222222222222222

3

3

33

33

3

3

3

3

3

3

3

3

3

3

3

33

3

3

1995 2000 2005 2010

−0.10 −0.05 0.00 0.05 0.10

(c) Time−varying pattern coefficients

1

2

3

1st patn. coeff.

2nd patn. coeff.

3rd patn. coeff.

break: 1st p.c.

break: 2nd p.c.

break: 3rd p.c.

1995 2000 2005 2010

−0.10 −0.05 0.00 0.05 0.10

(d) Fixed and time−varying seasonal components

Fixed

Time−varying

1995 2000 2005 2010

−0.15 −0.05 0.00 0.05 0.10 0.15

(e) Seasonal components

X−12−ARIMA

SEATS

RSVDB

Figure 4: The RSVDB seasonal decomposition of the logarithm retail series

33

Month

F

FF

F

1 2 3 4

−2000 0 2000 4000 6000

(a) Fixed seasonal pattern

F Fixed seasonal pattern

Month

111

1

2

2

22

1 2 3 4

−0.5 0.0 0.5

(b) Time−varying seasonal patterns

1

2

3

1st pattern

2nd pattern

3rd pattern

11

1

111

11

1

111

1

11

1

1

22222222222

222222

1990 1995 2000

−2000 0 2000 4000 6000

(c) Time−varying pattern coefficients

1

2

3

1st patn. coeff.

2nd patn. coeff.

3rd patn. coeff.

break: 1st p.c.

break: 2nd p.c.

break: 3rd p.c.

1990 1995 2000 2005

−2000 0 2000 4000 6000

(d) Fixed and time−varying seasonal components

Fixed

Time−varying

1990 1995 2000 2005

−25000 −15000 −5000 0 5000

(e) Seasonal components

X−12−ARIMA

SEATS

RSVDB

Figure 5: The RSVDB seasonal decomposition of the berries exports series

34

Week day

FFF

F

F

FF

1234567

−0.4 −0.2 0.0 0.2

(a) Fixed seasonal pattern

F Fixed seasonal pattern

Week day

1

1

1

1

1

1

1

1234567

0.0 0.5 1.0

(b) Time−varying seasonal patterns

11st pattern

1 1 1 1

1

11

1

1

1

2 4 6 8 10

0.0 0.2 0.4 0.6

(c) Time−varying pattern coefficients

11st patn. coeff.

break: 1st p.c.

2 4 6 8 10

−0.4 −0.2 0.0 0.2 0.4 0.6

(d) Fixed and time−varying seasonal components

Fixed

Time−varying

2 4 6 8 10

−0.6 −0.4 −0.2 0.0 0.2

(e) Seasonal components

RSVDB

Figure 6: The RSVDB seasonal decomposition of the daily submission counts series

35