# The application of forecasting techniques to modeling emergency medical system calls in Calgary, Alberta.

**ABSTRACT** We develop and evaluate time-series models of call volume to the emergency medical service of a major Canadian city. Our objective is to offer simple and effective models that could be used for realistic simulation of the system and for forecasting daily and hourly call volumes. Notable features of the analyzed time series are: a positive trend, daily, weekly, and yearly seasonal cycles, special-day effects, and positive autocorrelation. We estimate models of daily volumes via two approaches: (1) autoregressive models of data obtained after eliminating trend, seasonality, and special-day effects; and (2) doubly-seasonal ARIMA models with special-day effects. We compare the estimated models in terms of goodness-of-fit and forecasting accuracy. We also consider two possibilities for the hourly model: (3) a multinomial distribution for the vector of number of calls in each hour conditional on the total volume of calls during the day and (4) fitting a time series to the data at the hourly level. For our data, (1) and (3) are superior.

**2**Bookmarks

**·**

**81**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Emergency medical services (EMS) provide life-saving care and hospital transport to patients with severe trauma or medical conditions. Severe weather events, such as snow events, may lead to adverse patient outcomes by increasing call volumes and service times. Adequate staffing levels during such weather events are critical for ensuring that patients receive timely care. To determine staffing levels that depend on weather, we propose a model that uses a discrete event simulation of a reliability model to identify minimum staffing levels that provide timely patient care, with regression used to provide the input parameters. The system is said to be reliable if there is a high degree of confidence that ambulances can immediately respond to a given proportion of patients (e.g., 99 %). Four weather scenarios capture varying levels of snow falling and snow on the ground. An innovative feature of our approach is that we evaluate the mitigating effects of different extrinsic response policies and intrinsic system adaptation. The models use data from Hanover County, Virginia to quantify how snow reduces EMS system reliability and necessitates increasing staffing levels. The model and its analysis can assist in EMS preparedness by providing a methodology to adjust staffing levels during weather events. A key observation is that when it is snowing, intrinsic system adaptation has similar effects on system reliability as one additional ambulance.Health Care Management Science 07/2012; · 1.05 Impact Factor - [Show abstract] [Hide abstract]

**ABSTRACT:**Empirical studies considering the location and relocation of emergency medical service (EMS) vehicles in an urban region provide important insight into dynamic changes during the day. Within a 24-hour cycle, the demand, travel time, speed of ambulances and areas of coverage change. Nevertheless, most existing approaches in literature ignore these variations and require a (temporally and spatially) fixed (double) coverage of the planning area. Neglecting these variations and fixation of the coverage could lead to an inaccurate estimation of the time-dependent fleet size and individual positioning of ambulances. Through extensive data collection, now it is possible to precisely determine the required coverage of demand areas. Based on data-driven optimization, a new approach is presented, maximizing the flexible, empirically determined required coverage, which has been adjusted for variations due to day-time and site. This coverage prevents the EMS system from unavailability of ambulances due to parallel operations to ensure an improved coverage of the planning area closer to realistic demand. An integer linear programming model is formulated in order to locate and relocate ambulances. The use of such a programming model is supported by a comprehensive case study, which strongly suggests that through such a model, these objectives can be achieved and lead to greater cost-effectiveness and quality of emergency care.Health Care Management Science 03/2014; · 1.05 Impact Factor - [Show abstract] [Hide abstract]

**ABSTRACT:**The daily ambulance demand for Hong Kong is rising, and it has been shown that weather factors (temperature and humidity) play a role in the demand for ambulance services. This study aimed at developing short-term forecasting models of daily ambulance calls using the 7-day weather forecast data as predictors. We employed the autoregressive integrated moving average (ARIMA) method to analyze over 1.3 million cases of emergency attendance in May 2006 through April 2009 and the 7-day weather forecast data for the same period. Our results showed that the ARIMA model could offer reasonably accurate forecasts of daily ambulance calls at 1-7 days ahead of time and with improved accuracy by including weather factors. Specifically, the inclusion of average temperature alone in our ARIMA model improved the predictability of the 1-day forecast when compared to that of a simple ARIMA model (8.8 % decrease in the root mean square error, RMSE = 53 vs 58). The improvement in the 7-day forecast with average temperature as a predictor was more pronounced, with a 10 % drop in prediction error (RMSE = 62 vs 69). These findings suggested that weather forecast data can improve the 1- to 7-day forecasts of daily ambulance demand. As weather forecast data are readily accessible from Hong Kong Observatory's official website, there is virtually no cost to including them in the ARIMA models, which yield better prediction for forward planning and deployment of ambulance manpower.International Journal of Biometeorology 03/2013; · 2.59 Impact Factor

Page 1

Health Care Manage Sci

DOI 10.1007/s10729-006-9006-3

The application of forecasting techniques to modeling

emergency medical system calls in Calgary, Alberta

Nabil Channouf · Pierre L’Ecuyer ·

Armann Ingolfsson · Athanassios N. Avramidis

Received: 20 June 2006 / Accepted: 1 October 2006

© Springer Science + Business Media, LLC 2006

Abstract We develop and evaluate time-series models

of call volume to the emergency medical service of a

major Canadian city. Our objective is to offer simple

and effective models that could be used for realistic

simulation of the system and for forecasting daily and

hourly call volumes. Notable features of the analyzed

time series are: a positive trend, daily, weekly, and

yearly seasonal cycles, special-day effects, and posi-

tive autocorrelation. We estimate models of daily vol-

umes via two approaches: (1) autoregressive models of

data obtained after eliminating trend, seasonality, and

special-day effects; and (2) doubly-seasonal ARIMA

models with special-day effects. We compare the es-

timated models in terms of goodness-of-fit and fore-

casting accuracy. We also consider two possibilities for

the hourly model: (3) a multinomial distribution for the

vector of number of calls in each hour conditional on

the total volume of calls during the day and (4) fitting a

time series to the data at the hourly level. For our data,

(1) and (3) are superior.

Keywords Emergency medical service · Arrivals ·

Time series · Simulation · Forecasting

N. Channouf · P. L’Ecuyer · A. N. Avramidis

DIRO, Université de Montréal, Montréal, Canada

A. Ingolfsson (B )

School of Business, University of Alberta,

Edmonton, Alberta T6G 2R6, Canada

e-mail: armann.ingolfsson@ualberta.ca

1 Introduction

Most cities in the developed world have organizations

that provide Emergency Medical Service (EMS), con-

sisting of pre-hospital medical care and transport to a

medical facility. Demand for such services is increasing

throughout the developed world, in large part because

oftheagingofthepopulation.IntheU.S.,EMSfunding

decreased following conversion of direct federal fund-

ing to block grants to states [11, 37] that have, in many

cases, been used for purposes other than EMS. Tighter

budgets make efficient use of resources increasingly

important. Reliable demand forecasts are crucial input

to resource use planning, and the focus of this paper is

on how to generate such forecasts.

Almost all demand to EMS systems arrives by

phone, through calls to an emergency number (911 in

North America). Calls that arrive to 911 are initially

routed to EMS, fire, or police. Calls routed to EMS

arethenevaluated,whichinvolvesobtaininganaddress,

determining the nature and importance of the incident,

and possibly providing instructions to a bystander on

the use of CPR or other first-aid procedures. Dispatch-

ing an ambulance to the call, the next step, is a sepa-

rate function that can occur partly in parallel with call

evaluation. The crew of the dispatched vehicle(s) then

begins traveling toward the scene of the call, where

they assess the situation, provide on-site medical care,

and determine whether transport to a medical facility

is necessary (this is the case roughly 75% of the time).

Once at the medical facility, EMS staff remain with

the patient until they have transferred responsibility for

her or his care to a nurse or physician. The crew may

then need to complete various forms before it becomes

available to take new calls.

Page 2

Health Care Manage Sci

The resource requirements per EMS call are on the

order of a few minutes for call evaluation and dispatch,

and on the order of an hour for an ambulance and

its crew. The latter component is growing in many

locations because of increased waiting times in hospital

emergency rooms [13, 14, 33, 34].

The primary performance measure for an EMS sys-

tem is typically the fraction of calls reached within some

time standard, from the instant the call was made. In

North America, a typical target is to reach 90% of

the most urgent calls within 9 min. Although universal

standards are lacking [28], the response time is typically

considered to begin when call evaluation begins and

end when an ambulance reaches the call address. Sec-

ondary performance measures include waiting times on

thephonebeforereachinga911operator;(forexample,

90% in 10 s [29] or 95% in 5 s [12]), average call eval-

uation times, average dispatch times, and average time

spent at hospital.

The main decisions that require medium-term call

volume forecasts (a few days to a few weeks into the

future) are scheduling decisions for call evaluators, dis-

patchers, and, most importantly, ambulances and their

crews. Longer-term call volume forecasts are needed

for strategic planning of system expansion or reorga-

nization. Shorter-term (intra-day) forecasts could be

used to inform decisions about when to call in extra

resources.

Service level standards for EMS systems are imper-

fect proxies for the real goal of such systems, namely

to save lives and prevent suffering (see [15] for a dis-

cussion of models that attempt to quantify such goals

more explicitly). Meeting these service-level standards

is expensive, so it is a problem of substantial economic

and social interest to manage EMS systems efficiently.

Generally speaking, efficiency involves balancing qual-

ity of service against system costs. An important input

to this operational problem is the call volume. Uncer-

tainty in future call volume complicates the process of

determining levels of EMS staffing and equipment. It

is therefore important to correctly model the stochastic

nature of call volumes, and in particular, to make pre-

dictions of future call volumes, including uncertainty

estimates (via prediction intervals).

Operations researchers have been developing plan-

ning models for EMS systems, as well as police and

fire services, since the 1970s. Green and Kolesar [17]

provide a recent perspective on the impact of this work.

Swersey [35] surveys the academic literature on this

topic and [16] provides an EMS-practitioner-oriented

literature survey. EMS planning models include simula-

tion models ([20] and [22] are recent examples), analyt-

ical queueing models (notably the hypercube queueing

model, see Larson [24, 25]), and optimization models

for location of facilities and units. All of these mod-

els require estimates of demand as input. Typically,

planning models assume that demand follows a Poisson

process—an assumption that is supported by both theo-

retical arguments [e.g., 19] and empirical evidence [e.g.,

18, 39]. However, empirical studies of demand for both

EMS and other services (notably Brown et al. [9]) indi-

cate that the rate of the Poisson arrival process varies

with time and may be random. The work we report

in this paper is aimed at estimating the arrival rate

during day-long or hour-long periods. We elaborate in

Section 3 on how our estimates can be used to support

simulation and analytical studies that assume a Poisson

arrival process.

Goldberg [16] mentions that “the ability to predict

demand is of paramount importance” but that this area

has seen little systematic study. The work that has been

done can be divided in two categories: (1) models of

the spatial distribution of demand, as a function of

demographic variables and (2) models of how demand

evolves over time. In the first category, Kamenetsky

et al. [23] surveyed the literature before 1982 and pre-

sented regression models to predict EMS demand as

a function of population, employment, and two other

demographic variables. Their models successfully ex-

plained most of the variation in demand (R2= 0.92)

among 200 spatial units in southwestern Pennsylvania.

McConnell and Wilson [27] is a more recent article

fromthiscategorywhichfocusesontheincreasinglyim-

portant impact of the age distribution in a community

on EMS demand. We refer the reader to Kamenetsky

et al. [23] and McConnell and Wilson [27] for further

relevant references.

This paper falls in the second category, of modeling

and forecasting EMS demand over time. EMS demand

varies strongly by time of day and day of week, for

example see Zhu et al. [39] and Gunes and Szechtman

[18]. Past related work that attempts to forecast daily

EMS demand includes Mabert [26], who analyzed

emergency call arrivals to the Indianapolis Police

Department. He considered several simple methods

based on de-seasonalized data and found that one

of them outperforms a simple ARIMA model [7].

In a similar vein, Baker and Fitzpatrick [4] used

Winter’s exponential smoothing models to separately

forecast the daily volume of emergency and “routine”

EMS calls and used goal programming to choose the

exponential smoothing parameters.

Recent work on forecasting arrivals to call centers

from a variety of industries is also relevant. For the pre-

diction of daily call volumes to a retailer’s call center,

Andrews and Cunningham [3] incorporate advertising

Page 3

Health Care Manage Sci

effects in an ARIMA model with transfer functions;

theircovariatesareindicatorvariablesofcertainspecial

days and catalog mailing days. Bianchi et al. [6] use an

ARIMA model for forecasting daily arrivals at a tele-

marketing center, compare against the Holt–Winters

model, and show the benefits of outlier elimination.

Tych et al. [36] forecast hourly arrivals in a retail bank

call center via a relatively complex model with unob-

served components named “dynamic harmonic regres-

sion” and show that it outperforms seasonal ARIMA

models; one unusual feature of their methodology is

that estimation is done in the frequency domain. Brown

et al. [9] develop methods for the prediction of the

arrival rates over short intervals in a day, notably via

linear regression on previous day’s call volume.

In this paper, we study models of daily and

hourly EMS call volumes and we demonstrate their

application using historical observations from Calgary,

Alberta. Although we focus on the Calgary data,

we expect the models could be used to model EMS

demand in other cities as well and we will comment on

likely similarities and differences between cities.

We have 50 months (from 2000 to 2004) of data from

the Calgary EMS system. Preliminary analysis reveals

a positive trend, seasonality at the daily, weekly, and

yearly cycle, special-day effects, and autocorrelation.

In view of this, we consider two main approaches: (1)

autoregressive models of the residual error of a model

with trend, seasonality, and special-day effects; and (2)

doubly-seasonal ARIMA models for the residuals of

a model that captures only special-day effects. Within

approach (1), we explore models whose effects are the

day-of-week and month-of-year. We also consider a

model with cross effects (interaction terms) and a more

parsimonious model, also with cross effects, but where

only the statistically significant effects are retained.

The latter turns out to be the best performer in terms

of both goodness-of-fit and forecasting accuracy. All

the models are estimated with the first 36 months of

data and the forecasting error is measured with the

data from the last 14 months. We used the R and SAS

statistical software for the analysis.

The remainder of the paper is organized as follows.

Section 2 provides descriptive and preliminary data

analysis. In Section 3 we present the different models of

dailyarrivalsandcomparethemintermsofqualityoffit

(in-sample) and forecast accuracy (out-of-sample). In

Section 4, we address the problem of predicting hourly

call volumes. Section 5 offers conclusions.

2 Preliminary data analysis

We have data from January 1, 2000 to March 16, 2004,

containing the time of occurrence of each ambulance

call, the assessed call priority, and the geographical

zone where the call originated. We work with the

number of calls in each hour instead of their times of

occurrence, to facilitate the application of time series

models. We explain in the next section how such hourly

counts can be related to a stochastic model of the times

of individual arrivals. The average number of arrivals is

about 174/day, or about 7/h.

Figure 1 provides a first view of the data; it shows the

daily volume for year 2000. The figure suggests a pos-

itive trend, larger volume in July and December, and

shows some unusually large values, e.g., on January 1,

July 8, November 11, December 1; and low values, e.g.,

on January 26, September 9. Figure 2 shows monthly

volume over the entire period. This plot reveals a clear

Fig. 1 The daily call volume

for the year 2000. Some

outliers appears in the plot

120

140

160

180

200

220

days

number of calls

O

O

O

O

O

O

1(1 1 2000)

100 200

366(31 12 2000)

Page 4

Health Care Manage Sci

Fig. 2 The monthly call

volume over the entire period

4500

5000

5500

6000

months

number of calls

1 1224 3650

positive trend; the likely explanation is a combination

of population growth and aging in the city. Figure 3

shows average volume by hour over the weekly cycle.

The plot reveals a clear hour-of-day seasonality: over

a 24-h cycle, higher call volumes are usually observed

between 10 a.m. and 8 p.m.; substantially lower volumes

are seen overnight. One also observes day-of-week

effects. Closer inspection reveals, not surprisingly, in-

creased activity during Friday and Saturday evening

and early night. With respect to daily volume, larger

values are observed over Friday and Saturday relative

to the other days of the week. These observations

would have to be taken into account when designing

shift schedules for the ambulance crews.

Figures 4 and 5 give box-plots of the daily volume

for each day of the week and monthly volume for each

month of the year, respectively. Each box plot gives the

median,thefirstandthirdquartiles(thebottomandtop

of the central box), the interquartile range (the height

of the box), and two bars located at a distance of 1.5

timestheinterquartilerangebelowthefirstquartileand

above the fourth quartile, respectively. The small cir-

cles denote the individual observations that fall above

or below these two bars. We see again that Friday and

Saturday have more volume than the average. July,

December, and November are the busiest months (in

this order) while April is the most quiet month.

3 Models for daily arrivals

We now consider five different time-series models

for the arrival volumes over successive days. Although

in the end we conclude that one of these models fits

Fig. 3 The average hourly

call volume over the weekly

cycle

2

4

6

8

10

12

hours (24x7 days)

number of calls

1 4385127 168

Mon.Tue.Wed. Thu.Fri.Sat. Sun.

Page 5

Health Care Manage Sci

Fig. 4 Box plots of arrival

volumes per day for each day

of the week

50

100

150

200

250

number of calls

Mon.

Tue.Wed.Thu.

Fri.

Sat. Sun.

the Calgary data best, we discuss all of them because

different models from the collection that we present

may be appropriate depending on the city being studied

and the purpose of the analysis. These models are

defined and studied in Sections 3.1 to 3.5. In Section 3.6,

we compare these models in terms of both quality of

fit (in-sample) and forecast accuracy (out-of-sample).

Throughout the paper, t denotes the time index in days

and the number of arrivals on day t is denoted Yt,

for t = 1,2,...,n, where n = 1,537. The models are

fitted to the first 1,096 observations (January 1, 2000

through December 31, 2002), and the remaining 441

observations are used for prediction (January 1, 2003

through March 16, 2004).

There are compelling theoretical reasons to assume

that call arrivals follow a nonhomogeneous Poisson

process (NHPP). The Palm–Khintchine theorem [e.g.,

10] states, approximately, that the superposition of

arrival processes from many small and indepen-

dent “sources” (patients, in an EMS context) is well-

approximated by a Poisson process. The rate of this

process will vary with time (because medical emergen-

cies are more likely to occur at certain times) and the

rate may not be known with certainty (because it may

be influenced by factors other than time).

For purposes of illustration, suppose that arrivals

during hour h follow a Poisson process with a random

rate that remains constant during the hour. Conditional

on the number of calls during the hour, call it Zh,

the arrival times of individual calls within the hour are

independentlyanduniformlydistributedbetween0and

1. This is the “order statistic property” for a Poisson

process and it holds regardless of whether the arrival

rate is deterministic or random [see 32, Sections 4.5–

4.6]. Our models in this and the next section quantify

the distribution of the daily arrival counts Ytand the

Fig. 5 Box plots of mean

daily arrival volumes per

month

150

200

250

number of calls

Jan.Feb. Mar.Apr.MayJun. Jul.Aug.Sep. Oct.Nov. Dec.

Page 6

Health Care Manage Sci

hourly counts Zh. One can use the following procedure

to simulate call arrival times on day t:

1.Simulate the daily count Yt. As we will see in this

section, this involves simulating the residual from a

standard autoregressive process.

Given Yt, generate the vector Ztof hourly counts

on day t. As we will see in the next section, this

involves simulating a multinomial random vector.

Use the order statistic property to distribute the

simulated number of arrivals in each hour.

2.

3.

If the arrival rate varies too rapidly to be approx-

imated as constant over hour-long periods, then it is

straightforward to modify our models to use shorter

periods, for example half-hours. Thus, if one limits at-

tention to this general and plausible NHPP model, then

each of our models of arrival counts by period yield

corresponding stochastic models of all the arrival times,

which can support analytical and simulation studies.

3.1 Model 1: fixed-effect model with independent

residuals

One would expect to see month-of-year and day-of-

week effects in EMS demand in most cities. Our pre-

liminaryanalysisoftheCalgarydataindicatesapositive

trend and confirms the presence of month-of-year and

day-of-week effects. This suggests the following linear

model as a first approximation:

Yj,k,l= a +˜βj+ ˜ γk+ ˜ αl+ ˜ ?j,k,l,

where Yj,k,lis the number of calls on a day of type j

in month k of year l, the parameters a,˜βj, ˜ γk, and ˜ αl,

are real-valued constants, and the residuals ?j,k,l are

independent and identically distributed (i.i.d.) normal

(1)

randomvariableswithmean0.Thepreliminaryanalysis

suggests that for Calgary, the yearly effect is approxi-

mately a linear function of l, which allows us to express

the model more conveniently as

Yt= a + bt +

7

?

j=1

βjCt,j+

12

?

k=1

γkSt,k+ Et,

(2)

where a, b, the βj, and the γkare constants, the indica-

tor Ct,jis 1 if observation t is on the jth day of the week

and 0 otherwise, the indicator St,kis 1 if observation t is

in the kth month of the year and 0 otherwise. In other

cities, it might be more appropriate to model the yearly

effect as a nonlinear function of t. We assume that the

residuals Etare i.i.d. normal with mean 0 and variance

σ2

presence of the constant parameter a, we impose the

standard identifiability constraints:

E,0, i.e., a Gaussian white noise process. Given the

7

?

(Without these constraints, there would be redundant

parameters; for example, adding a constant κ to all

the βj’s and subtracting κ from a would give the same

model.) We estimated the parameters for the regres-

sion model (2) using least squares and obtained the

residuals displayed in Fig. 6, in which the circled points

are at a distance larger than 3ˆ σE,0 from zero, where

ˆ σ2

a single residual larger than 4ˆ σE,0, which corresponds

to January 1, 2002, and seven other residuals larger

than 3ˆ σE,0: December 1, 2000; January 1, 2001; May

27, 2001; August 2, 2001; September 8, 2001; June 27,

2002; July 12, 2002. The single residual smaller than

−3ˆ σE,0 is on July 30, 2001. January 1 appears to be

j=1

βj=

12

?

k=1

γk= 0.

(3)

E,0is the empirical variance of the residuals. There is

Fig. 6 The residuals Etfor

the simple linear model

of Eq. 2

–4

–2

0

2

4

Time (days)

standardized residuals

1

500 10001096

OO

O

O

O

O

O

O

O

Some suspect outliers

Page 7

Health Care Manage Sci

a special day, with a call volume systematically larger

than average. The month of July also has a larger vol-

ume per day than the other months (in the data). One

potential explanation that we decided to consider is the

Calgary Stampede, held every year in July. The Stam-

pede includes one of the largest rodeos in the world

and it is the most important annual festival in Calgary

(http://calgarystampede.com). The dates for this event

are: July 7–16, 2000; July 6–15, 2001; July 5–14, 2002;

and July 4–13, 2003. To account for those two types of

special days, we add two indicator variables Ht,1and

Ht,2to our model, where Ht,1is 1 if observation t is on

January 1 and 0 otherwise, whereas Ht,2is 1 if obser-

vation t is on one of the 40 Stampede days enumerated

above, and 0 otherwise. This gives the model

Yt= a + bt +

7

?

j=1

βjCt,j+

12

?

k=1

γkSt,k

+ ω1Ht,1+ ω2Ht,2+ Et,

in which we now have two additional real-valued para-

meters ω1and ω2, and the residuals now have variance

σ2

events will vary between cities but the same gen-

eral approach can be used if the dates of the special

events are known. We estimate all the parameters of

this linear regression model by standard least-squares,

using the first n = 1,096 observations. If we denote the

parameter estimates by ˆ a,ˆb,ˆβj, ˆ γk, ˆ ω1and ˆ ω2, then the

estimates of Ytand Etare given by

(4)

E. The timing, nature, and number of such special

ˆYt= ˆ a +ˆbt +

7

?

j=1

ˆβjCt,j+

12

?

k=1

ˆ γkSt,k

+ ˆ ω1Ht,1+ ˆ ω2Ht,2

(5)

and

ˆEt= Yt−ˆYt.

A naive estimator of σ2

variance

(6)

Ewould be the empirical

ˆ σ2

E=

1

n − s

n

?

t=1

ˆE2

t,

(7)

where s = 21 is the number of parameters estimated in

the model. However, this variance estimator is biased

if the residuals are correlated [5], and we will see in a

moment that they are.

We must test the hypothesis that the residuals are

a white-noise process, i.e., normally distributed and

uncorrelated with zero mean and constant variance.

Stationarity and normality of the residuals is plausible,

based on Fig. 6 and on Q–Q (quantile–quantile) plots

not shown here. To test for autocorrelation, we use the

Ljung-Box test statistic, defined by

Q = n(n + 2)

l?

i=1

ˆ r2

n − i,

i

where n is the number of residuals, ˆ riis the lag-i sample

autocorrelation in the sequence of residuals, andl is the

maximum lag up to which we want to test the autocor-

relations. Under the null hypothesis that the residuals

are uncorrelated and n ? s, Q has approximately a chi-

square distribution with l degrees of freedom. Here we

have n = 1,096 and s = 21. We apply the test with l =

30 and obtain Q = 154.8. The corresponding p-value

is smaller than 2.2 × 10−16, so the null hypothesis is

clearly rejected. This strong evidence of the presence

of correlation between the residuals motivates our next

model.

3.2 Model 2: an autoregressive process for the errors of

Model 1

We improve Model 1 by fitting a time-series model to

the residuals Et. Since the Et process appears to be

normal and stationary, it suffices to capture the auto-

correlation structure. We do this with an autoregressive

process of order p (an AR(p) process), defined by

Et= φ1Et−1+ ··· + φpEt−p+ at,

where the at are i.i.d. normal with mean zero and

variance σ2

and using standard tools of model identification [7, 38],

we find that p = 3 is adequate (different values of pwill

be appropriate for different cities). When estimating

the coefficients φlin a model with p > 3, we find that

the coefficients φl for l > 3 are non-significant at the

5% level. For example, the p-value of the t-test for φ4

is about 0.153.

The model obtained by combining Eqs. 4 and 8 with

p = 3 can be written alternatively as

⎡

j=1

12

?

where φ(B) = 1 − φ1B − φ2B2− φ3B3, B is the back-

shift operator defined by BpEt= Et−p, and φ1,φ2,φ3

are the autoregressive parameters. We estimate the

parameters (a,b,β1,...,β7,γ1,...,γ12,ω1,ω2,φ1,φ2,φ3)

by (nonlinear) least squares [1, page 67], based on the

observations Ytfor t = 4,...,n, where n = 1096.

(8)

a. Based on the residuals defined by Eq. 6,

φ(B)

⎣Yt− a − bt −

−

k=1

7

?

βjCt,j

γkSt,k− ω1Ht,1− ω2Ht,2

?

= at,

(9)

Page 8

Health Care Manage Sci

Table 1 Parameter estimates for Model 2

Parameter

a

Intercept

149.3

3.3

< 0.001

β1

Mon.

−4.2

2.5

0.095

γ1

Jan.

−5.6

2.9

0.048

γ8

Aug.

−1.3

2.8

0.627

b

Trend/month

0.031

0.003

< 0.001

β2

Tue.

−5.0

2.5

0.046

γ2

Feb.

−4.1

2.8

0.152

γ9

Sep.

−0.3

2.8

0.928

ω1

Jan. 1

60.5

11.0

< 0.001

β3

Wed.

−5.3

2.5

0.034

γ3

Mar.

−2.5

2.8

0.358

γ10

Oct.

−4.3

2.8

0.121

ω2

Stampede

2.7

4.5

0.544

β4

Thu.

−0.9

2.5

0.719

γ4

Apr.

−4.0

2.8

0.149

γ11

Nov.

0.3

2.8

0.916

σ2

a

250.1

–

Estimate

St. error

p-val. of t-test

Parameter

β5

Fri.

8.5

2.5

0.001

γ5

May

0.5

2.7

0.849

γ12

Dec.

4.1

2.8

0.150

β6

Sat.

7.6

2.5

0.002

γ6

Jun.

4.1

2.8

0.138

β7

Sun.

−0.8

2.5

0.747

γ7

Jul.

13.2

2.9

< 0.001

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

φ1

0.192

0.030

< 0.001

φ2

0.108

0.031

< 0.001

φ3

0.083

0.030

0.006

The parameter estimates are given in Table 1, to-

gether with their standard errors and the p-value of a

t-test of the null hypothesis that the given parameter

is zero, for each parameter. We then compute the

residuals ˆ at=ˆφ(B)(Yt−ˆYt) in a similar manner as for

Model 1 and we estimate σ2

aby

ˆ σ2

a=

1

n − s

n

?

t=4

ˆ a2

t,

(10)

where n = 1,096 and s = 24. This gives ˆ σ2

Figure 7 presents visual diagnostics for residual normal-

ity: we see the estimated residual density and a normal

Q–Q plot, i.e., the empirical quantiles of normalized

residuals plotted versus the corresponding quantiles

of the standard normal distribution (with mean 0 and

variance1).Figure8isadiagnosticfor(lackof)residual

autocorrelation:itshowsthestandardizedresiduals,the

sample autocorrelations up to lag 30, and the p-values

of the Ljung-Boxtestfor eachlag. We conclude thatthe

residuals atappear to be white noise. Thus, Model 2 is

a much better fit than Model 1.

The most significant parameters in Table 1 are a (the

mean), b (the positive trend), ω1(the positive January

1 effect), φ1, φ2, and φ3(the positive AR parameters),

γ7(the positive July effect), and β5and β6(the positive

Friday and Saturday effects). Other parameters signifi-

cant at the 10% level are β1to β3(the negative effects

of Monday–Wednesday) and γ1(the negative effect of

a= 250.1.

January). (Since April has the lowest average in Fig. 5,

one may find it surprising that January has significant

negative coefficient and not April. But the average for

January becomes smaller after removing the January

1 effect. Also, this estimation uses only the first 1,096

days of data, whereas Fig. 5 combines all 1,537 days).

This gives a total of 13 significant parameters. We could

eliminate the other ones; we will do that in Section 3.4.

Observe that ω2(the Stampede effect) is not significant;

most of the increased volume during the Stampede

days is captured by the July effect. In fact, the average

volume per day is about 186 during the Stampede days

compared with 180 during the other days of July and

174 on average during the year.

We can also use this model to estimate the variance

of the residuals Etin Model 1. Their sample variance

(7) underestimates σ2

itively correlated. If we multiply both sides of Eq. 8 by

Etand take the expectation, we get

E= Var[Et] because they are pos-

σ2

E= E[E2

= (φ1ρ1+ φ2ρ2+ φ3ρ3)σ2

where γi= Cov(Et, Et−i) and ρi= Corr(Et, Et−i) for

each i. Replacing all quantities in this last expression

by their estimates and resolving for σ2

291.8 as an estimate of σ2

estimate σ2

17% more variance than Model 2.

t] = φ1γ1+ φ2γ2+ φ3γ3+ σ2

a

E+ σ2

a,

E, we obtain ˆ σ2

E=

E. By comparing with the

a= 250.1, we see that Model 1 has about

Page 9

Health Care Manage Sci

Fig. 7 Diagnostic for

normality of residuals for

Model 2

–50050

Estimated density (kernel)

0.000

0.005

0.010

0.015

0.020

–3–2–10123

–40

–20

0

20

40

60

Normal Q–Q Plot

Theoretical Quantiles

Sample Quantiles

3.3 Model 3: adding cross effects

We now extend Model 2 by adding second-order terms

to capture the interaction between the day-of-week and

month-of-year factors. We simply add the term

7

?

to the right side of Eq. 4 and subtract the same term

inside the brackets in Eq. 9, where the indicator vari-

j=1

12

?

k=1

δj,kMt,j,k

(11)

able Mt,j,k is 1 if observation t is on the jth day of

theweek andkth monthoftheyear.This introducesthe

additional model parameters δj,k, which must satisfy the

identifiability constraints?12

We found that the estimates for the parameters βj,

γk, and ωi in this model were almost the same as in

Model 2. Table 2 gives the estimated values of the

parameters that differ from those of Model 2, together

with the p-value of a t-test that the given parameter

is zero. The estimated variance of the residuals has

k=1δj,k= 0 for each j and

?7

j=1δj,k= 0 for each k.

Fig. 8 Diagnostic for (lack

of) residual autocorrelation

for Model 2

Standardized Residuals

Time

0 200400 600 8001000

–3

0 2 4

0510 15 20 25 30

0.0 0.4 0.8

Lag

ACF

ACF of Residuals

2468 10

0.0 0.4 0.8

p values for Ljung–Box statistic

Lag

p value

Page 10

Health Care Manage Sci

Table 2 Parameter estimates for Model 3

Mon.Tue.Wed. Thu.Fri. Sat. Sun.

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

Parameter

Est.

St. error

p-val. of t-test

δ1,1

−0.3

3.9

0.942

δ1,2

−5.4

4.2

0.193

δ1,3

8.1

4.1

0.052

δ1,4

1.7

4.2

0.686

δ1,5

5.6

3.8

0.143

δ1,6

−2.6

4.1

0.525

δ1,7

−3.9

3.9

0.324

δ1,8

6.1

4.2

0.142

δ1,9

−1.6

7.1

0.819

δ1,10

0.9

4.0

0.822

δ1,11

−1.2

4.3

0.775

δ1,12

−7.3

4.3

0.092

φ1

0.213

0.030

< 0.001

δ2,1

5.0

4.0

0.210

δ2,2

5.6

4.1

0.173

δ2,3

6.2

4.2

0.137

δ2,4

−3.7

4.2

0.369

δ2,5

−0.8

3.9

0.840

δ2,6

−4.5

4.1

0.276

δ2,7

−4.6

4.2

0.269

δ2,8

4.7

3.9

0.227

δ2,9

−5.3

6.8

0.441

δ2,10

−1.5

4.0

0.702

δ2,11

0.1

3.9

0.982

δ2,12

−1.1

4.3

0.797

φ2

0.126

0.031

< 0.001

δ3,1

4.6

4.0

0.256

δ3,2

3.7

4.2

0.382

δ3,3

−3.7

4.1

0.364

δ3,4

0.4

4.2

0.927

δ3,5

−0.5

3.9

0.908

δ3,6

−1.3

4.2

0.747

δ3,7

1.5

4.2

0.728

δ3,8

−1.4

3.9

0.711

δ3,9

−4.5

6.7

0.502

δ3,10

−3.7

4.2

0.371

δ3,11

7.8

3.9

0.044

δ3,12

−2.7

4.2

0.518

φ3

0.085

0.031

0.006

δ4,1

−6.9

4.2

0.102

δ4,2

−3.6

4.2

0.386

δ4,3

6.2

3.9

0.113

δ4,4

4.2

4.2

0.314

δ4,5

0.1

4.2

0.986

δ4,6

11.5

3.8

0.003

δ4,7

0.2

4.2

0.954

δ4,8

−0.8

3.9

0.833

δ4,9

−2.3

6.7

0.732

δ4,10

−6.8

4.2

0.105

δ4,11

−2.2

3.9

0.578

δ4,12

0.4

4.2

0.924

σ2

a

241.5

–

δ5,1

4.0

4.2

0.339

δ5,2

1.0

4.2

0.805

δ5,3

0.5

3.9

0.891

δ5,4

−3.3

4.2

0.434

δ5,5

−6.6

4.2

0.113

δ5,6

−6.7

3.9

0.083

δ5,7

6.6

4.2

0.112

δ5,8

−3.1

4.2

0.452

δ5,9

2.9

5.4

0.594

δ5,10

3.2

4.2

0.436

δ5,11

4.5

4.1

0.280

δ5,12

−3.1

3.9

0.417

δ6,1

−6.3

4.2

0.137

δ6,2

2.9

4.2

0.486

δ6,3

−12.3

4.2

0.003

δ6,4

4.1

3.9

0.287

δ6,5

−1.8

4.2

0.663

δ6,6

3.0

4.2

0.468

δ6,7

1.8

3.8

0.649

δ6,8

0.1

4.2

0.986

δ6,9

4.9

5.8

0.399

δ6,10

0.8

4.2

0.839

δ6,11

−5.4

4.1

0.187

δ6,12

8.2

3.9

0.036

δ7,1

−0.2

4.2

0.965

δ7,2

−4.2

4.2

0.321

δ7,3

−4.9

4.2

0.240

δ7,4

−3.4

3.9

0.393

δ7,5

4.0

4.2

0.345

δ7,6

0.7

4.2

0.875

δ7,7

−1.6

3.9

0.685

δ7,8

−5.4

4.2

0.195

δ7,9

5.9

7.9

0.457

δ7,10

7.1

3.9

0.073

δ7,11

−3.5

4.6

0.444

δ7,12

5.7

4.0

0.154

Jan.

Feb.

Mar.

Apr.

May

Jun.

Jul.

Aug.

Sep.

Oct.

Nov.

Dec.

Page 11

Health Care Manage Sci

Fig. 9 Diagnostic for the

normality of residuals,

Model 3

–50050

Estimated density (kernel)

0.000

0.005

0.010

0.015

0.020

–3–2–10123

–40

–20

0

20

40

60

Normal Q–Q Plot

Theoretical Quantiles

Sample Quantiles

been reduced to σ2

Model 2. The diagnostics for the residuals are in Fig. 9.

The Ljung-Box test does not detect correlation in the

residuals (we have n = 1,096, get Q = 5.394, and the

p-value of the test is 0.944) (Fig. 10).

The slightly better fit of this model compared with

Model 2 is obtained at the expense of a much larger

number of parameters and several of these parameters

do not appear to be significant. The next step is to

remove them.

a= 241.5, about 4% less than for

3.4 Model 4: considering only the significant

parameters

This model is a stripped-down version of Model 3, in

which we keep only the parameters that are significant

at the 10% level (i.e., for which the p-value of the t-test

in Table 1 or 2 is less than 0.10). In Table 2, eight para-

meters δj,kand three parameters φiare significant at the

90% level. There are ten other significant parameters

in Table 1, for a total of 20. With the identifiability

Fig. 10 Diagnostic for the

correlation between

residuals, Model 3

Standardized Residuals

Time

0 200 4006008001000

–2 0 2 4

05 1015 20 2530

0.0 0.4 0.8

Lag

ACF

ACF of Residuals

2468 10

0.0 0.4 0.8

p values for Ljung–Box statistic

Lag

p value

Page 12

Health Care Manage Sci

Table 3 Parameter estimates for Model 4

Parameter

a

Intercept

149.6

1.9

< 0.001

β1

Mon.

−4.3

1.5

0.004

γ1

Jan.

−5.7

2.9

0.050

δ1,3

Mon.

Mar.

7.9

4.2

0.060

φ1

0.212

0.030

< 0.001

b

Trend/month

0.032

0.003

< 0.001

β2

Tue.

−5.3

1.4

< 0.001

γ7

Jul.

12.4

2.7

< 0.001

δ6,3

Sat.

Mar.

−11.1

4.3

0.010

φ2

0.132

0.031

< 0.001

ω1

Jan. 1

57.8

10.7

< 0.001

β3

Wed.

−6.2

1.4

< 0.001

Estimate

St. error

p-val. of t-test

Parameter

β5

Fri.

7.8

1.4

< 0.001

β6

Sat.

7.9

1.5

< 0.001

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

Parameter

δ4,6

Thu.

Jun.

12.0

3.9

0.002

φ3

0.094

0.031

0.002

δ5,6

Fri.

Jun.

−7.1

3.9

0.069

σ2

a

241.6

–

δ7,10

Sun.

Oct.

8.3

3.8

0.029

δ3,11

Wed.

Nov.

8.5

4.0

0.034

δ1,12

Mon.

Dec.

−7.8

4.5

0.083

δ6,12

Sat.

Dec.

8.2

4.1

0.046

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

constraints, there remain s = 15 independent parame-

ters out of those 20. The same strategy of including only

the significant parameters from Model 3 could be used

in other cities, but the set of significant parameters will

vary between cities, of course.

We reestimate the model with those parameters only

(all other parameters are set at zero) and obtain the

values given in Table 3. All these parameters are signif-

icant. The most significant interaction parameters δj,k

are for Saturday in March (negative interaction) and

Fig. 11 Diagnostic for the

normality of residuals,

Model 4

–500 50

Estimated density (kernel)

0.000

0.005

0.010

0.015

0.020

–3 –2 –10123

–40

–20

0

20

40

60

Normal Q–Q Plot

Theoretical Quantiles

Sample Quantiles

Page 13

Health Care Manage Sci

Fig. 12 Diagnostic for the

correlation between residuals,

Model 4

Standardized Residuals

Time

0 200400 600 8001000

–2 0 2 4

05 1015 2025 30

0.0 0.4 0.8

Lag

ACF

ACF of Residuals

2468 10

0.0 0.4 0.8

p values for Ljung–Box statistic

Lag

p value

Thursday in June (positive interaction). Other mildly

significant interactions, at the 10% level, are (by order

of significance) Saturday in December, Wednesday in

November, Monday in March, Sunday in October, Fri-

day in June, and Monday in December.

Theestimatedvarianceoftheresidualsisσ2

The diagnostics for the residuals are in Figs. 11 and 12.

The Ljung-Box test does not detect correlation in the

residuals (we have n = 1,096, get Q = 11.581, and the

p-value of the test is 0.48).

a= 241.9.

3.5 Model 5: a doubly-seasonal ARIMA process

We now consider a different model: an ARIMA model

withtwoseasonalcycles.Wedecomposeourtimeseries

as

Yt= Nt+ ω1Ht,1+ ω2Ht,2,

where {Nt} is modeled as a doubly-seasonal ARIMA

process and the other components capture the special

days (January 1 and Stampede days). Given the season-

alitypatternsacrosstheweeklyandyearlycycleimplied

by the analysis in Section 2, we propose an ARIMA

model with two seasonal cycles: a weekly cycle, with

period s1= 7, and an approximate annual cycle, with

period s2= 365. This choice of periodicities means that

the conditional mean of Ntis regressed on Nt−365; for

example, January 1, 2004 is regressed on January 1,

2003. In other words, after eliminating February 29

(12)

(which we did), this regression “aligns” the same dates

across years.

The general form of a doubly-seasonal ARIMA

model with periods s1and s2is [7, 8, 38]:

φ(B)?s1(Bs1)?s2(Bs2)∇d∇d1

= θ(B)?s1(Bs1)?s2(Bs2)at,

where ∇d

polynomial functions of order p, p1, p2, q, q1, and q2,

respectively, and {at} is a Gaussian white noise process.

This model is referred to as an ARIMA(p,d,q) ×

(p1,d1,q1)s1× (p2,d2,q2)s2process.

We follow a standard model-building protocol to

identify the model (choice of the polynomial orders and

exponents d, d1, and d2), estimate the parameters (ω1,

ω2, and the polynomial coefficients), and perform diag-

nostic checks [7, 38]. ARIMA models with more than

one seasonal cycle are difficult to estimate in general,

because the multiple seasonalities complicate Eq. 13

with several operators, due to the multiplicative nature

of the expressions involved. A concrete selection cri-

terion must be adopted for model selection. Here, we

used Akaike’s information criterion (AIC), discussed

in Section 3.6. We keep the model with minimum AIC,

subject to non-rejection of the null hypothesis that

model residuals are a white-noise process [7]. Based on

this criterion, we identify the following model for Nt:

s1∇d2

s2Nt

(13)

s= (1 − Bs)d, φ, ?s1, ?s2, θ, ?s1, and ?s2are

(1 − φ7B7− φ14B14− φ28B28)(1 − φ365B365)(1 − B)Nt

= (1 − θ1B)at.

(14)

Page 14

Health Care Manage Sci

Table 4 Parameter estimates, Model 5

Parameter

ω1

Jan. 1

43.8

12.0

< 0.001

φ7

0.064

0.03

0.038

ω2

Stampede

16.6

3.8

< 0.001

φ14

0.103

0.03

0.001

Estimate

St. error

p-val. of t-test

Parameter

Estimate

St. error

p-val. of t-test

φ28

0.082

0.03

0.007

φ365

0.128

0.04

0.001

θ1

0.905

0.01

< 0.0001

σ2

251.7

–

a

The parameters are estimated jointly via least squares

based on Eqs. 12 and 14, i.e., we find the parameter val-

ues that minimize the sum of squares of the estimated

residuals. The estimates are given in Table 4, together

with their p-values. Note that Model 5 has considerably

fewer parameters than the other models. It is also inter-

esting to observe that for this model, the parameter ω2

(Stampede days effect) is highly significant, in contrast

with Models 2 to 4. The explanation is that there is no

“July effect” term in the model.

3.6 Model comparison: Goodness of fit and forecast

performance

In this section, we compare the five models in terms

of their quality of fit and forecasting performance. The

results are in Table 5.

With respect to quality of fit, we report the standard

errorofmodelresiduals, ˆ σa,thenumbersofparameters

estimated, and Akaike’s information criterion (AIC,

see Akaike [2] and Wei [38, page 153]). The AIC has

the advantage of taking into account both the mean-

square error of the residuals and the number of esti-

mated parameters in the model. It is designed to be

an approximately unbiased estimator of the Kullback–

Leibler distance (or cross-entropy or relative entropy)

between the true model and fitted model. It is defined

by

AIC(s) = nln(ˆ σ2

a) + 2s,

(15)

where n is the number of observations, s is the number

of estimated parameters in the model, and ˆ σ2

maximum likelihood estimator of the variance of resid-

uals, which is approximately the same as the sample

variance (10) under the assumption that the residuals

are i.i.d. normal [30]. Bias-reduced variants known as

the AICC are discussed, e.g., in [8, pages 301–304].

A model with minimal AIC is a good compromise

between parsimony and small (empirical) variance of

the residuals.

The models of Sections 3.1–3.5 were fitted to the

first 1,096 days of data. We then used the estimated

models to forecast for the remaining 441 days (t =

1,097,...,1,537), at forecast lag ranging from 1 day

ahead to 21 days ahead. The lag-? forecast error at day

t is defined as

ais the

et(?) = Yt+?−ˆYt(?),

whereˆYt(?)istheforecastofYt+?basedontheinforma-

tion available on day t. Forecasts for doubly-seasonal

ARIMA processes obey fairly complicated recursive

formulas; see, for example, Brockwell and Davis

Table 5 Comparison of models for daily arrivals

Model 1 Model 2 Model 3 Model 4 Model 5

ˆ σ2

St. error of fit ˆ σa

s

Degrees of freedom

AIC(s)

a

291.8

17.08

21

1075

250.1

15.81

24

1072

6099

241.5

15.54

90

1006

6194

241.6

15.55

15

1081

6045

251.7

15.87

7

1088

6068

RMSE(1)

MRAE(1) (in %)

17.82

7.58

15.38

6.14

15.31

6.14

13.91

5.72

15.68

6.91

Page 15

Health Care Manage Sci

Fig. 13 Forecast RMSE(s)

for Models 1–5, for forecast

lags ? = 1,...,21

12

14

16

18

20

Horizon

RMSE

123456789 1011 12 13 141516 17 18192021

model 1

model 2

model 3

model 4

model 5

[8, pages 175-182], but forecasting software facilitates

their computation.

Commonly used forecast-accuracy metrics are the

Root Mean Square Error (RMSE) and the Mean Rel-

ative Absolute Error (MRAE) at various forecast lags,

defined in our case as

?

442 − ?

RMSE(?) =

?

?

?

1

1538−?

?

1538−?

?

t=1097

e2

t(?)

and

MRAE(?) =

1

442 − ?

t=1097

|et(?)|

Yt+?

.

for lag ?. The MRAE standardizes each forecasting

error term by the corresponding process value Yt+?, to

reflect the idea that larger numbers usually require less

absolute accuracy; it must be used with caution because

it may be inflated substantially by a few moderate

absolute errors that correspond to very small values

|Yt+?|.

Table5summarizesthemodelevaluation.Theupper

part of the table collects information on the fit with

the data used for the estimation (the first 36 months).

It recalls the estimated variance of the residuals, ˆ σ2

then its square root ˆ σacalled the standard error of fit,

the number s of independent estimated parameters, the

number n − s of degrees of freedom, and the AIC(s)

criterion (for Model 1, ˆ σ2

the end of Section 3.2). According to the AIC criterion,

Model 4 is the winner, followed by Model 5 and then

Model 2. It must be underlined, however, that Model

5 was selected by minimizing the AIC over a class of

ARIMA models, so the AIC measure is biased to its

advantage.

a,

ais replaced by ˆ σ2

E, defined at

The second part of the table gives the RMSE and

MRAE for the forecasts of lag 1. The RMSE for lags 1

to 21 are displayed in Fig. 13. For small lags, Model 4

is clearly the best model in terms of forecast accuracy,

followed by Models 2 and 3. For lags s ≥ 13 (approx-

imately), RMSE(s) is about the same for all models

except Model 5, whose forecasts are much noisier. En-

couragingly, the AIC measure at the estimation stage

has successfully identified the best model.

In interpreting the standard error of fit and the

RMSE, it is helpful to recall from our preliminary data

analysis that the average number of calls per day was

about 174. If calls were generated by a stationary Pois-

son process with a rate of 174/day, then the standard

deviation of the number of calls per day would be

√174 = 13.2. The Model 4 RMSE with a lag of 1 comes

close to this value. This suggests that, given knowledge

of call volumes up to a certain point in time, the Poisson

arrivalrateforthenext24hisalmostdeterministic.The

RMSE for longer lags is higher, suggesting that when

modeling arrivals more than one day into the future,

one should view them as being generated by a Poisson

process with a random arrival rate. The discussion at

the beginning of this Section outlines how one can

quantify the distribution for the arrival rate.

4 Modeling hourly arrivals

Now that we have a good model of day-by-day call

volumes, we turn to the modeling of hour-by-hour call

volumes. We will denote the number of calls during

hour h by Zh, where h = 1,...,24n and n = 1,537. We

investigate two modeling and forecasting approaches.