David S. Matteson

Cornell University, Итак, New York, United States

Are you David S. Matteson?

Claim your profile

Publications (23)26.24 Total impact

  • Zhengyi Zhou, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Predicting ambulance demand accurately in fine resolutions in space and time is critical for ambulance fleet management and dynamic deployment. Typical challenges include data sparsity at high resolutions and the need to respect complex urban spatial domains. To provide spatial density predictions for ambulance demand in Melbourne, Australia as it varies over hourly intervals, we propose a predictive spatio-temporal kernel warping method. To predict for each hour, we build a kernel density estimator on a sparse set of the most similar data from relevant past time periods (labeled data), but warp these kernels to a larger set of past data irregardless of time periods (point cloud). The point cloud represents the spatial structure and geographical characteristics of Melbourne, including complex boundaries, road networks, and neighborhoods. Borrowing from manifold learning, kernel warping is performed through a graph Laplacian of the point cloud and can be interpreted as a regularization towards, and a prior imposed, for spatial features. Kernel bandwidth and degree of warping are efficiently estimated via cross-validation, and can be made time- and/or location-specific. Our proposed model gives significantly more accurate predictions compared to a current industry practice, an unwarped kernel density estimation, and a time-varying Gaussian mixture model.
  • Zhengyi Zhou, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Predicting ambulance demand accurately at fine time and location scales is critical for ambulance fleet management and dynamic deployment. Large-scale datasets in this setting typically exhibit complex spatio-temporal dynamics and sparsity at high resolutions. We propose a predictive method using spatio-temporal kernel density estimation (stKDE) to address these challenges, and provide spatial density predictions for ambulance demand in Toronto, Canada as it varies over hourly intervals. Specifically, we weight the spatial kernel of each historical observation by its informativeness to the current predictive task. We construct spatio-temporal weight functions to incorporate various temporal and spatial patterns in ambulance demand, including location-specific seasonalities and short-term serial dependence. This allows us to draw out the most helpful historical data, and exploit spatio-temporal patterns in the data for accurate and fast predictions. We further provide efficient estimation and customizable prediction procedures. stKDE is easy to use and interpret by non-specialized personnel from the emergency medical service industry. It also has significantly higher statistical accuracy than the current industry practice, with a comparable amount of computational expense.
  • Nicholas A. James, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: The concept of homogeneity plays a critical role in statistics, both in its applications as well as its theory. Change point analysis is a statistical tool that aims to attain homogeneity within time series data. This is accomplished through partitioning the time series into a number of contiguous homogeneous segments. The applications of such techniques range from identifying chromosome alterations to solar flare detection. In this manuscript we present a general purpose search algorithm called cp3o that can be used to identify change points in multivariate time series. This new search procedure can be applied with a large class of goodness of fit measures. Additionally, a reduction in the computational time needed to identify change points is accomplish by means of probabilistic pruning. With mild assumptions about the goodness of fit measure this new search algorithm is shown to generate consistent estimates for both the number of change points and their locations, even when the number of change points increases with the time series length. A change point algorithm that incorporates the cp3o search algorithm and E-Statistics, e-cp3o, is also presented. The only distributional assumption that the e-cp3o procedure makes is that the absolute $\alpha$th moment exists, for some $\alpha\in(0,2)$. Due to this mild restriction, the e-cp3o procedure can be applied to a majority of change point problems. Furthermore, even with such a mild restriction, the e-cp3o procedure has the ability to detect any type of distributional change within a time series. Simulation studies are used to compare the e-cp3o procedure to other parametric and nonparametric change point procedures, we highlight applications of e-cp3o to climate and financial datasets.
  • Source
    William B. Nicholson, Jacob Bien, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Vector autoregression (VAR) is a fundamental tool for modeling the joint dynamics of multivariate time series. However, as the number of component series is increased, the VAR model quickly becomes overparameterized, making reliable estimation difficult and impeding its adoption as a forecasting tool in high dimensional settings. A number of authors have sought to address this issue by incorporating regularized approaches, such as the lasso, that impose sparse or low-rank structures on the estimated coefficient parameters of the VAR. More traditional approaches attempt to address overparameterization by selecting a low lag order, based on the assumption that dynamic dependence among components is short-range. However, these methods typically assume a single, universal lag order that applies across all components, unnecessarily constraining the dynamic relationship between the components and impeding forecast performance. The lasso-based approaches are more flexible but do not incorporate the notion of lag order selection. We propose a new class of regularized VAR models, called hierarchical vector autoregression (HVAR), that embed the notion of lag selection into a convex regularizer. The key convex modeling tool is a group lasso with nested groups which ensure the sparsity pattern of autoregressive lag coefficients honors the ordered structure inherent to VAR. We provide computationally efficient algorithms for solving HVAR problems that can be parallelized across the components. A simulation study shows the improved performance in forecasting and lag order selection over previous approaches, and a macroeconomic application further highlights forecasting improvements as well as the convenient, interpretable output of a HVAR model.
  • Source
    Nicholas A. James, Arun Kejariwal, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Low latency and high availability of an app or a web service are key, amongst other factors, to the overall user experience (which in turn directly impacts the bottomline). Exogenic and/or endogenic factors often give rise to breakouts in cloud data which makes maintaining high availability and delivering high performance very challenging. Although there exists a large body of prior research in breakout detection, existing techniques are not suitable for detecting breakouts in cloud data owing to being not robust in the presence of anomalies. To this end, we developed a novel statistical technique to automatically detect breakouts in cloud data. In particular, the technique employs Energy Statistics to detect breakouts in both application as well as system metrics. Further, the technique uses robust statistical metrics, viz., median, and estimates the statistical significance of a breakout through a permutation test. To the best of our knowledge, this is the first work which addresses breakout detection in the presence of anomalies. We demonstrate the efficacy of the proposed technique using production data and report Precision, Recall and F-measure measure. The proposed technique is 3.5 times faster than a state-of-the-art technique for breakout detection and is being currently used on a daily basis at Twitter.
  • Source
    Daniel R. Kowal, David S. Matteson, David Ruppert
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the data--functional, time dependent, and multivariate components--we extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization framework. The proposed methods identify a time-invariant functional basis for the functional observations, which is smooth and interpretable, and can be made common across multivariate observations for additional information sharing. The Bayesian framework permits joint estimation of the model parameters, provides exact inference (up to MCMC error) on specific parameters, and allows generalized dependence structures. Sampling from the posterior distribution is accomplished with an efficient Gibbs sampling algorithm. We illustrate the proposed framework with two applications: (1) multi-economy yield curve data from the recent global recession, and (2) local field potential brain signals in rats, for which we develop a multivariate functional time series approach for multivariate time-frequency analysis.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a parsimonious Bayesian approach for modeling spatio-temporal point processes. Our method is motivated by the problem of estimating the spatial distribution of ambulance demand in Toronto, Canada, as it changes over discrete 2-hour intervals; such estimates are critical for fleet management and dynamic deployment. The large-scale datasets typical in ambulance demand estimation exhibit complex spatial and temporal patterns and dynamics. We propose to model this time series of spatial densities by finite Gaussian mixture models. We fix the mixture component distributions across all time periods while letting the mixture weights evolve over time. This allows efficient estimation of the underlying spatial structure, yet enough flexibility to capture dynamics over time. We capture temporal patterns such as seasonality by introducing constraints on the mixture weights; we represent location-specific temporal dynamics by applying a separate autoregressive prior on each mixture weight. While estimation may be performed using a fixed number of mixture components, we also extend to estimate the number of components using birth-and-death Markov chain Monte Carlo. We quantify statistical and operational merits of our method over the current industry practice.
    Journal of the American Statistical Association 01/2014; 110(509). DOI:10.1080/01621459.2014.941466 · 2.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a parsimonious Bayesian approach for modeling spatio-temporal point processes. Our method is motivated by the problem of estimating the spatial distribution of ambulance demand in Toronto, Canada, as it changes over discrete 2-hour intervals; such estimates are critical for fleet management and dynamic deployment. The large-scale datasets typical in ambulance demand estimation exhibit complex spatial and temporal patterns and dynamics. We propose to model this time series of spatial densities by finite Gaussian mixture models. We fix the mixture component distributions across all time periods while letting the mixture weights evolve over time. This allows efficient estimation of the underlying spatial structure, yet enough flexibility to capture dynamics over time. We capture temporal patterns such as seasonality by introducing constraints on the mixture weights; we represent location-specific temporal dynamics by applying a separate autoregressive prior on each mixture weight. While estimation may be performed using a fixed number of mixture components, we also extend to estimate the number of components using birth-and-death Markov chain Monte Carlo. We quantify statistical and operational merits of our method over the current industry practice.
  • [Show abstract] [Hide abstract]
    ABSTRACT: We examine differences between independent component analyses (ICAs) arising from different assumptions, measures of dependence, and starting points of the algorithms. ICA is a popular method with diverse applications including artifact removal in electrophysiology data, feature extraction in microarray data, and identifying brain networks in functional magnetic resonance imaging (fMRI). ICA can be viewed as a generalization of principal component analysis (PCA) that takes into account higher-order cross-correlations. Whereas the PCA solution is unique, there are many ICA methods-whose solutions may differ. Infomax, FastICA, and JADE are commonly applied to fMRI studies, with FastICA being arguably the most popular. Hastie and Tibshirani (2003) demonstrated that ProDenICA outperformed FastICA in simulations with two components. We introduce the application of ProDenICA to simulations with more components and to fMRI data. ProDenICA was more accurate in simulations, and we identified differences between biologically meaningful ICs from ProDenICA versus other methods in the fMRI analysis. ICA methods require nonconvex optimization, yet current practices do not recognize the importance of, nor adequately address sensitivity to, initial values. We found that local optima led to dramatically different estimates in both simulations and group ICA of fMRI, and we provide evidence that the global optimum from ProDenICA is the best estimate. We applied a modification of the Hungarian (Kuhn-Munkres) algorithm to match ICs from multiple estimates, thereby gaining novel insights into how brain networks vary in their sensitivity to initial values and ICA method.
    Biometrics 12/2013; 70(1). DOI:10.1111/biom.12111 · 1.52 Impact Factor
  • Source
    Nicholas A. James, David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: There are many different ways in which change point analysis can be performed, from purely parametric methods to those that are distribution free. The ecp package is designed to perform multiple change point analysis while making as few assumptions as possible. While many other change point methods are applicable only for univariate data, this R package is suitable for both univariate and multivariate observations. Estimation can be based upon either a hierarchical divisive or agglomerative algorithm. Divisive estimation sequentially identifies change points via a bisection algorithm. The agglomerative algorithm estimates change point locations by determining an optimal segmentation. Both approaches are able to detect any type of distributional change within the data. This provides an advantage over many existing change point algorithms which are only able to detect changes within the marginal distributions.
    Journal of statistical software 09/2013; 62(7). · 3.80 Impact Factor
  • Source
    David S. Matteson, Ruey S. Tsay
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper introduces a novel statistical framework for independent component analysis (ICA) of multivariate data. We propose methodology for estimating and testing the existence of mutually independent components for a given dataset, and a versatile resampling-based procedure for inference. Independent components are estimated by combining a nonparametric probability integral transformation with a generalized nonparametric whitening method that simultaneously minimizes all forms of dependence among the components. U-statistics of certain Euclidean distances between sample elements are combined in succession to construct a statistic for testing the existence of mutually independent components. The proposed measures and tests are based on both necessary and sufficient conditions for mutual independence. When independent components exist, one may apply univariate analysis to study or model each component separately. Univariate models may then be combined to obtain a multivariate model for the original observations. We prove the consistency of our estimator under minimal regularity conditions without assuming the existence of independent components a priori, and all assumptions are placed on the observations directly, not on the latent components. We demonstrate the improvements of the proposed method over competing methods in simulation studies. We apply the proposed ICA approach to two real examples and contrast it with principal component analysis.
  • Source
    David S. Matteson, Nicholas A. James
    [Show abstract] [Hide abstract]
    ABSTRACT: Change point analysis has applications in a wide variety of fields. The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data is continually arriving and is analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the alpha-th absolute moment, for some alpha in (0,2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms. The divisive method is shown to provide consistent estimates of both the number and location of change points under standard regularity assumptions. We compare the proposed approach with competing methods in a simulation study. Methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs. We conclude with applications in genetics, finance and spatio-temporal analysis.
    Journal of the American Statistical Association 06/2013; 109(505). DOI:10.1080/01621459.2013.849605 · 2.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a Bayesian model for estimating the distribution of ambulance travel times on each road segment in a city, using Global Positioning System (GPS) data. Due to sparseness and error in the GPS data, the exact ambulance paths and travel times on each road segment are unknown. We simultaneously estimate the paths, travel times, and parameters of each road segment travel time distribution using Bayesian data augmentation. To draw ambulance path samples, we use a novel reversible jump Metropolis–Hastings step. We also introduce two simpler estimation methods based on GPS speed data. We compare these methods to a recently published travel time estimation method, using simulated data and data from Toronto EMS. In both cases, out-of-sample point and interval estimates of ambulance trip times from the Bayesian method outperform estimates from the alternative methods. We also construct probability-of-coverage maps for ambulances. The Bayesian method gives more realistic maps than the recently published method. Finally, path estimates from the Bayesian method interpolate well between sparsely recorded GPS readings and are robust to GPS location errors.
    The Annals of Applied Statistics 06/2013; 7(2). DOI:10.1214/13-AOAS626 · 1.69 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The assumption of strict stationarity is too strong for observations in many financial time series applications; however, distributional properties may be at least locally stable in time. We define multivariate measures of homogeneity to quantify local stationarity and an empirical approach for robustly estimating time varying windows of stationarity. Finally, we consider bivariate series that are believed to be cointegrated locally, assess our estimates, and discuss applications in financial asset pairs trading.
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Identifying periods of recession and expansion is a challenging topic of ongoing interest with important economic and monetary policy implications. Given the current state of the global economy, significant attention has recently been devoted to identifying ...
    Applied Stochastic Models in Business and Industry 11/2012; 28(6):504-505. DOI:10.1002/asmb.1955 · 0.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Identifying periods of recession and expansion is a challenging topic of ongoing interest with important economic and monetary policy implications. Given the current state of the global economy, significant attention has recently been devoted to identifying and forecasting economic recessions. Consequently, we introduce a novel class of Bayesian hierarchical probit models that take advantage of dimension-reduced time–frequency representations of various market indices. The approach we propose can be viewed as a Bayesian mixed frequency data regression model, as it relates high-frequency daily data observed over several quarters to a binary quarterly response indicating recession or expansion. More specifically, our model directly incorporates time–frequency representations of the entire high-dimensional non-stationary time series of daily log returns, over several quarters, as a regressor in a predictive model, while quantifying various sources of uncertainty. The necessary dimension reduction is achieved by treating the time–frequency representation (spectrogram) as an “image” and finding its empirical orthogonal functions. Subsequently, further dimension reduction is accomplished through the use of stochastic search variable selection. Overall, our dimension reduction approach provides an extremely powerful tool for feature extraction, yielding an interpretable image of features that predict recessions. The effectiveness of our model is demonstrated through out-of-sample identification (nowcasting) and multistep-ahead prediction (forecasting) of economic recessions. In fact, our results provide greater than 85% and 80% out-of-sample forecasting accuracy for recessions and expansions respectively, even three quarters ahead. Finally, we illustrate the utility and added value of including time–frequency information from the NASDAQ index when identifying and predicting recessions. Copyright © 2012 John Wiley & Sons, Ltd.
    Applied Stochastic Models in Business and Industry 11/2012; 28(6):485-499. DOI:10.1002/asmb.1954 · 0.53 Impact Factor
  • Source
    D.S. Matteson, David Ruppert
    [Show abstract] [Hide abstract]
    ABSTRACT: Economic and financial time series typically exhibit time-varying conditional (given the past) standard deviations and correlations. The conditional standard deviation is also called the volatility. Higher volatilities increase the risk of assets and higher conditional correlations cause an increased risk in portfolios. Therefore, models of time-varying volatilities and correlations are essential for risk management.
    IEEE Signal Processing Magazine 10/2011; 28(5-28):72 - 82. DOI:10.1109/MSP.2011.941553 · 4.48 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a new method for forecasting emergency call arrival rates that combines integer-valued time series models with a dynamic latent factor structure. Covariate information is captured via simple constraints on the factor loadings. We directly model the count-valued arrivals per hour, rather than using an artificial assumption of normality. This is crucial for the emergency medical service context, in which the volume of calls may be very low. Smoothing splines are used in estimating the factor levels and loadings to improve long-term forecasts. We impose time series structure at the hourly level, rather than at the daily level, capturing the fine-scale dependence in addition to the long-term structure. Our analysis considers all emergency priority calls received by Toronto EMS between January 2007 and December 2008 for which an ambulance was dispatched. Empirical results demonstrate significantly reduced error in forecasting call arrival volume. To quantify the impact of reduced forecast errors, we design a queueing model simulation that approximates the dynamics of an ambulance system. The results show better performance as the forecasting method improves. This notion of quantifying the operational impact of improved statistical procedures may be of independent interest.
    The Annals of Applied Statistics 07/2011; 5(2011). DOI:10.1214/10-AOAS442 · 1.69 Impact Factor
  • Source
    David S Matteson, Ruey S Tsay, H G B Alexander
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce dynamic orthogonal components (DOC) for multivariate time series and propose a procedure for estimating and testing the existence of DOCs for a given time series. We estimate the dynamic orthogonal components via a generalized decorrelation method that minimizes the linear and quadratic dependence across components and across time. Ljung-Box type statistics are then used to test the existence of dynamic orthogonal components. When DOCs exist, one can apply univariate analysis to build a model for each component. Those univariate models are then combined to obtain a multivariate model for the original time series. We demonstrate the usefulness of dynamic orthogonal components with two real examples and compare the proposed modeling method with other dimension reduction methods available in the literature, including principal component and independent component analyses. We also prove consistency and asymptotic normality of the proposed estimator under some regularity conditions. Some technical details are provided in online Supplementary Materials.
    Journal of the American Statistical Association 07/2011; 106(496). DOI:10.1198/jasa.2011.tm10616 · 2.11 Impact Factor
  • Source
    Dawn B. Woodard, David S. Matteson, Shane G. Henderson
    [Show abstract] [Hide abstract]
    ABSTRACT: Time series models are often constructed by combining nonstationary effects such as trends with stochastic processes that are believed to be stationary. Although stationarity of the underlying process is typically crucial to ensure desirable properties or even validity of statistical estimators, there are numerous time series models for which this stationarity is not yet proven. A major barrier is that the most commonly-used methods assume φ-irreducibility, a condition that can be violated for the important class of discrete-valued observation-driven models. ¶ We show (strict) stationarity for the class of Generalized Autoregressive Moving Average (GARMA) models, which provides a flexible analogue of ARMA models for count, binary, or other discrete-valued data. We do this from two perspectives. First, we show conditions under which GARMA models have a unique stationary distribution (so are strictly stationary when initialized in that distribution). This result potentially forms the foundation for broadly showing consistency and asymptotic normality of maximum likelihood estimators for GARMA models. Since these conclusions are not immediate, however, we also take a second approach. We show stationarity and ergodicity of a perturbed version of the GARMA model, which utilizes the fact that the perturbed model is φ-irreducible and immediately implies consistent estimation of the mean, lagged covariances, and other functionals of the perturbed process. We relate the perturbed and original processes by showing that the perturbed model yields parameter estimates that are arbitrarily close to those of the original model.
    Electronic Journal of Statistics 01/2011; 5(2011). DOI:10.1214/11-EJS627 · 1.02 Impact Factor