David S. Matteson

Cornell University, Итак, New York, United States

Are you David S. Matteson?

Claim your profile

Publications (28)29.48 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a regression approach for estimating the distribution of ambulance travel times between any two locations in a road network. Our method uses ambulance location data that can be sparse in both time and network coverage, such as Global Positioning System data. Estimates depend on the path traveled and on explanatory variables such as the time of day and day of week. By modeling at the trip level, we account for dependence between travel times on individual road segments. Our method is parsimonious and computationally tractable for large road networks. We apply our method to estimate ambulance travel time distributions in Toronto, providing improved estimates compared to a recently published method and a commercial software package. We also demonstrate our method’s impact on ambulance fleet management decisions, showing substantial differences between our method and the recently published method in the predicted probability that an ambulance arrives within a time threshold.
    No preview · Article · Jan 2016 · European Journal of Operational Research
  • Source
    Benjamin B. Risk · David S. Matteson · David Ruppert
    [Show abstract] [Hide abstract]
    ABSTRACT: Independent component analysis (ICA) is popular in many applications, including cognitive neuroscience and signal processing. Due to computational constraints, principal component analysis is used for dimension reduction prior to ICA (PCA+ICA), which could remove important information. The problem is that interesting independent components (ICs) could be mixed in several principal components that are discarded and then these ICs cannot be recovered. To address this issue, we propose likelihood component analysis (LCA), a novel methodology in which dimension reduction and latent variable estimation is achieved simultaneously by maximizing a likelihood with Gaussian and non-Gaussian components. We present a parametric LCA model using the logistic density and a semi-parametric LCA model using tilted Gaussians with cubic B-splines. We implement an algorithm scalable to datasets common in applications (e.g., hundreds of thousands of observations across hundreds of variables with dozens of latent components). In simulations, our methods recover latent components that are discarded by PCA+ICA methods. We apply our method to dependent multivariate data and demonstrate that LCA is a useful data visualization and dimension reduction tool that reveals features not apparent from PCA or PCA+ICA. We also apply our method to an experiment from the Human Connectome Project with state-of-the-art temporal and spatial resolution and identify an artifact using LCA that was missed by PCA+ICA. We present theoretical results on identifiability of the LCA model and consistency of our estimator.
    Full-text · Article · Nov 2015
  • William Nicholson · David Matteson · Jacob Bien
    [Show abstract] [Hide abstract]
    ABSTRACT: The vector autoregression (VAR) has long proven to be an effective method for modeling the joint dynamics of macroe- conomic time series as well as forecasting. A major shortcomings of the VAR that has hindered its applicability is its heavy parameterization: the parameter space grows quadratically with the number of series included, quickly exhausting the available degrees of freedom. Consequently, forecasting using VARs is intractable for low-frequency, high-dimensional macroeconomic data. However, empirical evidence suggests that VARs that incorporate more component series tend to result in more accurate forecasts. Conventional methods that allow for the estimation of large VARs either tend to require ad hoc subjective specifications or are computationally infeasible. Moreover, as global economies become more intricately intertwined, there has been substantial interest in incorporating the impact of stochastic, unmodeled exogenous variables. Vector autoregression with exogenous variables (VARX) extends the VAR to allow for the inclusion of unmodeled variables, but it similarly faces dimensionality challenges. We introduce the VARX-L framework, a structured family of VARX models, and provide methodology which allows for both efficient estimation and accurate forecasting in high-dimensional analysis. VARX-L adapts several prominent scalar regression regularization techniques to a vector time series context to greatly reduce the parameter space of VAR and VARX models. We formulate convex optimization procedures that are amenable to efficient solutions for the time-ordered, high-dimensional problems we aim to solve. We also highlight a compelling extension that allows for shrinking toward reference models. We demonstrate the efficacy of VARX-L in both low- and high-dimensional macroeconomic applications and simulated data examples.
    No preview · Article · Aug 2015
  • Source
    Zhengyi Zhou · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Predicting ambulance demand accurately at fine time and location scales is critical for ambulance fleet management and dynamic deployment. Large-scale datasets in this setting typically exhibit complex spatio-temporal dynamics and sparsity at high resolutions. We propose a predictive method using spatio-temporal kernel density estimation (stKDE) to address these challenges, and provide spatial density predictions for ambulance demand in Toronto, Canada as it varies over hourly intervals. Specifically, we weight the spatial kernel of each historical observation by its informativeness to the current predictive task. We construct spatio-temporal weight functions to incorporate various temporal and spatial patterns in ambulance demand, including location-specific seasonalities and short-term serial dependence. This allows us to draw out the most helpful historical data, and exploit spatio-temporal patterns in the data for accurate and fast predictions. We further provide efficient estimation and customizable prediction procedures. stKDE is easy to use and interpret by non-specialized personnel from the emergency medical service industry. It also has significantly higher statistical accuracy than the current industry practice, with a comparable amount of computational expense.
    Preview · Article · Jul 2015
  • Source
    Zhengyi Zhou · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Predicting ambulance demand accurately in fine resolutions in space and time is critical for ambulance fleet management and dynamic deployment. Typical challenges include data sparsity at high resolutions and the need to respect complex urban spatial domains. To provide spatial density predictions for ambulance demand in Melbourne, Australia as it varies over hourly intervals, we propose a predictive spatio-temporal kernel warping method. To predict for each hour, we build a kernel density estimator on a sparse set of the most similar data from relevant past time periods (labeled data), but warp these kernels to a larger set of past data irregardless of time periods (point cloud). The point cloud represents the spatial structure and geographical characteristics of Melbourne, including complex boundaries, road networks, and neighborhoods. Borrowing from manifold learning, kernel warping is performed through a graph Laplacian of the point cloud and can be interpreted as a regularization towards, and a prior imposed, for spatial features. Kernel bandwidth and degree of warping are efficiently estimated via cross-validation, and can be made time- and/or location-specific. Our proposed model gives significantly more accurate predictions compared to a current industry practice, an unwarped kernel density estimation, and a time-varying Gaussian mixture model.
    Preview · Article · Jul 2015
  • Nicholas A. James · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: The concept of homogeneity plays a critical role in statistics, both in its applications as well as its theory. Change point analysis is a statistical tool that aims to attain homogeneity within time series data. This is accomplished through partitioning the time series into a number of contiguous homogeneous segments. The applications of such techniques range from identifying chromosome alterations to solar flare detection. In this manuscript we present a general purpose search algorithm called cp3o that can be used to identify change points in multivariate time series. This new search procedure can be applied with a large class of goodness of fit measures. Additionally, a reduction in the computational time needed to identify change points is accomplish by means of probabilistic pruning. With mild assumptions about the goodness of fit measure this new search algorithm is shown to generate consistent estimates for both the number of change points and their locations, even when the number of change points increases with the time series length. A change point algorithm that incorporates the cp3o search algorithm and E-Statistics, e-cp3o, is also presented. The only distributional assumption that the e-cp3o procedure makes is that the absolute $\alpha$th moment exists, for some $\alpha\in(0,2)$. Due to this mild restriction, the e-cp3o procedure can be applied to a majority of change point problems. Furthermore, even with such a mild restriction, the e-cp3o procedure has the ability to detect any type of distributional change within a time series. Simulation studies are used to compare the e-cp3o procedure to other parametric and nonparametric change point procedures, we highlight applications of e-cp3o to climate and financial datasets.
    No preview · Article · May 2015
  • Source
    William B. Nicholson · Jacob Bien · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Vector autoregression (VAR) is a fundamental tool for modeling the joint dynamics of multivariate time series. However, as the number of component series is increased, the VAR model quickly becomes overparameterized, making reliable estimation difficult and impeding its adoption as a forecasting tool in high dimensional settings. A number of authors have sought to address this issue by incorporating regularized approaches, such as the lasso, that impose sparse or low-rank structures on the estimated coefficient parameters of the VAR. More traditional approaches attempt to address overparameterization by selecting a low lag order, based on the assumption that dynamic dependence among components is short-range. However, these methods typically assume a single, universal lag order that applies across all components, unnecessarily constraining the dynamic relationship between the components and impeding forecast performance. The lasso-based approaches are more flexible but do not incorporate the notion of lag order selection. We propose a new class of regularized VAR models, called hierarchical vector autoregression (HVAR), that embed the notion of lag selection into a convex regularizer. The key convex modeling tool is a group lasso with nested groups which ensure the sparsity pattern of autoregressive lag coefficients honors the ordered structure inherent to VAR. We provide computationally efficient algorithms for solving HVAR problems that can be parallelized across the components. A simulation study shows the improved performance in forecasting and lag order selection over previous approaches, and a macroeconomic application further highlights forecasting improvements as well as the convenient, interpretable output of a HVAR model.
    Preview · Article · Dec 2014
  • Source
    Nicholas A. James · Arun Kejariwal · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Low latency and high availability of an app or a web service are key, amongst other factors, to the overall user experience (which in turn directly impacts the bottomline). Exogenic and/or endogenic factors often give rise to breakouts in cloud data which makes maintaining high availability and delivering high performance very challenging. Although there exists a large body of prior research in breakout detection, existing techniques are not suitable for detecting breakouts in cloud data owing to being not robust in the presence of anomalies. To this end, we developed a novel statistical technique to automatically detect breakouts in cloud data. In particular, the technique employs Energy Statistics to detect breakouts in both application as well as system metrics. Further, the technique uses robust statistical metrics, viz., median, and estimates the statistical significance of a breakout through a permutation test. To the best of our knowledge, this is the first work which addresses breakout detection in the presence of anomalies. We demonstrate the efficacy of the proposed technique using production data and report Precision, Recall and F-measure measure. The proposed technique is 3.5 times faster than a state-of-the-art technique for breakout detection and is being currently used on a daily basis at Twitter.
    Preview · Article · Nov 2014
  • Source
    Daniel R. Kowal · David S. Matteson · David Ruppert
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the data--functional, time dependent, and multivariate components--we extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization framework. The proposed methods identify a time-invariant functional basis for the functional observations, which is smooth and interpretable, and can be made common across multivariate observations for additional information sharing. The Bayesian framework permits joint estimation of the model parameters, provides exact inference (up to MCMC error) on specific parameters, and allows generalized dependence structures. Sampling from the posterior distribution is accomplished with an efficient Gibbs sampling algorithm. We illustrate the proposed framework with two applications: (1) multi-economy yield curve data from the recent global recession, and (2) local field potential brain signals in rats, for which we develop a multivariate functional time series approach for multivariate time-frequency analysis.
    Full-text · Article · Nov 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a parsimonious Bayesian approach for modeling spatio-temporal point processes. Our method is motivated by the problem of estimating the spatial distribution of ambulance demand in Toronto, Canada, as it changes over discrete 2-hour intervals; such estimates are critical for fleet management and dynamic deployment. The large-scale datasets typical in ambulance demand estimation exhibit complex spatial and temporal patterns and dynamics. We propose to model this time series of spatial densities by finite Gaussian mixture models. We fix the mixture component distributions across all time periods while letting the mixture weights evolve over time. This allows efficient estimation of the underlying spatial structure, yet enough flexibility to capture dynamics over time. We capture temporal patterns such as seasonality by introducing constraints on the mixture weights; we represent location-specific temporal dynamics by applying a separate autoregressive prior on each mixture weight. While estimation may be performed using a fixed number of mixture components, we also extend to estimate the number of components using birth-and-death Markov chain Monte Carlo. We quantify statistical and operational merits of our method over the current industry practice.
    Full-text · Article · Jan 2014 · Journal of the American Statistical Association
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a parsimonious Bayesian approach for modeling spatio-temporal point processes. Our method is motivated by the problem of estimating the spatial distribution of ambulance demand in Toronto, Canada, as it changes over discrete 2-hour intervals; such estimates are critical for fleet management and dynamic deployment. The large-scale datasets typical in ambulance demand estimation exhibit complex spatial and temporal patterns and dynamics. We propose to model this time series of spatial densities by finite Gaussian mixture models. We fix the mixture component distributions across all time periods while letting the mixture weights evolve over time. This allows efficient estimation of the underlying spatial structure, yet enough flexibility to capture dynamics over time. We capture temporal patterns such as seasonality by introducing constraints on the mixture weights; we represent location-specific temporal dynamics by applying a separate autoregressive prior on each mixture weight. While estimation may be performed using a fixed number of mixture components, we also extend to estimate the number of components using birth-and-death Markov chain Monte Carlo. We quantify statistical and operational merits of our method over the current industry practice.
    Full-text · Article · Dec 2013
  • Benjamin B Risk · David S Matteson · David Ruppert · Ani Eloyan · Brian S Caffo
    [Show abstract] [Hide abstract]
    ABSTRACT: We examine differences between independent component analyses (ICAs) arising from different assumptions, measures of dependence, and starting points of the algorithms. ICA is a popular method with diverse applications including artifact removal in electrophysiology data, feature extraction in microarray data, and identifying brain networks in functional magnetic resonance imaging (fMRI). ICA can be viewed as a generalization of principal component analysis (PCA) that takes into account higher-order cross-correlations. Whereas the PCA solution is unique, there are many ICA methods-whose solutions may differ. Infomax, FastICA, and JADE are commonly applied to fMRI studies, with FastICA being arguably the most popular. Hastie and Tibshirani (2003) demonstrated that ProDenICA outperformed FastICA in simulations with two components. We introduce the application of ProDenICA to simulations with more components and to fMRI data. ProDenICA was more accurate in simulations, and we identified differences between biologically meaningful ICs from ProDenICA versus other methods in the fMRI analysis. ICA methods require nonconvex optimization, yet current practices do not recognize the importance of, nor adequately address sensitivity to, initial values. We found that local optima led to dramatically different estimates in both simulations and group ICA of fMRI, and we provide evidence that the global optimum from ProDenICA is the best estimate. We applied a modification of the Hungarian (Kuhn-Munkres) algorithm to match ICs from multiple estimates, thereby gaining novel insights into how brain networks vary in their sensitivity to initial values and ICA method.
    No preview · Article · Dec 2013 · Biometrics
  • D.S. Matteson · N.A. James · W.B. Nicholson · L.C. Segalini
    [Show abstract] [Hide abstract]
    ABSTRACT: The assumption of strict stationarity is too strong for observations in many financial time series applications; however, distributional properties may be at least locally stable in time. We define multivariate measures of homogeneity to quantify local stationarity and an empirical approach for robustly estimating time varying windows of stationarity. Finally, we consider bivariate series that are believed to be cointegrated locally, assess our estimates, and discuss applications in financial asset pairs trading.
    No preview · Conference Paper · Oct 2013
  • Source
    Nicholas A. James · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: There are many different ways in which change point analysis can be performed, from purely parametric methods to those that are distribution free. The ecp package is designed to perform multiple change point analysis while making as few assumptions as possible. While many other change point methods are applicable only for univariate data, this R package is suitable for both univariate and multivariate observations. Estimation can be based upon either a hierarchical divisive or agglomerative algorithm. Divisive estimation sequentially identifies change points via a bisection algorithm. The agglomerative algorithm estimates change point locations by determining an optimal segmentation. Both approaches are able to detect any type of distributional change within the data. This provides an advantage over many existing change point algorithms which are only able to detect changes within the marginal distributions.
    Preview · Article · Sep 2013 · Journal of statistical software
  • Source
    David S. Matteson · Ruey S. Tsay
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper introduces a novel statistical framework for independent component analysis (ICA) of multivariate data. We propose methodology for estimating and testing the existence of mutually independent components for a given dataset, and a versatile resampling-based procedure for inference. Independent components are estimated by combining a nonparametric probability integral transformation with a generalized nonparametric whitening method that simultaneously minimizes all forms of dependence among the components. U-statistics of certain Euclidean distances between sample elements are combined in succession to construct a statistic for testing the existence of mutually independent components. The proposed measures and tests are based on both necessary and sufficient conditions for mutual independence. When independent components exist, one may apply univariate analysis to study or model each component separately. Univariate models may then be combined to obtain a multivariate model for the original observations. We prove the consistency of our estimator under minimal regularity conditions without assuming the existence of independent components a priori, and all assumptions are placed on the observations directly, not on the latent components. We demonstrate the improvements of the proposed method over competing methods in simulation studies. We apply the proposed ICA approach to two real examples and contrast it with principal component analysis.
    Preview · Article · Jun 2013
  • Source
    David S. Matteson · Nicholas A. James
    [Show abstract] [Hide abstract]
    ABSTRACT: Change point analysis has applications in a wide variety of fields. The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data is continually arriving and is analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the alpha-th absolute moment, for some alpha in (0,2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms. The divisive method is shown to provide consistent estimates of both the number and location of change points under standard regularity assumptions. We compare the proposed approach with competing methods in a simulation study. Methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs. We conclude with applications in genetics, finance and spatio-temporal analysis.
    Preview · Article · Jun 2013 · Journal of the American Statistical Association
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a Bayesian model for estimating the distribution of ambulance travel times on each road segment in a city, using Global Positioning System (GPS) data. Due to sparseness and error in the GPS data, the exact ambulance paths and travel times on each road segment are unknown. We simultaneously estimate the paths, travel times, and parameters of each road segment travel time distribution using Bayesian data augmentation. To draw ambulance path samples, we use a novel reversible jump Metropolis–Hastings step. We also introduce two simpler estimation methods based on GPS speed data. We compare these methods to a recently published travel time estimation method, using simulated data and data from Toronto EMS. In both cases, out-of-sample point and interval estimates of ambulance trip times from the Bayesian method outperform estimates from the alternative methods. We also construct probability-of-coverage maps for ambulances. The Bayesian method gives more realistic maps than the recently published method. Finally, path estimates from the Bayesian method interpolate well between sparsely recorded GPS readings and are robust to GPS location errors.
    Full-text · Article · Jun 2013 · The Annals of Applied Statistics
  • Scott H. Holan · Wen-Hsi Yang · Christopher K. Wikle · David S. Matteson
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying periods of recession and expansion is a challenging topic of ongoing interest with important economic and monetary policy implications. Given the current state of the global economy, significant attention has recently been devoted to identifying ...
    No preview · Article · Nov 2012 · Applied Stochastic Models in Business and Industry
  • Scott H. Holan · Wen-Hsi Yang · David S. Matteson · Christopher K. Wikle
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying periods of recession and expansion is a challenging topic of ongoing interest with important economic and monetary policy implications. Given the current state of the global economy, significant attention has recently been devoted to identifying and forecasting economic recessions. Consequently, we introduce a novel class of Bayesian hierarchical probit models that take advantage of dimension-reduced time–frequency representations of various market indices. The approach we propose can be viewed as a Bayesian mixed frequency data regression model, as it relates high-frequency daily data observed over several quarters to a binary quarterly response indicating recession or expansion. More specifically, our model directly incorporates time–frequency representations of the entire high-dimensional non-stationary time series of daily log returns, over several quarters, as a regressor in a predictive model, while quantifying various sources of uncertainty. The necessary dimension reduction is achieved by treating the time–frequency representation (spectrogram) as an “image” and finding its empirical orthogonal functions. Subsequently, further dimension reduction is accomplished through the use of stochastic search variable selection. Overall, our dimension reduction approach provides an extremely powerful tool for feature extraction, yielding an interpretable image of features that predict recessions. The effectiveness of our model is demonstrated through out-of-sample identification (nowcasting) and multistep-ahead prediction (forecasting) of economic recessions. In fact, our results provide greater than 85% and 80% out-of-sample forecasting accuracy for recessions and expansions respectively, even three quarters ahead. Finally, we illustrate the utility and added value of including time–frequency information from the NASDAQ index when identifying and predicting recessions. Copyright © 2012 John Wiley & Sons, Ltd.
    No preview · Article · Nov 2012 · Applied Stochastic Models in Business and Industry
  • Source
    David S. Matteson · David Ruppert
    [Show abstract] [Hide abstract]
    ABSTRACT: Economic and financial time series typically exhibit time-varying conditional (given the past) standard deviations and correlations. The conditional standard deviation is also called the volatility. Higher volatilities increase the risk of assets and higher conditional correlations cause an increased risk in portfolios. Therefore, models of time-varying volatilities and correlations are essential for risk management.
    Full-text · Article · Oct 2011 · IEEE Signal Processing Magazine

Publication Stats

134 Citations
29.48 Total Impact Points

Institutions

  • 2011-2014
    • Cornell University
      • • Department of Statistical Science
      • • Operations Research and Information Engineering
      Итак, New York, United States
    • University of Minnesota Duluth
      Duluth, Minnesota, United States