Rob J HyndmanMonash University (Australia) · Department of Econometrics and Business Statistics
Rob J Hyndman
BSc (Hons), PhD, AStat
About
353
Publications
373,962
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
40,964
Citations
Introduction
Copies of papers are available on my personal website at robjhyndman.com.
Additional affiliations
January 1995 - present
Publications
Publications (353)
Data-driven organizations around the world routinely use forecasting methods to improve their planning and decision-making capabilities. Although much research exists on the harms resulting from traditional machine learning applications, little has specifically focused on the ethical impact of time series forecasting. Yet forecasting raises unique...
In this paper, we propose a novel approach to improving forecasts of stock market indexes by considering common stock prices as hierarchical time series, combining clustering with forecast
reconciliation. We propose grouping the individual stock price series in various ways including via metadata and using unsupervised learning techniques. The prop...
Detecting anomalies in a temporal sequence of graphs can be applied is areas such as the detection of accidents in transport networks and cyber attacks in computer networks. Existing methods for detecting abnormal graphs can suffer from multiple limitations, such as high false positive rates as well as difficulties with handling variable-sized grap...
A novel forecast linear augmented projection (FLAP) method is introduced, which reduces the forecast error variance of any unbiased multivariate forecast without introducing bias. The method first constructs new component series which are linear combinations of the original series. Forecasts are then generated for both the original and component se...
Accurate forecasts of ambulance demand are crucial inputs when planning and deploying staff and fleet. Such demand forecasts are required at national, regional, and sub-regional levels and must take account of the nature of incidents and their priorities. These forecasts are often generated independently by different teams within the organization....
One of the most challenging aspects for managers when building a forecasting system is choosing how to aggregate the data at different levels. This is frequently done without the manager knowing how these choices can compromise the system's accuracy. This article illustrates these compromises by comparing different structures and aggregation criter...
This paper discusses the use of forecast reconciliation with stock price time series and the corresponding stock index. The individual stock price series may be grouped using known meta-data or other clustering methods. We propose a novel forecasting framework that combines forecast reconciliation and clustering, to lead to better forecasts of both...
Time series often reflect variation associated with other related variables. Controlling for the effect of these variables is useful when modeling or analysing the time series. We introduce a novel approach to normalize time series data conditional on a set of covariates. We do this by modeling the conditional mean and the conditional variance of t...
Forecast reconciliation is a post-forecasting process that involves transforming a set of incoherent forecasts into coherent forecasts which satisfy a given set of linear constraints for a multivariate time series. In this paper we extend the current state-of-the-art cross-sectional probabilistic forecast reconciliation approach to encompass a cros...
Features of time series are useful in identifying suitable models for forecasting. We present a general framework, labelled FFORMS (Feature‐based FORecast Model Selection), which selects forecast models based on features calculated from each time series. The FFORMS framework builds a mapping that relates the features of a time series to the “best”...
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sources,...
Detecting anomalies from a series of temporal networks has many applications, including road accidents in transport networks and suspicious events in social networks. While there are many methods for network anomaly detection, statistical methods are under utilised in this space even though they have a long history and proven capability in handling...
Global forecasting models (GFMs) that are trained across a set of multiple time series have shown superior results in many forecasting competitions and real-world applications compared with univariate forecasting approaches. One aspect of the popularity of statistical forecasting models such as ETS and ARIMA is their relative simplicity and interpr...
We develop a framework for forecasting multivariate data that follow known linear constraints. This is particularly common in forecasting where some variables are aggregates of others, commonly referred to as hierarchical time series, but also arises in other prediction settings. For point forecasting, an increasingly popular technique is reconcili...
Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework....
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from the single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sour...
The COVID-19 pandemic has had a devastating effect on many industries around the world including tourism and policy makers are interested in mapping out what the recovery path will look like. We propose a novel statistical methodology for generating scenario-based probabilistic forecasts based on a large survey of 443 tourism experts and stakeholde...
Model selection has been proven an effective strategy for improving accuracy in time series forecasting applications. However, when dealing with hierarchical time series, apart from selecting the most appropriate forecasting model, forecasters have also to select a suitable method for reconciling the base forecasts produced for each series to make...
We propose a new method for decomposing seasonal data: a seasonal-trend decomposition using regression (STR). Unlike other decomposition methods, STR allows for multiple seasonal and cyclic components, covariates, seasonal patterns that may have noninteger periods, and seasonality with complex topology. It can be used for time series with any regul...
In situ sensors that collect high-frequency data are used increasingly to monitor aquatic environments. These sensors are prone to technical errors, resulting in unrecorded observations and/or anomalous values that are subsequently removed and create gaps in time series data. We present a framework based on generalized additive and auto-regressive...
Global Forecasting Models (GFM) that are trained across a set of multiple time series have shown superior results in many forecasting competitions and real-world applications compared with univariate forecasting approaches. One aspect of the popularity of statistical forecasting models such as ETS and ARIMA is their relative simplicity and interpre...
This paper introduces lookout, a new approach to detect outliers using leave-one-out kernel density estimates and extreme value theory. Outlier detection methods that use kernel density estimates generally employ a user defined parameter to determine the bandwidth. Lookout uses persistent homology to construct a bandwidth suitable for outlier detec...
Forecast evaluation plays a key role in how empirical evidence shapes the development of the discipline. Domain experts are interested in error measures relevant for their decision making needs. Such measures may produce unreliable results. Although reliability properties of several metrics have already been discussed, it has hardly been quantified...
The decomposition of time series into components is an important task that helps to understand time series and can enable better forecasting. Nowadays, with high sampling rates leading to high-frequency data (such as daily, hourly, or minutely data), many real-world datasets contain time series data that can exhibit multiple seasonal patterns. Alth...
Organizations such as government departments and financial institutions provide online service facilities accessible via an increasing number of internet connected devices which make their operational environment vulnerable to cyber attacks. Consequently, there is a need to have mechanisms in place to detect cyber security attacks in a timely manne...
Over the last 15 years, studies on hierarchical forecasting have moved away from single-level approaches towards proposing linear combination approaches across multiple levels of the hierarchy. Such combinations offer coherent reconciled forecasts, improved forecasting performance and aligned decision-making. This paper proposes a novel hierarchica...
Forecasting hierarchical or grouped time series using a reconciliation approach involves two steps: computing base forecasts and reconciling the forecasts. Base forecasts can be computed by popular time series forecasting methods such as Exponential Smoothing (ETS) and Autoregressive Integrated Moving Average (ARIMA) models. The reconciliation step...
Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal data sets. This paper describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can...
Real time monitoring using in situ sensors is becoming a common approach for measuring water quality within watersheds. High frequency measurements produce big data sets that present opportunities to conduct new analyses for improved understanding of water quality dynamics and more effective management of rivers and streams. Of primary importance i...
Global methods that fit a single forecasting method to all time series in a set have recently shown surprising accuracy, even when forecasting large groups of heterogeneous time series. We provide the following contributions that help understand the potential and applicability of global methods and how they relate to traditional local methods that...
Functional autoregressive models are popular for functional time series analysis, but the standard formulation fails to address seasonal behaviour in functional time series data. To overcome this shortcoming, we introduce seasonal functional autoregressive time series models. For the model of order one, we derive sufficient stationarity conditions...
We forecast the old‐age dependency ratio for Australia under various pension age proposals, and estimate a pension age scheme that will provide a stable old‐age dependency ratio at a specified level. Our approach involves a stochastic population forecasting method based on coherent functional data models for mortality, fertility and net migration,...
Many businesses and industries nowadays rely on large quantities of time series data making time series forecasting an important research area. Global forecasting models that are trained across sets of time series have shown a huge potential in providing accurate forecasts compared with the traditional univariate forecasting models that work on iso...
Manifold learning algorithms are valuable tools for the analysis of high-dimensional data, many of which include a step where nearest neighbors of all observations are found. This can present a computational bottleneck when the number of observations is large or when the observations lie in more general metric spaces, such as statistical manifolds,...
This paper introduces lookout, a new approach to detect outliers using leave-one-out kernel density estimates and extreme value theory. Outlier detection methods that use kernel density estimates generally employ a user defined parameter to determine the bandwidth. Lookout uses persistent homology to construct a bandwidth suitable for outlier detec...
A geometric interpretation is developed for so-called reconciliation methodologies used to forecast time series that adhere to known linear constraints. In particular, a general framework is established that nests many existing popular reconciliation methods within the class of projections. This interpretation facilitates the derivation of novel th...
Model selection has been proven an effective strategy for improving accuracy in time series forecasting applications. However, when dealing with hierarchical time series, apart from selecting the most appropriate forecasting model, forecasters have also to select a suitable method for reconciling the base forecasts produced for each series to make...
Deconstructing a time index into time granularities can assist in exploration and automated analysis of large temporal data sets. This paper describes classes of time deconstructions using linear and cyclic time granularities. Linear granularities respect the linear progression of time such as hours, days, weeks and months. Cyclic granularities can...
This paper proposes a novel forecast reconciliation framework using Bayesian state-space methods. It allows for the joint reconciliation at all forecast horizons and uses predictive distributions rather than past variation of forecast errors. Informative priors are used to assign weights to specific predictions, which makes it possible to reconcile...
We propose two new general methods for decomposing seasonal time series data: STR (a Seasonal-Trend decomposition procedure based on Regression) and Robust STR. In some ways, STR is similar to Ridge Regression, and Robust STR is related to LASSO. These new methods are more general than any other alternative time series decomposition methods; they a...
The sum of forecasts of disaggregated time series is often required to equal the forecast of the aggregate, giving a set of coherent forecasts. The least squares solution for finding coherent forecasts uses a reconciliation approach known as MinT, proposed by Wickramasuriya, Athanasopoulos, and Hyndman (2019). The MinT approach and its variants do...
This paper introduces DOBIN, a new approach to select a set of basis vectors tailored for outlier detection. DOBIN has a simple mathematical foundation and can be used as a dimension reduction tool for outlier detection tasks. We demonstrate the effectiveness of DOBIN on an extensive data repository, by comparing the performance of outlier detectio...
The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We de...
This paper investigates event extraction and early event classification in contiguous spatio-temporal data streams, where events need to be classified using partial information, i.e. while the event is ongoing. The framework incorporates an event extraction algorithm and an early event classification algorithm. We apply this framework to synthetic...
Forecasting groups of time series is of increasing practical importance, e.g. forecasting the demand for multiple products offered by a retailer or server loads within a data center. The local approach to this problem considers each time series separately and fits a function or model to each series. The global approach fits a single function to all...
Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industry-standard MapRe...
We examine the relationships between electoral socio‐demographic characteristics and two‐party preferences in the six Australian federal elections held between 2001 and 2016. Socio‐demographic information is derived from the Australian Census which occurs every 5 years. Since a census is not directly available for each election, an imputation metho...
Hierarchical forecasting methods have been widely used to support aligned decision-making by providing coherent forecasts at different aggregation levels. Traditional hierarchical forecasting approaches, such as the bottom-up and top-down methods, focus on a particular aggregation level to anchor the forecasts. During the past decades, these have b...
The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative appr...
This paper demonstrates that the performance of various outlier detection methods is sensitive to both the characteristics of the dataset, and the data normalization scheme employed. To understand these dependencies, we formally prove that normalization affects the nearest neighbor structure, and density of the dataset; hence, affecting which obser...
Decisions regarding the supply of electricity across a power grid must take into consideration the inherent uncertainty in demand. Optimal decision-making requires probabilistic forecasts for demand in a hierarchy with various levels of aggregation, such as substations, cities and regions. The forecasts should be coherent in the sense that the fore...
Calendars are broadly used in society to display temporal information and events. This paper describes a new calendar display for plotting data, that includes a layout algorithm with many options, and faceting functionality. The functions use modular arithmetic on the date variable to restructure the data into a calendar format. The user can apply...
We demonstrate the utility of predicting the whole distribution of an outcome rather than a marginal change. We overcome inconsistent data modelling techniques in a real world problem. A model based on additive quantile regression and boosting was used to predict the whole distribution of length of hospital stay (LOS) following colorectal cancer su...
Accurate forecasts of macroeconomic variables are crucial inputs into the decisions of economic agents and policy makers. Exploiting inherent aggregation structures of such variables, we apply forecast reconciliation methods to generate forecasts that are coherent with the aggregation constraints. We generate both point and probabilistic forecasts...
This paper investigates longevity inequality across U.S. states by modelling and forecasts mortality rates via a forecast reconciliation approach. Understanding the heterogeneity in state-level mortality experience is of fundamental importance, as. A key challenge of multi-population mortality modeling is the curse of dimensionality, and the result...
Hierarchical forecasting (HF) is needed in many situations in the supply chain (SC) because managers often need different levels of forecasts at different levels of SC to make a decision. Top-Down (TD), Bottom-Up (BU) and Optimal Combination (COM) are common HF models. These approaches are static and often ignore the dynamics of the series while di...
Mining temporal data for information is often inhibited by a multitude of formats: regular or irregular time intervals, point events that need aggregating, multiple observational units or repeated measurements on multiple individuals, and heterogeneous data types. This work presents a cohesive and conceptual framework for organizing and manipulatin...
Outliers due to technical errors in water quality data from in situ sensors can reduce data quality and have a direct impact on inference drawn from subsequent data analysis. However, outlier detection through manual monitoring is infeasible given the volume and velocity of data the sensors produce. Here we introduce an automated procedure, named o...
Objective:
Length of hospital stay (LOS) is considered a vital component for successful colorectal surgery treatment. Evidence of an association between hospital surgery volume and LOS has been mixed. Data modelling techniques may give inconsistent results that adversely impact conclusions. This study applied techniques to overcome possible modell...
This paper introduces DOBIN, a new approach to select a set of basis vectors tailored for outlier detection. DOBIN has a solid mathematical foundation and can be used as a dimension reduction tool for outlier detection tasks. We demonstrate the effectiveness of DOBIN on an extensive data repository, by comparing the performance of outlier detection...
This paper provides a non-systematic review of the progress of forecasting in social settings. It is aimed at someone outside the field of forecasting who wants to understand and appreciate the results of the M4 Competition, and forms a survey paper regarding the state of the art of this discipline. It discusses the recorded improvements in forecas...
We propose an automated method for obtaining weighted forecast combinations using time series features. The proposed approach involves two phases. First, we use a collection of time series to train a meta-model for assigning weights to various possible forecasting methods with the goal of minimizing the average forecasting loss obtained from a weig...
Water-quality monitoring in rivers often focuses on the concentrations of sediments and nutrients, constituents that can smother biota and cause eutrophication. However, the physical and economic constraints of manual sampling prohibit data collection at the frequency required to adequately capture the variation in concentrations through time. Here...
The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We de...
This paper demonstrates how machine learning is used to measure energy savings from energy conservation measures (ECMs); in particular ECMs with a low expected energy saving. We develop a model that predicts energy consumption in buildings on an hourly level. The model is trained on energy data from the main meter before the ECMs took place. The mo...
Forecasting competitions are now so widespread that it is often forgotten how controversial they were when first held, and how influential they have been over the years. I briefly review the history of forecasting competitions, and discuss what we have learned about their design and implementation, and what they can tell us about forecasting. I als...
This article proposes a framework that provides early detection of anomalous series within a large collection of non-stationary streaming time series data. We define an anomaly as an observation that is very unlikely given the recent distribution of a given system. The proposed framework first calculates a boundary for the system’s typical behavior...
Water-quality monitoring in rivers often focuses on the concentrations of sediments and nutrients, constituents that can smother biota and cause eutrophication. However, the physical and economic constraints of manual sampling prohibit data collection at the frequency required to adequately capture the variation in concentrations through time. Here...
A popular approach to forecasting macroeconomic variables is to utilize a large number of predictors. Several regularization and shrinkage methods can be used to exploit such high-dimensional datasets, and have been shown to improve forecast accuracy for the US economy. To assess whether similar results hold for economies with different characteris...