Conference Paper

Stop&Hop: Early Classification of Irregular Time Series

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... If prediction is triggered, the hidden representation given by the RNN is sent to a Discriminator, whose role is to predict a class, given this representation. The model has been adapted to deal with irregularly sampled time series [57]. [58] extend the ECTS framework to channel filtering, using here also Reinforcement Learning. ...
Preprint
Full-text available
Habilitation to Direct Research manuscript (HDR)
... Among the separable approaches, EARLIEST and its variants [13][14][15] train three modules jointly: an encoder, a triggering agent, and a discriminator, the latter making the classification decision. Here, F represents the time series embedding, produced by the RNNbased encoder; is the triggering agent, and ℎ thus includes both the encoder and discriminator. ...
Preprint
Full-text available
Early Classification of Time Series (ECTS) has been recognized as an important problem in many areas where decisions have to be taken as soon as possible, before the full data availability, while time pressure increases. Numerous ECTS approaches have been proposed, based on different triggering functions, each taking into account various pieces of information related to the incoming time series and/or the output of a classifier. Although their performances have been empirically compared in the literature, no studies have been carried out on the optimality of these triggering functions that involve ``man-tailored'' decision rules. Based on the same information, could there be better triggering functions? This paper presents one way to investigate this question by showing first how to translate ECTS problems into Reinforcement Learning (RL) ones, where the very same information is used in the state space. A thorough comparison of the performance obtained by ``handmade'' approaches and their ``RL-based'' counterparts has been carried out. A second question investigated in this paper is whether a different combination of information, defining the state space in RL systems, can achieve even better performance. Experiments show that the system we describe, called \textsc{Alert}, significantly outperforms its state-of-the-art competitors on a large number of datasets.
... If prediction is triggered, the hidden representation given by the RNN is sent to a Discriminator, whose role is to predict a class, given this representation. The model has been extended for irregularly sampled time series (Hartvigsen et al., 2022). Pantiskas et al. (2023) extend the ECTS framework to channel filtering, using here also Reinforcement Learning. ...
Preprint
Full-text available
In many situations, the measurements of a studied phenomenon are provided sequentially, and the prediction of its class needs to be made as early as possible so as not to incur too high a time penalty, but not too early and risk paying the cost of misclassification. This problem has been particularly studied in the case of time series, and is known as Early Classification of Time Series (ECTS). Although it has been the subject of a growing body of literature, there is still a lack of a systematic, shared evaluation protocol to compare the relative merits of the various existing methods. This document begins by situating these methods within a principle-based taxonomy. It defines dimensions for organizing their evaluation, and then reports the results of a very extensive set of experiments along these dimensions involving nine state-of-the art ECTS algorithms. In addition, these and other experiments can be carried out using an open-source library in which most of the existing ECTS algorithms have been implemented (see \url{https://github.com/ML-EDM/ml_edm}).
Article
Early classification of longitudinal data remains an active area of research today. The complexity of these datasets and the high rates of missing data caused by irregular sampling present data-level challenges for the Early Longitudinal Data Classification (ELDC) problem. Coupled with the algorithmic challenge of optimising the opposing objectives of early classification (i.e., earliness and accuracy), ELDC becomes a non-trivial task. Inspired by the generative power and utility of the Generative Adversarial Network (GAN), we propose a novel context-conditional, longitudinal early classifier GAN (LEC-GAN). This model utilises informative missingness, static features, and earlier observations to improve the ELDC objective. It achieves this by incorporating ELDC as an auxiliary task within an imputation optimization process. Our experiments on several datasets demonstrate that LEC-GAN outperforms all relevant baselines in terms of F1 scores while increasing the earliness of prediction.
Article
Time series data are ubiquitous in a variety of disciplines. Early classification of time series, which aims to predict the class label of a time series as early and accurately as possible, is a significant but challenging task in many time-sensitive applications. Existing approaches mainly utilize heuristic stopping rules to capture stopping signals from the prediction results of time series classifiers. However, heuristic stopping rules can only capture obvious stopping signals, which makes these approaches give either correct but late predictions or early but incorrect predictions. To tackle the problem, we propose a novel second-order confidence network for early classification of time series, which can automatically learn to capture implicit stopping signals in early time series in a unified framework. The proposed model leverages deep neural models to capture temporal patterns and outputs second-order confidence to reflect the implicit stopping signals. Specifically, our model not only exploits the data from a time step but from the probability sequence to capture stopping signals. By combining stopping signals from the classifier output and the second-order confidence, we design a more robust trigger to decide whether or not to request more observations from future time steps. Experimental results show that our approach can achieve superior in early classification compared to state-of-the-art approaches.
Preprint
Full-text available
Irregularly-sampled time series (ITS) are native to high-impact domains like healthcare, where measurements are collected over time at uneven intervals. However, for many classification problems, only small portions of long time series are often relevant to the class label. In this case, existing ITS models often fail to classify long series since they rely on careful imputation, which easily over- or under-samples the relevant regions. Using this insight, we then propose CAT, a model that classifies multivariate ITS by explicitly seeking highly-relevant portions of an input series' timeline. CAT achieves this by integrating three components: (1) A Moment Network learns to seek relevant moments in an ITS's continuous timeline using reinforcement learning. (2) A Receptor Network models the temporal dynamics of both observations and their timing localized around predicted moments. (3) A recurrent Transition Model models the sequence of transitions between these moments, cultivating a representation with which the series is classified. Using synthetic and real data, we find that CAT outperforms ten state-of-the-art methods by finding short signals in long irregular time series.
Article
Full-text available
Attention mechanism is crucial for sequential learning where a wide range of applications have been successfully developed. This mechanism is basically trained to spotlight on the region of interest in hidden states of sequence data. Most of the attention methods compute the attention score through relating between a query and a sequence where the discrete-time state trajectory is represented. Such a discrete-time attention could not directly attend the continuous-time trajectory which is represented via neural differential equation (NDE) combined with recurrent neural network. This paper presents a new continuous-time attention method for sequential learning which is tightly integrated with NDE to construct an attentive continuous-time state machine. The continuous-time attention is performed at all times over the hidden states for different kinds of irregular time signals. The missing information in sequence data due to sampling loss, especially in presence of long sequence, can be seamlessly compensated and attended in learning representation. The experiments on irregular sequence samples from human activities, dialogue sentences and medical features show the merits of the proposed continuous-time attention for activity recognition, sentiment classification and mortality prediction, respectively.
Article
Full-text available
Screening programs must balance the benefit of early detection with the cost of overscreening. Here, we introduce a novel reinforcement learning-based framework for personalized screening, Tempo, and demonstrate its efficacy in the context of breast cancer. We trained our risk-based screening policies on a large screening mammography dataset from Massachusetts General Hospital (MGH; USA) and validated this dataset in held-out patients from MGH and external datasets from Emory University (Emory; USA), Karolinska Institute (Karolinska; Sweden) and Chang Gung Memorial Hospital (CGMH; Taiwan). Across all test sets, we find that the Tempo policy combined with an image-based artificial intelligence (AI) risk model is significantly more efficient than current regimens used in clinical practice in terms of simulated early detection per screen frequency. Moreover, we show that the same Tempo policy can be easily adapted to a wide range of possible screening preferences, allowing clinicians to select their desired trade-off between early detection and screening costs without training new policies. Finally, we demonstrate that Tempo policies based on AI-based risk models outperform Tempo policies based on less accurate clinical risk models. Altogether, our results show that pairing AI-based risk models with agile AI-designed screening policies has the potential to improve screening programs by advancing early detection while reducing overscreening.
Article
Full-text available
Due to the discrepancy of diseases and symptoms, patients usually visit hospitals irregularly and different physiological variables are examined at each visit, producing large amounts of irregular multivariate time series (IMTS) data with missing values and varying intervals. Existing methods process IMTS into regular data so that standard machine learning models can be employed. However, time intervals are usually determined by the status of patients, while missing values are caused by changes in symptoms. Therefore, we propose a novel end-to-end Dual-Attention Time-Aware Gated Recurrent Unit (DATA-GRU) for IMTS to predict the mortality risk of patients. In particular, DATA-GRU is able to: 1) preserve the informative varying intervals by introducing a time-aware structure to directly adjust the influence of the previous status in coordination with the elapsed time, and 2) tackle missing values by proposing a novel dual-attention structure to jointly consider data-quality and medical-knowledge. A novel unreliability-aware attention mechanism is designed to handle the diversity in the reliability of different data, while a new symptom-aware attention mechanism is proposed to extract medical reasons from original clinical records. Extensive experimental results on two real-world datasets demonstrate that DATA-GRU can significantly outperform state-of-the-art methods and provide meaningful clinical interpretation.
Article
Full-text available
Background: For real-time monitoring of hospital patients, high-quality inference of patients' health status using all information available from clinical covariates and lab test results is essential to enable successful medical interventions and improve patient outcomes. Developing a computational framework that can learn from observational large-scale electronic health records (EHRs) and make accurate real-time predictions is a critical step. In this work, we develop and explore a Bayesian nonparametric model based on multi-output Gaussian process (GP) regression for hospital patient monitoring. Methods: We propose MedGP, a statistical framework that incorporates 24 clinical covariates and supports a rich reference data set from which relationships between observed covariates may be inferred and exploited for high-quality inference of patient state over time. To do this, we develop a highly structured sparse GP kernel to enable tractable computation over tens of thousands of time points while estimating correlations among clinical covariates, patients, and periodicity in patient observations. MedGP has a number of benefits over current methods, including (i) not requiring an alignment of the time series data, (ii) quantifying confidence regions in the predictions, (iii) exploiting a vast and rich database of patients, and (iv) inferring interpretable relationships among clinical covariates. Results: We evaluate and compare results from MedGP on the task of online prediction for three patient subgroups from two medical data sets across 8,043 patients. We find MedGP improves online prediction over baseline and state-of-the-art methods for nearly all covariates across different disease subgroups and hospitals. Conclusions: The MedGP framework is robust and efficient in estimating the temporal dependencies from sparse and irregularly sampled medical time series data for online prediction. The publicly available code is at https://github.com/bee-hive/MedGP .
Article
Full-text available
Many multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. It has been noted that the missing patterns and values are often correlated with the target labels, a.k.a., missingness is informative, and there is significant interest to explore methods which model them for time series prediction and other related tasks. In this paper, we develop novel deep learning models based on Gated Recurrent Units (GRU), a state-of-the-art recurrent neural network, to handle missing observations. Our model takes two representations of missing patterns, i.e., masking and time duration, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to improve the prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-art performance on these tasks and provide useful insights for time series with missing values.
Article
Full-text available
The goal of early classification of time series is to predict the class value of a sequence early in time, when its full length is not yet available. This problem arises naturally in many contexts where the data is collected over time and the label predictions have to be made as soon as possible. In this work, a method based on probabilistic classifiers is proposed for the problem of early classification of time series. An important feature of this method is that, in its learning stage, it discovers the timestamps in which the prediction accuracy for each class begins to surpass a pre-defined threshold. This threshold is defined as a percentage of the accuracy that would be obtained if the full series were available, and it is defined by the user. The class predictions for new time series will only be made in these timestamps or later. Furthermore, when applying the model to a new time series, a class label will only be provided if the difference between the two largest predicted class probabilities is higher than or equal to a certain threshold, which is calculated in the training step. The proposal is validated on 45 benchmark time series databases and compared with several state-of-the-art methods, and obtains superior results in both earliness and accuracy. In addition, we show the practical applicability of our method for a real-world problem: the detection and identification of bird calls in a biodiversity survey scenario.
Article
Full-text available
Learning temporal causal structures between time se-ries is one of the key tools for analyzing time series data. In many real-world applications, we are confronted with Irregular Time Series, whose observations are not sampled at equally-spaced time stamps. The irregu-larity in sampling intervals violates the basic assump-tions behind many models for structure learning. In this paper, we propose a nonparametric generalization of the Granger graphical models called Generalized Lasso Granger (GLG) to uncover the temporal dependencies from irregular time series. Via theoretical analysis and extensive experiments, we verify the effectiveness of our model. Furthermore, we apply GLG to the application dataset of δ 18 O isotope of Oxygen records in Asia and achieve promising results to discover the moisture trans-portation patterns in a 800-year period.
Article
Full-text available
With the coming data deluge from synoptic surveys, there is a need for frameworks that can quickly and automatically produce calibrated classification probabilities for newly observed variables based on small numbers of time-series measurements. In this paper, we introduce a methodology for variable-star classification, drawing from modern machine-learning techniques. We describe how to homogenize the information gleaned from light curves by selection and computation of real-numbered metrics (features), detail methods to robustly estimate periodic features, introduce tree-ensemble methods for accurate variable-star classification, and show how to rigorously evaluate a classifier using cross validation. On a 25-class data set of 1542 well-studied variable stars, we achieve a 22.8% error rate using the random forest (RF) classifier; this represents a 24% improvement over the best previous classifier on these data. This methodology is effective for identifying samples of specific science classes: for pulsational variables used in Milky Way tomography we obtain a discovery efficiency of 98.2% and for eclipsing systems we find an efficiency of 99.1%, both at 95% purity. The RF classifier is superior to other methods in terms of accuracy, speed, and relative immunity to irrelevant features; the RF can also be used to estimate the importance of each feature in classification. Additionally, we present the first astronomical use of hierarchical classification methods to incorporate a known class taxonomy in the classifier, which reduces the catastrophic error rate from 8% to 7.8%. Excluding low-amplitude sources, the overall error rate improves to 14%, with a catastrophic error rate of 3.5%.
Conference Paper
Full-text available
Early classification on time series data has been found highly useful in a few important applications, such as medical and health informatics, industry production management, safety and security management. While some classifiers have been proposed to achieve good earliness in classification, the interpretability of early classification remains largely an open problem. Without interpretable features, application domain experts such as medical doctors may be reluctant to adopt early classification. In this paper, we tackle the problem of extracting interpretable features on time series for early classification. Specifically, we advocate local shapelets as features, which are segments of time series remaining in the same space of the input data and thus are highly interpretable. We extract local shapelets distinctly manifesting a target class locally and early so that they are effective for early classification. Our experimental results on seven benchmark real data sets clearly show that the local shapelets extracted by our methods are highly interpretable and can achieve effective early classification.
Conference Paper
Full-text available
In this paper, we formulate the problem of early classification of time series data, which is important in some time-sensitive applications such as health- informatics. We introduce a novel concept of MPL (Minimum Prediction Length) and develop ECTS (Early Classification on Time Series), an effective 1-nearest neighbor classification method. ECTS makes early predictions and at the same time re- tains the accuracy comparable to that of a 1NN classifier using the full-length time series. Our em- pirical study using benchmark time series data sets shows that ECTS works well on the real data sets where 1NN classification is effective.
Article
Full-text available
Reliance on physiological monitors to continuously "watch" patients and to alert the nurse when a serious rhythm problem occurs is standard practice on monitored units. Alarms are intended to alert clinicians to deviations from a predetermined "normal" status. However, alarm fatigue may occur when the sheer number of monitor alarms overwhelms clinicians, possibly leading to alarms being disabled, silenced, or ignored. Excessive numbers of monitor alarms and fear that nurses have become desensitized to these alarms was the impetus for a unit-based quality improvement project. Small tests of change to improve alarm management were conducted on a medical progressive care unit. The types and frequency of monitor alarms in the unit were assessed. Nurses were trained to individualize patients' alarm parameter limits and levels. Monitor software was modified to promote audibility of critical alarms. Critical monitor alarms were reduced 43% from baseline data. The reduction of alarms could be attributed to adjustment of monitor alarm defaults, careful assessment and customization of monitor alarm parameter limits and levels, and implementation of an interdisciplinary monitor policy. Although alarms are important and sometimes life-saving, they can compromise patients' safety if ignored. This unit-based quality improvement initiative was beneficial as a starting point for revamping alarm management throughout the institution.
Article
Early classification of time series has been extensively studied for minimizing class prediction delay in time-sensitive applications such as medical diagnostic and industrial process monitoring. A primary task of an early classification approach is to classify an incomplete time series as soon as possible with some desired level of accuracy. Recent years have witnessed several approaches for early classification of time series. As most approaches have solved the early classification problem using a diverse set of strategies, it becomes very important to make a thorough review of existing solutions. These solutions have demonstrated reasonable performance on a wide range of applications including human activity recognition, gene expression based health diagnostic, and industrial monitoring. In this article, we presenta-systematic review of the current literature on early classification approaches for both univariate and multivariate time series. We divide various existing approaches into four exclusive categories based on their proposed solution strategies. The four categories include prefix based, shapelet based, model based, and miscellaneous approaches. We discuss the applications of early classification and provide a quick summary of the current literature with future research directions. Impact Statement-Early classification is mainly an extension of classification with an ability to classify a time series using limited data points. It is true that one can achieve better accuracy if one waits for more data points, but opportunities for early interventions could equally be missed. In a pandemic situation such as COVID-19, early detection of an infected person becomes more desirable to curb the spread of the virus and possibly save lives. Early classification of gas (e.g., methyl isocyanate) leakage can help to avoid life-threatening consequences on human beings. Early classification techniques have been successfully applied to solve many time-critical problems related to medical diagnostic and industrial monitoring. This article provides a systematic review of the current literature on these early classification approaches for time series data, along with their potential applications. It also suggests some promising directions for further work in this area.
Article
Irregular time series data is becoming increasingly prevalent with the growth of multi-sensor systems as well as the continued use of unstructured manual data recording mechanisms. Irregular data and the resulting missing values severely limit the data's ability to be analysed and modelled for classification and forecasting tasks. Often, conventional methods used for handling time series data introduce bias and make strong assumptions on the underlying data generation process, which can lead to poor model predictions. Traditional machine learning and deep learning methods, although at the forefront of data modelling, are at best compromised by irregular time series data sets and fail to model the temporal irregularity of incomplete time series. Gated recurrent neural networks (RNN), such as LSTM and GRU, have had outstanding success in sequential modelling, and have been applied in many application fields, including natural language processing. These models have become an obvious choice for time series modelling and a promising tool for handling irregular time series data. RNNs have a unique ability to be adapted to make effective use of missing value patterns, time intervals and complex temporal dependencies in irregular univariate and multivariate time series data. In this paper, we provide a systematic review of recent studies in which gated recurrent neural networks have been successfully applied to irregular time series data for prediction tasks within several fields, including medical, human activity recognition, traffic monitoring and environmental monitoring. The review highlights the two common approaches for handling irregular time series data: missing value imputation at the data pre-processing stage and modification of algorithms to directly handle missing values in the learning process. Reviewed models are confined to those that can address issues with irregular time series data and does not cover the broader range of models that deal more generally with sequences and regular time series. This paper aims to present the most effective techniques emerging within this branch of research as well as to identify remaining challenges, so that researchers may build upon this platform of work towards further novel techniques for handling irregular time series data.
Conference Paper
We define the open problem of Early Multi-label Classification and propose its first solution, motivated by both state-of-the-art Early Classification and Multi-label Classification algorithms.
Article
In this article, we address the problem of early classification (EC) of temporal sequences with adaptive prediction times. We frame EC as a sequential decision making problem and we define a partially observable Markov decision process (POMDP) fitting the competitive objectives of classification earliness and accuracy. We solve the POMDP by training an agent for EC with deep reinforcement learning (DRL). The agent learns to make adaptive decisions between classifying incomplete sequences now or delaying its prediction to gather more measurements. We adapt an existing DRL algorithm for batch and online learning of the agent’s action value function with a deep neural network. We propose strategies of prioritized sampling, prioritized storing and random episode initialization to address the fact that the agent’s memory is unbalanced due to (1): all but one of its actions terminate the process and thus (2): actions of classification are less frequent than the action of delay. In experiments, we show improvements in accuracy induced by our specific adaptation of the algorithm used for online learning of the agent’s action value function. Moreover, we compare two definitions of the POMDP based on delay reward shaping against reward discounting. Finally, we demonstrate that a static naive deep neural network, i.e. trained to classify at static times, is less efficient in terms of accuracy against speed than the equivalent network trained with adaptive decision making capabilities.
Conference Paper
The forecasting of whether a patient should be transferred into intensive care units (ICU) is a matter of life and death since it will raise survival rate for patients if they get treated properly and carefully in time. However, we found that recent research on ICU early prediction could not get an acceptable result on the time series that are asynchronous and multivariate. We propose Multivariate Early Shapelet (MEShapelet) which could get an accurate prediction on asynchronous multivariate time series beside interpretability. Our experiments show that MEShapelet can get 9% improvement on F1-score over the best of the previous methods on our real ICU data set. In summary, we prove that our method can effectively carry out asynchronous multivariate time series early predict problem.
Conference Paper
Early classification of time series is the prediction of the class label of a time series before it is observed in its entirety. In time-sensitive domains where information is collected over time it is worth sacrificing some classification accuracy in favor of earlier predictions, ideally early enough for actions to be taken. However, since accuracy and earliness are contradictory objectives, a solution must address this challenge to discover task-dependent trade-offs. We design an early classification model, called EARLIEST, which tackles this multi-objective optimization problem, jointly learning (1) to classify time series and (2) at which timestep to halt and generate this prediction. By learning the objectives together, we achieve a user-controlled balance between these contradictory goals while capturing their natural relationship. Our model consists of the novel pairing of a recurrent discriminator network with a stochastic policy network, with the latter learning a halting-policy as a reinforcement learning task. The learned policy interprets representations generated by the recurrent model and controls its dynamics, sequentially deciding whether or not to request observations from future timesteps. For a rich variety of datasets (four synthetic and three real-world), we demonstrate that EARLIEST consistently out-performs state-of-the-art alternatives in accuracy and earliness while discovering signal locations without supervision.
Article
The problem of early classification of time series appears naturally in contexts where the data, of temporal nature, are collected over time, and early class predictions are interesting or even required. The objective is to classify the incoming sequence as soon as possible, while maintaining suitable levels of accuracy in the predictions. Thus, we can say that the problem of early classification consists of optimizing two objectives simultaneously: accuracy and earliness. In this context, we present a method for early classification based on combining a set of probabilistic classifiers together with a stopping rule (SR). This SR will act as a trigger and will tell us when to output a prediction or when to wait for more data, and its main novelty lies in the fact that it is built by explicitly optimizing a cost function based on accuracy and earliness. We have selected a large set of benchmark data sets and four other state-of-the-art early classification methods, and we have evaluated and compared our framework obtaining superior results in terms of both earliness and accuracy.
Article
We demonstrate that a person's behavioral and environmental context can be automatically recognized by harnessing the sensors built into smartphones and smartwatches. We propose a generic system that can simultaneously recognize many contextual attributes from diverse behavioral domains. By fusing complementary information from different types of sensors our system successfully recognizes fine details of work and leisure activities, body movement, transportation, and more. Health monitoring, clinical intervention, aging care, personal assistance and many more applications will benefit from automatic, frequent and detailed context recognition.
Article
We present a general framework for classification of sparse and irregularly-sampled time series. The properties of such time series can result in substantial uncertainty about the values of the underlying temporal processes, while making the data difficult to deal with using standard classification methods that assume fixed-dimensional feature spaces. To address these challenges, we propose an uncertainty-aware classification framework based on a special computational layer we refer to as the Gaussian process adapter that can connect irregularly sampled time series data to to any black-box classifier learnable using gradient descent. We show how to scale up the required computations based on combining the structured kernel interpolation framework and the Lanczos approximation method, and how to discriminatively train the Gaussian process adapter in combination with a number of classifiers end-to-end using backpropagation.
Article
Background: Clinical time-series data acquired from electronic health records (EHR) are liable to temporal complexities such as irregular observations, missing values and time constrained attributes that make the knowledge discovery process challenging. Objective: This paper presents a temporal rough set induced neuro-fuzzy (TRiNF) mining framework that handles these complexities and builds an effective clinical decision-making system. TRiNF provides two functionalities namely temporal data acquisition (TDA) and temporal classification. Method: In TDA, a time-series forecasting model is constructed by adopting an improved double exponential smoothing method. The forecasting model is used in missing value imputation and temporal pattern extraction. The relevant attributes are selected using a temporal pattern based rough set approach. In temporal classification, a classification model is built with the selected attributes using a temporal pattern induced neuro-fuzzy classifier. Result: For experimentation, this work uses two clinical time series dataset of hepatitis and thrombosis patients. The experimental result shows that with the proposed TRiNF framework, there is a significant reduction in the error rate, thereby obtaining the classification accuracy on an average of 92.59% for hepatitis and 91.69% for thrombosis dataset. Conclusion: The obtained classification results prove the efficiency of the proposed framework in terms of its improved classification accuracy.
Article
Multivariate time series (MTS) classification is an important topic in time series data mining, and has attracted great interest in recent years. However, early classification on MTS data largely remains a challenging problem. To address this problem without sacrificing the classification performance, we focus on discovering hidden knowledge from the data for early classification in an explainable way. At first, we introduce a method MCFEC (Mining Core Feature for Early Classification) to obtain distinctive and early shapelets as core features of each variable independently. Then, two methods are introduced for early classification on MTS based on core features. Experimental results on both synthetic and real-world datasets clearly show that our proposed methods can achieve effective early classification on MTS.
Conference Paper
Leveraging temporal observations to predict a patient's health state at a future period is a very challenging task. Providing such a prediction early and accurately allows for designing a more successful treatment that starts before a disease completely develops. Information for this kind of early diagnosis could be extracted by use of temporal data mining methods for handling complex multivariate time series. However, physicians usually prefer to use interpretable models that can be easily explained, rather than relying on more complex black-box approaches. In this study, a temporal data mining method is proposed for extracting interpretable patterns from multivariate time series data, which can be used to assist in providing interpretable early diagnosis. The problem is formulated as an optimization based binary classification task addressed in three steps. First, the time series data is transformed into a binary matrix representation suitable for application of classification methods. Second, a novel convex-concave optimization problem is defined to extract multivariate patterns from the constructed binary matrix. Then, a mixed integer discrete optimization formulation is provided to reduce the dimensionality and extract interpretable multivariate patterns. Finally, those interpretable multivariate patterns are used for early classification in challenging clinical applications. In the conducted experiments on two human viral infection datasets and a larger myocardial infarction dataset, the proposed method was more accurate and provided classifications earlier than three alternative state-of-the-art methods.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Conference Paper
Early classification of time series is prevalent in many time-sensitive applications such as, but not limited to, early warning of disease outcome and early warning of crisis in stock market. \textcolor{black}{ For example,} early diagnosis allows physicians to design appropriate therapeutic strategies at early stages of diseases. However, practical adaptation of early classification of time series requires an easy to understand explanation (interpretability) and a measure of confidence of the prediction results (uncertainty estimates). These two aspects were not jointly addressed in previous time series early classification studies, such that a difficult choice of selecting one of these aspects is required. In this study, we propose a simple and yet effective method to provide uncertainty estimates for an interpretable early classification method. The question we address here is "how to provide estimates of uncertainty in regard to interpretable early prediction." In our extensive evaluation on twenty time series datasets we showed that the proposed method has several advantages over the state-of-the-art method that provides reliability estimates in early classification. Namely, the proposed method is more effective than the state-of-the-art method, is simple to implement, and provides interpretable results.
Conference Paper
Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data. In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. As we shall show with extensive empirical evaluations in diverse domains, algorithms based on the time series shapelet primitives can be interpretable, more accurate and significantly faster than state-of-the-art classifiers.
Article
This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reinforcement tasks, and they do this without explicitly computing gradient estimates or even storing information from which such estimates could be computed. Specific examples of such algorithms are presented, some of which bear a close relationship to certain existing algorithms while others are novel but potentially interesting in their own right. Also given are results that show how such algorithms can be naturally integrated with backpropagation. We close with a brief discussion of a number of additional issues surrounding the use of such algorithms, including what is known about their limiting behaviors a...
Real-time alerting system for COVID-19 and other stress events using wearable data
  • Arash Alavi
  • K Gireesh
  • Meng Bogu
  • Ekanath Wang
  • Andrew W Srihari Rangan
  • Qiwen Brooks
  • Emily Wang
  • Alessandra Higgs
  • Tejaswini Celli
  • Ahmed A Mishra
  • Metwally
  • Alavi Arash
Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy
  • Kazuyuki Akinori F Ebihara Taiki Miyagawa
  • Hitoshi Sakurai
  • Imaoka
Multiple instance learning for efficient sequential data classification on resource-constrained devices
  • Dennis C Pabbaraju
  • H Simhadri
  • P Jain
Neural jump stochastic differential equations
  • Junteng Jia
  • Austin R Benson
Learning from irregularly-sampled time series: a missing data perspective
  • Steven Cheng
  • Xian Li
  • Benjamin M Marlin
Learning Long-Term Dependencies in Irregularly-Sampled Time Series
  • Mathias Lechner
  • Ramin Hasani
Directly modeling missing data in sequences with rnns: Improved classification of clinical time series
  • David Zachary C Lipton
  • Randall Kale
  • Wetzel
Latent Ordinary Differential Equations for Irregularly-Sampled Time Series
  • Yulia Rubanova
  • David K Tian Qi Chen
  • Duvenaud
  • Rubanova Yulia
Multi-Time Attention Networks for Irregularly Sampled Time Series
  • Satya Narayan Shukla
  • Benjamin M Marlin
Chenxi Sun, Shenda Hong, Moxian Song, and Hongyan Li. 2020. A review of deep learning methods for irregularly sampled medical time series data
  • Chenxi Sun
  • Shenda Hong
  • Moxian Song
  • Hongyan Li
  • Sun Chenxi
A survey on principles, models and methods for learning from irregularly sampled time series
  • Satya Narayan Shukla
  • Benjamin M Marlin
  • Shukla Satya Narayan