Article

A comparison between discrete and continuous time Bayesian networks in learning from clinical time series data with irregularity

If you want to read the PDF, try requesting it from the authors.

Abstract

Background: Recently, mobile devices, such as smartphones, have been introduced into healthcare research to substitute paper diaries as data-collection tools in the home environment. Such devices support collecting patient data at different time points over a long period, resulting in clinical time-series data with high temporal complexity, such as time irregularities. Analysis of such time series poses new challenges for machine-learning techniques. The clinical context for the research discussed in this paper is home monitoring in chronic obstructive pulmonary disease (COPD). Objective: The goal of the present research is to find out which properties of temporal Bayesian network models allow to cope best with irregularly spaced multivariate clinical time-series data. Methods: Two mainstream temporal Bayesian network models of multivariate clinical time series are studied: dynamic Bayesian networks, where the system is described as a snapshot at discrete time points, and continuous time Bayesian networks, where transitions between states are modeled in continuous time. Their capability of learning from clinical time series that vary in nature are extensively studied. In order to compare the two temporal Bayesian network types for regularly and irregularly spaced time-series data, three typical ways of observing time-series data were investigated: (1) regularly spaced in time with a fixed rate; (2) irregularly spaced and missing completely at random at discrete time points; (3) irregularly spaced and missing at random at discrete time points. In addition, similar experiments were carried out using real-world COPD patient data where observations are unevenly spaced. Results: For regularly spaced time series, the dynamic Bayesian network models outperform the continuous time Bayesian networks. Similarly, if the data is missing completely at random, discrete-time models outperform continuous time models in most situations. For more realistic settings where data is not missing completely at random, the situation is more complicated. In simulation experiments, both models perform similarly if there is strong prior knowledge available about the missing data distribution. Otherwise, continuous time Bayesian networks perform better. In experiments with unevenly spaced real-world data, we surprisingly found that a dynamic Bayesian network where time is ignored performs similar to a continuous time Bayesian network. Conclusion: The results confirm conventional wisdom that discrete-time Bayesian networks are appropriate when learning from regularly spaced clinical time series. Similarly, we found that time series where the missingness occurs completely at random, dynamic Bayesian networks are an appropriate choice. However, for complex clinical time-series data that motivated this research, the continuous-time models are at least competitive and sometimes better than their discrete-time counterparts. Furthermore, continuous-time models provide additional benefits of being able to provide more fine-grained predictions than discrete-time models, which will be of practical relevance in clinical applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Instead of using data imputation methods to fill the gaps between actual observations, irregularity is also valuable information that we should consider for learning the evolution of the patient's health status. Following that line of thought, recent studies [12,53,54,50] took advantage of the advancement made in sequence modeling and used recurrent neural networks (RNNs) coupled with the representation of the time gap between two consecutive event points to conduct downstream medical tasks such as risk prediction, procedures recommendation, and patient phenotyping. ...
... They used sequential deep learning methods like GRU, LSTM, or Transformer to derive the patient's health state vector at a specific point in time, t, by taking into account the evolution of past medical events, such as diseases, procedures, treatments, and numerical indicators. The majority of the proposed works ignored the time irregularity between consecutive visits, however, the recent studies [12,53,74,73,75,84,54,76,72] that integrated it have demonstrated its importance in capturing the contextual relationships between visits leading to a better representation of the evolution of the patient. With the exception of the TAPER architecture [22], all of these methods focus on one type of data (numerical, categorical, or text) to represent a patient's timeline. ...
... P θ (y i |x)log(P θ (y i |x)), 54 ...
Thesis
The wide adoption of Electronic Health Records in hospitals’ information systems has led to the definition of large databases grouping various types of data such as textual notes, longitudinal medical events, and tabular patient information. However, the records are only filled during consultations or hospital stays that depend on the patient’s state, and local habits. A system that can leverage the different types of data collected at different time scales is critical for reconstructing the patient’s health trajectory, analyzing his history, and consequently delivering more adapted care.This thesis work addresses two main challenges of medical data processing: learning to represent the sequence of medical observations with irregular elapsed time between consecutive visits and optimizing the extraction of medical events from clinical notes. Our main goal is to design a multimodal representation of the patient’s health trajectory to solve clinical prediction problems. Our first work built a framework for modeling irregular medical time series to evaluate the importance of considering the time gaps between medical episodes when representing a patient’s health trajectory. To that end, we conducted a comparative study of sequential neural networks and irregular time representation techniques. The clinical objective was to predict retinopathy complications for type 1 diabetes patients in the French database CaRéDIAB (Champagne Ardenne Réseau Diabetes) using their history of HbA1c measurements. The study results showed that the attention-based model combined with the soft one-hot representation of time gaps led to AUROC score of 88.65% (specificity of 85.56%, sensitivity of 83.33%), an improvement of 4.3% when compared to the LSTM-based model. Motivated by these results, we extended our framework to shorter multivariate time series and predicted in-hospital mortality for critical care patients of the MIMIC-III dataset. The proposed architecture, HiTT, improved the AUC score by 5% over the Transformer baseline. In the second step, we focused on extracting relevant medical information from clinical notes to enrich the patient’s health trajectories. Particularly, Transformer-based architectures showed encouraging results in medical information extraction tasks. However, these complex models require a large, annotated corpus. This requirement is hard to achieve in the medical field as it necessitates access to private patient data and high expert annotators. To reduce annotation cost, we explored active learning strategies that have been shown to be effective in tasks such as text classification, information extraction, and speech recognition. In addition to existing methods, we defined a Hybrid Weighted Uncertainty Sampling active learning strategy that takes advantage of the contextual embeddings learned by the Transformer-based approach to measuring the representativeness of samples. A simulated study using the i2b2-2010 challenge dataset showed that our proposed metric reduces the annotation cost by 70% to achieve the same score as passive learning. Lastly, we combined multivariate medical time series and medical concepts extracted from clinical notes of the MIMIC-III database to train a multimodal transformer-based architecture. The test results of the in-hospital mortality task showed an improvement of 5.3% when considering additional text data. This thesis contributes to patient health trajectory representation by alleviating the burden of episodic medical records and the manual annotation of free-text notes.
... The problem is thus to maintain an uncertain observation set with an in- 20 complete data collection over time, in order to provide on-demand reliable information regarding the person's state. This requires to evaluate whether the value (or state) of a variable has changed since it was observed. ...
... In Continuous Time Bayesian Networks (CTBN) [18], the states of variables evolve continuously over time, and the evolution of each variable depends on the state of its parents in the graph. Although possibly well-adapted 85 to our problem, CTBNs construction requires a level of resource that would be too high regarding the size of the targeted model (this is discussed in the last section of the paper [20]. However, in the current status of our works, we consider only 95 discrete state spaces. ...
Article
The aim of this study is to maintain up-to-date information about the current state of elderly people that are medically followed for risks of fall. Our proposal consists of an individual information database management system that can provide information on-demand on various variables. Such a system has to deal with several sources of uncertainty: lack of information, evolving information and reliability of the information sources. We consider that the features of the person may evolve with time causing uncertainty due to obsolete information. Our context includes new information received bit by bit, with no possibility to collect all required information at once. This paper establishes a first proposal to manage a set of uncertain observations, in order to reduce erroneous and obsolete information while keeping the benefit of previously collected information. We propose an architecture of the system based on a probabilistic knowledge model about the characteristics of interest, a set of decay functions that help to evaluate the confidence degree in previous observations, and a reasoning module to manage new observations, maintain the compatibility and the quality of the observation set. We detail the algorithms of the reasoning module, and the algorithm to update the confidence degree of the observations.
... Therefore, Bueno et al. used a dynamic Bayesian network (DBN) model to capture changes in the patient status distribution and explore potential pathophysiological changes over time [14]. Liu et al. noted that typical time Bayesian networks are divided into dynamic Bayesian networks with an assumption of discrete time and continuous-time Bayesian networks with an assumption of continuous time when studying disease processes and compared the advantages of the two networks in different data distributions [15]. Although this approach is advantageous for analyzing complex medical data, its power has yet to be confirmed, and it further assumes that discrete states are inefficient [3,16]. ...
... The related calculations are shown in formulas (14)(15)(16)(17). ...
Article
Full-text available
Type 2 diabetes mellitus (T2DM) has been identified as one of the most challenging chronic diseases to manage. In recent years, the incidence of T2DM has increased, which has seriously endangered people’s health and life quality. Glycosylated hemoglobin (HbA1c) is the gold standard clinical indicator of the progression of T2DM. An accurate prediction of HbA1c levels not only helps medical workers improve the accuracy of clinical decision-making but also helps patients to better understand the clinical progression of T2DM and conduct self-management to achieve the goal of controlling the progression of T2DM. Therefore, we introduced the long short-term memory (LSTM) neural network to predict patients’ HbA1c levels using time sequential data from electronic medical records (EMRs). We added the self-attention mechanism based on the traditional LSTM to capture the long-term interdependence of feature elements and which ensure that the memory was more profound and effective, and used the gradient search technology to minimize the mean square error of the predicted value of the network and the real value. LSTM with the self-attention mechanism performed better than the traditional deep learning sequence prediction method. Our research provides a good reference for the application of deep learning in the field of medical health management.
... Situational threat estimation is the basis of command decision-making and action control, which has become a hot topic of combat auxiliary decision-making [1,2]. Target threat estimation is an important part of situational threat estimation. ...
Article
The accuracy of target threat estimation has a great impact on command decision-making. The Bayesian network, as an effective way to deal with the problem of uncertainty, can be used to track the change of the target threat level. Unfortunately, the traditional discrete dynamic Bayesian network (DDBN) has the problems of poor parameter learning and poor reasoning accuracy in a small sample environment with partial prior information missing. Considering the finiteness and discreteness of DDBN parameters, a fuzzy k-nearest neighbor (KNN) algorithm based on correlation of feature quantities (CF-FKNN) is proposed for DDBN parameter learning. Firstly, the correlation between feature quantities is calculated, and then the KNN algorithm with fuzzy weight is introduced to fill the missing data. On this basis, a reasonable DDBN structure is constructed by using expert experience to complete DDBN parameter learning and reasoning. Simulation results show that the CF-FKNN algorithm can accurately fill in the data when the samples are seriously missing, and improve the effect of DDBN parameter learning in the case of serious sample missing. With the proposed method, the final target threat assessment results are reasonable, which meets the needs of engineering applications.
... Multi sampling rates-based [38,39,110,111] No artificial dependency; No data imputation. Implementation complexity; Data generation patterns assumptions. ...
Preprint
Irregularly sampled time series (ISTS) data has irregular temporal intervals between observations and different sampling rates between sequences. ISTS commonly appears in healthcare, economics, and geoscience. Especially in the medical environment, the widely used Electronic Health Records (EHRs) have abundant typical irregularly sampled medical time series (ISMTS) data. Developing deep learning methods on EHRs data is critical for personalized treatment, precise diagnosis and medical management. However, it is challenging to directly use deep learning models for ISMTS data. On the one hand, ISMTS data has the intra-series and inter-series relations. Both the local and global structures should be considered. On the other hand, methods should consider the trade-off between task accuracy and model complexity and remain generality and interpretability. So far, many existing works have tried to solve the above problems and have achieved good results. In this paper, we review these deep learning methods from the perspectives of technology and task. Under the technology-driven perspective, we summarize them into two categories - missing data-based methods and raw data-based methods. Under the task-driven perspective, we also summarize them into two categories - data imputation-oriented and downstream task-oriented. For each of them, we point out their advantages and disadvantages. Moreover, we implement some representative methods and compare them on four medical datasets with two tasks. Finally, we discuss the challenges and opportunities in this area.
... Spatial scale examples include visual object recognition [141] and natural language processing [142], with the latter facing multiscale challenges such as how to interpret current text based on past text. Such approaches have had success in numerous applications, including cybersecurity [143], genetic and clinical analyses [144,145], and robotic exploration with real-time learning during continuous actions [146]. ...
Article
Full-text available
Biological and artificial intelligence (AI) are often defined by their capacity to achieve a hierarchy of short-term and long-term goals that require incorporating information over time and space at both local and global scales. More advanced forms of this capacity involve the adaptive modulation of integration across scales, which resolve computational inefficiency and explore-exploit dilemmas at the same time. Research in neuroscience and AI have both made progress towards understanding architectures that achieve this. Insight into biological computations come from phenomena such as decision inertia, habit formation, information search, risky choices and foraging. Across these domains, the brain is equipped with mechanisms (such as the dorsal anterior cingulate and dorsolateral prefrontal cortex) that can represent and modulate across scales, both with top-down control processes and by local to global consolidation as information progresses from sensory to prefrontal areas. Paralleling these biological architectures, progress in AI is marked by innovations in dynamic multiscale modulation, moving from recurrent and convolutional neural networks—with fixed scalings—to attention, transformers, dynamic convolutions, and consciousness priors—which modulate scale to input and increase scale breadth. The use and development of these multiscale innovations in robotic agents, game AI, and natural language processing (NLP) are pushing the boundaries of AI achievements. By juxtaposing biological and artificial intelligence, the present work underscores the critical importance of multiscale processing to general intelligence, as well as highlighting innovations and differences between the future of biological and artificial intelligence.
... Similarity assessment techniques (e.g., [17]). The combination is the focus of this article, with the peculiarity that the human expert is replaced by the expert machine, i.e. the CBR [7,9,14,15]. ...
... The variables are random, and uncertainty can be measured [20]. The applicability of Bayesian models [21], networks [22][23][24], or successful combinations [25], e.g., with gaussian variables [26], is too broad. Certainly, Bayesian methods are more difficult to implement than traditional methods, especially in epidemiology and infection diseases [27]. ...
Article
Full-text available
Infectious diseases are the primary cause of mortality worldwide. The dangers of infectious disease are compounded with antimicrobial resistance, which remains the greatest concern for human health. Although novel approaches are under investigation, the World Health Organization predicts that by 2050, septicaemia caused by antimicrobial resistant bacteria could result in 10 million deaths per year. One of the main challenges in medical microbiology is to develop novel experimental approaches, which enable a better understanding of bacterial infections and antimicrobial resistance. After the introduction of whole genome sequencing, there was a great improvement in bacterial detection and identification, which also enabled the characterization of virulence factors and antimicrobial resistance genes. Today, the use of in silico experiments jointly with computational and machine learning offer an in depth understanding of systems biology, allowing us to use this knowledge for the prevention, prediction, and control of infectious disease. Herein, the aim of this review is to discuss the latest advances in human health engineering and their applicability in the control of infectious diseases. An in-depth knowledge of host–pathogen–protein interactions, combined with a better understanding of a host’s immune response and bacterial fitness, are key determinants for halting infectious diseases and antimicrobial resistance dissemination.
Article
The adoption of electronic health records in hospitals has ensured the availability of large datasets that can be used to predict medical complications. The trajectories of patients in real-world settings are highly variable, making longitudinal data modeling challenging. In recent years, significant progress has been made in the study of deep learning models applied to time series; however, the application of these models to irregular medical time series (IMTS) remains limited. To address this issue, we developed a generic deep-learning-based framework for modeling IMTS that facilitates the comparative studies of sequential neural networks (transformers and long short-term memory) and irregular time representation techniques. A validation study to predict retinopathy complications was conducted on 1,207 patients with type 1 diabetes in a French database using their historical glycosylated hemoglobin measurements, without any data aggregation or imputation. The transformer-based model combined with the soft one-hot representation of time gaps achieved the highest score: an area under the receiver operating characteristic curve of 88.65%, specificity of 85.56%, sensitivity of 83.33% and an improvement of 11.7% over the same architecture without time information. This is the first attempt to predict retinopathy complications in patients with type 1 diabetes using deep learning and longitudinal data collected from patient visits. This study highlighted the significance of modeling time gaps between medical records to improve prediction performance and the utility of a generic framework for conducting extensive comparative studies.
Design/methodology/approach – The literature review was used in this study to extract the Public-Private Partnerships (PPPs) Key Performance Indicators (KPIs). Experts’ judgment and interviews, as well as questionnaires, were designed to obtain data. Copula Bayesian network (CBN) has been selected to achieve the research purpose. CBN is one of the most potent tools in statistics for analyzing the causal relationship of different elements and considering their quantitive impact on each other. By utilizing this technique and using Python as one of the best programming languages, this research used machine learning methods, SHAP and XGBoost, to optimize the network. Purpose - Being an efficient mechanism for the value of money, Public-private partnership (PPP) is one of the most prominent approaches for infrastructure construction. Hence, many controversies about the performance effectiveness of these delivery systems have been debated. This research aims to develop a novel performance management perspective by revealing the causal effect of key performance indicators (KPIs) on PPP infrastructures. Findings - The sensitivity analysis of the KPIs verified the causation importance in PPPs performance management. This study determined the causal structure of KPIs in PPP projects, assessed each indicator's priority to performance, and found 7 of them as a critical cluster to optimize the network. These KPIs include innovation for financing, feasibility study, macro-environment impact, appropriate financing option, risk identification, allocation, sharing, and transfer, finance infrastructure, and compliance with the legal and regulatory framework. Practical implications – Identifying the most scenic indicators helps the private sector to allocate the limited resources more rationally and concentrate on the most influential parts of the project. It also provides the KPIs’ critical cluster that should be controlled and monitored closely by PPP project managers. Additionally, the public sector can evaluate the performance of the private sector more accurately. Finally, this research provides a comprehensive causal insight into the PPPs’ performance management that can be used to develop management systems in future research. Originality/value – For the first time, this research proposes a model to determine the causal structure of KPIs in PPPs and indicate the importance of this insight. The developed innovative model identifies the KPIs' behavior and takes a non-linear approach based on CBN and machine learning methods while providing valuable information for construction and performance managers to allocate resources more efficiently.
Article
Purpose The aim of continuous learning is to obtain and fine-tune information gradually without removing the already existing information. Many conventional approaches in streaming data classification assume that all arrived new data is completely labeled. To regularize Neural Networks (NNs) by merging side information like user-provided labels or pair-wise constraints, incremental semi-supervised learning models need to be introduced. However, they are hard to implement, specifically in non-stationary environments because of the efficiency and sensitivity of such algorithms to parameters. The periodic update and maintenance of the decision method is the significant challenge in incremental algorithms whenever the new data arrives. Design/methodology/approach Hence, this paper plans to develop the meta-learning model for handling continuous or streaming data. Initially, the data pertain to continuous behavior is gathered from diverse benchmark source. Further, the classification of the data is performed by the Recurrent Neural Network (RNN), in which testing weight is adjusted or optimized by the new meta-heuristic algorithm. Here, the weight is updated for reducing the error difference between the target and the measured data when new data is given for testing. The optimized weight updated testing is performed by evaluating the concept-drift and classification accuracy. The new continuous learning by RNN is accomplished by the improved Opposition-based Novel Updating Spotted Hyena Optimization (ONU-SHO). Finally, the experiments with different datasets show that the proposed learning is improved over the conventional models. Findings From the analysis, the accuracy of the ONU-SHO based RNN (ONU-SHO-RNN) was 10.1% advanced than Decision Tree (DT), 7.6% advanced than Naive Bayes (NB), 7.4% advanced than k-nearest neighbors (KNN), 2.5% advanced than Support Vector Machine (SVM) 9.3% advanced than NN, and 10.6% advanced than RNN. Hence, it is confirmed that the ONU-SHO algorithm is performing well for acquiring the best data stream classification. Originality/value This paper introduces a novel meta-learning model using Opposition-based Novel Updating Spotted Hyena Optimization (ONU-SHO)-based Recurrent Neural Network (RNN) for handling continuous or streaming data. This is the first work utilizes a novel meta-learning model using Opposition-based Novel Updating Spotted Hyena Optimization (ONU-SHO)-based Recurrent Neural Network (RNN) for handling continuous or streaming data.
Article
Discrete Dynamic Bayesian Network (dDBN) is used in many challenging causal modelling applications, such as human brain connectivity, due to its multivariate, non-deterministic, and nonlinear capability. Since there is not a ground truth for brain connectivity, the resulting model cannot be evaluated quantitatively. However, we should at least make sure that the best structure results for the used modelling approach and the data. Later, this result can be appreciated by further correlated literature of anatomy and physiology. Nearly all of the previously published studies rest on limited data, which brings doubt to the resulting structures. In theory, an immense number of samples is required, which is impossible to collect in practice. In this study, the appropriate number of data which makes a dDBN modelling trustable is searched by practical experiments and found to be O(Kp+1) for binary and ternary-valued networks, where K is the cardinality of the random variables and p is the maximum number of parents possibly present in the network. If a modelling approach satisfies this amount of data, we can at least say that the resulting structure is trustable.
Article
Causal discovery is considered as a major concept in biomedical informatics contributing to diagnosis, therapy, and prognosis of diseases. Probabilistic causality approaches in epidemiology and medicine is a common method for finding relationships between pathogen and disease, environment and disease, and adverse events and drugs. Bayesian Network (BN) is one of the common approaches for probabilistic causality, which is widely used in health-care and biomedical science. Since in many biomedical applications we deal with temporal dataset, the temporal extension of BNs called Dynamic Bayesian network (DBN) is used for such applications. DBNs define probabilistic relationships between parameters in consecutive time points in the form of a graph and have been successfully used in many biomedical applications. In this paper, a novel method was introduced for finding probabilistic causal chains from a temporal dataset with the help of entropy and causal tendency measures. In this method, first, Causal Features Dependency (CFD) matrix is created on the basis of parameters changes in consecutive events of a phenomenon, and then the probabilistic causal graph is constructed from this matrix based on entropy criteria. At the next step, a set of probabilistic causal chains of the corresponding causal graph is constructed by a novel polynomial-time heuristic. Finally, the causal chains are used for predicting the future trend of the phenomenon. The proposed model was applied to the Pooled Resource Open-Access Clinical Trials (PRO-ACT) dataset related to Amyotrophic Lateral Sclerosis (ALS) disease, in order to predict the progression rate of this disease. The results of comparison with Bayesian tree, random forest, support vector regression, linear regression, and multivariate regression show that the proposed algorithm can compete with these methods and in some cases outperforms other algorithms. This study revealed that probabilistic causality is an appropriate approach for predicting the future states of chronic diseases with unknown cause.
Article
Full-text available
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4+ naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments.
Book
Full-text available
Patients are more empowered to shape their own health care today than ever before. Health information technology (HIT) is creating new opportunities for patients and families to participate actively in their care management. Moreover, HIT is transforming the healthcare system by enabling healthcare providers to partner with their patients in a bold effort to improve quality of care and health outcomes. In this book leading figures discuss the existing needs, challenges and opportunities for improving patient engagement and empowerment through HIT, mapping out what has been accomplished and what work remains to engage patients and transform the care we deliver. .
Article
Full-text available
Cost-effective mobile healthcare must consider not only technological performance but also the division of responsibilities between the patient and care provider, the context of the patient’s condition, and ways to implement patient decision support and tailored interaction. In this paper we discuss four foundational aspects of m-health for disease self-management - support for shared care, context awareness, embedded intelligence, and personalized interaction - and ways to integrate them in mobile technology.
Article
Full-text available
Background Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time. However, nowadays omics data are being generated with increasing time course granularity. Thus, modellers now have the possibility to represent the system as evolving in continuous time and to improve the models¿ expressiveness.ResultsContinuous time Bayesian networks are proposed as a new approach for gene network reconstruction from time course expression data. Their performance was compared to two state-of-the-art methods: dynamic Bayesian networks and Granger causality analysis. On simulated data, the methods comparison was carried out for networks of increasing size, for measurements taken at different time granularity densities and for measurements unevenly spaced over time. Continuous time Bayesian networks outperformed the other methods in terms of the accuracy of regulatory interactions learnt from data for all network sizes. Furthermore, their performance degraded smoothly as the size of the network increased. Continuous time Bayesian networks were significantly better than dynamic Bayesian networks for all time granularities tested and better than Granger causality for dense time series. Both continuous time Bayesian networks and Granger causality performed robustly for unevenly spaced time series, with no significant loss of performance compared to the evenly spaced case, while the same did not hold true for dynamic Bayesian networks. The comparison included the IRMA experimental datasets which confirmed the effectiveness of the proposed method. Continuous time Bayesian networks were then applied to elucidate the regulatory mechanisms controlling murine T helper 17 (Th17) cell differentiation and were found to be effective in discovering well-known regulatory mechanisms, as well as new plausible biological insights.Conclusions Continuous time Bayesian networks were effective on networks of both small and large size and were particularly feasible when the measurements were not evenly distributed over time. Reconstruction of the murine Th17 cell differentiation network using continuous time Bayesian networks revealed several autocrine loops, suggesting that Th17 cells may be auto regulating their own differentiation process.
Article
Full-text available
Autonomous chronic disease management requires models that are able to interpret time series data from patients. However, construction of such models by means of machine learning requires the availability of costly health-care data, often resulting in small samples. We analysed data from chronic obstructive pulmonary disease (COPD) patients with the goal of constructing a model to predict the occurrence of exacerbation events, i.e., episodes of decreased pulmonary health status. Data from 10 COPD patients, gathered with our home monitoring system, were used for temporal Bayesian network learning, combined with bootstrapping methods for data analysis of small data samples.For comparison a temporal variant of augmented naive Bayes models and a temporal nodes Bayesian network (TNBN) were constructed. The performances of the methods were first tested with synthetic data. Subsequently, different COPD models were compared to each other using an external validation data set. The model learning methods are capable of finding good predictive models for our COPD data. Model averaging over models based on bootstrap replications is able to find a good balance between true and false positive rates on predicting COPD exacerbation events. Temporal naive Bayes offers an alternative that trades some performance for a reduction in computation time and easier interpretation.
Article
Full-text available
Geoscientific measurements often provide time series with irregular time sampling, requiring either data reconstruction (interpolation) or sophisticated methods to handle irregular sampling. We compare the linear interpolation technique and different approaches for analyzing the correlation functions and persistence of irregularly sampled time series, as Lomb-Scargle Fourier transformation and kernel-based methods. In a thorough benchmark test we investigate the performance of these techniques. All methods have comparable root mean square errors (RMSEs) for low skewness of the inter-observation time distribution. For high skewness, very irregular data, interpolation bias and RMSE increase strongly. We find a 40 % lower RMSE for the lag-1 autocorrelation function (ACF) for the Gaussian kernel method vs. the linear interpolation scheme,in the analysis of highly irregular time series. For the cross correlation function (CCF) the RMSE is then lower by 60 %. The application of the Lomb-Scargle technique gave results comparable to the kernel methods for the univariate, but poorer results in the bivariate case. Especially the high-frequency components of the signal, where classical methods show a strong bias in ACF and CCF magnitude, are preserved when using the kernel methods. We illustrate the performances of interpolation vs. Gaussian kernel method by applying both to paleo-data from four locations, reflecting late Holocene Asian monsoon variability as derived from speleothem δ<sup>18</sup>O measurements. Cross correlation results are similar for both methods, which we attribute to the long time scales of the common variability. The persistence time (memory) is strongly overestimated when using the standard, interpolation-based, approach. Hence, the Gaussian kernel is a reliable and more robust estimator with significant advantages compared to other techniques and suitable for large scale application to paleo-data.
Conference Paper
Full-text available
In this paper we present a language for nite state con- tinuous time Bayesian networks (CTBNs), which de- scribe structured stochastic processes that evolve over continuous time. The state of the system is decom- posed into a set of local variables whose values change over time. The dynamics of the system are described by specifying the behavior of each local variable as a function of its parents in a directed (possibly cyclic) graph. The model species, at any given point in time, the distribution over two aspects: when a local variable changes its value and the next value it takes. These distributions are determined by the variable's current value and the current values of its parents in the graph. More formally, each variable is modelled as a nite state continuous time Markov process whose transi- tion intensities are functions of its parents. We present a probabilistic semantics for the language in terms of the generative model a CTBN denes over sequences of events. We list types of queries one might ask of a CTBN, discuss the conceptual and computational dif- culties associated with exact inference, and provide an algorithm for approximate inference which takes ad- vantage of the structure within the process.
Conference Paper
Full-text available
Continuous time Bayesian networks (CTBNs) describe structured stochastic processes with finitely many states that evolve over continuous time. A CTBN is a directed (possibly cyclic) dependency graph over a set of variables, each of which represents a finite state continuous time Markov process whose transition model is a function of its parents. We address the problem of learning parameters and structure of a CTBN from fully observed data. We define a conjugate prior for CTBNs, and show how it can be used both for Bayesian parameter estimation and as the basis of a Bayesian score for structure learning. Because acyclicity is not a constraint in CTBNs, we can show that the structure learning problem is significantly easier, both in theory and in practice, than structure learning for dynamic Bayesian networks (DBNs). Furthermore, as CTBNs can tailor the parameters and dependency structure to the different time granularities of the evolution of different variables, they can provide a better fit to continuous-time processes than DBNs with a fixed time granularity.
Article
Full-text available
Intrusion detection systems (IDSs) fall into two high-level categories: network-based systems (NIDS) that monitor network behaviors, and host-based systems (HIDS) that monitor system calls. In this work, we present a general technique for both systems. We use anomaly detection, which identifies patterns not conforming to a historic norm. In both types of systems, the rates of change vary dramatically over time (due to burstiness) and over components (due to service difference). To efficiently model such systems, we use continuous time Bayesian networks (CTBNs) and avoid specifying a fixed update interval common to discrete-time models. We build generative models from the normal training data, and abnormal behaviors are flagged based on their likelihood under this norm. For NIDS, we construct a hierarchical CTBN model for the network packet traces and use Rao-Blackwellized particle filtering to learn the parameters. We illustrate the power of our method through experiments on detecting real worms and identifying hosts on two publicly available network traces, the MAWI dataset and the LBNL dataset. For HIDS, we develop a novel learning method to deal with the finite resolution of system log file time stamps, without losing the benefits of our continuous time model. We demonstrate the method by detecting intrusions in the DARPA 1998 BSM dataset.
Article
Full-text available
Continuous time Bayesian networks are used to diagnose cardiogenic heart failure and to anticipate its likely evolution. The proposed model overcomes the strong modeling and computational limitations of dynamic Bayesian networks. It consists of both unobservable physiological variables, and clinically and instrumentally observable events which might support diagnosis like myocardial infarction and the future occurrence of shock. Three case studies related to cardiogenic heart failure are presented. The model predicts the occurrence of complicating diseases and the persistence of heart failure according to variations of the evidence gathered from the patient. Predictions are shown to be consistent with current pathophysiological medical understanding of clinical pictures.
Article
Full-text available
Current tools for recording chronic obstructive pulmonary disease (COPD) exacerbations are limited and often lack validity testing. We assessed the validity of an automated telephonic exacerbation assessment system (TEXAS) and compared its outcomes with existing tools. Over 12 months, 86 COPD patients (22.1% females; mean age 66.5 yrs; mean post-bronchodilator forced expiratory volume in 1 s 53.4% predicted) were called once every 2 weeks by TEXAS to record changes in respiratory symptoms, unscheduled healthcare utilisation and use of respiratory medication. The responses to TEXAS were validated against exacerbation-related information collected by observations made by trained research assistants during home visits. No care assistance was provided in any way. Diagnostic test characteristics were estimated using commonly used definitions of exacerbation. Detection rates, compliance and patient preference were assessed, and compared with paper diary cards and medical record review. A total of 1,824 successful calls were recorded, of which 292 were verified by home visits (median four calls per patient, interquartile range three to five calls per patient). Independent of the exacerbation definition used, validity was high, with sensitivities and specificities between 66% and 98%. Detection rates and compliance differed extensively between the different tools, but were highest with TEXAS. Patient preference did not differ. TEXAS is a valid tool to assess COPD exacerbation rates in prospective clinical studies. Using different tools to record exacerbations strongly affects exacerbation occurrence rates.
Article
Full-text available
The ability to objectively differentiate exacerbations of chronic obstructive pulmonary disease (COPD) from day-to-day symptom variations would be an important development in clinical practice and research. We assessed the ability of domiciliary pulse oximetry to achieve this. 40 patients with moderate-severe COPD collected daily data on changes in symptoms, heart-rate (HR), oxygen saturation (SpO2) and peak-expiratory flow (PEF) over a total of 2705 days. 31 patients had data suitable for baseline analysis, and 13 patients experienced an exacerbation. Data were expressed as multiples of the standard deviation (SD) observed from each patient when stable. In stable COPD, the SD for HR, SpO2 and PEF were approximately 5 min(-1), 1% and 10l min(-1). There were detectable changes in all three variables just prior to exacerbation onset, greatest 2-3 days following symptom onset. A composite Oximetry Score (mean magnitude of SpO2 fall and HR rise) distinguished exacerbation onset from symptom variation (area under receiver-operating characteristic curve, AUC = 0.832, 95%CI 0.735-0.929, p = 0.003). In the presence of symptoms, a change in Score of ≥1 (average of ≥1SD change in both HR and SpO2) was 71% sensitive and 74% specific for exacerbation onset. We have defined normal variation of pulse oximetry variables in a small sample of patients with COPD. A composite HR and SpO2 score distinguished exacerbation onset from symptom variation, potentially facilitating prompt therapy and providing validation of such events in clinical trials.
Article
Full-text available
New technologies have allowed remote real-time electronic recording of symptoms and spirometry. The feasibility of utilising this technology in COPD patients has not been investigated. This is a feasibility study. The primary objective is to determine whether the use of an electronic diary with a portable spirometer can be performed by COPD patients with a moderate to severe disease. Secondary objectives are to investigate the value of this method in early detection of acute exacerbations of COPD (AECOPD). In this 6-month study, 18 patients recorded daily their symptom score and spirometry. Data was sent on real time. AECOPD which was defined according to pre-set criteria were noted. Spirometry values and scores for health-related quality of life were compared between the start and the end of the study. Hospitalisation rate due to AECOPD was compared with a parallel period in the previous year. On average, patients were able to record 77% of their total study days. The system detected 73% of AECOPD. In further 27% of AECOPD patients sought treatment although the change in symptoms did not meet AECOPD definition. The number of COPD-related hospitalisations significantly reduced compared to the previous year. There was a significant increase in FEV(1) and FVC from the start to the end of the study. The remote monitoring device used in this study can be used in COPD patients. AECOPD was detected early in the majority of cases. Hospitalisation rate due to AECOPD was reduced and FEV(1) and FVC values increased during the study.
Article
Full-text available
We present a brief, nonexhaustive overview of research efforts in designing and developing time-oriented systems in medicine. The growing volume of research on time-oriented systems in medicine can be viewed from either an application point of view, focusing on different generic tasks (e.g. diagnosis) and clinical areas (e.g. cardiology), or from a methodological point of view, distinguishing between different theoretical approaches. In this overview, we focus on highlighting methodological and theoretical choices, and conclude with suggestions for new research directions. Two main research directions can be noted: temporal reasoning, which supports various temporal inference tasks (e.g. temporal abstraction, time-oriented decision support, forecasting, data validation), and temporal data maintenance, which deals with storage and retrieval of data that have heterogeneous temporal dimensions. Efforts common to both research areas include the modeling of time, of temporal entities, and of temporal queries. We suggest that tasks such as abstraction of time-oriented data and the handling of different temporal-granularity levels should provide common ground for collaboration between the two research directions and fruitful areas for future research.
Article
Full-text available
Paper diaries are commonly used in health care and clinical research to assess patient experiences. There is concern that patients do not comply with diary protocols, possibly invalidating the benefit of diary data. Compliance with paper diaries was examined with a paper diary and with an electronic diary that incorporated compliance-enhancing features. Participants were chronic pain patients and they were assigned to use either a paper diary instrumented to track diary use or an electronic diary that time-stamped entries. Participants were instructed to make three pain entries per day at predetermined times for 21 consecutive days. Primary outcome measures were reported vs actual compliance with paper diaries and actual compliance with paper diaries (defined by comparing the written times and the electronically-recorded times of diary use). Actual compliance was recorded by the electronic diary. Participants submitted diary cards corresponding to 90% of assigned times (+/-15 min). However, electronic records indicated that actual compliance was only 11%, indicating a high level of faked compliance. On 32% of all study days the paper diary binder was not opened, yet reported compliance for these days exceeded 90%. For the electronic diary, the actual compliance rate was 94%. In summary, participants with chronic pain enrolled in a study for research were not compliant with paper diaries but were compliant with an electronic diary with enhanced compliance features. The findings call into question the use of paper diaries and suggest that electronic diaries with compliance-enhancing features are a more effective way of collecting diary information.
Article
Capturing heterogeneous dynamic systems in a probabilistic model is a challenging problem. A single time granularity, such as employed by dynamic Bayesian networks, provides insufficient flexibility to capture the dynamics of many real-world processes. The alternative is to assume that time is continuous, giving rise to continuous time Bayesian networks. Here the problem is that the level of temporal detail is too precise to match available probabilistic knowledge. In this paper, we present a novel class of models, called hybrid time Bayesian networks, which combine discrete-time and continuous-time Bayesian networks. The new formalism allows us to more naturally model dynamic systems with regular and irregularly changing variables. We also present a mechanism to construct discrete-time versions of hybrid models and an EM-based algorithm to learn the parameters of the resulting BNs. Its usefulness is illustrated by means of a real-world medical problem.
Article
The continuous time Bayesian network (CTBN) enables reasoning about complex systems by representing the system as a factored, finite-state, continuous-time Markov process. Inference over the model incorporates evidence, given as state observations through time. The time dimension introduces several new types of evidence that are not found with static models. In this work, we present a comprehensive look at the types of evidence in CTBNs. Moreover, we define and extend inference to reason under uncertainty in the presence of uncertain evidence, as well as negative evidence, concepts extended to static models but not yet introduced into the CTBN model.
Article
Many definitions of COPD exacerbations are reported. The choice for a definition determines the number of exacerbations observed. However, the effect of different definitions on the effect sizes of randomized controlled trials is unclear. This article provides an overview of the large variation of definitions of COPD exacerbations from the literature. Furthermore, the effect of using different definitions on effect sizes (relative risk and hazard ratio) was investigated in a randomized controlled discontinuation trial of inhaled corticosteroids. The following definitions were applied: (1) unscheduled medical attention, (2) a course of oral corticosteroids/antibiotics, (3) deterioration in two major or one major and one minor symptom according to Anthonisen (referenced later), (4) a change in one or more symptoms, (5) a change in two or more symptoms, and (6) a combination of numbers 2 and 4. Relative risks for the exacerbation rate ranged from 1.19 to 1.49, and hazard ratios for time to first exacerbation ranged from 1.36 to 1.84 for the various definitions, varying from nonsignificant to significant. Because the definition of a COPD exacerbation has an impact on the effect size of interventions, there is an urgent need for concerted attempts to reach agreement on a definition of an exacerbation. Also, the exact definition to be used in a study should be specified in the protocol
Article
COPD places an enormous burden on the healthcare systems and causes diminished health-related quality of life. The highest proportion of human and economic cost is associated with admissions for acute exacerbation of respiratory symptoms (AECOPD). Since prompt detection and treatment of exacerbations may improve outcomes, early detection of AECOPD is a critical issue. This pilot study was aimed to determine whether a mobile health system could enable early detection of AECOPD on a day-to-day basis. A novel electronic questionnaire for the early detection of COPD exacerbations was evaluated during a 6-months field trial in a group of 16 patients. Pattern recognition techniques were applied. A k-means clustering algorithm was trained and validated, and its accuracy in detecting AECOPD was assessed. Sensitivity and specificity were 74.6 and 89.7 %, respectively, and area under the receiver operating characteristic curve was 0.84. 31 out of 33 AECOPD were early identified with an average of 4.5 ± 2.1 days prior to the onset of the exacerbation that was considered the day of medical attendance. Based on the findings of this preliminary pilot study, the proposed electronic questionnaire and the applied methodology could help to early detect COPD exacerbations on a day-to-day basis and therefore could provide support to patients and physicians.
Article
This paper presents methods for analyzing and manipulating unevenly spaced time se-ries without a transformation to equally spaced data. Processing and analyzing such data in its unaltered form avoids the biases and information loss caused by resampling. Care is taken to develop a framework consistent with a traditional analysis of equally spaced data, as in Brockwell and Davis (1991), Hamilton (1994) and Box, Jenkins, and Reinsel (2004).
Article
On of the basic (and often implicit) assumptions of most first generation diagnostic expert systems is that they operate in a static environment. However, the static domain assumption is very limiting since it requires that all manifestations are observable ...
Article
Longitudinal studies often involve the repeated diagnosis across time of each patient's status with respect to a progressive categorical process. When the occurrence of a change in status is not readily apparent, two factors can make modeling and assessing the incidence rates of progression difficult. First, because diagnoses may be difficult, they may not be performed with the frequency necessary to pinpoint exact times of incidence. Second, uncertainty in the diagnostic process can obscure identification of the time interval in which incidence occurs. When serial diagnoses are fallible, even small error rates can seriously disrupt interpretation and make using the aforementioned methods difficult or impossible. For example, if false diagnoses (both false positives and negatives) occur independently with probability .05 in a longitudinal study involving four serial diagnoses, 19% of the strings of serial diagnoses would be expected to contain at least one error. If the underlying process is progressive, many of these errors would be noticeable: At face value, some patterns of diagnoses would describe regressions. Errors yielding patterns of diagnoses that are progressive would not be detectable. Simply omitting any subjects with inconsistent patterns from the analysis introduces bias. Another possible approach, using the first reported incidence of progression, also introduces bias (Schlesselman 1977). To analyze clinical data on the diagnosis of sexual maturation among subjects with sickle-cell disease, models are developed for jointly parameterizing incidence and error rates. An EM algorithm is presented that allows tractable maximum likelihood estimation even when the times of diagnoses are irregular and vary among subjects. Likelihood ratio tests are used to assess relationships between categorical covariates and both incidence and error rates. Data from the Cooperative Study of Sickle Cell Disease are analyzed to describe the age distribution for the onset of puberty (according to the Tanner stage index) among homozygous (SS) males. Clear delays in maturation are apparent among SS males. Diagnostic error rates for Tanner staging appear to vary with the subject's age. False-positive diagnoses appear to be more common than false-negative diagnoses.
Article
Introduction: Managing chronic disease through automated systems has the potential to both benefit the patient and reduce health-care costs. We have developed and evaluated a disease management system for patients with chronic obstructive pulmonary disease (COPD). Its aim is to predict and detect exacerbations and, through this, help patients self-manage their disease to prevent hospitalisation. Materials: The carefully crafted intelligent system consists of a mobile device that is able to collect case-specific, subjective and objective, physiological data, and to alert the patient by a patient-specific interpretation of the data by means of probabilistic reasoning. Collected data are also sent to a central server for inspection by health-care professionals. Methods: We evaluated the probabilistic model using cross-validation and ROC analyses on data from an earlier study and by an independent data set. Furthermore a pilot with actual COPD patients has been conducted to test technical feasibility and to obtain user feedback. Results: Model evaluation results show that we can reliably detect exacerbations. Pilot study results suggest that an intervention based on this system could be successful.
Article
We investigated whether physiological data can be used for predicting chronic obstructive pulmonary disease (COPD) exacerbations. Home measurements from 57 patients were analysed, during which 10 exacerbations occurred in nine patients. A total of 273 different features were evaluated for their discrimination abilities between periods with and without exacerbations. The analysis showed that if a sensitivity level of 70% is considered to be acceptable, then the specificity was 95% and the AUC was 0.73, i.e. it is possible to discriminate between periods of exacerbation and periods without. A system capable of predicting risk could provide support to COPD patients in their tele-rehabilitation.
Conference Paper
Continuous-time Bayesian networks (CTBNs) (Nodelman, Shelton, & Koller 2002; 2003), are an elegant modeling lan- guage for structured stochastic processes that evolve over continuous time. The CTBN framework is based on homoge- neous Markov processes, and defines two distributions with respect to each local variable in the system, given its parents: an exponential distribution over when the variable transitions, and a multinomial over what is the next value. In this paper, we present two extensions to the framework that make it more useful in modeling practical applications. The first extension models arbitrary transition time distributions using Erlang- Coxian approximations, while maintaining tractable learning. We show how the censored data problem arises in learning the distribution, and present a solution based on expectation- maximization initialized by the Kaplan-Meier estimate. The second extension is a general method for reasoning about negative evidence, by introducing updates that assert no ob- servable events occur over an interval of time. Such updates were not defined in the original CTBN framework, and we show show that their inclusion can significantly improve the accuracy of filtering and prediction. We illustrate and evalu- ate these extensions in two real-world domains, email use and GPS traces of a person traveling about a city.
Article
We present a continuous-time Bayesian network (CTBN) framework for dynamic systems reliability modeling and analysis. Dynamic systems exhibit complex behaviors and interactions between their components; where not only the combination of failure events matters, but so does the sequence ordering of the failures. Similar to dynamic fault trees, the CTBN framework defines a set of 'basic' BN constructs that capture well-defined system components' behaviors and interactions. Combining, in a structured way, the various 'basic' Bayesian network constructs enables the user to construct, in a modular and hierarchical fashion, the system model. Within the CTBN framework, one can perform various analyses, including reliability, sensitivity, and uncertainty analyses. All the analyses allow the user to obtain closed-form solutions.
Thesis
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data. In particular, the main novel technical contributions of this thesis are as follows: a way of representing Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Article
The effects of broad-spectrum antibiotic and placebo therapy in patients with chronic obstructive pulmonary disease in exacerbation were compared in a randomized, double-blinded, crossover trial. Exacerbations were defined in terms of increased dyspnea, sputum production, and sputum purulence. Exacerbations were followed at 3-day intervals by home visits, and those that resolved in 21 days were designated treatment successes. Treatment failures included exacerbations in which symptoms did not resolve but no intervention was necessary, and those in which the patient's condition deteriorated so that intervention was necessary. Over 3.5 years in 173 patients, 362 exacerbations were treated, 180 with placebo and 182 with antibiotic. The success rate with placebo was 55% and with antibiotic 68%. The rate of failure with deterioration was 19% with placebo and 10% with antibiotic. There was a significant benefit associated with antibiotic. Peak flow recovered more rapidly with antibiotic treatment than with placebo. Side effects were uncommon and did not differ between antibiotic and placebo.
Article
Although exacerbations of chronic obstructive pulmonary disease (COPD) are associated with symptomatic and physiological deterioration, little is known of the time course and duration of these changes. We have studied symptoms and lung function changes associated with COPD exacerbations to determine factors affecting recovery from exacerbation. A cohort of 101 patients with moderate to severe COPD (mean FEV(1) 41.9% predicted) were studied over a period of 2.5 yr and regularly followed when stable and during 504 exacerbations. Patients recorded daily morning peak expiratory flow rate (PEFR) and changes in respiratory symptoms on diary cards. A subgroup of 34 patients also recorded daily spirometry. Exacerbations were defined by major symptoms (increased dyspnea, increased sputum purulence, increased sputum volume) and minor symptoms. Before onset of exacerbation there was deterioration in the symptoms of dyspnea, sore throat, cough, and symptoms of a common cold (all p < 0.05), but not lung function. Larger falls in PEFR were associated with symptoms of increased dyspnea (p = 0.014), colds (p = 0.047), or increased wheeze (p = 0.009) at exacerbation. Median recovery times were 6 (interquartile range [IQR] 1 to 14) d for PEFR and 7 (IQR 4 to 14) d for daily total symptom score. Recovery of PEFR to baseline values was complete in only 75.2% of exacerbations at 35 d, whereas in 7.1% of exacerbations at 91 d PEFR recovery had not occurred. In the 404 exacerbations where recovery of PEFR to baseline values was complete at 91 d, increased dyspnea and colds at onset of exacerbation were associated with prolonged recovery times (p < 0.001 in both cases). Symptom changes during exacerbation do not closely reflect those of lung function, but their increase may predict exacerbation, with dyspnea or colds characterizing the more severe. Recovery is incomplete in a significant proportion of COPD exacerbations.
Article
Exacerbations of COPD are thought to be caused by complex interactions between the host, bacteria, viruses, and environmental pollution. These factors increase the inflammatory burden in the lower airways, overwhelming the protective anti-inflammatory defences leading to tissue damage. Frequent exacerbations are associated with increased morbidity and mortality, a faster decline in lung function, and poorer health status, so prevention or optimal treatment of exacerbations is a global priority. In order to evolve new treatment strategies there has been great interest in the aetiology and pathophysiology of exacerbations, but progress has been hindered by the heterogeneous nature of these episodes, vague definitions of an exacerbation, and poor stratification of known confounding factors when interpreting results. We review how an exacerbation should be defined, its inflammatory basis, and the importance of exacerbations on disease progression. Important aetiologies, with their potential underlying mechanisms, are discussed and the significance of each aetiology is considered.
Article
The main aim of this paper is to propose and discuss promising directions of research in the field of temporal representation and reasoning in medicine, taking into account the recent scientific literature and challenging issues of current interest as viewed from the different research perspectives of the authors of the paper. Temporal representation and reasoning in medicine is a well-known field of research in the medical as well as computer science community. It encompasses several topics, such as summarizing data from temporal clinical databases, reasoning on temporal clinical data for therapeutic assessments, and modeling uncertainty in clinical knowledge and data. It is also related to several medical tasks, such as monitoring intensive care patients, providing treatments for chronic patients, as well as planning and scheduling clinical routine activities within complex healthcare organizations. The authors jointly identified significant research areas based on their importance as for temporal representation and reasoning issues; the subjects were considered to be promising topics of future activity. Every subject was addressed in detail by one or two authors and then discussed with the entire team to achieve a consensus about future fields of research. We identified and focused on four research areas, namely (i) fuzzy logic, time, and medicine, (ii) temporal reasoning and data mining, (iii) health information systems, business processes, and time, and (iv) temporal clinical databases. For every area, we first highlighted a few basic notions that would permit any reader--including those who are unfamiliar with the topic--to understand the main goals. We then discuss interesting and promising directions of research, taking into account the recent literature and underlining the yet unresolved medical/clinical issues that deserve further scientific investigation. The considered research areas are by no means disjointed, because they share common theoretical and methodological features. Moreover, subjects of imminent interest in medicine are represented in many of the fields considered. We propose and discuss promising subjects of future research that deserve investigation to develop software systems that will properly manage the multifaceted temporal aspects of information and knowledge encountered by physicians during their clinical work. As the subjects of research have resulted from merging the different perspectives of the authors involved in this study, we hope the paper will succeed in stimulating discussion and multidisciplinary work in the described fields of research.
Article
Exacerbations of chronic obstructive pulmonary disease (COPD) are episodes of worsening of symptoms, leading to substantial morbidity and mortality. COPD exacerbations are associated with increased airway and systemic inflammation and physiological changes, especially the development of hyperinflation. They are triggered mainly by respiratory viruses and bacteria, which infect the lower airway and increase airway inflammation. Some patients are particularly susceptible to exacerbations, and show worse health status and faster disease progression than those who have infrequent exacerbations. Several pharmacological interventions are effective for the reduction of exacerbation frequency and severity in COPD such as inhaled steroids, long-acting bronchodilators, and their combinations. Non-pharmacological therapies such as pulmonary rehabilitation, self-management, and home ventilatory support are becoming increasingly important, but still need to be studied in controlled trials. The future of exacerbation prevention is in assessment of optimum combinations of pharmacological and non-pharmacological therapies that will result in improvement of health status, and reduction of hospital admission and mortality associated with COPD.
Article
A frequent problem in longitudinal studies is that subjects may miss scheduled visits or be assessed at self-selected points in time. As a result, observed outcome data may be highly unbalanced and the availability of the data may be directly related to the outcome measure and/or some auxiliary factors that are associated with the outcome. If the follow-up visit and outcome processes are correlated, then marginal regression analyses will produce biased estimates. Building on the work of Robins, Rotnitzky and Zhao, we propose a class of inverse intensity-of-visit process-weighted estimators in marginal regression models for longitudinal responses that may be observed in continuous time. This allows us to handle arbitrary patterns of missing data as embedded in a subject's visit process. We derive the large sample distribution for our inverse visit-intensity-weighted estimators and investigate their finite sample behaviour by simulation. Our approach is illustrated with a data set from a health services research study in which homeless people with mental illness were randomized to three different treatments and measures of homelessness (as percentage days homeless in the past 3 months) and other auxiliary factors were recorded at follow-up times that are not fixed by design. Copyright 2004 Royal Statistical Society.
Gradient-based hyperparameter optimization through reversible learning
  • D Maclaurin
  • D Duvenaud
  • R Adams
Maclaurin D, Duvenaud D, Adams R. Gradient-based hyperparameter optimization through reversible learning. International conference on machine learning 2015:2113-22.
Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms
  • C Thornton
  • F Hutter
  • H H Hoos
  • K Leyton-Brown
Thornton C, Hutter F, Hoos HH, Leyton-Brown K. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining 2013:847-55. ACM.
Dynamic Bayesian network (DBN) with structure expectation maximization (SEM) for modeling of gene network from time series gene expression data
  • Y Zhang
  • Z Deng
  • H Jiang
  • P Jia
Zhang Y, Deng Z, Jiang H, Jia P. Dynamic Bayesian network (DBN) with structure expectation maximization (SEM) for modeling of gene network from time series gene expression data. in BIOCOMP 2006:41-7.