Article

General framework and model building in the class of Hidden Mixture Transition Distribution models

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Modeling time series that present non-Gaussian features plays as central role in many fields, including finance, seismology, psychological, and life course studies. The Hidden Mixture Transition Distribution model is an answer to the complexity of such series. The observed heterogeneity can be induced by one or several latent factors, and each level of these factors is related to a different component of the observed process. The time series is then treated as a mixture and the relation between the components is governed by a Markovian latent transition process. This framework generalizes several specifications that appear separately in related literature. Both the expectation and the standard deviation of each component are allowed to be functions of the past of the process. The latent process can be of any order, and can be modeled using a discrete Mixture Transition Distribution. The effects of covariates at the visible and hidden levels are also investigated. One of the main difficulties lies in correctly specifying the structure of the model. Therefore, we pro-pose a hierarchical model selection procedure that exploits the multilevel structure of our approach. Finally, we illustrate the model and the model selection procedure through a real application in social science.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Markovian models are typically used in this approach, because of their flexibility in terms of adjustment. In addition, Markov models have also been proposed at the latent level, where the distributional characteristics that are used for clustering are not those of the dependent variable under scrutiny, but of a latent variable thought to drive the indicator [4,5]. ...
... The third model we will use is the Hidden Mixture Transition Distribution model (HMTD), which has already been applied in health and social sciences [19,23]. The HMTD is a specific class of Markovian Models, which combines a latent and an observed (visible) level Bolano and Berchtold [4]. The visible level is a Mixture Transition Distribution (MTD) model, which was first introduced by Raftery in 1985 as an approximation of high-order Markov chains [24] and then developed by Berchtold [25] and Berchtold and Raftery [11]. ...
... Nowadays there are multiple clustering methods available to researchers of many disciplines, and interest in these applications is on a steady rise. We have limited ourselves to three methods, never before presented together, which apply to sequences of continuous data, but many other methods exist and deserve attention [4]. ...
Article
Full-text available
In accordance with the theme of this special issue, we present a model that indirectly discovers symmetries and asymmetries between past and present assessments within continuous sequences. More specifically, we present an alternative use of a latent variable version of the Mixture Transition Distribution (MTD) model, which allows for clustering of continuous longitudinal data, called the Hidden MTD (HMTD) model. We compare the HMTD and its clustering performance to the popular Growth Mixture Model (GMM), as well as to the recently introduced GMM based on individual case residuals (ICR-GMM). The GMM and the ICR-GMM contrast with HMTD, because they are based on an explicit change function describing the individual sequences on the dependent variable (here, we implement a non-linear exponential change function). This paper has three objectives. First, it introduces the HMTD. Second, we present the GMM and the ICR-GMM and compare them to the HMTD. Finally, we apply the three models and comment on how the conclusions differ depending on the clustering model, when using a specific dataset in psychology, which is characterized by a small number of sequences (n = 102), but that are relatively long (for the domains of psychology and social sciences: t = 20). We use data from a learning experiment, in which healthy adults (19–80 years old) were asked to perform a perceptual–motor skills over 20 trials.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
In the original version of this book, the second author Emanuela Struffolino was missed to be added as the corresponding author and the affiliation of the author Camilla Borgna was incorrect. The second author Emanuela Struffolino has now also been included as the corresponding author and the affiliation of Camilla Borgna is corrected as Collegio Carlo Alberto, Turin, Italy.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
The aim of this study is to investigate how disabilities and the experiences of work and family during early adulthood affected subsequent mortality in nineteenth century Sundsvall, Sweden. To achieve this, sequence analysis and event history analyses are combined, using digitised parish registers from nineteenth-century Sweden. First, occurrence and type of disability, noted at latest on their 15th birthday, is recorded. Second, life trajectories are analysed using sequence analysis between ages 15 and 33 in order to determine homogeneous groups, given their experience of work and family in their early adulthood. Important demographic events that occur in the life of young adults—first occupation, first marriage and first child—are recorded yearly and cause the person’s trajectory to change state. Third, the groups derived are used as explanatory variables in combination with disability and other variables in Cox regressions with mortality as outcome. The individuals are followed from their 33rd birthday as long as the registers permit and it is noted if the period ends with death or if the observation is censored. The main findings are that the groups found for men are significantly associated with mortality and that mentally disabled women seem to have excess mortality. They also show that sequence analysis can be a valuable tool in summarising individuals’ life paths for use in subsequent analysis.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
This methodological paper presents three case studies of life course analysis in which the clustering of the sequences and probabilistic modelling have been combined or contrasted when analysing a particular research question. Latent variable hierarchical modelling is used in various ways to account for correlation in multidimensional response variables and to model underlying structures in the life course. We conclude that, while sequence analysis allows to use multidimensional and time-dependent criteria either for comparing life patterns of groups of individuals or to extract subgroups for further analysis, the SA outcome may be rather unspecific for causal-like questions.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
This paper uses relational sequence networks to discern gendered patterns in migration biographies. Starting from an integrated bimodal network model of kinship and mobility relations, sequence networks are constructed by classifying mobility events according to the social (kinship or other) relation between the individuals they link together as migrants and hosts. Itineraries thus are conceived of as walks in a space of relational positions. Drawing on 60 migration itineraries from rural South-east Togo (embedded in a larger network of 509 itineraries), we show that male and female trajectories do not so much differ in their degree of mobility as in the topology of the social spaces they traverse and in the structure of the social sequences they trace. Rather than just confirming the macro-tendencies for male and female mobility patterns stated in the demographic literature, sequence network analysis yields insight into the relational logics that bring these tendencies about. Data have been analyzed with the open source software Puck, which implements the model presented in the paper.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
In this article, we propose an innovative method which is a combination of Sequences Analysis and Event History Analysis. We called this method Sequence History Analysis (SHA). We start by identifying typical past trajectories of individuals over time by using Sequence Analysis. We then estimate the effect of these typical past trajectories on the event under study using discrete-time models. The aim of this approach is to estimate the effect of past trajectories on the chances of experiencing an event. We apply the proposed methodological approach to an original study of the effect of past childhood co-residence structures on the chances of leaving the parental home in Switzerland. The empirical research was based on the LIVES Cohort study, a panel survey that started in autumn 2013 in Switzerland. Analyses show that it is not only the occurrence of an event that increases the risk of experiencing another event, but also the order in which various states occurred. What is more, it seems that two features have a significant influence on departure from the parental home: the co-residence structures and the arrival or departure of siblings from the parental home.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
This chapter brings together methodological tools helping to compare sets of multiphase sequences, i.e., sequences structured into successive phases. First, the notions of phase and multiphase sequences are presented. Phases are defined by two properties—internal consistency and processual location—that imply two crucial methodological assumptions: phases are regarded both as sites of narratives and as dissociated incommensurable episodes. Three parameters of the division into phases are distinguished and exemplified: the reference frame of the division, the alphabet(s) of the dissociated phases, and the phase-structure of the sequences. Two ways of rendering multiphase sequences are considered: event-aligned representations and sliced representations. We then introduce multiphase optimal matching, a measure of pairwise distances between multiphase sequences the logic of which can be extended to other dissimilarity measures. Throughout the chapter, an example of two-phase sequences drawn from a study of the careers of participants in professional pâtissier competitions in France is developed.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
Drawing from the literature on “glass ceilings” and “glass escalators”, we analyze gender differences in career advancement across occupations. We argue that gender-typical occupations provide different opportunities for upward mobility in part due to varying institutional rules and work organizational logics. We further extend previous research by looking at two aspects: accessibility to and likelihood of staying in leadership. Using data from the German National Education Panel Study, we ask: (1) Do men demonstrate an advantage regarding access to and staying in leadership? (2) To what extent does occupational segregation explain gender differences in upward mobility? (3) Do gender effects vary across occupations? Using event history analysis, results confirm that occupational gender segregation largely explains gender differences in upward mobility. We further find that the probability of upward mobility is lower in female and higher in male occupations; however, the male advantage is nevertheless weaker in male occupations.
... With regards to clustering, there are attempts to get rid of explicit dissimilarity measures by resorting to latent class approaches (Vermunt et al. 2008;Barban and Billari 2012), or more or less similarly hidden Markov models (e.g., Helske and Helske 2017;Bolano et al. 2016), to clustering the sequences. Markov-based approaches may also contribute to understanding the dynamics that drive the unfolding of the sequences. ...
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
Chapter
Full-text available
In what ways do dual-earner couples organize their workdays and how do they (de)synchronize their daily activities? Using a multichannel sequence analysis approach, the paper tackles these questions. We consider the couples’ division of work-family activities in holistic terms by setting it within the context of everyday life, that is, the overall temporal pattern of combination of His and Her multiple activities. Our multichannel sequence analysis approach is based on a Lexicographic Index that seeks to overcome some optimal matching limits of the sequence analysis. The case-study concerns how Italian dual-earner couples organize their daily activities (sleep, personal care, work, moving, housework, free time), during a typical Monday to Friday work day, 7.00 am to 10.00 pm. The analysis, carried out using the data from the 2008 Italian Census on Time Use (the last one available), involves 873 couples where both partners filled the given diaries on the very same day. All the analyses confirm the idea that dual-earner couples package their life time mainly in accordance with their jobs and eventual children management. Moreover, the analyses show that this time packaging changes in relation to the level of education, social class and the occupational sector of the couple.
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
... Another feature of HMTD that is worth stressing is the possibility of using it to perform different kind of clustering (Bolano and Berchtold 2016). The transition between components is driven by the hidden transition matrix A. In this paper, A was constrained to be a diagonal identity matrix, implying that each sequence was assigned to one and only one group, and all sequences assigned to the same group were described by the same visible model. ...
Chapter
Full-text available
The main aim of this paper is to describe the use of the Markovian-based Hidden Mixture Transition Distribution (HMTD) model for the clustering of longitudinal sequences of continuous data. We especially discuss the use of covariates to improve the clustering process. The HMTD is compared to the well-known Growth Mixture Model (GMM) that is considered here as a gold standard. Both models are applied to a sample of n = 185 adolescents, who are repeatedly evaluated for Internet overuse using the Internet Addiction Test (IAT). The best solution provided by the HMTD model has four groups and it uses five covariates. This solution is related to the subjects’ level of emotional well-being, body mass index, gender, and education track, but shows no relation with age. Compared to a GMM clustering, the HMTD solution provides highly interpretable results with quite equilibrate cluster size, while GMM tends to identify very small clusters allowing for less generalization.
... We used a specific class of Markovian Models, the HMTD model, to cluster the longitudinal sequences of continuous data. This model combines a latent and an observed level (Bolano and Berchtold 2016). The visible level is a Mixture Transition Distribution (MTD) model that was first introduced by Raftery in 1985 as an approximation of high-order Markov chains Raftery (1985) and then developed by Berchtold (2001Berchtold ( , 2003 and Berchtold and Raftery (2002). ...
... A comparison of the two specifications of the mean, with and without covariates, illustrates whether the inclusion of covariates in the model helps to improve the clustering process. It must be mentioned that, in addition to these two HMTD models, many other specifications were tried, following a hierarchical approach (Bolano and Berchtold 2016), but none of these alternative specifications seemed to give a more useful clustering of IAT trajectories. ...
... Another feature of HMTD that is worth stressing is the possibility of using it to perform different kind of clustering (Bolano and Berchtold 2016). The transition between components is driven by the hidden transition matrix A. In this paper, A was constrained to be a diagonal identity matrix, implying that each sequence was assigned to one and only one group, and all sequences assigned to the same group were described by the same visible model. ...
... [38] proposed parsimonious mixture models applied to the study of market ratings. For more references about the applications of mixtures and the MTD models, see for example [39][40][41][42][43][44][45] ...
... In this example, the packet arrivals data retrieved from [42] are fitted to the three models, the EPBMTD, RPBMTD and the ARIMA to compare their predictive performances, and bigdata techologies are used for its analysis . Although these data are not bigdata, the aim of this example is to show how bigdata technologies could be used to its analysis. ...
Article
Full-text available
The pace in the development and adoption of the new technologies for bigdata analytics has changed dramatically over the last several decades, and the amount of data being digitally ingested and stored is expanding exponentially and rapidly. These data include structured, semi-structured and unstructured, and come in different sizes and formats. To utilize these vast resources, the knowledge and the skills needed to manage and to convert it into information is crucial. In this paper, firstly, the commonly used technologies, platforms, computational tools and the techniques currently in use for the ingesting, processing, storing and analyzing bigdata are reviewed. Secondly, those technologies are utilized to predict internet congestion by employing the bivariate mixture transition distribution (BMTD), expectation–maximization (EM) algorithm and the autoregressive integrated moving average (ARIMA) models. BMTD models are very effective in capturing non-Gaussian and nonlinear features, such as bursts of activity and outliers, in a single unified model class. These models do not assume equally spaced, as well as independence, which are the key weaknesses of some other available time series and marked point processes models. Both the Weibull BMTD and the ARIMA models are very effective time series predictive models, but the comparison of their predictive performances is not yet addressed in the statistics and the machine learning literature.
... The effect of each lag and covariate is considered separately, combined by means of a mixture model. The illustrative example used a hidden Markov model with a time dependence of order one, but the proposed MTD approach for covariates can be applied to more complex Markovian models such as high-order hidden Markov models or double chains Markov models [33,34], where both the visible and latent process follow a Markovian process. ...
... In this case, including categorical and/or continuous covariates is relatively easy and straightforward. The expectation and the standard deviation of each Gaussian distribution can be rewritten as a function of the past and of the covariates [34]. More complex is the case of a categorical outcome variable X t and a continuous covariate. ...
Article
Full-text available
This paper presents and discusses the use of a Mixture Transition Distribution-like model (MTD) to account for covariates in Markovian models. The MTD was introduced in 1985 by Raftery as an approximation of higher order Markov chains. In the MTD, each lag is estimated separately using an additive model, which introduces a kind of symmetrical relationship between the past and the present. Here, using an MTD-based approach, we consider each covariate separately, and we combine the effects of the lags and of the covariates by means of a mixture model. This approach has three main advantages. First, no modification of the estimation procedure is needed. Second, it is parsimonious in terms of freely estimated parameters. Third, the weight parameters of the mixture can be used as an indication of the relevance of the covariate in explaining the time dependence between states. An illustrative example taken from life course studies using a 3-state hidden Markov model and a covariate with three levels shows how to interpret the results of such models.
... To identify unobserved distinctive conditions and to analyze the heterogeneity in functional performance, we used an innovative autoregressive longitudinal mixture model called the Hidden Mixture Transition Distributional (HMTD; Bolano & Berchtold, 2016) model. HMTD is a Markov switching regime model that allows the analysis of individual trajectories in terms of change in an underlying construct to account for nonlinearity in disability trajectories. ...
... This study analyzed changes in ADL using the HMTD model introduced by Bolano and Berchtold (2016). The HMTD model is an autoregressive mixture-based model to analyze longitudinal data switching between alternative unobserved regimes. ...
Article
Full-text available
Objective: This study investigated the variability in activities of daily living (ADL) trajectories among 6,155 nursing home residents using unique and rich observational data. Method: The impairment in ADL performance was considered as a dynamic process in a multi-state framework. Using an innovative mixture model, such states were not defined a priori but inferred from the data. Results: The process of change in functional health differed among residents. We identified four latent regimes: stability or slight deterioration, relevant change, variability, and recovery. Impaired body functions and poor physical performance were main risk factors associated with degradation in functional health. Discussion: The evolution of disability in later life is not completely gradual or homogeneous. Steep deterioration in functional health can be followed by periods of stability or even recovery. The current condition can be used to successfully predict the evolution of ADL allowing to set and target different care priorities and practices.
... An MM with a time-varying latent variable is called the hidden or latent Markov model (LMM, see e.g., Zucchini and MacDonald, 2009). With an LMM, we can analyze how the time dependence between observable states is governed by a latent process (e.g., Bolano and Berchtold, 2016). This is particularly useful in life course studies where many non-or hardly observable aspects such as motivations, beliefs, or levels of frailty may influence or explain the observed behavior (Bolano et al. 2019;Piccarreta and Studer, 2019;Han et al., 2020). ...
Article
Full-text available
This article marks the occasion of Social Science Research's 50th anniversary by reflecting on the progress of sequence analysis (SA) since its introduction into the social sciences four decades ago, with focuses on the developments of SA thus far in the social sciences and on its potential future directions. The application of SA in the social sciences, especially in life course research, has mushroomed in the last decade and a half. Using a life course analogy, we examined the birth of SA in the social sciences and its childhood (the first wave), its adolescence and young adulthood (the second wave), and its future mature adulthood in the paper. The paper provides a summary of (1) the important SA research and the historical contexts in which SA was developed by Andrew Abbott, (2) a thorough review of the many methodological developments in visualization, complexity measures, dissimilarity measures, group analysis of dissimilarities, cluster analysis of dissimilarities, multidomain/multichannel SA, dyadic/polyadic SA, Markov chain SA, sequence life course analysis, sequence network analysis, SA in other social science research, and software for SA, and (3) reflections on some future directions of SA including how SA can benefit and inform theory-making in the social sciences, the methods currently being developed, and some remaining challenges facing SA for which we do not yet have any solutions. It is our hope that the reader will take up the challenges and help us improve and grow SA into maturity.
... For non-Gaussian distributions, Ref. [44] considered MTD for high Markov chains and non-Gaussian time series. For more recent references about MTD, see, for example, [45][46][47]. ...
Article
Full-text available
The most effective techniques for predicting time series patterns include machine learning and classical time series methods. The aim of this study is to search for the best artificial intelligence and classical forecasting techniques that can predict the spread of acute respiratory infection (ARI) and pneumonia among under-five-year old children in Somaliland. The techniques used in the study include seasonal autoregressive integrated moving averages (SARIMA), mixture transitions distribution (MTD), and long short term memory (LSTM) deep learning. The data used in the study were monthly observations collected from five regions in Somaliland from 2011–2014. Prediction results from the three best competing models are compared by using root mean square error (RMSE) and absolute mean deviation (MAD) accuracy measures. Results have shown that the deep learning LSTM and MTD models slightly outperformed the classical SARIMA model in predicting ARI values.
... The hidden model is used to assign each respondent to one specific group in function of a set of fixed covariates. For more details, the HMTD was completely described in Bolano and Berchtold (2016), and its estimation was discussed in Taushanov and Berchtold (2017). At the visible level, confidence intervals are obtained using a bootstrap procedure. ...
Article
Full-text available
In this study we explored the development of somatic complaints among adolescents and young adults aged 16 to 30 years in Switzerland. Using data from the Transitions from Education to Employment (TREE) study, we applied a hidden Markovian model with covariates to cluster trajectories representing the sum of eight somatic complaints. The resulting groups differed mainly in terms of gender, reading literacy, and substance use. The trajectories of somatic complaints were also related to the number of critical events experienced by the respondents.
... In that case, the probability of observing a particular value of the variable at time t is modeled through a Gaussian distribution whose mean (and optionally variance) is a function of the past of the variable (Berchtold and Raftery, 2002). Non-homogeneous behaviors can also be represented using a model equivalent to the DCMM introduced in the previous section (Bolano and Berchtold, 2016;Taushanov and Berchtold, 2017), and clustering is also possible. ...
Chapter
The behavior of an animal, with or without interaction with its environment, can generally be decomposed into a set of mutually excluding activity categories such as play, exploration, or rest. Thus, the data consist of one or several series of successive activities which can be used to answer the following questions: Does the probability of observing a particular activity depend on the preceding observed activities? Do some particular patterns of successive activities appear more often than expected by chance only? Which external events influence the behavior of this animal?
... Although we focused here on the case of a single categorical outcome variable (the self-rated health condition), hidden Markov models can be applied to multivariate data (Bartolucci et al., 2012) and to numeric outcome variables. Bolano and Berchtold (2016) for example considered a double chain Markov model for numeric outcomes. ...
Chapter
Full-text available
This is an introduction on discrete-time Hidden Markov models (HMM) for longitudinal data analysis in population and life course studies. In the Markovian perspective, life trajectories are considered as the result of a stochastic process in which the probability of occurrence of a particular state or event depends on the sequence of states observed so far. Markovian models are used to analyze the transition process between successive states. Starting from the traditional formulation of a first-order discrete-time Markov chain where each state is liked to the next one, we present the hidden Markov models where the current response is driven by a latent variable that follows a Markov process. The paper presents also a simple way of handling categorical covariates to capture the effect of external factors on the transition probabilities and existing software are briefly overviewed. Empirical illustrations using data on self reported health demonstrate the relevance of the different extensions for life course analysis.
... Although we focused here on the case of a single categorical outcome variable (the self-rated health condition), hidden Markov models can be applied to multivariate data (Bartolucci et al., 2012) and to numeric outcome variables. Bolano and Berchtold (2016) for example considered a double chain Markov model for numeric outcomes. ...
Conference Paper
Full-text available
This is an introduction on discrete-time Hidden Markov models (HMM)for longitudinal data analysis in population and life course studies. In the Marko-vian perspective, life trajectories are considered as the result of a stochastic processin which the probability of occurrence of a particular state or event depends on thesequence of states observed so far. Markovian models are used to analyze the tran-sition process between successive states. Starting from the traditional formulationof a first-order discrete-time Markov chain where each state is liked to the nextone, we present the hidden Markov models where the current response is drivenby a latent variable that follows a Markov process. The paper presents also a sim-ple way of handling categorical covariates to capture the effect of external factorson the transition probabilities and existing software are briefly overviewed. Empir-ical illustrations using data on self reported health demonstrate the relevance of thedifferent extensions for life course analysis
... Lu et al. (2016) develop a Bayesian framework for this using a finite mixture of joint NLME models and apply this to an AIDS clinical trial. An alternative approach to the heterogeneity seen in longitudinal data and time series is the hidden mixture transition distribution model presented in Bolano and Berchtold (2016). Here, a Markov latent transition process is used to switch between components whose parameters can depend on both observed covariates and the history of the process. ...
Article
Mixture transition distribution time series models build high-order dependence through a weighted combination of first-order transition densities for each one of a specified number of lags. We present a framework to construct stationary mixture transition distribution models that extend beyond linear, Gaussian dynamics. We study conditions for first-order strict stationarity which allow for different constructions with either continuous or discrete families for the first-order transition densities given a pre-specified family for the marginal density, and with general forms for the resulting conditional expectations. Inference and prediction are developed under the Bayesian framework with particular emphasis on flexible, structured priors for the mixture weights. Model properties are investigated both analytically and through synthetic data examples. Finally, Poisson and Lomax examples are illustrated through real data applications.
Preprint
Mixture transition distribution time series models build high-order dependence through a weighted combination of first-order transition densities for each one of a specified number of lags. We present a framework to construct stationary transition mixture distribution models that extend beyond linear, Gaussian dynamics. We study conditions for first-order strict stationarity which allow for different constructions with either continuous or discrete families for the first-order transition densities given a pre-specified family for the marginal density, and with general forms for the resulting conditional expectations. Inference and prediction are developed under the Bayesian framework with particular emphasis on flexible, structured priors for the mixture weights. Model properties are investigated both analytically and through synthetic data examples. Finally, Poisson and Lomax examples are illustrated through real data applications.
Article
Longitudinal methods aggregate individual health histories to produce inferences about aging populations, but to what extent do these summaries reflect the experiences of older adults? We describe the assumption of gradual change built into several influential statistical models and draw on widely used, nationally representative survey data to empirically compare the conclusions drawn from mixed-regression methods (growth curve models and latent class growth analysis) designed to capture trajectories with key descriptive statistics and methods (multistate life tables and sequence analysis) that depict discrete states and transitions. We show that individual-level data record stasis irregularly punctuated by relatively sudden change in health status or mortality. Although change is prevalent in the sample, for individuals it occurs rarely, at irregular times and intervals, and in a nonlinear and multidirectional fashion. We conclude by discussing the implications of this punctuated equilibrium pattern for understanding health changes in individuals and the dynamics of inequality in aging populations.
Conference Paper
Full-text available
In this article, a model-based method for clustering life sequences is suggested. In the social sciences, model-free clustering methods are often used in order to find typical life sequences. The suggested method, which is based on hidden Markov models, provides principled probabilistic ranking of candidate clusterings for choosing the best solution. After presenting the principle of the method and algorithm, the method is tested with real life data, where it finds eight descriptive clusters with clear probabilistic structures. Full text is available at http://sp.cs.tut.fi/WITMSE10/Proceedings/WITMSE2010_Papers/WITMSEHelskeEerolaTabus-1.pdf
Article
Full-text available
Mixture model parameters are usually computed with maximum likelihood using an Expectation-Maximization (EM) algorithm. However, it is well-known that this method can sometimes converge toward a critical point of the solution space which is not the global maximum. To minimize this problem, different strategies using different combinations of algorithms can be used. In this paper, we compare by the mean of numerical simulations strategies using EM, Classification EM, Stochastic EM, and Genetic algorithms for the optimization of mixture models. Our results indicate that two-stage procedures combining both an exploration phase and an optimization phase provide the best results, especially when these methods are applied on several sets of initial conditions rather than on one single starting point.
Article
Full-text available
We introduce the weighted likelihood bootstrap (WLB) as a way to simulate approximately from a posterior distribution. This method is often easy to implement, requiring only an algorithm for calculating the maximum likelihood estimator, such as iteratively reweighted least squares. In the generic weighting scheme, the WLB is first order correct under quite general conditions. Inaccuracies can be removed by using the WLB as a source of samples in the sampling-importance resampling (SIR) algorithm, which also allows incorporation of particular prior information. The SIR- adjusted WLB can be a competitive alternative to other integration methods in certain models. Asymptotic expansions elucidate the second- order properties of the WLB, which is a generalization of Rubin’s Bayesian bootstrap [D. B. Rubin, Ann. Stat. 9, 130-134 (1981)]. The calculation of approximate Bayes factors for model comparison is also considered. We note that, given a sample simulated from the posterior distribution, the required marginal likelihood may be simulation consistently estimated by the harmonic mean of the associated likelihood values; a modification of this estimator that avoids instability is also noted. These methods provide simple ways of calculating approximate Bayes factors and posterior model probabilities for a very wide class of models.
Article
Full-text available
Stochastic modeling of investment returns is an important topic for actuaries who deal with vari-able annuity and segregated fund investment guarantees. The traditional lognormal stock return model is simple, but it is generally less appealing for longer-term problems. In recent years, the use of regime-switching lognormal (RSLN) processes for modeling maturity guarantees has been gaining popularity. In this paper we introduce the class of mixture Gaussian time series processes for modeling long-term stock market returns. It offers an alternative class of models to actuaries who may be experimenting with the RSLN process. We use monthly data from the Toronto Stock Exchange 300 and the Standard and Poor's 500 indices to illustrate the mixture time series mod-eling procedures, and we compare the fits of the mixture models to the lognormal and RSLN models. Finally, we give a numerical example comparing risk measures for a simple segregated fund contract under different stochastic return models.
Article
Full-text available
In discrete time the increment of the logarithm of the price of a risky asset is supposed to involve two parameters which may be thought of as the ‘drift’ and ‘volatility’. It is assumed these parameters take finitely many values, and that they change value like a Markov chain on this state space. Filtering and parameter estimation techniques from Hidden Markov Models are then applied to obtain recursive estimates of the ‘drift’ and ‘volatility’. Further, all parameters in the model can be estimated. The method is illustrated by applying the results to two series of prices.
Article
Full-text available
This article describes the many capabilities offered by the TraMineR toolbox for categorical sequence data. It focuses more specifically on the analysis and rendering of state sequences. Addressed features include the description of sets of sequences by means of transversal aggregated views, the computation of longitudinal characteristics of individual sequences and the measure of pairwise dissimilarities. Special emphasis is put on the multiple ways of visualizing sequences. The core element of the package is the state se- quence object in which we store the set of sequences together with attributes such as the alphabet, state labels and the color palette. The functions can then easily retrieve this information to ensure presentation homogeneity across all printed and graphical displays. The article also demonstrates how TraMineR’s outcomes give access to advanced analyses such as clustering and statistical modeling of sequence data.
Article
Full-text available
Standard real business cycle models must rely on total factor productivity (TFP) shocks to explain the observed comovement of consumption, investment, and hours worked. This paper shows that a neoclassical model consistent with observed heterogeneity in labor supply and consumption can generate comovement in the absence of TFP shocks. Intertemporal substitution of goods and leisure induces comovement over the business cycle through heterogeneity in the consumption behavior of employed and unemployed workers. This result owes to two model features introduced to capture important characteristics of U.S. labor market data. First, individual consumption is affected by the number of hours worked: Employed agents consume more on average than the unemployed do. Second, changes in the employment rate, a central factor explaining variation in total hours, affect aggregate consumption. Demand shocks--such as shifts in the marginal efficiency of investment, as well as government spending shocks and news shocks--are shown to generate economic fluctuations consistent with observed business cycles.
Article
Despite mounting evidence to the contrary, credit migration matrices, used in many credit risk and pricing applications, are typically assumed to be generated by a simple Markov process. Based on empirical evidence, we propose a parsimonious model that is a mixture of (two) Markov chains, where the mixing is on the speed of movement among credit ratings. We estimate this model using credit rating histories and show that the mixture model statistically dominates the simple Markov model and that the differences between two models can be economically meaningful. The non-Markov property of our model implies that the future distribution of a firm's ratings depends not only on its current rating but also on its past rating history. Indeed we find that two firms with identical current credit ratings can have substantially different transition probability vectors. We also find that conditioning on the state of the business cycle or industry group does not remove the heterogeneity with respect to the rate of movement. We go on to compare the performance of mixture and Markov chain using out-of-sample predictions.
Article
In a 1935 paper and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is one-half. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P-values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this article we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology, and psychology. We emphasize the following points:
Article
We propose a mixture autoregressive conditional heteroscedastic (MAR-ARCH) model for modeling nonlinear time series. The models consist of a mixture of K autoregressive components with autoregressive conditional heteroscedasticity; that is, the conditional mean of the process variable follows a mixture AR (MAR) process, whereas the conditional variance of the process variable follows a mixture ARCH process. In addition to the advantage of better description of the conditional distributions from the MAR model, the MARARCH model allows a more flexible squared autocorrelation structure. The stationarity conditions, autocorrelation function, and squared autocorrelation function are derived. Construction of multiple step predictive distributions is discussed. The estimation can be easily done through a simple EM algorithm, and the model selection problem is addressed. The shape-changing feature of the conditional distributions makes these models capable of modeling time series with multimodal conditional distributions and with heteroscedasticity. The models are applied to two real datasets and compared to other competing models. The MAR-ARCH models appear to capture features of the data better than the competing models.
Article
It is argued that P-values and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a single model, they ignore model uncertainty and so underestimate the uncertainty about quantities of interest. The Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented. Implementing this is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software. Specific results are presented for most of the types of model commonly used in sociology. It is shown that this approach overcomes the difficulties with P-values and standard model selection procedures based on them. It also allows easy comparison of nonnested models, and permits the quantification of the evidence for a null hypothesis of interest, such as a convergence theory or a hypothesis about societal norms.
Article
The class of mixture transition distribution (MTD) time series models is extended to general non-Gaussian time series. In these models the conditional distribution of the current observation given the past is a mixture of conditional distributions given each one of the last p observations. They can capture non-Gaussian and nonlinear features such as flat stretches, bursts of activity, outliers changepoints in a single unified model class. They can also represent time series defined on arbitrary state spaces, univariate or multivariate, continuous, discrete or mixed, which need not even be Euclidean. They perform well in the usual case of Gaussian time series without obvious nonstandard behaviors. The models are simple, analytically tractable, easy to simulate readily estimated. The stationarity and autocorrelation properties of the models are derived. A simple EM algorithm is given and shown to work well for estimation. The models are applied to several real and simulated datasets with satisfactory results. They appear to capture the features of the data better than the best competing autoregressive integrated moving average (ARIMA) models.
Article
A class of bivariate continuous-discrete distributions is proposed to fit Poisson dynamic models in a single unified framework via bivariate mixture transition distributions (BMTDs). Potential advantages of this class over the current models include its ability to capture stretches, bursts and nonlinear patterns characterized by Internet network traffic, high-frequency financial data and many others. It models the inter-arrival times and the number of arrivals (marks) in a single unified model which benefits from the dependence structure of the data. The continuous marginal distributions of this class include as special cases the exponential, gamma, Weibull and Rayleigh distributions (for the inter-arrival times), whereas the discrete marginal distributions are geometric and negative binomial. The conditional distributions are Poisson and Erlang. Maximum-likelihood estimation is discussed and parameter estimates are obtained using an expectation–maximization algorithm, while the standard errors are estimated using the missing information principle. It is shown via real data examples that the proposed BMTD models appear to capture data features better than other competing models.
Article
In this paper, a family of Finite Mixed Generalized Linear Models is considered. A straightforward general EM-algorithm for estimating any model from this family by standard GLM-software is given. After discussing the particular problems of statistical inference arising when FMGLMs are used, three estimators of standard errors of the parameter estimates are compared by means of example data and some simulations.
Article
A model for Markov chains of order higher than one is introduced which involves only one additional parameter for each extra lag. Asymptotic properties and the autocorrelation structure are investigated. Three examples are given in which the model appears to model data more successfully than both the usual high-order Markov chain and the alternative models of P. A. Jacobs and P. A. W. Lewis [ibid. 40, 94-105 and 222-228 (1978; Zbl 0374.62087 and Zbl 0388.62086, resp.) and ”Discrete time series generated by mixtures. III: Autoregressive processes (DAR(p))”. Naval Tech. Rep. NPS 55-78-022 (1978)], G. G. S. Pegram [J. Appl. Probab. 17, 350-362 (1980; Zbl 0428.60082)] and J. A. Logan [J. Math. Sociol. 8, 75-89 (1981; Zbl 0472.92018)].
Article
A procedure is derived for extracting the observed information matrix when the EM algorithm is used to find maximum likelihood estimates in incomplete data problems. The technique requires computation of a complete‐data gradient vector or second derivative matrix, but not those associated with the incomplete data likelihood. In addition, a method useful in speeding up the convergence of the EM algorithm is developed. Two examples are presented.
Article
The Double Chain Markov Model is a fully Markovian model for the representation of time-series in random environments. In this article, we show that it can handle transitions of high-order between both a set of observations and a set of hidden states. In order to reduce the number of parameters, each transition matrix can be replaced by a Mixture Transition Distribution model. We provide a complete derivation of the algorithms needed to compute the model. Three applications, the analysis of a sequence of DNA, the song of the wood pewee, and the behavior of young monkeys show that this model is of great interest for the representation of data that can be decomposed into a finite set of patterns.
Article
Variance estimation techniques for nonlinear statistics, such as ratios and regression and correlation coefficients, and functionals, such as quantiles, are reviewed in the context of sampling from stratified populations. In particular, resampling methods such as the bootstrap, the jackknife, and balanced repeated replication are compared with the traditional linearization method for nonlinear statistics and a method based on Woodruff's confidence intervals for the quantiles. Results of empirical studies are presented on the bias and stability of these variance estimators and on confidence-interval coverage probabilities and lengths.Nous réexaminons dans le contexte d'échantillons tirés de populations stratifiées les techniques d'estimation de la variance pour statistiques non linéaires, telles que quotients, coefficients de régression et de corrélation, de měme que pour fonctionnelles, telles que quantiles. Nous comparons en particulier des méthodes de rééchantillonnages comme les méthodes d'auto-amorçage, de Quenouille-Tukey et de répliques équilibrées répétées avec la méthode traditionnelle de linéarisation pour les statistiques non linéaires et une méthode basée sur des intervalles de confiance de type Woodruff pour les quantiles. Nous présentons les résultats d'études expérimentales sur le biais et la stabilité de ces estimateurs de variance de même que sur les probabilités de recouvrement et les longueurs des intervalles de confiance construits à partir de ces derniers.
Article
A model of intraday financial time series is developed. The model is a dynamic factor model consisting of two equations. First, a rate of return of a 'stock' in a single day is assumed to be generated by serveral common factors plus some additive erros ('intraday equation'). Secondly, the joint distribution of those common factors is assumed to depend on the hidden state of the day, which fluctuates according to a Markov chain ('day-by-day equation'). Together the equations compose a hidden Markov model. We investigate properties of the model. Among them is a central limit theorem for cumulative returns, which agrees with the well-known empirical phenomenon in the stock markets that the distributions of longer-horizon returns are closer to the normal. We propose a two-step procedure consisting of the method of principal components and the EM algorithm to estimate the model parameters as well as the unboservable states. In addition, we propose a procedure for predicting intraday returns. Finally, the model is fitted to empirical data, the Standard&Poors 500 Index 5 min return data, to see if the model is capable of describing intraday movements of the index.
Article
The MTD (mixture transition distribution) model based on Weibull distribution (WMTD model) is proposed in this paper, which is aimed at its parameter estimation. An EM algorithm for estimation is given and shown to work well by some simulations. And bootstrap method is used to obtain confidence regions for the parameters. Finally, the results of a real example—predicting stock prices—show that the WMTD model proposed is able to capture the features of the data from thick-tailed distribution better than GMTD (mixture transition distribution) model.
Conference Paper
The Hidden Markov models are generalized by defining a new emission probability which takes the correlation between successive feature vectors into account. Estimation formulas for the iterative learning both along Viterbi and Maximum likelihood criteria are presented.
Article
Despite mounting evidence to the contrary, credit migration matrices, used in many credit risk and pricing applications, are typically assumed to be generated by a simple Markov process. Based on empirical evidence, we propose a parsimonious model that is a mixture of (two) Markov chains, where the mixing is on the speed of movement among credit ratings. We estimate this model using credit rating histories and show that the mixture model statistically dominates the simple Markov model and that the differences between two models can be economically meaningful. The non-Markov property of our model implies that the future distribution of a firm’s ratings depends not only on its current rating but also on its past rating history. Indeed we find that two firms with identical current credit ratings can have substantially different transition probability vectors. We also find that conditioning on the state of the business cycle or industry group does not remove the heterogeneity with respect to the rate of movement. We go on to compare the performance of mixture and Markov chain using out-of-sample predictions.
Article
S ummary A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value situations, applications to grouped, censored or truncated data, finite mixture models, variance component estimation, hyperparameter estimation, iteratively reweighted least squares and factor analysis.
Chapter
Statistics is a subject of many uses and surprisingly few effective practitioners. The traditional road to statistical knowledge is blocked, for most, by a formidable wall of mathematics. The approach in An Introduction to the Bootstrap avoids that wall. It arms scientists and engineers, as well as statisticians, with the computational techniques they need to analyze and understand complicated data sets.
Article
A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value situations, applications to grouped, censored or truncated data, finite mixture models, variance component estimation, hyperparameter estimation, iteratively reweighted least squares and factor analysis.
Article
The analysis of routinely collected surveillance data is an important challenge in public health practice. We present a method based on a hidden Markov model for monitoring such time series. The model characterizes the sequence of measurements by assuming that its probability density function depends on the state of an underlying Markov chain. The parameter vector includes distribution parameters and transition probabilities between the states. Maximum likelihood estimates are obtained with a modified EM algorithm. Extensions are provided to take into account trend and seasonality in the data. The method is demonstrated on two examples: the first seeks to characterize influenza-like illness incidence rates with a mixture of Gaussian distributions, and the other, poliomyelitis counts with mixture of Poisson distributions. The results justify a wider use of this method for analysing surveillance data.
Article
In a clinical trial of a treatment for alcoholism, a common response variable of interest is the number of alcoholic drinks consumed by each subject each day, or an ordinal version of this response, with levels corresponding to abstinence, light drinking and heavy drinking. In these trials, within-subject drinking patterns are often characterized by alternating periods of heavy drinking and abstinence. For this reason, many statistical models for time series that assume steady behavior over time and white noise errors do not fit alcohol data well. In this paper we propose to describe subjects' drinking behavior using Markov models and hidden Markov models (HMMs), which are better suited to describe processes that make sudden, rather than gradual, changes over time. We incorporate random effects into these models using a hierarchical Bayes structure to account for correlated responses within subjects over time, and we estimate the effects of covariates, including a randomized treatment, on the outcome in a novel way. We illustrate the models by fitting them to a large data set from a clinical trial of the drug Naltrexone. The HMM, in particular, fits this data well and also contains unique features that allow for useful clinical interpretations of alcohol consumption behavior. Comment: Published in at http://dx.doi.org/10.1214/09-AOAS282 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Article
We discuss an interpretation of the mixture transition distribution (MTD) for discrete-valued time series which is based on a sequence of independent latent variables which are occasion-specific. We show that, by assuming that this latent process follows a first order Markov Chain, MTD can be generalized in a sensible way. A class of models results which also includes the hidden Markov model (HMM). For these models we outline an EM algorithm for the maximum likelihood estimation which exploits recursions developed within the HMM literature. As an illustration, we provide an example based on the analysis of stock market data referred to different American countries. Copyright Copyright 2010 Blackwell Publishing Ltd
Article
A maximum-penalized-likelihood method is proposed for estimating a mixing distribution and it is shown that this method produces a consistent estimator, in the sense of weak convergence. In particular, a new proof of the consistency of maximum-likelihood estimators is given. The estimated number of components is shown to be at least as large as the true number, for large samples. Also, the large-sample limits of estimators which are constrained to have a fixed finite number of components are identified as distributions minimizing Kullback-Leibler divergence from the true mixing distribution. Estimation of a Poisson mixture distribution is illustrated using the distribution of traffic accidents presented by Simar.
Article
We discuss the following problem given a random sample X = (X 1, X 2,…, X n) from an unknown probability distribution F, estimate the sampling distribution of some prespecified random variable R(X, F), on the basis of the observed data x. (Standard jackknife theory gives an approximate mean and variance in the case R(X, F) = θ(F^)θ(F)\theta \left( {\hat F} \right) - \theta \left( F \right), θ some parameter of interest.) A general method, called the “bootstrap”, is introduced, and shown to work satisfactorily on a variety of estimation problems. The jackknife is shown to be a linear approximation method for the bootstrap. The exposition proceeds by a series of examples: variance of the sample median, error rates in a linear discriminant analysis, ratio estimation, estimating regression parameters, etc.
Article
The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.