ArticlePDF Available

# It's About Time: Using Discrete-Time Survival Analysis to Study Duration and the Timing of Events

Authors:

## Abstract

Educational researchers frequently ask whether and, if so, when events occur. Until relatively recently, however, sound statistical methods for answering such questions have not been readily available. In this article, by empirical example and mathematical argument, we demonstrate how the methods of discrete-time survival analysis provide educational statisticians with an ideal framework for studying event occurrence. Using longitudinal data on the career paths of 3,941 special educators as a springboard, we derive maximum likelihood estimators for the parameters of a discrete-time hazard model, and we show how the model can befit using standard logistic regression software. We then distinguish among the several types of main effects and interactions that can be included as predictors in the model, offering data analytic advice for the practitioner. To aid educational statisticians interested in conducting discrete-time survival analysis, we provide illustrative computer code (SAS, 1989) for fitting discrete-time hazard models and for recapturing fitted hazard and survival functions.
Copyright 1993 by the American Educational Research Association and the American Statistical Association; 
reproduced with permission of the publisher. 
... A discrete time logit model is an extension of the logit model that addresses the change across time in the underlying probability of an outcome being true (Allison, 1982). It does this by treating time as discrete units or intervals rather than a continuum, with each interval reflecting a different probability of event occurrence derived from the individuals present during that interval (Allison, 1982;Singer & Willett, 1993). 16 Event histories such as the NLSY97 are ideal data sources for discrete time analysis (Singer & Willett, 1993) because the regular recurrence of follow-up interviews not only defines the length of the interval but also allows measurements to be repeated on each person. ...
... It does this by treating time as discrete units or intervals rather than a continuum, with each interval reflecting a different probability of event occurrence derived from the individuals present during that interval (Allison, 1982;Singer & Willett, 1993). 16 Event histories such as the NLSY97 are ideal data sources for discrete time analysis (Singer & Willett, 1993) because the regular recurrence of follow-up interviews not only defines the length of the interval but also allows measurements to be repeated on each person. Each set of repeated measurements enables the hazard for event occurrence during that interval to be updated. ...
... Each set of repeated measurements enables the hazard for event occurrence during that interval to be updated. When the event of interest is nonrepeatable (such as finishing a bachelor's degree), individuals experiencing an event become right censored, which means the hazard for each subsequent interval is further affected by the decreasing number of surviving individuals (Allison, 1982;Singer & Willett, 1993). This positions the discrete time model alongside survival analysis methods like Cox regression, but with specific advantages for handling time-varying regressors and right-censoring (Bahr, 2009;Roksa & Velez, 2012;Wao, 2010) and the discrete, repeated nature of longitudinal data (Richardson, 2010). ...
Article
Full-text available
Although the positive relationship between social determinants and college attainment is well established, less is known about how social class specifically relates to the linear and timely completion of postsecondary degrees. In this paper, we empirically examine on-time completion of bachelor’s degrees using social class proxies for a national sample of U.S. high school graduates, using the life course perspective and social selection hypothesis to contextualize social effects on the two key transitions—timely full-time enrollment and timely degree completion—that bound the traditional 4-year college pathway. We find strongly positive associations between several social indicators and attainment of both transition events, although effects are larger and more numerous for the initial transition, indicating social selection may be more influential in launching the 4-year college pathway than in completing it. Gradients of social advantage also appear more complexly gendered and racialized at the start of the college pathway than at the end. Finally, we confirm that parenthood is highly incompatible with a 4-year path to a degree regardless of social class and conspicuously more likely to interfere with the timely completion of a bachelor’s degree than other major life transitions.
... Duration or survival models have been applied to the student dropout problem to analyse survival time. In this study, survival time refers the period between the moment a student enters the Higher Education Institution (HEI), and the moment he/she drops out (Singer & Willett, 1993). ...
Article
Full-text available
A combination of mathematical and statistical modelling techniques may be used to analyse student dropout behaviour. The aim of this study is to combine Survival Analysis and Analytic Hierarchy Process methodologies when identifying students at-risk of dropping out. This combination favours the institutional understanding of dropout as a dynamic phenomenon, susceptible to preventative measures and increased efficiency, leading to the curbing of dropout rates. These techniques quantify and qualify the student risk of dropout from an academic program and estimate the probability of persistance, considering the variables framed in academic, institutional, socioeconomic, and individual factors. These factors are provided by the habitus immersed in the particular institutional educational project. The proposal was tested with real data, evaluating the operability and viability of both Survival Analysis and Analytic Hierarchy Process methods. Type of admission, gender, and age were found to be the most influential variables of survival. This novel combination of methods could offer possibilities for decision-making of use in the strengthening of institutional information culture and student support programs, and in efficient allocation of resources.
... The models were fitted separately for each age-group to account for changes in baseline hazards by age and to more accurately characterize discrepancies in mortality caused by health status in each age group. The discrete time complementary log-log model, [34][35][36] a discrete analog of the Cox proportional hazards model, was employed for the proportional hazards models. All statistical analyses were conducted using SAS 9.3 (SAS Institute, Cary NC, USA) and SAS-callable SUDAAN [37] to account for the complex NHIS survey design and weights using SAS PROC SURVEYLOGISTIC. ...
Article
Full-text available
Background: Life expectancy is increasingly incorporated in evidence-based screening and treatment guidelines to facilitate patient-centered clinical decision-making. However, life expectancy estimates from standard life tables do not account for health status, an important prognostic factor for premature death. This study aims to address this research gap and develop life tables incorporating the health status of adults in the United States. Methods: Data from the National Health Interview Survey (1986-2004) linked to mortality follow-up through to 2006 (age ≥ 40, n = 729,531) were used to develop life tables. The impact of self-rated health (excellent, very good, good, fair, poor) on survival was quantified in 5-year age groups, incorporating complex survey design and weights. Life expectancies were estimated by extrapolating the modeled survival probabilities. Results: Life expectancies incorporating health status differed substantially from standard US life tables and by health status. Poor self-rated health more significantly affected the survival of younger compared to older individuals, resulting in substantial decreases in life expectancy. At age 40 years, hazards of dying for white men who reported poor vs. excellent health was 8.5 (95% CI: 7.0,10.3) times greater, resulting in a 23-year difference in life expectancy (poor vs. excellent: 22 vs. 45), while at age 80 years, the hazards ratio was 2.4 (95% CI: 2.1, 2.8) and life expectancy difference was 5 years (5 vs. 10). Relative to the US general population, life expectancies of adults (age < 65) with poor health were approximately 5-15 years shorter. Conclusions: Considerable shortage in life expectancy due to poor self-rated health existed. The life table developed can be helpful by including a patient perspective on their health and be used in conjunction with other predictive models in clinical decision making, particularly for younger adults in poor health, for whom life tables including comorbid conditions are limited.
... Discrete-time logit regression can be applied when time is measured at a discrete (not continuous) time scale; thus, it accommodates multiple persons having the same apparent time the event occurs. 10,17,18 All participants with baseline PQRS data were included. For analyses of the time course of the secondary cognitive outcome measures, we used linear mixed models (LMMs) with a spatial power covariance structure of repeated observations within participant, and personspecific random intercepts. ...
Article
Background: Postoperative delirium and postoperative cognitive dysfunction are the most common complications for older surgical patients. General anesthesia may contribute to the development of these conditions, but there are little data on the association of age with cognitive recovery from anesthesia in the absence of surgery or underlying medical condition. Methods: We performed a single-center cohort study of healthy adult volunteers 40 to 80 years old (N = 71, mean age 58.5 years, and 44% women) with no underlying cognitive dysfunction. Volunteers underwent cognitive testing before and at multiple time points after 2 hours of general anesthesia consisting of propofol induction and sevoflurane maintenance, akin to a general anesthetic for a surgical procedure, although no procedure was performed. The primary outcome was time to recovery to cognitive baseline on the Postoperative Quality of Recovery Scale (PQRS) within 30 days of anesthesia. Secondary cognitive outcomes were time to recovery on in-depth neuropsychological batteries, including the National Institutes of Health Toolbox and well-validated paper-and-pencil tests. The primary hypothesis is that time to recovery of cognitive function after general anesthesia increases across decades from 40 to 80 years of age. We examined this with discrete-time logit regression (for the primary outcome) and linear mixed models for interactions of age decade with time postanesthesia (for secondary outcomes). Results: There was no association between age group and recovery to baseline on the PQRS; 36 of 69 (52%) recovered within 60-minute postanesthesia and 63 of 69 (91%) by day 1. Hazard ratios (95% confidence interval) for each decade compared to 40- to 49-year olds were: 50 to 59 years, 1.41 (0.50-4.03); 60 to 69 years, 1.03 (0.35-3.00); and 70 to 80 years, 0.69 (0.25-1.88). There were no significant differences between older decades relative to the 40- to 49-year reference decade in recovery to baseline on secondary cognitive measures. Conclusions: Recovery of cognitive function to baseline was rapid and did not differ between age decades of participants, although the number in each decade was small. These results suggest that anesthesia alone may not be associated with cognitive recovery in healthy adults of any age decade.
Article
Objectives: Unrecognized clinical deterioration during illness requiring hospitalization is associated with high risk of mortality and long-term morbidity among children. Our objective was to develop and externally validate machine learning algorithms using electronic health records for identifying ICU transfer within 12 hours indicative of a child's condition. Design: Observational cohort study. Setting: Two urban, tertiary-care, academic hospitals (sites 1 and 2). Patients: Pediatric inpatients (age <18 yr). Interventions: None. Measurement and main results: Our primary outcome was direct ward to ICU transfer. Using age, vital signs, and laboratory results, we derived logistic regression with regularization, restricted cubic spline regression, random forest, and gradient boosted machine learning models. Among 50,830 admissions at site 1 and 88,970 admissions at site 2, 1,993 (3.92%) and 2,317 (2.60%) experienced the primary outcome, respectively. Site 1 data were split longitudinally into derivation (2009-2017) and validation (2018-2019), whereas site 2 constituted the external test cohort. Across both sites, the gradient boosted machine was the most accurate model and outperformed a modified version of the Bedside Pediatric Early Warning Score that only used physiologic variables in terms of discrimination (C-statistic site 1: 0.84 vs 0.71, p < 0.001; site 2: 0.80 vs 0.74, p < 0.001), sensitivity, specificity, and number needed to alert. Conclusions: We developed and externally validated a novel machine learning model that identifies ICU transfers in hospitalized children more accurately than current tools. Our model enables early detection of children at risk for deterioration, thereby creating opportunities for intervention and improvement in outcomes.
Article
Full-text available
Identifying factors that influence how ectothermic animals respond physiologically to changing temperatures is of high importance given current threats of global climate change. Host-associated microbial communities impact animal physiology and have been shown to influence host thermal tolerance in invertebrate systems. However, the role of commensal microbiota in the thermal tolerance of ectothermic vertebrates is unknown. Here we show that experimentally manipulating the tadpole microbiome through environmental water sterilization reduces the host’s acute thermal tolerance to both heat and cold, alters the thermal sensitivity of locomotor performance, and reduces animal survival under prolonged heat stress. We show that these tadpoles have reduced activities of mitochondrial enzymes and altered metabolic rates compared with tadpoles colonized with unmanipulated microbiota, which could underlie differences in thermal phenotypes. These results demonstrate a strong link between the microbiota of an ectothermic vertebrate and the host’s thermal tolerance, performance and fitness. It may therefore be important to consider host-associated microbial communities when predicting species’ responses to climate change.
Article
Article
Social media has become a vital platform for voicing product-related experiences that may not only reveal product defects, but also impose pressure on firms to act more promptly than before. This study scrutinizes the rarely studied relationship between these voices and the speed of product recalls in the context of the pharmaceutical industry in which social media pharmacovigilance is becoming increasingly important for the detection of drug safety signals. Using Federal Drug Administration drug enforcement reports and social media data crawled from online forums and Twitter, we investigate whether social media can accelerate the product recall process in the context of drug recalls. Results based on discrete-time survival analyses suggest that more adverse drug reaction discussions on social media lead to a higher hazard rate of the drug being recalled and, thus, a shorter time to recall. To better understand the underlying mechanism, we propose the information effect, which captures how extracting information from social media helps detect more signals and mine signals faster to accelerate product recalls, and the publicity effect, which captures how firms and government agencies are pressured by public concerns to initiate speedy recalls. Estimation results from two mechanism tests support the existence of these conceptualized channels underlying the acceleration hypothesis of social media. This study offers new insights for firms and policymakers concerning the power of social media and its influence on product recalls.
Article
Knowledge diffusion is a significant driving force behind discipline development and technological innovation. Keyword is a unique knowledge diffusion trajectory, in which the sleeping beauty phenomenon sometimes appears. In this paper, we first put forward the concept of Keyword Sleeping Beauties (KSBs) on the basis of the scientific literature phenomenon of sleeping beauties. Then, we construct a parameter-free identification method to distinguish KSBs based on beauty coefficient criteria. Furthermore, we analyze the intrinsic and extrinsic influencing factors to explore the awakening mechanism of KSBs. The experiment results show that sleeping beauty phenomena also exist in the keyword diffusion trajectory and 284 KSBs are identified. The depth of sleep has a positive correlation with awakening intensity, while the length of sleep has a negative correlation with awakening intensity. In the two years of pre-awakening, KSBs tend to appear in the journals with a higher impact factor. In addition, the adoption frequency and the number of KSBs both increase obviously in the one year of pre-awakening. The findings of this paper enrich the patterns of knowledge diffusion and extend academic thinking on the sleeping beauty in science.
Article
Article
A statistical methodology relatively new to education—survival analysis—is used to describe the career paths of over 6,600 special education teachers newly hired in Michigan and North Carolina between 1972 and 1983, following them for up to 13 years, or until they stopped teaching in the state. Beginning special educators in both states continue to teach for an average of 7 years. They are most likely to leave teaching during the first few years after hire; those who survive this initial “hazardous” period typically teach for many years to come. Young women are particularly likely to leave, as are those special educators who provide support services or teach students with speech, hearing, or vision disabilities. Teachers with high test scores are at greater risk of leaving as are teachers paid comparatively low salaries.
Article
The Education for All Handicapped Children Act is an unusual piece of legislation in that it has continued to enjoy bipartisan support in an era of shrinking federal investment in such programs. Judith Singer and John Butler report the findings of a study on the Act's implementation in five diverse school districts across the country, conducted during the fifth through the eighth years of the Act's existence. The process of equilibration between federal demands and the local capacity to respond provides a central focus for the authors as they ask how, and how well, the schools have functioned as agents of social reform. While they find that both significant transformation of attitude and social reform have occurred, they also point to inequities whose roots in the social fabric make them difficult for the schools alone to overcome.
Article
Richard Murnane, Judith Singer, and John Willett analyze data from a larger study on the factors influencing career paths of teachers, focusing specifically on the career paths of White teachers in North Carolina who were first hired between 1976 and 1978. Using methodology known as "hazards modeling," the authors explore the relationship between the risk of leaving teaching, on the one hand, and teacher salary and opportunity cost, on the other hand. By employing hazards models, they are able to examine simultaneously various predictors of risk of leaving teaching — gender, National Teacher Examination (NTE) score, subject specialty, and the level of teaching (elementary or secondary) — and to determine whether the effects of these predictors remain constant or vary across teachers' careers. The authors conclude by discussing implications for policy and for teacher supply and demand models.
Article
Psychologists studying whether and when events occur face unique design and analytic difficulties. The fundamental problem is how to handle censored observations, the people for whom the target event does not occur before data collection ends. The methods of survival analysis overcome these difficulties and allow researchers to describe patterns of occurrence, compare these patterns among groups, and build statistical models of the risk of occurrence over time. This article presents a unified description of survival analysis that focuses on 2 topics: study design and data analysis. In the process, we show how psychologists have used the methods during the past decade and identify new directions for future application. The presentation is based on our own experience with the methods in modeling employee turnover and examples drawn from research on mental health, addiction, social interaction, and the life course.
Article
Various methods of graphically portraying censored data are discussed. These include smoothing and model checking. Many existing graphical methods may be expressed as functionals of the empirical distribution function. When the data are subject to censoring, the Kaplan–Meier estimator of the distribution may replace the empirical distribution in these methods. The plots and problems that arise from such a substitution will be discussed.
Article
We discuss the use of standard logistic regression techniques to estimate hazard rates and survival curves from censored data. These techniques allow the statistician to use parametric regression modeling on censored data in a flexible way that provides both estimates and standard errors. An example is given that demonstrates the increased structure that can be seen in a parametric analysis, as compared with the nonparametric Kaplan-Meier survival curves. In fact, the logistic regression estimates are closely related to Kaplan-Meier curves, and approach the Kaplan-Meier estimate as the number of parameters grows large.
Article
Attempts to determine whether enough qualified teachers will be available to staff the nation's schools in the coming years have been hampered by methodological difficulties that are inherent in the study of teacher career patterns. In this article, we have applied an analytic technique rarely used in educational research, proportional hazards modeling, to resolve these problems and to investigate the relationship between teachers' background characteristics and their career durations. We find that teacher demographic characteristics and subject specialty are important predictors of length of stay in teaching. Our results call into question several assumptions about teacher career persistence implicit in the national teacher supply and demand model. We also argue that proportional hazards modeling has wide applicability to many educational research questions.