Imputation of Incident Events in Longitudinal Cohort Studies

Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Ryals Building, Room 327D, 1665 University Boulevard, Birmingham, AL 35294-0022, USA.
American journal of epidemiology (Impact Factor: 5.23). 07/2011; 174(6):718-26. DOI: 10.1093/aje/kwr155
Source: PubMed


Longitudinal cohort studies normally identify and adjudicate incident events detected during follow-up by retrieving medical records. There are several reasons why the adjudication process may not be successfully completed for a suspected event including the inability to retrieve medical records from hospitals and an insufficient time between the suspected event and data analysis. These "incomplete adjudications" are normally assumed not to be events, an approach which may be associated with loss of precision and introduction of bias. In this article, the authors evaluate the use of multiple imputation methods designed to include incomplete adjudications in analysis. Using data from the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study, 2008-2009, they demonstrate that this approach may increase precision and reduce bias in estimates of the relations between risk factors and incident events.

Download full-text


Available from: George Howard, Sep 08, 2014
1 Follower
28 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Multiple imputation is increasingly regarded as the standard method to account for partially observed data, but most methods have been based on cross-sectional imputation algorithms. Recently, a new multiple-imputation method, the two fold fully conditional specification (FCS) method, was developed to impute missing data in longitudinal datasets with nonmonotone missing data. (See Nevalainen J., Kenward M.G., and Virtanen S.M. 2009. Missing values in longitudinal dietary data: A multiple imputation approach based on a fully conditional specification. Statistics in Medicine 28: 3657-3669.) This method imputes missing data at a given time point based on measurements recorded at the previous and next time points. Up to now, the method has only been tested on a relatively small dataset and under very specific conditions. We have implemented the two fold FCS algorithm in Stata, and in this study we further challenge and evaluate the performance of the algorithm under different scenarios. In simulation studies, we generated 1,000 datasets, which were similar in structure to the longitudinal clinical records (The Health Improvement Network primary care database) to which we will apply the two fold FCS algorithm. Initially, these generated datasets included complete records. We then introduced different levels and patterns of partially observed data patterns and applied the algorithm to generate multiply imputed datasets. The results of our initial multiple imputations demonstrated that the algorithm provided acceptable results when using a linear substantive model and data were imputed over a limited time period for continuous variables such as weight and blood pressure. Introducing an exponential substantive model introduced some bias, but estimates were still within acceptable ranges. We will present results for simulation studies that include situations where categorical and continuous variables change over a 10-year period (for example, smokers become ex-smokers, weight increases or decreases) and large proportions of data are unobserved. We also explore how the algorithm deals with interactions and whether it has any impact on the final data distribution--whether the algorithm is initiated to run forward or backward in time.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Previously in the REasons for Geographic And Racial Differences in Stroke (REGARDS) cohort, we found 18% of the stroke/transient ischemic attack-free study population reported ≥1 stroke symptom at baseline. We sought to evaluate the additional impact of these stroke symptoms on risk for subsequent stroke. REGARDS recruited 30,239 US blacks and whites, aged 45+ years in 2003 to 2007 who are being followed every 6 months for events. All stroke events are physician-verified; those with prior diagnosed stroke or transient ischemic attack are excluded from this analysis. At baseline, participants were asked 6 questions regarding stroke symptoms. Measured stroke risk factors were components of the Framingham Stroke Risk Score. After excluding those with prior stroke or missing data, there were 24,412 participants in this analysis with a median follow-up of 4.4 years. Participants were 39% black, 55% female, and had median age of 64 years. There were 381 physician-verified stroke events. The Framingham Stroke Risk Score explained 72.0% of stroke risk; individual components explained between 0.2% (left ventricular hypertrophy) and 5.7% (age+race) of stroke risk. After adjustment for Framingham Stroke Risk Score factors, stroke symptoms were significantly related to stroke risk: for each stroke symptom reported, the risk of stroke increased by 21% per symptom. Among participants without self-reported stroke or transient ischemic attack, prior stroke symptoms are highly predictive of future stroke events. Compared with Framingham Stroke Risk Score factors, the impact of stroke symptom on the prediction of future stroke was almost as large as the impact of smoking and hypertension and larger than the impact of diabetes and heart disease.
    Stroke 09/2011; 42(11):3122-6. DOI:10.1161/STROKEAHA.110.612937 · 5.72 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Black/white disparities in stroke incidence are well documented, but few studies have assessed the contributions to the disparity. Here we assess the contribution of "traditional" risk factors. A total of 25 714 black and white men and women, aged≥45 years and stroke-free at baseline, were followed for an average of 4.4 years to detect stroke. Mediation analysis using proportional hazards analysis assessed the contribution of traditional risk factors to racial disparities. At age 45 years, incident stroke risk was 2.90 (95% CI: 1.72-4.89) times more likely in blacks than in whites and 1.66 (95% CI: 1.34-2.07) times at age 65 years. Adjustment for risk factors attenuated these excesses by 40% and 45%, respectively, resulting in relative risks of 2.14 (95% CI: 1.25-3.67) and 1.35 (95% CI: 1.08-1.71). Approximately one half of this mediation is attributable to systolic blood pressure. Further adjustment for socioeconomic factors resulted in total mediation of 47% and 53% to relative risks of 2.01 (95% CI: 1.16-3.47) and 1.30 (1.03-1.65), respectively. Between ages 45 to 65 years, approximately half of the racial disparity in stroke risk is attributable to traditional risk factors (primarily systolic blood pressure) and socioeconomic factors, suggesting a critical need to understand the disparity in the development of these traditional risk factors. Because half of the excess stroke risk in blacks is not attributable to traditional risk factors and socioeconomic factors, differential impact of risk factors, residual confounding, or nontraditional risk factors may also play a role.
    Stroke 09/2011; 42(12):3369-75. DOI:10.1161/STROKEAHA.111.625277 · 5.72 Impact Factor
Show more