Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design

Department of Biostatistics, Virginia Commonwealth University, 1101 East Marshall St., B1-066, Richmond, VA 23298-0032, USA; Division of Epidemiology and Biostatistics and Department of Statistics, School of Public Health, The Ohio State University, 320 West Tenth Ave., M200 Starling-Loving Hall, Columbus, OH 43210, USA; University of Massachusetts, 128 Worcester Road, Stowe, VT 05672-4320, USA
Computational Statistics & Data Analysis 01/2007; DOI: 10.1016/j.csda.2006.07.006
Source: RePEc

ABSTRACT Logistic regression models are frequently used in epidemiological studies for estimating associations that demographic, behavioral, and risk factor variables have on a dichotomous outcome, such as disease being present versus absent. After the coefficients in a logistic regression model have been estimated, goodness-of-fit of the resulting model should be examined, particularly if the purpose of the model is to estimate probabilities of event occurrences. While various goodness-of-fit tests have been proposed, the properties of these tests have been studied under the assumption that observations selected were independent and identically distributed. Increasingly, epidemiologists are using large-scale sample survey data when fitting logistic regression models, such as the National Health Interview Survey or the National Health and Nutrition Examination Survey. Unfortunately, for such situations no goodness-of-fit testing procedures have been developed or implemented in available software. To address this problem, goodness-of-fit tests for logistic regression models when data are collected using complex sampling designs are proposed. Properties of the proposed tests were examined using extensive simulation studies and results were compared to traditional goodness-of-fit tests. A Stata ado function svylogitgof for estimating the F-adjusted mean residual test after svylogit fit is available at the author's website

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Housing costs are a substantial component of U.S. household expenditures. Those who allocate a large proportion of their income to housing often have to make difficult financial decisions with significant short-term and long-term implications for adults and children. This study employs cross-sectional data from the first wave of the Los Angeles Family and Neighborhood Survey (L.A.FANS) collected between 2000 and 2002 to examine the most common U.S. standard of housing affordability, the likelihood of spending thirty percent or more of income on shelter costs. Multivariate analyses of a low-income sample of U.S. born Latinos, Whites, African Americans, authorized Latino immigrants and unauthorized Latino immigrants focus on baseline and persistent differences in the likelihood of being cost burdened by race, nativity and legal status. Nearly half or more of each group of low-income respondents experience housing affordability problems. The results suggest that immigrants' legal status is the primary source of disparities among those examined, with the multivariate analyses revealing large and persistent disparities for unauthorized Latino immigrants relative to most other groups. Moreover, the higher odds of housing cost burden observed for unauthorized immigrants compared with their authorized immigrant counterparts remains substantial, accounting for traditional indicators of immigrant assimilation. These results are consistent with emerging scholarship regarding the role of legal status in shaping immigrant outcomes in the United States.
    Race and Social Problems 09/2013; 5(3):173-190.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Postoperative venous thromboembolic events (VTEs), which include pulmonary emboli and deep venous thromboses, are potentially preventable causes of death. The aim of this study was to investigate the patient and procedure-related risk factors for the occurrence of VTEs in patients undergoing spinal fusion.METHODS: We used ICD-9-CM (International Classification of Diseases, 9th Revision, Clinical Modification) procedure codes to identify patients in the Nationwide Inpatient Sample (NIS) database for 2001 through 2010 who were treated with spinal fusion. The occurrence of a symptomatic VTE was identified with use of ICD-9-CM diagnosis codes. Patient demographics, hospital characteristics, and comorbidities in the VTE and non-VTE groups were analyzed, and independent risk factors for VTE were identified.RESULTS: A total of 710,154 spinal fusion procedures were identified in the NIS from 2001 to 2010, and 3525 (0.50%) of these patients were recorded as having 3777 VTEs, consisting of 2038 deep venous thromboses (0.29%) and 1739 pulmonary emboli (0.24%). Patients with a VTE were older on average (57.63 years compared with 52.88 years for patients without a VTE) and more often male (VTE incidence, 0.58% compared with 0.42% for female) and black (VTE incidence, 0.78% compared with 0.47% for white). Postoperative VTE occurrence was associated with a longer hospital stay (18.0 compared with 3.94 days) and higher total hospital charges ($207,253 compared with $66,823). A number of comorbidities and procedure-related factors were identified as independent risk factors for VTE.CONCLUSIONS: We present a VTE Risk Index, based on the independent risk factors identified in this study, for the VTE following spinal fusion. In conjunction with current guidelines, this risk index can be used to guide clinical decision-making regarding VTE prophylaxis in patients undergoing spinal fusion.LEVEL OF EVIDENCE: Prognostic Level III. See Instructions for Authors for a complete description of levels of evidence.
    The Journal of Bone and Joint Surgery 10/2013; 96(11):936-942. · 3.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective The objectives of this study are to examine racial and ethnic differences in suicidal behaviour, its main risk factors, and the effect of the risk factors on suicidal behaviour in young adults in the United States. Design Using nationally representative data (n = 10,585) from Add Health, we calculate the prevalence of suicidal behavior and associated risk factors for non-Hispanic White, non-Hispanic Black, and Hispanic youth (aged 18-26) using logistic regression models of suicidal ideation stratified by race. Results Non-Hispanic White and Hispanic young adults have higher rates of suicidal ideation than their non-Hispanic Black counterparts, but racial/ethnic differences in attempts are not statistically significant. Non-Hispanic Whites and Hispanic young adults are more likely to possess key risk factors for suicide. With the exception of substance use variables (i.e. alcohol and marijuana use) which appear to be more conducive to suicidal ideation in non-Hispanic Black than in non-Hispanic White young adults, the effects of risk factors appear to be similar across race/ethnicity. Conclusion The higher prevalence of suicidal ideation in non-Hispanic White and Hispanic young adults may be driven by their greater exposure to risk factors, as opposed to differences in the effects of these risk factors. More research is needed to uncover why non-Hispanic White and Hispanic young adults have higher rates of suicidal ideation than their non-Hispanic Black counterparts; yet, rates of suicide attempts are comparable and non-Hispanic White young adults have the highest rate of completed suicides.
    Ethnicity and Health 10/2013; · 1.20 Impact Factor


Available from