The Utility of Prediction Models to Oversample the Long-Term Uninsured
To evaluate the performance of prediction models in identifying the long-term uninsured and their utility for oversampling purposes in national health care surveys. Nationally representative data from the Medical Expenditure Panel Survey (MEPS) were used to examine national estimates of nonelderly adults without health insurance coverage for 2 consecutive years and to identify the factors that distinguished them from the short-term uninsured and those who are continually insured. The MEPS data were also used in the development of the prediction models to identify individuals most likely to experience long-term spells without coverage in the future. The prediction models were developed using data from the MEPS panel covering 2004-2005 and evaluated with an independent MEPS panel. Study findings revealed these prediction models to be markedly effective statistical tools in facilitating an efficient over-sample of individuals likely to be uninsured for long periods of duration in the future. Use of these models for oversampling purposes, to support a 50% increase in sample yield over a self-weighting design, permits the selection of the target sample of individuals who are continuously uninsured for 2 consecutive years in the most cost-efficient manner. This methodology allows for an overall sample size specification for nonelderly adults that is at least 25% lower than a design without access to the predictor variables from a screening interview or without application of oversampling techniques. This examination of the performance of probabilistic models, to both identify and facilitate an oversample of the long-term uninsured, demonstrates the viability of these model-based sampling methodologies for adoption in national health care surveys.