Article

Mismeasurement and the Resonance of Strong Confounders: Correlated Errors

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Confounding in epidemiology, and the limits of standard methods of control for an imperfectly measured confounder, have been understood for some time. However, most treatments of this problem are based on the assumption that errors of measurement in confounding and confounded variables are independent. This paper considers the situation in which a strong risk factor (confounder) and an inconsequential but suspected risk factor (confounded) are each measured with errors that are correlated; the situation appears especially likely to occur in the field of nutritional epidemiology. Error correlation appears to add little to measurement error as a source of bias in estimating the impact of a strong risk factor: it can add to, diminish, or reverse the bias induced by measurement error in estimating the impact of the inconsequential risk factor. Correlation of measurement errors can add to the difficulty involved in evaluating structures in which confounding and measurement error are present. In its presence, observed correlations among risk factors can be greater than, less than, or even opposite to the true correlations. Interpretation of multivariate epidemiologic structures in which confounding is likely requires evaluation of measurement error structures, including correlations among measurement errors. Am J Epidemiol 1999;150:88–96.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The challenge is how to use external validation data from other similar studies to adjust for the bias in the exposure-outcome association. When exposures are measured with correlated errors, it can be very difficult to predict the direction and strength of the association (Marshall, Hastrup and Ross, 1999). The difficulty is due to contamination effect of the confounder measurement error (Freedman et al., 2011). ...
... can take another form. The assumed distribution, nevertheless, is a reasonable approximation due to its ability to capture common distributional features such as (Marshall et al., 1999). The findings from this study, therefore, cannot be generalized for multiple exposures measured with correlated errors. ...
... The effect of measurement error on the association between an exposure and an outcome of interest has been studied extensively in epidemiology (Carroll et al., 2006;Day et al., 2004;Freedman et al., 2004;Freedman et al., 2008;Kipnis et al., 2003;Marshall et al., 1999), and particularly so in nutritional epidemiology. In nutritional research, the usually weak association between a dietary intake and the risk of a disease can further be distorted by another risk factor that is associated with both the disease and the dietary intake (hereafter, confounder) and by measurement error in the confounder. ...
Thesis
Full-text available
strong>Background Measurement error in exposure variables is an important issue in epidemiological studies that relate exposures to health outcomes. Such studies, however, usually pay limited attention to the quantitative effects of exposure measurement error on estimated exposure-outcome associations. Therefore, the estimators for exposure-outcome associations are prone to bias. Existing methods to adjust for the bias in the associations require a validation study with multiple replicates of a reference measurement. Validation studies with multiple replicates are quite costly and therefore, in some cases only a single–replicate validation study is conducted besides the main study. For a study that does not include an internal validation study, the challenge in dealing with exposure measurement error is even bigger. The challenge is how to use external data from other similar validation studies to adjust for the bias in the exposure-outcome association. In accelerometry research, various accelerometer models have currently been developed. However, some of these new accelerometer models have not been properly validated in field situations. Despite the widely recognized measurement error in the accelerometer, some accelerometers have been used to validate other instruments, such as physical activity questionnaires, in measuring physical activity. Consequently, if an instrument is validated against the accelerometer, and the accelerometer itself has considerable measurement error, the observed validity in the instrument being validated will misrepresent the true validity. Methodology In this thesis, we adapted regression calibration to adjust for exposure measurement error for a single-replicate validation study with zero-inflated reference measurements and assessed the adequacy of the adapted method in a simulation study. For the case where there is no internal validation study, we showed how to combine external data on validity for self-report instruments with the observed questionnaire data to adjust for the bias in the associations caused by measurement error in correlated exposures. In the last part, we applied a measurement error model to assess the measurement error in physical activity as measured by an accelerometer in free-living individuals in a recently concluded validation study. Results The performance of the proposed two-part model was sensitive to the form of continuous independent variables and was minimally influenced by the correlation between the probability of a non-zero response and the actual non-zero response values. Reducing the number of covariates in the model seemed beneficial, but was not critical in large-sample studies. We showed that if the confounder is strongly linked with the outcome, measurement error in the confounder can be more influential than measurement error in the exposure in causing the bias in the exposure-outcome association, and that the bias can be in any direction. We further showed that when accelerometers are used to monitor the level of physical activity in free-living individuals, the mean level of physical activity would be underestimated, the associations between physical activity and health outcomes would be biased, and there would be loss of statistical power to detect associations. Conclusion The following remarks were made from the work in this thesis. First, when only a single-replicate validation study with zero-inflated reference measurements is available, a correctly specified regression calibration can be used to adjust for the bias in the exposure-outcome associations. The performance of the proposed calibration model is influenced more by the assumption made on the form of the continuous covariates than the form of the response distribution. Second, in the absence of an internal validation study, carefully extracted validation data that is transportable to the main study can be used to adjust for the bias in the associations. The proposed method is also useful in conducting sensitivity analyses on the effect of measurement errors. Lastly, when “reference” instruments are themselves marred by substantial bias, the effect of measurement error in an instrument being validated can be seriously underestimated.
... The effect of measurement error on the association between an exposure and an outcome of interest has been studied extensively in epidemiology [1][2][3][4][5][6][7][8][9][10][11][12][13], and particularly so in nutritional epidemiology. In nutritional research, the usually weak association between a dietary intake and the risk of a disease can further be distorted by another risk factor that is associated with both the disease and the dietary intake (hereafter, confounder) and by measurement error in the confounder. ...
... In nutritional research, the usually weak association between a dietary intake and the risk of a disease can further be distorted by another risk factor that is associated with both the disease and the dietary intake (hereafter, confounder) and by measurement error in the confounder. Moreover, the measurement error in the confounder can be more harmful in distorting the diet-disease association than the measurement error in the dietary intake [6]. If measurement error in the confounder is not taken into account, its effects can resonate so that a dietary intake with no effect can appear to have a sizable effect on the risk of a disease [6]. ...
... Moreover, the measurement error in the confounder can be more harmful in distorting the diet-disease association than the measurement error in the dietary intake [6]. If measurement error in the confounder is not taken into account, its effects can resonate so that a dietary intake with no effect can appear to have a sizable effect on the risk of a disease [6]. Resonant confounding due to confounder measurement error can bias the diet-disease association in any direction, even when a researcher adjusts for confounding [6,14]. ...
Article
Full-text available
Background Measurement error in self-reported dietary intakes is known to bias the association between dietary intake and a health outcome of interest such as risk of a disease. The association can be distorted further by mismeasured confounders, leading to invalid results and conclusions. It is, however, difficult to adjust for the bias in the association when there is no internal validation data. Methods We proposed a method to adjust for the bias in the diet-disease association (hereafter, association), due to measurement error in dietary intake and a mismeasured confounder, when there is no internal validation data. The method combines prior information on the validity of the self-report instrument with the observed data to adjust for the bias in the association. We compared the proposed method with the method that ignores the confounder effect, and with the method that ignores measurement errors completely. We assessed the sensitivity of the estimates to various magnitudes of measurement error, error correlations and uncertainty in the literature-reported validation data. We applied the methods to fruits and vegetables (FV) intakes, cigarette smoking (confounder) and all-cause mortality data from the European Prospective Investigation into Cancer and Nutrition study. ResultsUsing the proposed method resulted in about four times increase in the strength of association between FV intake and mortality. For weakly correlated errors, measurement error in the confounder minimally affected the hazard ratio estimate for FV intake. The effect was more pronounced for strong error correlations. Conclusions The proposed method permits sensitivity analysis on measurement error structures and accounts for uncertainties in the reported validity coefficients. The method is useful in assessing the direction and quantifying the magnitude of bias in the association due to measurement errors in the confounders.
... Statistical control for confounding was not adequate. The inability to precisely measure important exposures can cause the effects of these exposures to resonate, to bias estimates of the effects of other exposures (10)(11)(12). We also understand that errors in the measurement of these exposures can be highly correlated (12,13) and that evaluating the effects of exposures in the presence of such measurement errors can be exceedingly complex. ...
... The inability to precisely measure important exposures can cause the effects of these exposures to resonate, to bias estimates of the effects of other exposures (10)(11)(12). We also understand that errors in the measurement of these exposures can be highly correlated (12,13) and that evaluating the effects of exposures in the presence of such measurement errors can be exceedingly complex. The standard statistical approaches usually used to control confounding are not sufficient (11)(12)(13). ...
... We also understand that errors in the measurement of these exposures can be highly correlated (12,13) and that evaluating the effects of exposures in the presence of such measurement errors can be exceedingly complex. The standard statistical approaches usually used to control confounding are not sufficient (11)(12)(13). ...
... Taylor and Thun showed substantial effects of possible CPS-II misclassification of ex-smoking years as " smoking " years [32]. Wacholder et al. [36] Boffetta et al. [4] and Marshall et al. [23, 24] each have noted that study risk estimates are very sensitive to misclassification of exposure as occurred even in Nurses (Table 2) [3]. Recent papers by Jha et al. [15] and Gruer et al. [12] have suggested apparent dominant roles of smoking in socioeconomic mortality disparities. ...
... This study has several implications. First, reducing active, secondhand, and insensible [3] smoking in all and especially the less educated (Fig. 2) and reducing bias in studies assessing or adjusting for smoking effects seem merited [23, 24, 36]. Second, as others have noted, development and use of better smoke load biochemical markers or sentinel health event smoke load bio-indices like lung cancer rates with juried cause of death assessments may be useful [13]. ...
Article
Large, unexplained, but possibly related disparities exist between heart disease risks observed in differing genders, educational levels, times, and studies. Such heart disease disparities might be related to cumulative tobacco smoke damage (smoke load) disparities that are overlooked in standard assessments of point smoking status. So, I reviewed possible relationships between smoke load and heart disease levels across genders, educational strata, years, and leading studies. Smoker heart disease risk assessments in the Nurses Health Study (Nurses), Cancer Prevention Study-II (CPS-II), and British Doctors studies were compared and related to their likely selection and misclassification biases. Relationships between smoke loads and United States (US) education- and gender-related heart disease mortality disparities were qualitatively assessed using lung cancer rates as a smoke load proxy. The high heart disease mortality risks observed in smoking Nurses in 1980-2004 and in less educated US women in 2001 were qualitatively associated with their higher smoke loads and lower selection and exposure misclassification biases than in the CPS-II and Doctors studies. Smoking-attributable heart disease death tolls and disparities extrapolated from mortality ratios from the CPS-II and Doctors studies may be substantial underestimates. Such studies appear to have compared convenience samples of light smokers to lighter smokers instead of comparing representative smokers to the unexposed. Further efforts to minimize smoke exposures and better quantify cumulative smoking-attributable burdens are needed.
... 33 Dependent and correlated errors are particularly problematic as the magnitude and direction of the resulting bias can be extreme and unpredictable. [34][35][36] When data values can change over time, the timing of measurement may be important and may require specific analytic techniques. 37,38 For more information, we refer readers to the Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) detailed guidance, 11 which discusses assessment and handling of time-varying confounders in observational research synthesis. ...
Article
Systematic reviews and meta-analyses are essential for drawing conclusions regarding etiologic associations between exposures or interventions and health outcomes. Observational studies comprise a substantive source of the evidence base. One major threat to their validity is residual confounding, which may occur when component studies adjust for different sets of confounders, fail to control for important confounders, or have classification errors resulting in only partial control of measured confounders. We present the confounder matrix—an approach for defining and summarizing adequate confounding control in systematic reviews of observational studies and incorporating this assessment into meta-analyses. First, an expert group reaches consensus regarding the core confounders that should be controlled and the best available method for their measurement. Second, a matrix graphically depicts how each component study accounted for each confounder. Third, the assessment of control adequacy informs quantitative synthesis. We illustrate the approach with studies of the association between short interpregnancy intervals and preterm birth. Our findings suggest that uncontrolled confounding, notably by reproductive history and sociodemographics, resulted in exaggerated estimates. Moreover, no studies adequately controlled for all core confounders, so we suspect residual confounding is present, even among studies with better control. The confounder matrix serves as an extension of previously published methodological guidance for observational research synthesis, enabling transparent reporting of confounding control and directly informing meta-analysis so that conclusions are drawn from the best available evidence. Widespread application could raise awareness about gaps across a body of work and allow for more valid inference with respect to confounder control. This article is protected by copyright. All rights reserved.
... When the search term 'epidemiology' is entered into PubMed, over 400 000 English-language journal articles of observational studies, clinical trials/studies or reviews/meta-analyses published during this time frame are retrieved, suggesting that relatively few epidemiology studies included a QBA. Despite the development of QBA methods that are relatively easy to implement, we conclude that investigators still too often rely on qualitative judgement based on heuristics, 266 even though exceptions are well documented, 26,35,[267][268][269][270][271][272][273][274][275][276][277][278] rather than quantitatively appraise the expected direction and magnitude of bias when interpreting their research findings. That said, the application of QBA-and to some degree, its quality-did increase steadily during this period. ...
Article
Full-text available
Background Quantitative bias analysis (QBA) measures study errors in terms of direction, magnitude and uncertainty. This systematic review aimed to describe how QBA has been applied in epidemiological research in 2006–19. Methods We searched PubMed for English peer-reviewed studies applying QBA to real-data applications. We also included studies citing selected sources or which were identified in a previous QBA review in pharmacoepidemiology. For each study, we extracted the rationale, methodology, bias-adjusted results and interpretation and assessed factors associated with reproducibility. Results Of the 238 studies, the majority were embedded within papers whose main inferences were drawn from conventional approaches as secondary (sensitivity) analyses to quantity-specific biases (52%) or to assess the extent of bias required to shift the point estimate to the null (25%); 10% were standalone papers. The most common approach was probabilistic (57%). Misclassification was modelled in 57%, uncontrolled confounder(s) in 40% and selection bias in 17%. Most did not consider multiple biases or correlations between errors. When specified, bias parameters came from the literature (48%) more often than internal validation studies (29%). The majority (60%) of analyses resulted in >10% change from the conventional point estimate; however, most investigators (63%) did not alter their original interpretation. Degree of reproducibility related to inclusion of code, formulas, sensitivity analyses and supplementary materials, as well as the QBA rationale. Conclusions QBA applications were rare though increased over time. Future investigators should reference good practices and include details to promote transparency and to serve as a reference for other researchers.
... Moreover, measurement error in confounders, including drug abuse and alcohol intake, may have resulted in residual confounding. However, the direction and magnitude of any imposed bias due to measurement error in the exposure and confounders are not predictable without knowledge of the error structure (15,63,64). The possibility of selection bias is another issue threatening the validity of these findings. ...
Article
There are few if any reports regarding the role of lifetime waterpipe smoking in multiple sclerosis (MS) etiology. The authors investigated the association between waterpipe and MS, adjusted for confounders. This was a population-based incident case-control study conducted in Tehran, Iran. Cases (n=547) were 15–50-year-old patients identified from the Iranian Multiple Sclerosis Society between 2013 and 2015. Population-based controls (n=1057) were 15–50-year old recruited by random digit telephone dialing. A double robust estimator method known as targeted maximum likelihood estimator (TMLE) was used to estimate the marginal risk ratio and odds ratio between waterpipe and MS. The both estimated RR and OR was 1.70 (95% CI: 1.34, 2.17). The population attributable fraction was 21.4% (95% CI: 4.0%, 38.8%). Subject to the limitations of case-control studies in interpreting associations causally, this study suggests that waterpipe use, or its strongly related but undetermined factors, increases the risk of MS. Further epidemiological studies including nested case-control studies are needed to confirm these results.
... 4,5 Discussion of the likelihood, direction, and magnitude of such biases related to exposure and outcome misclassification are ubiquitous in published epidemiologic literature. The consequence that misclassifying a confounding factor has on effect estimates, though long understood and described in previous literature, [6][7][8][9][10][11] may be less well appreciated. When a confounding variable is subject to misclassification, the ability to control for its effect is reduced because some level of confounding (i.e., residual confounding) still remains. ...
Article
When electronic health record (EHR) data are used, multiple approaches may be available for measuring the same variable, introducing potentially confounding factors. While additional information may be gleaned and residual confounding reduced through resource-intensive assessment methods such as natural language processing (NLP), whether the added benefits offset the added cost of the additional resources is not straightforward. We evaluated the implications of misclassification of a confounder when using EHRs. Using a combination of simulations and real data surrounding hospital readmission, we considered smoking as a potential confounder. We compared ICD-9 diagnostic code assignment, which is an easily available measure but has the possibility of substantial misclassification of smoking status, with NLP, a method of determining smoking status that more expensive and time-consuming than ICD-9 code assignment but has less potential for misclassification. Classification of smoking status with NLP consistently produced less residual confounding than the use of ICD-9 codes; however, when minimal confounding was present, differences between the approaches were small. When considerable confounding is present, investing in a superior measurement tool becomes advantageous.
... In the CRC screening project, health officers and research staff with a structured questionnaire had interviewed the dietary intake. But the dietary intake are depended on individual recall like as a self-report instrument that can be caused measurement error in a major risk factor can affect the association estimate of a suspected risk factor and even slight measurement error (Marshall et al., 1999;Agogo et al., 2016). ...
Article
Full-text available
Background There is convincing evidence from epidemiological studies that meat consumption increases colorectal cancer (CRC) risk. However, assessment of any association with a positive fecal immunochemical test (FIT) in CRC screening has been limited. If a link could be shown this might be helpful for establishing a risk group for colonoscopy. Objective This study aimed to assess any association between meat consumption and other lifestyle factors and a positive FIT result in a Thai population. Methods A cross-sectional analytical study was conducted with 1,167 participants in a population-based randomized controlled trial. CRC was screened from May 2016 - February 2017. Subjects aged 45-74 years who met the eligibility criteria were randomly allocated to the study arm. A positive FIT was determined with cut-off 100 ng/mL. Multiple logistic regression was used to analyze any relationship between lifestyle factors and a positive FIT. Result The total number of subjects was 1,060 (90.8% return rate of FIT). With FIT100, FIT150, and FIT200, positive tests were found in 92 (8.68%), 74 (6.98%), and 60 (5.66%), respectively. No significant associations were noted with any of the variables, except for being aged 60-74 years (ORadj = 1.62, 95%CI: 1.03-2.54) Borderline significance was observed for high consumption of vegetables (ORadj = 0.62, 95%CI: 0.36-1.07) and being male (ORadj = 1.39, 95%CI:0.87-2.22). Conclusion Despite the evidence from the literature, no association was here found between a positive FIT result and meat consumption or other well-established lifestyle parameters. Being aged 60-74 years was a risk factor which should be taken into account in CRC screening strategy in countries like Thailand with limited access to endoscopy.
... Assessment of intensity and duration in a cohort study is subject to considerable measurement imprecision, and classification by smoking status can create fuzzy boundaries. This may result in incomplete control for the impact of smoking and bias estimates of an intervening variable like periodontal disease (30,31). Examination of neversmokers in our large group of participants allowed us to attempt to tease that out; we found the risk of total cancer among neversmokers with a history of periodontal disease remained statistically significant and largely unchanged (HR, 1.12; 95% CI, 1.04-1.22). ...
Article
Full-text available
Background: Periodontal pathogens have been isolated from precancerous and cancerous lesions and also shown to promote a procarcinogenic microenvironment. Few studies have examined periodontal disease as a risk factor for total cancer, and none have focused on older women. We examined whether periodontal disease is associated with incident cancer among postmenopausal women in the Women's Health Initiative Observational Study. Methods: Our prospective cohort study comprised 65,869 women, ages 54 to 86 years. Periodontal disease information was obtained via self-report questionnaires administered between 1999 and 2003, whereas ascertainment of cancer outcomes occurred through September 2013, with a maximum follow-up period of 15 years. Physician-adjudicated incident total cancers were the main outcomes and site-specific cancers were secondary outcomes. HRs and 95% confidence intervals (CI) were calculated using Cox proportional hazards regression. All analyses were conducted two-sided. Results: During a mean follow-up of 8.32 years, 7,149 cancers were identified. Periodontal disease history was associated with increased total cancer risk (multivariable-adjusted HR, 1.14; 95% CI, 1.08–1.20); findings were similar in analyses limited to 34,097 never-smokers (HR, 1.12; 95% CI, 1.04–1.22). Associations were observed for breast (HR, 1.13; 95% CI, 1.03–1.23), lung (HR, 1.31; 95% CI, 1.14–1.51), esophagus (HR, 3.28; 95% CI, 1.64–6.53), gallbladder (HR, 1.73; 95% CI, 1.01–2.95), and melanoma skin (HR, 1.23; 95% CI, 1.02–1.48) cancers. Stomach cancer was borderline (HR, 1.58; 95% CI, 0.94–2.67). Conclusions: Periodontal disease increases risk of total cancer among older women, irrespective of smoking, and certain anatomic sites appear to be vulnerable. Impact: Our findings support the need for further understanding of the effect of periodontal disease on cancer outcomes. Cancer Epidemiol Biomarkers Prev; 26(8); 1255–65. ©2017 AACR.
... In Kenfield et al. [11], for example, smokers were more likely than never smokers or those who had quit 10 or more years prior to diagnosis to be diagnosed with advanced disease, with higher Gleason grade, to be treated with hormones or watchful waiting, to have an elevated PSA at diagnosis; they were less likely to be treated by prostatectomy and less likely to have been diagnosed by PSA. Inability to precisely measure such confounders increases the likelihood of residual confounding [18,19]. The homogeneity of this study is an advantage, speaking to the diminished likelihood of confounding due to the stage or grade of disease, hospital treatment patterns, or access. ...
Article
Full-text available
Cigarette smoking has been consistently associated with increased risk of overall mortality, but the importance of smoking for patients with prostate cancer (CaP) who are candidates for curative radical prostatectomy (RP) has received less attention. This retrospectively designed cohort study investigated the association of smoking history at RP with subsequent CaP treatment outcomes and overall mortality. A total of 1981 patients who underwent RP at Roswell Park Cancer Institute (RPCI) between 1993 and 2014 were studied. Smoking history was considered as a risk factor for overall mortality as well as for currently accepted CaP treatment outcomes (biochemical failure, treatment failure, distant metastasis, and disease-specific mortality). The associations of smoking status with these outcomes were tested by Cox proportional hazard analyses. A total of 153 (8%) patients died during follow-up. Current smoking at diagnosis was a statistically significant predictor of overall mortality after RP (current smokers vs. former and never smokers, hazards ratio 2.07, 95% confidence interval [CI]: 1.36–3.14). This association persisted for overall mortality at 3, 5, and 10 years (odds ratios 2.07 [95% CI: 1.36–3.15], 2.05 [95% CI: 1.35–3.12], and 1.8 [95% CI: 1.18–2.74], respectively). Smoking was not associated with biochemical failure, treatment failure, distant metastasis, or CaP-specific mortality, and the association of smoking with overall mortality did not appear to be functionally related to treatment or biochemical failure, or to distant metastasis. Smoking is a non-negligible risk factor for death among CaP patients who undergo RP; patients who smoke are far more likely to die of causes other than CaP.
... Second, we focused on a univariate case where a single exposure variable is measured with error. Though not the focus of this study, in many epidemiologic studies, exposures are often measured with correlated errors (Marshall et al., 1999). The findings from this study, therefore, cannot be generalized for multiple exposures measured with correlated errors. ...
... Second, we focused on a univariate case where a single exposure variable is measured with error. Though not the focus of this study, in many epidemiologic studies, exposures are often measured with correlated errors (Marshall et al., 1999). The findings from this study, therefore, cannot be generalized for multiple exposures measured with correlated errors. ...
Article
Dietary questionnaires are prone to measurement error, which bias the perceived association between dietary intake and risk of disease. Short-term measurements are required to adjust for the bias in the association. For foods that are not consumed daily, the short-term measurements are often characterized by excess zeroes. Via a simulation study, the performance of a two-part calibration model that was developed for a single-replicate study design was assessed by mimicking leafy vegetable intake reports from the multicenter European Prospective Investigation into Cancer and Nutrition (EPIC) study. In part I of the fitted two-part calibration model, a logistic distribution was assumed; in part II, a gamma distribution was assumed. The model was assessed with respect to the magnitude of the correlation between the consumption probability and the consumed amount (hereafter, cross-part correlation), the number and form of covariates in the calibration model, the percentage of zero response values, and the magnitude of the measurement error in the dietary intake. From the simulation study results, transforming the dietary variable in the regression calibration to an appropriate scale was found to be the most important factor for the model performance. Reducing the number of covariates in the model could be beneficial, but was not critical in large-sample studies. The performance was remarkably robust when fitting a one-part rather than a two-part model. The model performance was minimally affected by the cross-part correlation.
... 21 22 We have previously argued a decade ago that measurement error, as opposed to backdoor confounding induced by conditioning on a mediator that also happens to be a collider, is likely to be at least as important as a source of error in practice. 21 Our argument was based on the mathematical parallel with confounder adjustment whereby non-differential measurement error of confounders rapidly attenuates adjustment, resulting in residual confounding [23][24][25][26] ; we reasoned the same logic would apply to measurement error of the mediator. However, it is only recently that methodological work has more closely looked at this issue. ...
Article
Full-text available
Background Confounding of mediator–outcome associations resulting in collider biases causes systematic error when estimating direct and indirect effects. However, until recently little attention has been given to the impact of misclassification bias. Objective To quantify the impact of non-differential and independent misclassification of a dichotomous exposure and a dichotomous mediator on three target parameters: the total effect of exposure on outcome; the direct effect (by conditioning on the mediator); and the indirect effect (identified by the percentage reduction in the excess OR on adjusting for the mediator). Methods Simulations were conducted for varying strength of associations between exposure, mediator and outcome, varying ratios of exposed to unexposed and mediator present to mediator absent, and varying sensitivity and specificity of exposure and mediator classification. Results ORs before (total effect) and after adjustment (direct effect) for the mediator are both biased towards the null by non-differential misclassification of the exposure, but the percentage reduction in the excess OR is little affected by misclassification of exposure. Conversely, misclassification of the mediator rapidly biases the percentage reduction of the excess OR (indirect effect) downwards. Conclusions If the research objective is to quantify the proportion of the total association that is due to mediation (ie, indirect effect), then minimising non-differential misclassification bias of the mediator is more important than that for the exposure. Misclassification bias is an important source of error when estimating direct and indirect effects.
... Such corrections assume independent error and underestimate the bias. Correlated errors with weak confounding factors can reverse the direction and positively or negatively bias the true association, whereas correlated errors and strong risk factors do not lead to appreciable bias with independent measurement error (5). We extended these findings to the case of 2 correlated variables associated with the outcome, subject to correlated error, interactions, and LODs, and observed upward and downward bias. ...
Article
Full-text available
Utilizing multiple biomarkers is increasingly common in epidemiology. However, the combined impact of correlated exposure measurement error, unmeasured confounding, interaction, and limits of detection (LODs) on inference for multiple biomarkers is unknown. We conducted data-driven simulations evaluating bias from correlated measurement error with varying reliability coefficients (R), odds ratios (ORs), levels of correlation between exposures and error, LODs, and interactions. Blood cadmium and lead levels in relation to anovulation served as the motivating example, based on findings from the BioCycle Study (2005-2007). For most scenarios, main-effect estimates for cadmium and lead with increasing levels of positively correlated measurement error created increasing downward or upward bias for OR > 1.00 and OR < 1.00, respectively, that was also a function of effect size. Some scenarios showed bias for cadmium away from the null. Results subject to LODs were similar. Bias for main and interaction effects ranged from -130% to 36% and from -144% to 84%, respectively. A closed-form continuous outcome case solution provides a useful tool for estimating the bias in logistic regression. Investigators should consider how measurement error and LODs may bias findings when examining biomarkers measured in the same medium, prepared with the same process, or analyzed using the same method.
... The lack of for-mal definition of tobacco use (such as current, former or never smoking status) and lack of structured tobacco assessment documentation potentially underestimate the true impact of smoking on cancer survival. 11,12 There are very limited data using structured smoking assessments that evaluate the effect of smoking on long-term survival, but structured assessments are necessary to more accurately identify the true effect of smoking on survival. Documenting an adverse relationship between smoking and survival would justify inclusion of structured smoking assessments in clinical trials design and clinical practice, would strengthen cessation efforts in cancer patients, and would support further analyses of smoking cessation as a cost effective mechanism of improving cancer treatment outcomes. ...
Article
The effect of smoking on survival in cancer patients is limited by the lack of structured prospective assessments of smoking at diagnosis. To assess the effect of smoking at diagnosis on survival, structured smoking assessments were obtained in a cohort of 5,185 cancer patients within 30 days of a cancer diagnosis between 1982 and 1998. Hazard ratios (HRs) or odds ratios were generated to analyze the effects of smoking at diagnosis on overall mortality (OM) and disease-specific mortality (DSM) in a patient cohort from 13 disease sites containing at least 100 patients in each disease site. With a minimum of 12 years of follow-up, current smoking increased OM risk versus recent quit (HR 1.17), former (HR 1.29) and never smokers (HR 1.38) in the overall cohort. Current smoking increased DSM risk versus former (HR 1.23) and never smokers (HR 1.18). In disease sites with proportionately large (>20%) recent quit cohorts (lung and head/neck), current smoking increased OM and DSM risks as compared with recent quit. Current smoking increased mortality risks in lung, head/neck, prostate and leukemia in men and breast, ovary, uterus and melanoma in women. Current smoking was not associated with any survival benefit in any disease site. Data using prospective structured smoking assessments demonstrate that current smoking increased long-term OM and DSM. Standardized smoking assessment at diagnosis is an important variable for evaluating outcomes in cancer patients.
... However, there may be a positive association between the errors in self-reported confounders and physical activity, and this could increase residual confounding. 29 Infrequent measurements of the confounders could also lead to residual confounding (see eAppendix 3, http://links.lww.com/EDE/A564). Another limitation of our study is that we could not fit separate censoring models for different censoring mechanisms (eg, knee surgery vs. death) because the reasons for loss to follow-up were not publicly available. ...
Article
A previous analysis of the Osteoarthritis Initiative study reported a dose-response relationship between physical activity and improved physical function in adults with knee osteoarthritis, using conventional statistical methods. These methods are subject to bias when confounders are affected by prior exposure. We used baseline and 1-, 2-, and 3-year follow-up data from the Osteoarthritis Initiative study of 2545 US adults with knee osteoarthritis recruited between 2004 and 2006 from 4 clinical sites. Physical activity was measured using the Physical Activity Scale for the Elderly, and outcomes were functional performance measured by the timed 20-meter walk test and self-reported knee pain measured by the Western Ontario and McMaster Universities Osteoarthritis Index. We estimated the effect of physical activity on each outcome using inverse probability-weighted (IPW) estimators of marginal structural models. For each outcome, we fitted 2 separate IPW models adjusting for concurrent or lagged confounders. The mean differences in walking speed for the second, third, and fourth quartiles of physical activity relative to the first were 0.48 (95% confidence interval = -0.12 to 1.08), 0.45 (-0.23 to 1.13), and 0.46 (-0.29 to 1.22) meters/min based on the IPW model adjusting for concurrent confounders. When adjusting for lagged confounders, the results were 1.35 (0.64 to 2.07), 1.33 (0.54 to 2.14), and 1.26 (0.40 to 2.12). Both IPW models indicated that physical activity did not affect knee pain. Physical activity has no effect on knee pain and may have either a very small effect or no effect on functional performance in adults with knee osteoarthritis.
... Our study also had some limitations. Because the association between smoking and lung cancer is very strong and because dietary habits differ between smokers and non-smokers (1), it is difficult to ensure that all of the potential confounding by smoking habits has been removed in analyses of dietary factors in relation to lung cancer risk (63)(64)(65). We found in our study that controlling for smoking status, the number of years smoked, and the number of cigarettes smoked per day provided the strongest control of confounding compared with other parameterizations of smoking history. ...
Article
Intervention trials with supplemental beta-carotene have observed either no effect or a harmful effect on lung cancer risk. Because food composition databases for specific carotenoids have only become available recently, epidemiological evidence relating usual dietary levels of these carotenoids with lung cancer risk is limited. We analyzed the association between lung cancer risk and intakes of specific carotenoids using the primary data from seven cohort studies in North America and Europe. Carotenoid intakes were estimated from dietary questionnaires administered at baseline in each study. We calculated study-specific multivariate relative risks (RRs) and combined these using a random-effects model. The multivariate models included smoking history and other potential risk factors. During follow-up of up to 7-16 years across studies, 3,155 incident lung cancer cases were diagnosed among 399,765 participants. beta-Carotene intake was not associated with lung cancer risk (pooled multivariate RR = 0.98; 95% confidence interval, 0.87-1.11; highest versus lowest quintile). The RRs for alpha-carotene, lutein/zeaxanthin, and lycopene were also close to unity. beta-Cryptoxanthin intake was inversely associated with lung cancer risk (RR = 0.76; 95% confidence interval, 0.67-0.86; highest versus lowest quintile). These results did not change after adjustment for intakes of vitamin C (with or without supplements), folate (with or without supplements), and other carotenoids and multivitamin use. The associations generally were similar among never, past, or current smokers and by histological type. Although smoking is the strongest risk factor for lung cancer, greater intake of foods high in beta-cryptoxanthin, such as citrus fruit, may modestly lower the risk.
... 26,27 Less often is confounder misclassification explored, despite the fact that the resulting bias can be either towards or away from the null even if the errors are non-differential. [28][29][30][31] Much less often is the bias from misclassification quantified, and usually not even plausible ranges are given for the bias. 32 When internal-validation or repeat-measurement data are available, one may use special statistical methods to formally incorporate that data into the analysis, such as inverse-varianceweighted estimation, 33 maximum likelihood, 34-36 regression calibration, 35 multiple imputation, 37 and other error-correction and missing-data methods. ...
Article
Misclassification bias is present in most studies, yet uncertainty about its magnitude or direction is rarely quantified. The authors present a method for probabilistic sensitivity analysis to quantify likely effects of misclassification of a dichotomous outcome, exposure or covariate. This method involves reconstructing the data that would have been observed had the misclassified variable been correctly classified, given the sensitivity and specificity of classification. The accompanying SAS macro implements the method and allows users to specify ranges of sensitivity and specificity of misclassification parameters to yield simulation intervals that incorporate both systematic and random error. The authors illustrate the method and the accompanying SAS macro code by applying it to a study of the relation between occupational resin exposure and lung-cancer deaths. The authors compare the results using this method with the conventional result, which accounts for random error only, and with the original sensitivity analysis results. By accounting for plausible degrees of misclassification, investigators can present study results in a way that incorporates uncertainty about the bias due to misclassification, and so avoid misleadingly precise-looking results.
... 7,8 In addition, the assumption that errors are well behaved is a strong assumption. Slightly ill behaviour on the part of measurement errors can, as Fox et al. point out, and as others have noted, 9 have massively distorting effects. The only way to avoid this problem is to measure exposures and outcomes without error. ...
Article
0.75 0.8 0.85 0.9 0.95 1 0.75 0.8 0.85 0.9 0.95 1 0.75 0.8 0.85 0.9 0.95 1 0.75 0.7 0.8 0.85 0.9 0.95 1 Figure 1 Actual output of sensitivity and specificity distributions using uniform (a), triangular (b), and trapezoidal (c and d) distributions based on 30 000 iterations. '*' Sensitivity using a uniform distribution (min 50.8, max 51.0); 'y' Sensitivity using a triangular distribution (min 50.8, mode 5 0.9, max 5 1.0); 'z' Sensitivity using a trapezoidal distribution (min 5 0.75, mode1 5 0.85, mode2 5 0.95, max 5 1.0); and '§' Specificity using a trapezoidal distribution (min 5 0.7, mode1 5 0.8, mode2 5 0.9, max 5 0.95), truncated at 0.788 to avoid negative corrected counts in the example
... In short, the net effect of measurement error in our multilevel study (and multivariable models generally) is unclear. [54][55][56] Fourthly, the inclusion of individual level covariates in multilevel analyses may result in overcontrol, which argues for the possibility of a true contextual effect on food purchasing behaviour in Brisbane. Household income, for example, may in part depend on where you live or on cumulative small area effects over the lifecourse. ...
Article
Full-text available
To examine the association between area and individual level socioeconomic status (SES) and food purchasing behaviour. The sample comprised 1000 households and 50 small areas. Data were collected by face to face interview (66.4% response rate). SES was measured using a composite area index of disadvantage (mean 1026.8, SD = 95.2) and household income. Purchasing behaviour was scored as continuous indices ranging from 0 to 100 for three food types: fruits (mean 50.5, SD = 17.8), vegetables (61.8, 15.2), and grocery items (51.4, 17.6), with higher scores indicating purchasing patterns more consistent with dietary guideline recommendations. Brisbane, Australia, 2000. Persons responsible for their household's food purchasing. Controlling for age, gender, and household income, a two standard deviation increase on the area SES measure was associated with a 2.01 unit increase on the fruit purchasing index (95% CI -0.49 to 4.50). The corresponding associations for vegetables and grocery foods were 0.60 (-1.36 to 2.56) and 0.94 (-1.35 to 3.23). Before controlling for household income, significant area level differences were found for each food, suggesting that clustering of household income within areas (a composition effect) accounted for the purchasing variability between them. Living in a socioeconomically advantaged area was associated with a tendency to purchase healthier food, however, the association was small in magnitude and the 95% CI for area SES included the null. Although urban areas in Brisbane are differentiated on the basis of their socioeconomic characteristics, it seems unlikely that where you live shapes your procurement of food over and above your personal characteristics.
Article
Measurement error is common in environmental epidemiologic studies, but methods for correcting measurement error in regression models with multiple environmental exposures as covariates have not been well investigated. We consider a multiple imputation approach, combining external or internal calibration samples that contain information on both true and error-prone exposures with the main study data of multiple exposures measured with error. We propose a constrained chained equations multiple imputation (CEMI) algorithm that places constraints on the imputation model parameters in the chained equations imputation based on the assumptions of strong nondifferential measurement error. We also extend the constrained CEMI method to accommodate nondetects in the error-prone exposures in the main study data. We estimate the variance of the regression coefficients using the bootstrap with two imputations of each bootstrapped sample. The constrained CEMI method is shown by simulations to outperform existing methods, namely the method that ignores measurement error, classical calibration, and regression prediction, yielding estimated regression coefficients with smaller bias and confidence intervals with coverage close to the nominal level. We apply the proposed method to the Neighborhood Asthma and Allergy Study to investigate the associations between the concentrations of multiple indoor allergens and the fractional exhaled nitric oxide level among asthmatic children in New York City. The constrained CEMI method can be implemented by imposing constraints on the imputation matrix using the mice and bootImpute packages in R.
Article
In nutritional epidemiology, measurement error in covariates is a well‐known problem since dietary intakes are usually assessed through self‐reporting. In this article, we consider an additive error model in which error variables are highly correlated, and propose a new method called approximate profile likelihood estimation (APLE) for covariates measured with error in the Cox regression. Asymptotic normality of this estimator is established under regularity conditions, and simulation studies are conducted to examine the finite sample performance of the proposed estimator empirically. Moreover, the popular correction method called regression calibration is shown to be a special case of APLE. We then apply APLE to deal with measurement error in some nutrients of interest in the EPIC‐InterAct Study under a sensitivity analysis framework.
Article
Background: In nutritional epidemiology, covariates in some studies such as the EPIC are prone to measurement error. Estimation of unknown parameters in most measurement error models for food frequency questionnaire (FFQ) and nutrient biomarkers requires replicated measurements. But, the EPIC-InterAct Study did not collect replicated measurements for FFQ or 24-hour dietary recalls (24HR). The method of correcting measurement error in this case is worth studying. Methods: A moment method is applied to estimate unknown parameters of the proposed error model with correlated errors between biased measurements of FFQ and 24HR. Then, correction factor and reliability ratio of each error-prone nutrient can be obtained correspondingly. Afterwards, regression calibration (RC) under a Cox model is used to correct measurement error of nutrients of interest in the EPIC-InterAct data. Results: Compared to the naive estimation, estimation results for dietary intakes could be very different when we take measurement error into consideration. Using RC as the correction method, hazard ratios (HR) of vegetable plus fruit, fat and energy for males become 1.01 (95% CI 0.75-1.35), 1.30 (95% CI 1.12-1.51) and 1.16 (95% CI 1.04-1.28), respectively, and HR of energy for females becomes 0.99 (95% CI 0.91-1.08). These HRs are greatly different from those by naive estimation. Conclusions: Although there is no repeated measurement for FFQ and 24HR, we can still estimate all unknown parameters in our proposed error model under four assumptions and then correct measurement error in nutrients of interest in EPIC-InterAct Study by RC for avoiding some misleading results from naive estimation.
Article
Misclassification is a pervasive problem in assessing relations between exposures and outcomes. While some attention has been paid to the impact of dependence in measurement error between exposures and outcomes, there is little awareness of the potential impact of dependent error between exposures and covariates, despite the fact that this latter dependency may occur much more frequently, for example, when both are assessed by questionnaire. We explored the impact of non-differential dependent exposure-confounder misclassification bias by simulating a dichotomous exposure (E), disease (D) and covariate (C) with varying degrees of non-differential dependent misclassification between C and E. We demonstrate that under plausible scenarios, adjusted association can be a poorer estimate of the true association than the crude. Correlated errors in the measurement of covariate and exposure distort the covariate-exposure, covariate-outcome and exposure-outcome associations creating observed associations that can be greater than, less than, or in the opposite direction of the true associations. Under these circumstances adjusted associations may not be bounded by the crude association and true effect, as would be expected with non-differential independent confounder misclassification. The degree and direction of distortion depends on the amount of dependent error, prevalence of covariate and exposure, and magnitude of true effect.
Article
Purpose: Measurement error discussions often assume classification errors of key variables are independent. Yet, small amounts of dependent error can create large biases in effect estimates. The purpose of this review was to evaluate frequency of measurement error discussions and potential for dependent error in the observational literature. Methods: Two samples of articles analyzing exposure-outcome contrasts were collected: a random sample (n = 100) from high-impact epidemiology and medical journals (June 2015-July 2016), and a citation-based sample (n = 39) of studies citing one of two prominent dependent misclassification articles (through July 2016). We extracted study details, recorded measurement error mentions, and qualitatively assessed dependent error potential. Results: Measurement error was often discussed. No random sample articles explicitly mentioned dependent error, compared with 59% of the citation-based sample. The random sample was found to be at low risk of exposure-outcome (15% plausible/probable) but increased risk for exposure-confounder (38% plausible/probable) dependency. The citation-based sample was at higher risk for dependent error (exposure-outcome: 46% plausible/probable; exposure-confounder: 61% plausible/probable). Conclusions: Although measurement error was frequently mentioned, potential impact on observed results was rarely discussed in-depth or quantified. Dependent error mentions were rare, even among studies deemed susceptible. Further education and steps to avoid dependent error are needed.
Chapter
The convention in epidemiology and biostatistics is to divide the study of mismeasured variables into the areas of measurement error for continuous variables and misclassification for categorical variables. Although the topics overlap considerably, chapter Measurement Error of this handbook focuses on measurement error, whereas the present chapter is devoted to misclassification. As a motivating example of a misclassified variable in an epidemiological study, say that a binary exposure is ascertained via subject self-report on a questionnaire. Given human memory limitations, we would usually expect a portion of responses to be erroneous. For instance, in the study of Kraus et al. (1989) on possible association between maternal antibiotic use during pregnancy and sudden infant death syndrome (SIDS), antibiotic use is self-reported by subjects via questionnaire. Examination of medical records of some subjects, however, indicates that the questionnaire responses are erroneous for some subjects. Thus, antibiotic use as determined via questionnaire is subject to misclassification. Moreover, this misclassification has implications when the association between antibiotic use and SIDS is inferred.
Chapter
This chapter begins by summarizing current understanding of the mechanisms by which alcohol might affect cancer risk and then evaluates the molecular genetic factors that appear relevant to alcohol metabolism and hence the impact of alcohol on cancer risk. It briefly reviews the means by which alcohol's effects can be studied and their limitations. The role of alcohol in cancer at major cancer sites is then used to gauge the likely importance of alcohol to cancer risk and prevention. These sites are either ones for which there is a substantial literature linking alcohol to risk, or they are associated with significant morbidity and mortality.
Chapter
All epidemiologic studies are (or should be) based on a particular source population followed over a particular risk period. The goal is usually to estimate the effect of one or more exposures on one or more health outcomes. When we are estimating the effect of a specific exposure on a specific health outcome, confounding can be thought of as a mixing of the effects of the exposure being studied with the effect(s) of other factor(s) on the risk of the health outcome of interest. Interaction can be thought of as a modification, by other factors, of the effects of the exposure being studied on the health outcome of interest, and can be subclassified into two major concepts: biological dependence of effects, also known as synergism; and effect-measure modification, also known as heterogeneity of a measure. Both confounding and interaction can be assessed by stratification on these other factors (i.e. the potential confounders or effect modifiers). The present chapter covers the basic concepts of confounding and interaction and provides a brief overview of analytic approaches to these phenomena. Because these concepts and methods involve far more topics than we can cover in detail, we provide many references to further discussion beyond that in the present handbook, especially to relevant chapters in Modern Epidemiology by Rothman and Greenland (1998).
Chapter
Epidemiology is the science that focuses on the occurrence of disease in its broadest sense, with the fundamental aim to understand and to control its causes. This chapter deals with the conceptual building blocks of epidemiology. First we offer a model for causation, from which a variety of insights relevant to epidemiologic understanding emerge. We then discuss the basis by which we attempt to infer that an identified factor is indeed a cause of disease; the guidelines lead us through a rapid review of modern scientific philosophy. The remainder of the chapter deals with epidemiologic fundamentals of measurement, including the measurement of disease and the measurement of causal effects.
Article
Antioxidants such as selenium, vitamin E and C and carotenoids have been hypothesized as chemopreventive agents for several cancers. In the current review, we evaluate the results of epidemiological and interventional studies and summarize current knowledge of the prevention potential of the antioxidants, specific to gastrointestinal cancers. While early studies based on animal models and cell lines showed promise for antioxidants as chemopreventive agents for several gastrointestinal cancers, results from epidemiological studies and randomized trials do not support this promise. One large randomized trial, conducted in a region with widespread nutritional deficiency, showed that antioxidant use may confer protection against gastrointestinal cancers. However, this result has not been replicated in other epidemiological studies or the 10 other randomized trials conducted in developed Western countries. Overall, currently there is no evidence that antioxidants are protective against gastrointestinal cancers in populations whose members are replete in antioxidant intake.
Article
A common adage among those who play games that involve balls-baseball, for example-is that one must keep one's eye on the ball. The baseball player attempting to catch a grounder or fly ball must watch the ball-where it has been, where it is-into his or her glove; the player who diverts his or her
Article
Medication use patterns provide popular surrogate measures of disease, yet selective under-use of drugs by elderly patients with potentially unmeasured comorbidity may lead to artifactual "protective" associations between use of specific drugs and mortality. We examined the relation between use of 20 common classes of drugs and mortality among the 129,111 residents of New Jersey 65-99 years of age who had at least one hospitalization during the years 1991-1994 and filled prescriptions through either Medicaid or that state's Pharmacy Assistance for the Aged and Disabled program. Each study drug class was used by more than 5,000 subjects during the 120 days before hospitalization; 41,930 subjects died in the hospital or during the year after discharge. Users of drugs from each of seven therapeutic classes had reduced age- and sex-adjusted rates of death relative to non-users: lipid-lowering agents, nonsteroidal anti-inflammatory agents, beta blockers, thiazides, glaucoma drugs, calcium channel blockers, and anti-anxiety drugs. Adjustment for comorbidity and polypharmacy had little effect on these results. We found similar results in a separate nonhospitalized cohort of 132,071 elderly persons. Much of this observed association appears to be nonetiologic. These findings raise concerns about using observational studies in high-risk populations to infer associations between drug use and outcomes.
Article
Convincing epidemiologic evidence currently exists for an association between physical activity and the prevention of colon and breast cancer Physical activity may also reduce the risk of cancer at several other sites. With increasing research on this topic, it is apparent that studies of physical activity and cancer have numerous methodological similarities with studies of nutrition and cancer Lessons learned from nutritional epidemiology that can be applied to studies of physical activity and cancer prevention and recommendations for future research are discussed in this review.
Article
Recent research suggests that in utero exposure to maternal smoking is a risk factor for conduct disorder and delinquency. We review evidence of causality, a controversial but important public health question. We analyzed studies of maternal prenatal smoking and offspring antisocial behavior within a causal framework. The association is (1) independent of confounders, (2) present across diverse contexts, and (3) consistent with basic science. Methodological limitations of existing studies preclude causal conclusions. Existing evidence provides consistent support for, but not proof of, an etiologic role for prenatal smoking in the onset of antisocial behavior. The possibility of identifying a preventable prenatal risk factor for a serious mental disorder makes further research on this topic important for public health.
Article
In nutritional epidemiology, accurate quantification of nutritional exposure is critical. Even moderate flaws in measurement can lead to sizeable distortions in estimations of the effects of exposure. In many situations, this will lead to inaccurate direct estimation of exposure effects. In others, it will make it difficult to control for the confounding effects of nutritional exposure. Biomarkers offer important opportunities to advance research in nutritional epidemiology; their objectivity and potentially greater accuracy give them the potential to substantially lessen distortions that might result from imperfect measurements. Clearly, the accuracy of biomarkers as indicators of nutritional exposure is critical to their value. It is likely that establishing the accuracy of biomarkers will require some reference to self-reports, even if those reports are not as accurate as the biomarkers they are used to test. The goal of this paper is to describe aspects of accuracy-reproducibility, reliability and validity-as they apply to biomarkers in nutritional epidemiology.
Article
Intravaginal practices, including wiping, douching, or inserting substances into the vagina, have been hypothesized to increase women's risk of HIV infection. However, data on the prevalence of these practices, and associations with HIV and other sexually transmitted diseases (STD), are limited. We interviewed 2,897 women participating in a gynecologic screening study in Cape Town, South Africa, about their intravaginal practices. After clinical examination, cervical and blood samples were collected and tested for HIV and other STD [corrected]. Of the 831 (29%) women reporting some type of intravaginal practice, 48% reported using only water and cloth to clean inside the vagina, whereas 17% reported using antiseptics or detergents. Most women (53%) reported practices as part of regular hygiene. Intravaginal practices were strongly associated with behavioral risk factors, and recent multiple sexual partners [corrected]. Intravaginal practices were associated with prevalent HIV infection (adjusted odds ratio, 1.74; 95% confidence interval, 1.37-2.20), but were not associated with other STDs. Prospective studies that include detailed measurements of correlated sexual risk behaviors are required to discern whether this association is causal in nature; if so, these behaviors could represent an important area for future HIV prevention interventions.
Article
Prospective cohort studies have consistently found no important link between fiber intake and risk of colorectal cancer. The recent large, prospective European Prospective Investigation into Cancer and Nutrition has challenged this paradigm by suggesting significant protection by high fiber intake. We prospectively investigated the association of fiber intake with the incidence of colon and rectal cancers in two large cohorts: the Nurses' Health Study (76,947 women) and the Health Professionals Follow-up Study (47,279 men). Diet was assessed repeatedly in 1984, 1986, 1990, and 1994 among women and in 1986, 1990, and 1994 among men. The incidence of cancer of the colon and rectum was ascertained up to the year 2000. Relative risk estimates were calculated using a Cox proportional hazards model simultaneously controlling for potential confounding variables. During follow-up including 1.8 million person-years and 1,596 cases of colorectal cancer, we found little association with fiber intake after controlling for confounding variables. The hazard ratio for a 5-g/d increase in fiber intake was 0.91 (95% confidence interval, 0.87-0.95) after adjusting for covariates used in the European Prospective Investigation into Cancer and Nutrition study and 0.99 (95% confidence interval, 0.95-1.04) after adjusting for additional confounding variables. Our data from two large prospective cohorts with long follow-up and repeated assessment of fiber intake and of a large number of potential confounding variables do not indicate an important association between fiber intake and colorectal cancer but reveal considerable confounding by other dietary and lifestyle factors.
Article
The goal of this study was to assess whether interruption of care for chronic periodontitis during pregnancy increased the risk of low-birthweight infants. A population-based case-control study was designed with 793 cases (infants < 2,500 g) and a random sample of 3,172 controls (infants >or= 2,500 g). Generalized estimating equation models were used to relate periodontal treatment history to low birthweight risk and to common risk factors. The results indicate that periodontal care utilization was associated with a 2.35-fold increased odds of self-reported smoking during pregnancy (95% confidence interval: 1.48-3.71), a 2.19-fold increased odds for diabetes (95% confidence interval: 1.21-3.98), a 3.90-fold increased odds for black race (95% confidence interval: 2.31-6.61), and higher maternal age. After adjustment for these factors, interruption of periodontal care during pregnancy did not lead to an increased risk for a low-birthweight infant when compared to women with no history of periodontal care (odds ratio, 0.96; 95% confidence interval, 0.60-1.52). In conclusion, women receiving periodontal care had genetic and environmental characteristics, such as smoking, diabetes and race, that were associated with an increased risk for low-birthweight infants. Periodontal care patterns, in and of themselves, were unrelated to low-birthweight risk.
Article
Measurement error in explanatory variables and unmeasured confounders can cause considerable problems in epidemiologic studies. It is well recognized that under certain conditions, nondifferential measurement error in the exposure variable produces bias towards the null. Measurement error in confounders will lead to residual confounding, but this is not a straightforward issue, and it is not clear in which direction the bias will point. Unmeasured confounders further complicate matters. There has been discussion about the amount of bias in exposure effect estimates that can plausibly occur due to residual or unmeasured confounding. In this paper, the authors use simulation studies and logistic regression analyses to investigate the size of the apparent exposure-outcome association that can occur when in truth the exposure has no causal effect on the outcome. The authors consider two cases with a normally distributed exposure and either two or four normally distributed confounders. When the confounders are uncorrelated, bias in the exposure effect estimate increases as the amount of residual and unmeasured confounding increases. Patterns are more complex for correlated confounders. With plausible assumptions, effect sizes of the magnitude frequently reported in observational epidemiologic studies can be generated by residual and/or unmeasured confounding alone.
Article
Full-text available
We conducted a combined analysis of the original data to evaluate the consistency of 12 case-control studies of diet and breast cancer. Our analysis shows a consistent, statistically significant, positive association between breast cancer risk and saturated fat intake in postmenopausal women (relative risk for highest vs. lowest quintile, 1.46; P <.0001). A consistent protective effect for a number of markers of fruit and vegetable intake was demonstrated; vitamin C intake had the most consistent and statistically significant inverse association with breast cancer risk (relative risk for highest vs. lowest quintile, 0.69; P <.0001). If these dietary associations represent causality, the attributable risk (i.e., the percentage of breast cancers that might be prevented by dietary modification) in the North American population is estimated to be 24% for postmenopausal women and 16% for premeno-pausal women. [J Natl Cancer Inst 82: 561–569, 1990]
Article
Full-text available
Recently, some authors have questioned the validity of methods which correct relative risk estimates for measurement error and misclassification when the "gold standard" used to obtain information about the measurement error process is itself imperfect. When such an "alloyed" gold standard is used to validate the usual exposure measurement, the bias in the "regression calibration" (Rosner et al., Stat Med 1989; 8:1051-69) measurement-error correction factor for relative risks estimated from logistic regression models is derived. This quantity is a function of the correlations of the "alloyed" gold standard (X) and the usual exposure assessment method (Z) with the truth, of the ratio of the variances of X and Z, and of the correlation between the errors in the "alloyed" gold standard and the errors in the usual exposure assessment method. In this paper, it is proven that if the errors between Z and X are uncorrelated, the regression calibration method has no bias even when the gold standard is "alloyed." When a third method of exposure assessment is available and it is reasonable to assume that the errors in this method are uncorrelated with the errors in the other two exposure assessment methods, point and interval estimates of the correlation between the errors in X and Z are derived. These methods are illustrated here with data on the measurement of physical activity, vitamins A and E, and poly- and monounsaturated fat. In addition, when a third exposure assessment method is available, a modification of standard regression calibration is derived which can be used to calculate point and interval estimates of relative risk that are corrected for measurement error in both X and Z. This new method is illustrated here with data from the Health Professionals Follow-up Study, a study investigating the associations between physical activity and colon cancer incidence and between vitamin E intake and coronary heart disease. It is shown that in these examples, correlations of the errors in X and Z tended to be small. Even when moderate, estimates of relative risk corrected for error in both X and Z were not very different from the estimates which assumed that X was a true gold standard.
Article
A presentation and critique of the use of multiple measures of theoretical concepts for the assessment of validity (using the multi-trait multi-method matrix) and reliability (using multiple indicators with a path analytic framework).
Article
Chapter
This chapter provides an overview of nutritional epidemiology for those unfamiliar with the field. The field of nutritional epidemiology developed from an interest in the concept that aspects of diet may influence the occurrence of human disease. Although it is relatively new as a formal area of research, investigators have used basic epidemiologic methods for more than 200 years to identify numerous essential nutrients. The most serious challenge to research in nutritional epidemiology has been the development of practical methods to measure diet. Because epidemiologic studies usually involve at least several hundred and sometimes hundreds of thousands of subjects, dietary assessment methods must be not only reasonably accurate but also relatively inexpensive. Epidemiologic approaches to diet and disease and the interpretation of epidemiologic data are discussed.
Article
Although computer programs may estimate values for unidentified parameters, a parameter must be identified in order for there to exist a unique point estimate of its value. We provide a series of rules that can be applied easily to measurement models of complexity one to demonstrate the identifiability of their parameters. These rules can be applied to models that contain one or more latent variables and that contain observed variables with correlated measurement errors. If the model is not identified, the rules pinpoint the parameters that are not identified and, thus, help researchers formulate a testable model.
Book
This book is intended to increase understanding of the complex relationships between diet and the major diseases of western civilization, such as cancer and atherosclerosis. The book starts with an overview of research strategies in nutritional epidemiology-a relatively new discipline which combines the knowledge compiled by nutritionists during this century with the methodology developed by epidemiologists to study the determinants of disease with multiple etiologies and long latent periods. A major part of the book is devoted to methods of dietary assessment using data on food intake, biochemical indicators of diet, and measures of body size and composition. The reproducibility and validity of each approach and the implications of measurement error are considered in detail. The analysis, presentation, and interpretation of data from epidemiologic studies of diet and disease are discussed. Particular attention is paid to the important influence of total energy intake on findings in such studies. As examples of methodologic issues in nutritional epidemiology, three substantive topics are examined in depth: the relations of diet and coronary heart disease, fat intake and breast cancer, and Vitamin A and lung cancer.
Article
2nd Ed Bibliogr. s. 139-141
Article
In studies examining associations between dietary factors and biomedical risk factors, the relations, if they exist, are frequently attenuated by measurement error. Measurement error may be due to a large intraindividual variation and an inadequate number of measurements or to an inaccurate measuring instrument. This paper evaluates the impact of measurement error on partial correlation and multiple linear regression analyses. Quantitative methods are derived to estimate the potential attenuation of associations. The results indicate that when the controlled variables do not have measurement error, but the correlated variables do, the attenuation of the partial correlation coefficient (or multiple regression coefficient) is greater than that of the simple correlation (or regression) coefficient When both the correlated variables and the controlled variables have measurement error, the partial correlation (or the regression) coefficients can be either increased or decreased.
Article
To address the hypotheses that dietary fat increases and fiber decreases the risk of breast cancer. Prospective cohort study with dietary assessment at baseline, using a validated, self-administered food frequency questionnaire. 89,494 women in the Nurses' Health Study who were 34 through 59 years of age in 1980 and who were followed up for 8 years (> 95% complete). 1439 incident cases of breast cancer were diagnosed, including 774 among postmenopausal women. After adjustment for age, established risk factors, and total energy intake, we observed no evidence of any positive association between total fat intake and breast cancer incidence (relative risks [RRs] for increasing quintiles of fat intake were 1.0, 0.85, 0.96, 0.91, and 0.90; 95% confidence interval for highest vs lowest quintile, 0.77 to 1.07). Among postmenopausal women alone, corresponding RRs were 1.0, 0.89, 1.00, 0.95, and 0.91. Comparing extreme deciles of total fat intake (> or = 49% vs < 29% of total energy intake), the RR was 0.86 (95% confidence interval, 0.67 to 1.08). A similar absence of any positive association was observed without adjustment for energy intake; for tumors less than 2 cm as well as 2 cm or greater in diameter; for saturated, monounsaturated, and polyunsaturated fat; and after excluding the first 4 years of follow-up. Also, we found no suggestion of any positive association when using a more detailed and precise dietary questionnaire completed in 1984 (666 subsequent cases), even when women consuming less than 25% of energy from fat were used as the comparison group. No suggestion of a protective effect of dietary fiber was observed (RRs for increasing quintiles were 1.0, 0.95, 0.93, 1.02, and 1.02). These data provide evidence against both an adverse influence of fat intake and a protective effect of fiber consumption by middle-aged women on breast cancer incidence over 8 years. Nevertheless, the positive association between intake of animal fat and risk of colon cancer observed in many studies provides ample reason to limit this source of energy.
Article
A number of authors have presented evidence that high dietary fat increases the risk of breast cancer, and a number have presented evidence to the contrary. In this study, dietary histories were obtained in 1980 from 18,586 postmenopausal women in New York State. These women were followed through 1987 to ascertain their incidence of breast cancer and other cancers and deaths from all causes, as registered in the New York State Tumor Registry and Office of Vital Statistics. Survival analysis revealed that the incidence of breast cancer increased with age, was higher among the nulliparous, was higher for those with a late (> 26 years) age at first pregnancy, and increased with increasing socioeconomic status--all risk factors discovered before for breast cancer. No increase in risk was related to the ingested amount of calories, vitamins A, C, or E, dietary fiber, or fat. Although dietary fat has been found to be associated with higher risk of cancer at a number of other sites, e.g., the lung, colon, and rectum, and although some previous writers have suggested an association with risk of breast cancer, the findings in three cohort studies as well as in eight substantial case-control studies are negative and suggest that a relation is far from established.
Article
We propose a method for estimating odds ratios from case-control data in which covariates are subject to measurement error. The measurement error may contain both a random component and a systematic difference between cases and controls (recall bias). A multivariate normal discriminant analysis model is assumed. If the distribution of measurement error is known, then a simple correction to naive (biased) estimates of odds ratios from logistic regression of disease on fallible measurements of covariates removes bias. The same correction yields confidence intervals and significance tests. We apply the proposed methods to data from a case-control study of colon cancer and diet.
Article
Weak measurement of epidemiologic exposures is an impediment to appreciation of the effects of those exposures. This paper discusses two strategies to assess the true effects of weakly measured exposure. The first is to use external information about the extent of mismeasurement to adjust estimates of the effects of exposure. The second strategy is to use multiple measurement--to repeat the measurement in such a way that measurement errors are not repeated. The major disadvantage of the adjustment strategy is its sensitivity to incorrect specification of mismeasurement structure. The primary disadvantage of the multiple measurement strategy is its inefficiency. Unless epidemiologists are quite confident, about the extent and structure of measurement error in their data, they should rely primarily on multiple measurement, and secondarily on adjustment procedures.
Article
The authors interviewed 428 pathologically confirmed cases of colon cancer and controls matched on age, sex, race, and neighborhood in the New York counties containing the cities of Buffalo, Niagara Falls, and Rochester. Risk of colon cancer in both males and females, studied separately, appeared to increase with the amount of total fats and total calories ingested. In addition, we found the risk to increase with increases in the Quetelet index of relative weight (weight (kg)/height (m)2). Dietary fiber was only equivocally associated with risk. Fats and Quetelet index were associated with increased risk in a regression analysis adjusting each factor for the other, as well as for fiber, age, and socioeconomic status. The same was true for calories and Quetelet index. Future efforts to clarify a possible protective role for fiber and to disentangle the effects of fats and calories need to be undertaken. The fact that calories ingested and obesity are each associated with increased risk suggests the importance of studying calorie expenditure.
Article
Associations between intake of specific nutrients and disease cannot be considered primary effects of diet if they are simply the result of differences between cases and noncases in body size, physical activity, and metabolic efficiency. Epidemiologic studies of diet and disease should therefore be directed at the effect of nutrient intakes independent of total caloric intake in most instances. This is not accomplished with nutrient density measures of dietary intake but can be achieved by employing nutrient intakes adjusted for caloric intake by regression analysis. While pitfalls in the manipulation and interpretation of energy intake data in epidemiologic studies have been emphasized, these considerations also highlight the usefulness of obtaining a measurement of total caloric intake. For instance, if a questionnaire obtained information on only cholesterol intake in a study of coronary heart disease, it is possible that no association with disease would be found even if a real positive effect of a high cholesterol diet existed, since the caloric intake of cases is likely to be less than that of noncases. Such a finding could be appropriately interpreted if an estimate of total caloric intake were available. The relationships between dietary factors and disease are complex. Even with carefully collected measures of intake, consideration of the biologic implications of various analytic approaches is needed to avoid misleading conclusions.
Article
The authors examine some recently proposed criteria for determining when to adjust for covariates related to misclassification, and show these criteria to be incorrect. In particular, they show that when misclassification is present, covariate control can sometimes increase net bias, even when the covariate would have been a confounder under perfect classification, and even if the covariate is a determinant of classification. Thus, bias due to misclassification cannot be adequately dealt with by the methods used for control of confounding. The examples presented also show that the "change-in-estimate" criterion for deciding whether to control a covariate can be systematically misleading when misclassification is present. These results demonstrate that it is necessary to consider the degree of misclassification when deciding whether to control a covariate.
Article
When certain key factors of interest in epidemiologic research studies cannot be measured directly, epidemiologists often turn to the use of surrogate variables. The potential bias in making statistical inferences about an adjusted exposure-disease association parameter (e.g., a partial correlation) is described as a function of the degree of unreliability in the surrogate variables used in place of the underlying disease, exposure, and confounding factors of real interest. It is shown that unreliability in the surrogate confounder is much more apt to produce seriously misleading inferences than is unreliability in the surrogate measures for disease and exposure. Practical methods are discussed for dealing with less than perfectly reliable surrogate variables.
Article
The effects of misclassification on analyses involving a discrete covariate are examined. The following points are illustrated: 1) Analogous to the 2 X 2 table case, unbiased misclassification of the study exposure leads to reduction in the observed strength of the association of exposure with disease. 2) Both biased and unbiased misclassification will tend to distort the degree of heterogeneity in the measure of association being considered. 3) Misclassification of a confounder leads to a partial loss of ability to control confounding.
Article
Experiments in animals, international correlation comparisons, and case-control studies support an association between dietary fat intake and the incidence of breast cancer. Most cohort studies do not corroborate the association, but they have been criticized for involving small numbers of cases, homogeneous fat intake, and measurement errors in estimates of fat intake. We identified seven prospective studies in four countries that met specific criteria and analyzed the primary data in a standardized manner. Pooled estimates of the relation of fat intake to the risk of breast cancer were calculated, and data from study-specific validation studies were used to adjust the results for measurement error. Information about 4980 cases from studies including 337,819 women was available. When women in the highest quintile of energy-adjusted total fat intake were compared with women in the lowest quintile, the multivariate pooled relative risk of breast cancer was 1.05 (95 percent confidence interval, 0.94 to 1.16). Relative risks for saturated, monounsaturated, and polyunsaturated fat and for cholesterol, considered individually, were also close to unity. There was little overall association between the percentage of energy intake from fat and the risk of breast cancer, even among women whose energy intake from fat was less than 20 percent. Correcting for error in the measurement of nutrient intake did not materially alter these findings. We found no evidence of a positive association between total dietary fat intake and the risk of breast cancer. There was no reduction in risk even among women whose energy intake from fat was less than 20 percent of total energy intake. In the context of the Western lifestyle, lowering the total intake of fat in midlife is unlikely to reduce the risk of breast cancer substantially.
Article
Greenland first documented (Am J Epidemiol 1980;112:564–9) that error in the measurement of a confounder could resonate—that it could bias estimates of other study variables, and that the bias could persist even with statistical adjustment for the confounder as measured. An important question is raised by this finding: can such bias be more than trivial within the bounds of realistic data configurations? The authors examine several situations involving dichotomous and continuous data in which a confounder and a null variable are measured with error, and they assess the extent of resultant bias in estimates of the effect of the null variable. They show that, with continuous variables, measurement error amounting to 40% of observed variance in the confounder could cause the observed impact of the null study variable to appear to alter risk by as much as 30%. Similarly, they show, with dichotomous independent variables, that 15% measurement error in the form of misclassification could lead the null study variable to appear to alter risk by as much as 50%. Such bias would result only from strong confounding. Measurement error would obscure the evidence that strong confounding is a likely problem. These results support the need for every epidemiologic inquiry to include evaluations of measurement error in each variable considered.
Article
Least squares provides consistent estimates of the regression coefficients beta in the model E[Y [symbol: see text] x] = beta x when fully accurate measurements of x are available. However, in biomedical studies one must frequently substitute unreliable measurements X in place of x. This induces bias in the least squares coefficient estimates. In the univariate case, the bias manifests itself as a shrinkage toward zero, but this result does not generalize. When x is multivariate, then there are no predictable relationships between the signs or magnitudes of actual and estimated regression coefficients. In this article, we characterize the estimation bias, and review a relatively simple adjustment procedure to correct it. We also show that several natural conjectures about the bias are false. We present three definitions of reliability coefficient matrices that generalize the univariate case, and we illustrate their application to dietary intake data from a cancer prevention study.
Article
In epidemiologic studies, total energy intake is often related to disease risk because of associations between physical activity or body size and the probability of disease. In theory, differences in disease incidence may also be related to metabolic efficiency and therefore to total energy intake. Because intakes of most specific nutrients, particularly macronutrients, are correlated with total energy intake, they may be noncausally associated with disease as a result of confounding by total energy intake. In addition, extraneous variation in nutrient intake resulting from variation in total energy intake that is unrelated to disease risk may weaken associations. Furthermore, individuals or populations must alter their intake of specific nutrients primarily by altering the composition of their diets rather than by changing their total energy intake, unless physical activity or body weight are changed substantially. Thus, adjustment for total energy intake is usually appropriate in epidemiologic studies to control for confounding, reduce extraneous variation, and predict the effect of dietary interventions. Failure to account for total energy intake can obscure associations between nutrient intakes and disease risk or even reverse the direction of association. Several disease-risk models and formulations of these models are available to account for energy intake in epidemiologic analyses, including adjustment of nutrient intakes for total energy intake by regression analysis and addition of total energy to a model with the nutrient density (nutrient divided by energy).
Total energy intake: implications for epidemiologic analyses The first author replies. (Letter)
  • Gr Howe
  • Re
Howe GR. Re: 'Total energy intake: implications for epidemiologic analyses." The first author replies. (Letter). Am J Epidemiol 1989;129:1314-15.
Comments on adjustment for total energy intake in epidemiologic studies
  • Ls Friedman
  • V Kipnis
  • Brown
  • Cc
Friedman LS, Kipnis V, Brown CC, et al. Comments on adjustment for total energy intake in epidemiologic studies. Am J Clin Nutr 1997;65(Suppl): 12295-12315.
Diagnosing indicator ills in multiple indicator models Structural equation models in the social sciences
  • Hl Costner
  • R Schoenberg
Costner HL, Schoenberg R. Diagnosing indicator ills in multiple indicator models. In: Goldberger AS, Duncan OD, eds. Structural equation models in the social sciences. New York: Academic Press, 1976:167-99.