Establishing the internal and external validity of experimental studies.
ABSTRACT The information needed to determine the internal and external validity of an experimental study is discussed. Internal validity is the degree to which a study establishes the cause-and-effect relationship between the treatment and the observed outcome. Establishing the internal validity of a study is based on a logical process. For a research report, the logical framework is provided by the report's structure. The methods section describes what procedures were followed to minimize threats to internal validity, the results section reports the relevant data, and the discussion section assesses the influence of bias. Eight threats to internal validity have been defined: history, maturation, testing, instrumentation, regression, selection, experimental mortality, and an interaction of threats. A cognitive map may be used to guide investigators when addressing validity in a research report. The map is based on the premise that information in the report evolves from one section to the next to provide a complete logical description of each internal-validity problem. The map addresses experimental mortality, randomization, blinding, placebo effects, and adherence to the study protocol. Threats to internal validity may be a source of extraneous variance when the findings are not significant. External validity is addressed by delineating inclusion and exclusion criteria, describing subjects in terms of relevant variables, and assessing generalizability. By using a cognitive map, investigators reporting an experimental study can systematically address internal and external validity so that the effects of the treatment are accurately portrayed and generalization of the findings is appropriate.
- SourceAvailable from: Paul de Vreede[show abstract] [hide abstract]
ABSTRACT: Data regarding the effect of exercise programmes on older adults' health-related quality of life (HRQOL) and habitual physical activity are inconsistent. To determine whether a functional tasks exercise programme (enhances functional capacity) and a resistance exercise programme (increases muscle strength) have a different effect on the HRQOL and physical activity of community-dwelling older women. Ninety-eight women were randomised to a functional tasks exercise programme (function group), a resistance exercise programme (resistance group), or normal activity group (control group). Participants attended exercise classes three times a week for 12 weeks. The SF-36 Health Survey questionnaire and self-reported physical activity were obtained at baseline, directly after completion of the intervention (3 months), and 6 months later (9 months). At 3 months, no difference in mean change in HRQOL and physical activity scores was seen between the groups, except for an increased SF-36 physical functioning score for the resistance group compared with the control group (p = 0.019) and the function group (p = 0.046). Between 3 and 9 months, the self-reported physical functioning score of the function group decreased to below baseline (p = 0.026), and physical activity (p = 0.040) decreased in the resistance group compared with the function group. Exercise has a limited effect on the HRQOL and self-reported physical activity of community-living older women. Our results suggest that in these subjects HRQOL measures may be affected by ceiling effects and response shift. Studies should include performance-based measures in addition to self-report HRQOL measures, to obtain a better understanding of the effect of exercise interventions in older adults.Gerontology 02/2007; 53(1):12-20. · 2.68 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.Aging clinical and experimental research 09/2006; 18(4):325-33. · 1.01 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: Decision makers want to know which healthcare services matter the most, but there are no well-established, practical methods for providing evidence-based answers to such questions. Led by the National Commission on Prevention Priorities, the authors update the methods for determining the relative health impact and economic value of clinical preventive services. Using new studies, new preventive service recommendations, and improved methods, the authors present a new ranking of clinical preventive services in the companion article. The original ranking and methods were published in this journal in 2001. The current methods report focuses on evidence collection for a priority setting exercise, guidance for which is effectively lacking in the literature. The authors describe their own standards for searching, tracking, and abstracting literature for priority setting. The authors also summarize their methods for making valid comparisons across different services. This report should be useful to those who want to understand additional detail about how the ranking was developed or who want to adapt the methods for their own purposes.American Journal of Preventive Medicine 08/2006; 31(1):90-6. · 3.95 Impact Factor