Advice on total-score reliability issues in psychosomatic measurement.

Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands.
Journal of psychosomatic research (Impact Factor: 2.91). 06/2011; 70(6):565-72. DOI: 10.1016/j.jpsychores.2010.11.002
Source: PubMed

ABSTRACT This article addresses three reliability issues that are problematic in the construction of scales intended for use in psychosomatic research, illustrates how these problems may lead to errors, and suggests solutions.
We used psychometric results and present five computational studies. The first, third, and fourth studies are based on the generation of artificial data from psychometric models in combination with distributions for scale scores, as is common in psychometric research, whereas the second and fifth studies are analytical.
The power of Student's t test depends more on sample size than on total-score reliability, but reliability must be high when one estimates correlations involving test scores. Short scales often do not allow total scores to be significantly different from a cutoff score. Coefficient alpha is uninformative about the factorial structure of questionnaires and is one of the weakest estimators of total-score reliability.
The relationship between questionnaire length/reliability and statistical power is complex. Both in research and individual diagnostics, we recommend the use of highly reliable scales so as to reduce the chance of faulty decisions. The conclusion calls for profound statistical research producing hands-on rules for researchers to act upon. Factor analysis should be used to assess the internal consistency of questionnaires. As a reliability estimator, alpha should be replaced by better and readily available methods.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Several authors proposed a shortened version of the State scale of the State-Trait Anxiety Inventory (S-STAI) to obtain a more efficient measurement instrument. Psychometric theory shows that test shortening makes a total score more vulnerable to measurement error, and this may result in inaccurate and biased research results and an increased risk of making incorrect decisions about individuals. This study investigated whether the reliability and the measurement precision of shortened versions of the S-STAI are adequate for psychological research and making decisions about individuals in clinical practice. Secondary data analysis was used to compare reliability and measurement precision between twelve shortened S-STAI versions and the full-length 20-item S-STAI version. Data for the 20-item version came from a longitudinal study performed previously in the Netherlands and included 377 patients and 375 of their family members. This was our master data set. A literature study was conducted to identify shortened S-STAI versions that are used in research and clinical practice. Data for each shortened version were obtained from the master data set by selecting the relevant items from the 20-item version. All analyses were done by means of classical test theory statistics. The effect of test shortening on total-score reliability was small, the effect on measurement precision was large, and the effect on individual diagnosis and assessment of individual change was ambiguous. We conclude that shortened versions of the S-STAI seem to be acceptable for research purposes, but may be problematic in clinical practice.
    Journal of psychosomatic research 08/2013; 75(2):167-72. · 2.91 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: I address two issues that were inspired by my work on the Dutch Committee on Tests and Testing (COTAN). The first issue is the understanding of problems test constructors and researchers using tests have of psychometric knowledge. I argue that this understanding is important for a field, like psychometrics, for which the dissemination of psychometric knowledge among test constructors and researchers in general is highly important. The second issue concerns the identification of psychometric research topics that are relevant for test constructors and test users but in my view do not receive enough attention in psychometrics. I discuss the influence of test length on decision quality in personnel selection and quality of difference scores in therapy assessment, and theory development in test construction and validity research. I also briefly mention the issue of whether particular attributes are continuous or discrete.
    Psychometrika 01/2011; 77(1):4-20. · 1.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Validated questionnaires can improve the identification of psychosocial problems in community pediatric services. Our aim was to assess which of 3 short questionnaires-the Brief Infant-Toddler Social and Emotional Assessment (BITSEA), the Ages and Stages Questionnaires: Social-Emotional (ASQ:SE), and the KIPPPI (Brief Instrument Psychological and Pedagogical Problem Inventory)-was most suitable as a routine screening tool for identification among toddlers. We included 2106 parents (response rate 81%) of children aged 6, 14, or 24 months at routine well-child visits in 18 services across the Netherlands. Child health care professionals interviewed and examined children and parents. Parents were randomized to complete either the BITSEA or the KIPPPI; all filled out the ASQ:SE and the Child Behavior Checklist. For each questionnaire, we assessed the internal consistency, validity with Child Behavior Checklist-Total Problems Score (CBCL-TPS) as a criterion, and added value to identification compared to clinical assessment alone. Cronbach's alphas of the total scales varied between 0.46 to 0.91. At the ages of 6 and 14 months, none of the instruments studied had adequate validity. At the age of 24 months, only the BITSEA discriminated sufficiently between children with and without problems (sensitivity = 0.84 at specificity = 0.90), but not the other 2 questionnaires (with sensitivity indices varying between 0.53 and 0.60 at similar specificity). The BITSEA at this age offered slightly higher added value to the identification of psychosocial problems by child health care professionals. For toddlers aged 6 and 14 months, no questionnaire is sufficiently valid to support the identification of psychosocial problems. The BITSEA is the best short tool for the early detection of psychosocial problems in 2-year-old children.
    Academic pediatrics 11/2013; 13(6):587-92.


Available from

Similar Publications