Do ultra-short screening instruments accurately detect depression in primary care? A pooled analysis and meta-analysis of 22 studies

Department of Liaison Psychology, Leicester General Hospital, Leicester.
British Journal of General Practice (Impact Factor: 2.29). 02/2007; 57(535):144-51.
Source: PubMed


Guidance from the National Institute for Health and Clinical Excellence recommends one or two questions as a possible screening method for depression. Ultra-short (one-, two-, three- or four-item) tests have appeal due to their simple administration but their accuracy has not been established.
To determine whether ultra-short screening instruments accurately detect depression in primary care.
Pooled analysis and meta analysis.
A literature search revealed 75 possible studies and from these, 22 STARD-compliant studies (Standards for Reporting of Diagnostic Accuracy) involving ultra-short tests were entered in the analysis.
Meta-analysis revealed a performance accuracy better than chance (P<0.001). More usefully for clinicians, pooled analysis of single-question tests revealed an overall sensitivity of 32.0% and specificity of 97.0% (positive predictive value [PPV] was 55.6% and negative predictive value [NPV] was 92.3%). For two- and three-item tests, overall sensitivity on pooled analysis was 73.7% and specificity was 74.7% with a PPV of only 38.3% but a pooled NPV of 93.0%. The Youden index for single-item and multiple item tests was 0.289 and 0.47 respectively, suggesting superiority of multiple item tests. Re-analysis examining only 'either or' strategies improved the 'rule in' ability of two- and three-question tests (sensitivity 79.4% and NPV 94.7%) but at the expense of being able to rule out a possible diagnosis if the result was negative.
A one-question test identifies only three out of every 10 patients with depression in primary care, thus unacceptable if relied on alone. Ultra-short two- or three-question tests perform better, identifying eight out of 10 cases. This is at the expense of a high false-positive rate (only four out of 10 cases with a positive score are actually depressed). Ultra-short tests appear to be, at best, a method for ruling out a diagnosis and should only be used when there are sufficient resources for second-stage assessment of those who screen positive.

Download full-text


Available from: Alex J Mitchell
    • "However, given that the prevalence of depression in mental health care centres is high, this might be an optimal setting to use the VAMS. In a meta-analysis on screeners in primary care, it was found that single-item screeners have low sensitivity (i.e., pooled SE¼ 0.32) that increased when additional items were added (Mitchell and Coyne, 2007). Since we exclusively focused on sad mood, adding an additional VAMS for anhedonia, the other core symptom of depression (APA, 2000), might also reduce false positives. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Mood is a key element of Major Depressive Disorder (MDD), and is perceived as a highly dynamic construct. The aim of the current study was to examine whether a single-item mood scale can be used for mood monitoring. One hundred thirty remitted out-patients were assessed using the Structured Clinical Interview for DSM-IV Axis-I Disorders (SCID-I), Visual Analogue Mood Scale (VAMS), 17-item Hamilton Depression Rating Scale (HAM-D17), and Inventory of Depressive Symptomatology-Self Report (IDS-SR). Of all patients, 13.8% relapsed during follow-up assessments. Area under the curves (AUCs) for the VAMS, HAM-D17 and IDS-SR were 0.94, 0.91, and, 0.86, respectively. The VAMS had the highest positive predictive value (PPV) without any false negatives at score 55 (PPV=0.53; NPV=1.0) and was the best predictor of current relapse status (variance explained for VAMS: 60%; for HAM-D17: 49%; for IDS-SR: 34%). Only the HAM-D17 added significant variance to the model (7%). Assessing sad mood with a single-item mood scale seems to be a straightforward and patient-friendly avenue for life-long mood monitoring. Using a diagnostic interview (e.g., the SCID) in case of a positive screen is warranted. Repeated assessment of the VAMS using Ecological Momentary Assessment (EMA) might reduce false positives.
    No preview · Article · Jul 2014 · Psychiatry Research
  • Source
    • "We screened for anxiety symptoms in the same manner, asking: have you felt anxious much of the time in the past year? Single-item questions have been used for depressive disorders screening in both general and clinical populations [35] [36] "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background. Although binge drinking prevalence and correlates among young people have been extensively studied in the USA and Northern Europe, less is known for Southern Europe countries with relatively healthier drinking cultures. Objective.We aimed at analyzing prevalence and correlates of binge drinking in a representative sample of young adults in Italy. Methods.We conducted a cross-sectional survey among alcohol-consuming young adults. We carried out univariate and multivariate analyses to assess associations between recent binge drinking and candidate variables. Results. We selected 654 subjects, with 590 (mean age: 20.65 ± 1.90) meeting inclusion criteria. Prevalence for recent binge drinking was 38.0%, significantly higher for females than males. Multivariate analysis showed that high alcohol expectancies, large amount of money available during the weekend, interest for parties and discos, female gender, cannabis use, influence by peers, and electronic cigarettes smoking allwere significantly associated with recent binge drinking, whereas living with parents appeared a significant protective factor. Conclusions. More than a third of young adults using alcohol are binge drinkers, and, in contrast with findings from Anglo-Saxon countries, females show higher risk as compared with males. These data suggest the increasing importance of primary and secondary prevention programmes for binge drinking.
    Full-text · Article · Jun 2014 · BioMed Research International
  • Source
    • "To reduce the respondent burden of screeners for common mental health disorders, many efforts have been aimed at creating fixed length short forms of existing self-report questionnaires (e.g., Donker, van Straten, Marks, & Cuijpers, 2011; Cuijpers, Smits, Donker, Ten Have, & Graaf, 2010; Rost, Burnam, & Smith, 1993). However, for fixed-length short forms, the reduction in test length generally comes at the expense of diagnostic accuracy (Smith, McCarthy, & Anderson, 2000; Mitchell & Coyne, 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Minimizing the respondent burden and maximizing the classification accuracy of tests is essential for efficacious screening for common mental health disorders. In previous studies, curtailment of tests has been shown to reduce average test length considerably, without loss of accuracy. In the current study, we simulate Deterministic (DC) and Stochastic (SC) Curtailment for three self-report questionnaires for common mental health disorders, to study the potential gains in efficiency that can be obtained in screening for these disorders. The curtailment algorithms were applied in an existing dataset of item scores of 502 help-seeking participants. Results indicate that DC reduces test length by up to 37%, and SC reduces test length by up to 46%, with only very slight decreases in diagnostic accuracy. Compared to an item response theory based adaptive test with similar test length, SC provided better diagnostic accuracy. Consequently, curtailment may be useful in improving the efficiency of mental health self-report questionnaires.
    Full-text · Article · Nov 2013
Show more