QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies

University of Bristol, United Kingdom.
Annals of internal medicine (Impact Factor: 17.81). 10/2011; 155(8):529-36. DOI: 10.1059/0003-4819-155-8-201110180-00009
Source: PubMed

ABSTRACT In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.

Download full-text


Available from: Marie Westwood, May 13, 2014
333 Reads
  • Source
    • "This tool comprises of four domains, namely patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias and concerns regarding applicability [27] "
    [Show abstract] [Hide abstract]
    ABSTRACT: Several studies have implicated PAX1 epigenetic regulation in cervical neoplasia. The aim of this meta-analysis was to assess PAX1 gene methylation as a potential biomarker in cervical cancer screening. A systematical search of all major databases was performed, in order to include all relevant publications in English until December 31(st) 2014. Studies with insufficient data, conducted in experimental models or associated with other comorbidities were excluded from the meta-analysis. Summary receiver operating characteristics (SROC) for Cervical Intraepithelial Neoplasia grade 2 or worse (CIN2(+)) versus normal, and CIN grade 3 or worse (CIN3(+)) versus normal, were estimated using the bivariate model. Out of the 20 initially included studies, finally 7 (comprising of 1385 subjects with various stages of CIN and normal cervical pathology) met the inclusion criteria. The sensitivity of CIN2(+) versus normal was estimated to be 0.66 (CI 95%, 0.46-0.81) and the specificity 0.92 (CI 95%, 0.88-0.95). On the other hand, the sensitivity of CIN3(+) versus normal was 0.77 (CI 95%, 0.58-0.89) and the specificity 0.92 (CI 95%, 0.88-0.94). Moreover, the area under the curve (AUC) in the former case was 0.923, and in the latter 0.931. The results of this meta-analysis support the utility of PAX1 methylation as an auxiliary biomarker in cervical cancer screening. PAX1 could be used effectively to increase the specificity of HPV DNA by detecting women with more advanced cervical abnormalities. Copyright © 2015 Elsevier Ltd. All rights reserved.
    07/2015; 39(5):682-686. DOI:10.1016/j.canep.2015.07.008
    • "There is strong consensus regarding the qualities of studies that offer the best evidence of screening accuracy. A rich literature (e.g., Jaeschke et al., 1994) and two detailed expert consensus statements , the STARD (Bossuyt et al., 2003) and the QUADAS-2 (Whiting et al., 2011), detail the criteria for conducting and reporting diagnostic accuracy studies. This 'gold standard' methodology prescribes comparison of screening results to 'gold standard' tests and structured interviews that are considered to be our best indices of psychopathology, typically yielding what we will hereafter refer to as the standard indices of diagnostic accuracy: "
    [Show abstract] [Hide abstract]
    ABSTRACT: The accuracy of any screening instrument designed to detect psychopathology among children is ideally assessed through rigorous comparison to 'gold standard' tests and interviews. Such comparisons typically yield estimates of what we refer to as 'standard indices of diagnostic accuracy', including sensitivity, specificity, positive predictive value (PPV), and negative predictive value. However, whereas these statistics were originally designed to detect binary signals (e.g., diagnosis present or absent), screening questionnaires commonly used in psychology, psychiatry, and pediatrics typically result in ordinal scores. Thus, a threshold or 'cut score' must be applied to these ordinal scores before accuracy can be evaluated using such standard indices. To better understand the tradeoffs inherent in choosing a particular threshold, we discuss the concept of 'threshold probability'. In contrast to PPV, which reflects the probability that a child whose score falls at or above the screening threshold has the condition of interest, threshold probability refers specifically to the likelihood that a child whose score is equal to a particular screening threshold has the condition of interest. The diagnostic accuracy and threshold probability of two well-validated behavioral assessment instruments, the Child Behavior Checklist Total Problem Scale and the Strengths and Difficulties Questionnaire total scale were examined in relation to a structured psychiatric interview in three de-identified datasets. Although both screening measures were effective in identifying groups of children at elevated risk for psychopathology in all samples (odds ratios ranged from 5.2 to 9.7), children who scored at or near the clinical thresholds that optimized sensitivity and specificity were unlikely to meet criteria for psychopathology on gold standard interviews. Our results are consistent with the view that screening instruments should be interpreted probabilistically, with attention to where along the continuum of positive scores an individual falls. © 2015 Association for Child and Adolescent Mental Health.
    Journal of Child Psychology and Psychiatry 06/2015; 56(9). DOI:10.1111/jcpp.12442 · 6.46 Impact Factor
  • Source
    • "The lead authors of studies for which reported data were insufficient to calculate diagnostic accuracy were contacted to ascertain missing data. Study quality was appraised using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) instrument [10] [12]. Additionally , the studies were graded using the quality scale reported by Van den Bruel et al.; [13] studies were rated as grade A if they fulfilled all QUADAS-2 criteria. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Screening for atrial fibrillation (AF) using 12-lead-electrocardiograms (ECGs) has been recommended; however, the best method for interpreting ECGs to diagnose AF is not known. We compared accuracy of methods for diagnosing AF from ECGs. We searched MEDLINE, EMBASE, CINAHL and LILACS until March 24, 2014. Two reviewers identified eligible studies, extracted data and appraised quality using the QUADAS-2 instrument. Meta-analysis, using the bivariate hierarchical random effects method, determined average operating points for sensitivities, specificities, positive and negative likelihood ratios (PLR, NLR) and enabled construction of Summary Receiver Operating Characteristic (SROC) plots. 10 studies investigated 16 methods for interpreting ECGs (n=55,376 participant ECGs). The sensitivity and specificity of automated software (8 studies; 9 methods) were 0.89 (95% C.I. 0.82-0.93) and 0.99 (95% C.I. 0.99-0.99), respectively; PLR 96.6 (95% C.I. 64.2-145.6); NLR 0.11 (95% C.I. 0.07-0.18). Indirect comparisons with software found healthcare professionals (5 studies; 7 methods) had similar sensitivity for diagnosing AF but lower specificity [sensitivity 0.92 (95% C.I. 0.81-0.97), specificity 0.93 (95% C.I. 0.76-0.98), PLR 13.9 (95% C.I. 3.5-55.3), NLR 0.09 (95% C.I. 0.03-0.22)]. Sub-group analyses of primary care professionals found greater specificity for GPs than nurses [GPs: sensitivity 0.91 (95% C.I. 0.68-1.00); specificity 0.96 (95% C.I. 0.89-1.00). Nurses: sensitivity 0.88 (95% C.I. 0.63-1.00); specificity 0.85 (95% C.I. 0.83-0.87)]. Automated ECG-interpreting software most accurately excluded AF, although its ability to diagnose this was similar to all healthcare professionals. Within primary care, the specificity of AF diagnosis from ECG was greater for GPs than nurses. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
    International Journal of Cardiology 02/2015; 184C(1):175-183. DOI:10.1016/j.ijcard.2015.02.014 · 4.04 Impact Factor
Show more