ROC (receiver operating characteristic) curve: Principles and application in biology

ArticleinAnnales de biologie clinique 63(2):145-54 · March 2005with30 Reads
Source: PubMed
Laboratory test's diagnostic performances are generally estimated by means of their sensibility, specificity and positive and negative predictive values. Unfortunately, these indices reflect only imperfectly the capacity of a test to correctly classify subjects into clinically relevant subgroups. The appeal to ROC (receiver operating characteristic) curve appears as a tool of choice for this evaluation. Used in the medical domain since the 60s, ROC curve is a graphic representation of the relation existing between the sensibility and the specificity of a test, calculated for all possible cut-off. It allows the determination and the comparison of the diagnostic performances of several tests. It is also used to consider the optimal cut-off of a test, by taking into account epidemiological and medical - economic data of the disease. Used in numerous medical domains, this statistical tool is easily accessible thanks to the development of computer softwares. This article exposes the principles of construction and exploitation of a ROC curve.
    • "In order to obtain information about the comparative effectiveness of these tests between men and women, we compared the area under the curves (AUC) values for each scale (CAGE, RAPS4, RAPS4-QF and AUDIT) between the total sample, men and women using version of the MedCalc ® software (for more details about this statistical analysis and software, see Stephan et al., 2003; Delacour et al., 2005 ). Comparisons of fixed sensitivities and specificities at optimized and classically used thresholds were also provided between men and women by calculating partial index of ROC curves [1: (sensitivity + specificity)/2, Park et al., 2004; Delacour et al., 2005] for each scale. Finally, for each scale, we compared optimized and classically used thresholds for each sample (total men and women) by using the 1 index. "
    [Show abstract] [Hide abstract] ABSTRACT: A number of screening instruments are routinely used in Emergency Department (ED) situations to identify alcohol-use disorders (AUD). We wished to study the psychometric features, particularly concerning optimal thresholds scores (TSs), of four assessment scales frequently used to screen for abuse and/or dependence, the cut-down annoyed guilty eye-opener (CAGE), Rapid Alcohol Problem Screen 4 (RAPS4), RAPS4-quantity-frequency and AUD Identification Test (AUDIT) questionnaires, particularly in the sub-group of people admitted for acute alcohol intoxication (AAI). All included patients [AAI admitted to ED (blood alcohol level ≥0.8 g/l)] were assessed by the four scales, and with a gold standard (alcohol dependence/abuse section of the Mini International Neuropsychiatric Interview), to determine AUD status. To investigate the TSs of the scales, we used Youden's index, efficiency, receiver operating characteristic (ROC) curve techniques and quality ROC curve technique for optimized TS (indices of quality). A total of 164 persons (122 males, 42 females) were included in the study. Nineteen (11.60%) were identified as alcohol abusers alone and 128 (78.1%) as alcohol dependents (DSM-IV). Results suggest a statistically significant difference between men and women (P < 0.05) in performance of the screening tests RAPS4 (≥1) and CAGE (≥2) for detecting abuse. Also, in this population, we show an increase in TSs of RAPS4 (≥2) and CAGE (≥3) for detecting dependence compared with those typically accepted in non-intoxicated individuals. The AUDIT test demonstrates good performance for detecting alcohol abuse and/or alcohol-dependent patients (≥7 for women and ≥12 for men) and for distinguishing alcohol dependence (≥11 for women and ≥14 for men) from other conditions. Our study underscores for the first time the need to adapt, taking into account gender, the thresholds of tests typically used for detection of abuse and dependence in this population.
    Full-text · Article · Mar 2012
    • "Indeed, many of the validation studies reporting the discriminative capacity of a screening questionnaire use different outcome variables (5/ 5), a three-month or six-month follow-up (3/5) and a sample size of less than 200 participants (5/5). Additional statistical analyses would be necessary to determine if these differences were caused or not by the questionnaires' characteristics [94]. It is common to report sensitivity and specificity values associated with questionnaires or models (seeTable 7). "
    [Show abstract] [Hide abstract] ABSTRACT: Over the last decades, psychosocial factors were identified by many studies as significant predictive variables in the development of disability related to common low back disorders, which thus contributed to the development of biopsychosocial prevention interventions. Biopsychosocial interventions were supposed to be more effective than usual interventions in improving different outcomes. Unfortunately, most of these interventions show inconclusive results. The use of screening questionnaires was proposed as a solution to improve their efficacy. The aim of this study was to validate a new screening questionnaire to identify workers at risk of being absent from work for more than 182 cumulative days and who are more susceptible to benefit from prevention interventions. Injured workers receiving income replacement benefits from the Quebec Compensation Board (n = 535) completed a 67-item questionnaire in the sub-acute stage of pain and provided information about work-related events 6 and 12 months later. Reliability and validity of the 67-item questionnaire were determined respectively by test-retest reliability and internal consistency analysis, as well as by construct validity analyses. The Cox regression model and the maximum likelihood method were used to fix a model allowing calculation of a probability of absence of more than 182 days. Criterion validity and discriminative capacity of this model were calculated. Sub-sections from the 67-item questionnaire were moderately to highly correlated 2 weeks later (r = 0.52-0.80) and showed moderate to good internal consistency (0.70-0.94). Among the 67-item questionnaire, six sub-sections and variables (22 items) were predictive of long-term absence from work: fear-avoidance beliefs related to work, return to work expectations, annual family income before-taxes, last level of education attained, work schedule and work concerns. The area under the ROC curve was 73%. The significant predictive variables of long-term absence from work were dominated by workplace conditions and individual perceptions about work. In association with individual psychosocial variables, these variables could contribute to identify potentially useful prevention interventions and to reduce the significant costs associated with LBP long-term absenteeism.
    Full-text · Article · Jul 2011
    • "To facilitate the use of the model in clinical practice, the regression coefficients associated with the identified predictors in the final model were transformed by multiplication with a factor 4, rounded off to the nearest integer, into scores to obtain an aggregate score by adding up the scores. The discriminative power of the model was assessed using the area under the receiver operating characteristic (ROC) curve (AUC) [20]. To develop an algorithm for clinical approach to the prevention of falls among elderly living in the community, two cutoffs were discussed: the cutoff maximizing the Youden index and the cutoff maximizing the sum of the positive predictive value (PPV) and negative predictive value (NPV). "
    [Show abstract] [Hide abstract] ABSTRACT: To develop a simple clinical screening tool for community-dwelling older adults. A prospective multicenter cohort study was performed among healthy subjects of 65 years and older, examined in 10 health examination centers for the French health insurance. Falls were ascertained monthly by telephone for 12-month follow-up. Multivariate analyses using Cox regression models were performed. Regression coefficients of the predictors in the final model were added up to obtain the total score. The discriminative power was assessed using the area under the curve (AUC). Thousand seven hundred fifty-nine subjects were included. The mean age was 70.7 years and 51% were women. At least one fall occurred among 563 (32%) participants. Gender, living alone, psychoactive drug use, osteoarthritis, previous falls, and a change in the position of the arms during the one-leg balance (OLB) test were the strongest predictors. These predictors were used to build a risk score. The AUC of the score was 0.70. For a cutoff point of 1.68 in a total of 4.90, the positive predictive value and negative predictive value were 72.0% and 72.7%, respectively. A screening tool with five risk factors and the OLB test could predict falls in healthy community-dwelling older adults.
    Full-text · Article · Apr 2011
Show more