Feinstein AR, Cicchetti DVHigh agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol 43:543-549

Yale University, New Haven, Connecticut, United States
Journal of Clinical Epidemiology (Impact Factor: 3.42). 02/1990; 43(6):543-9. DOI: 10.1016/0895-4356(90)90158-L
Source: PubMed


In a fourfold table showing binary agreement of two observers, the observed proportion of agreement, p0, can be paradoxically altered by the chance-corrected ratio that creates kappa as an index of concordance. In one paradox, a high value of p0 can be drastically lowered by a substantial imbalance in the table's marginal totals either vertically or horizontally. In the second pardox, kappa will be higher with an asymmetrical rather than symmetrical imbalanced in marginal totals, and with imperfect rather than perfect symmetry in the imbalance. An adjustment that substitutes kappa max for kappa does not repair either problem, and seems to make the second one worse.

351 Reads
  • Source
    • "The kappa statistic was undertaken to assess concordance between blood smear microscopic examination and PCR method, as well as between PCR and parallel microscopy and PCV assay (Fleiss et al., 2003). Due to paradoxical kappa test results for agreement between PCR and MGG in one hand and PCR and PCV-MGG in the other hand, the prevalence-adjusted bias-adjusted kappa (PABAK) and the Gwet's AC 1 statistic (AC 1 ) were computed (Feinstein and Cicchetti, 1989; Byrt et al., 1993; Gwet, 2010). PCR (rRNA 16S) was considered as the gold standard on which to base estimates of sensibility and specificity for microscopy. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The prevalence of infection by Anaplasma spp. (including Anaplasma phagocytophilum) was determined using blood smear microscopy and PCR through screening of small ruminant blood samples collected from seven regions of Morocco. Co-infections of Anaplasma spp., Babesia spp, Theileria spp. and Mycoplasma spp. were investigated and risk factors for Anaplasma spp. infection assessed. A total of 422 small ruminant blood samples were randomly collected from 70 flocks. Individual animal (breed, age, tick burden and previous treatment) and flock data (GPS coordinate of farm, size of flock and livestock production system) were collected. Upon examination of blood smears, 375 blood samples (88.9%) were found to contain Anaplasma-like erythrocytic inclusion bodies. Upon screening with a large spectrum PCR targeting the Anaplasma 16S rRNA region, 303 (71%) samples were found to be positive. All 303 samples screened with the A. phagocytophilum-specific PCR, which targets the msp2 region, were found to be negative. Differences in prevalence were found to be statistically significant with regard to region, altitude, flock size, livestock production system, grazing system, presence of clinical cases and application of tick and tick-borne diseases prophylactic measures. Kappa analysis revealed a poor concordance between microscopy and PCR (k = 0.14). Agreement with PCR is improved by considering microscopy and packed cell volume (PCV) in parallel. The prevalence of double infections was found to be 1.7, 2.5 and 24% for Anaplasma-Babesia, Anaplasma-Mycoplasma and Anaplasma-Theileria, respectively. Co-infection with three or more haemoparasites was found in 1.6% of animals examined. In conclusion, we demonstrate the high burden of anaplasmosis in small ruminants in Morocco and the high prevalence of co-infections of tick-borne diseases. There is an urgent need to improve the control of this neglected group of diseases. © 2015 Blackwell Verlag GmbH.
    Full-text · Article · Apr 2015 · Transboundary and Emerging Diseases
  • Source
    • "Since P(C k ) depends directly upon the annotation value proportions, it may lead to the paradoxical result of ''high agreement but low kappa'' in case of unbalanced distributions of those proportions (Feinstein and Cicchetti, 1990). One motivation behind AC1 is to reduce the risk of paradoxical results by computing the chance estimator, P(C AC1 ), with reference to the concept of intra-observer variation (Kjaersgaard-Andersen et al., 1988), according to formula (3). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Current semantic theory on indexical expressions claims that demonstratively used indexicals such as this lack a referent-determining meaning but instead rely on an accompanying demonstration act like a pointing gesture. While this view allows to set up a sound logic of demonstratives, the direct-referential role assigned to pointing gestures has never been scrutinized thoroughly in semantics or pragmatics. We investigate the semantics and pragmatics of co-verbal pointing from a foundational perspective combining experiments, statistical investigation, computer simulation and theoretical modeling techniques in a novel manner. We evaluate various referential hypotheses with a corpus of object identification games set up in experiments in which body movement tracking techniques have been extensively used to generate precise pointing measurements. Statistical investigation and computer simulations show that especially distal areas in the pointing domain falsify the semantic direct-referential hypotheses concerning pointing gestures. As an alternative, we propose that reference involving pointing rests on a default inference which we specify using the empirical data. These results raise numerous problems for classical semantics–pragmatics interfaces: we argue for pre-semantic pragmatics in order to account for inferential reference in addition to classical post-semantic Gricean pragmatics.
    Full-text · Article · Feb 2015 · Journal of Pragmatics
    • "Secondly, the ICC is related to a Kappa statistic, since under special settings, a specific form of the ICC is identical to a weighted kappa statistic (Soeken and Prescott, 1986; Lin et al., 2007). Finally, Kappa statistics are not necessarily the best measures for agreement (Feinstein and Cicchetti, 1990) and similar measures like ICCs are then proposed (Kraemer et al., 2002). We calculated 95% CIs on ICCs using the Beta approximation (Demetrashvili et al., 2015), and criteria for evaluation of ICC are shown in Table 2 (Altman, 1991). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Dupuytren disease (DD) is a fibrosing disease affecting the palmar aponeurosis, and is mostly treated by surgery based on measurement of severity of flexion contracture of the fingers. Literature concerning the measurement reliability is scarce. This study aimed to determine the intra- and inter-observer agreement of four variables for diagnosing DD, determining severity of contracture, and disease extent. One of them is a new measurement on the area of nodules and cords for measuring the disease extent in early disease stages. An agreement study (n = 54) was performed by two trained investigators. Agreement was calculated per finger, based on an intraclass correlation coefficient (ICC) using a latent variable model on subjects for diagnosis and Tubiana stage. For total passive extension deficit (TPED) and the area of nodules and cords, agreement was calculated with an ICC using a one-way random effects model with subject as random effect. Inter-observer agreement was very good for diagnosing DD (ICC: 95.5%-99.9%) and good to very good for classifying Tubiana stage (ICC: 73.5%-94.9%). Agreements for area and TPED were moderate (middle finger) to very good (ICC: 48.4%-98.6% and 45.0%-99.5%, respectively). Intra-observer agreement was slightly higher on average than inter-observer agreement. Overall, the intra- and inter-observer agreement in diagnosing DD, and determining the severity of flexion contracture is high. Also, the newly introduced variable area of nodules and cords has high intra- and inter-observer agreement, indicating that it is suitable to measure disease extent. Copyright © 2015 Elsevier Ltd. All rights reserved.
    No preview · Article · Jan 2015 · Manual Therapy
Show more


351 Reads
Available from
Dec 19, 2014