About
18
Publications
68,357
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,170
Citations
Introduction
K. Gwet is a Statistical and Quantitative management consultant specializing in inter-rater reliability and statistical surveys. Has developed the R packages irrCAC and irrICC for evaluating agreement coefficients.
He has developed 2 cloud-based apps: (1) agreestat360.com for analyzing inter-rater agreement data and (2) AgreeTest (agreestat.net/agreetest/) to test the difference of 2 agreement coefficients for statistical significance. Has authored the "Handbook of Inter-Rater Reliability."
Publications
Publications (18)
Purpose
To estimate the interobserver agreement of the Carimas software package (SP) on global, regional, and segmental levels for the most widely used myocardial perfusion PET tracer—Rb-82.
Materials and methods
Rest and stress Rb-82 PET scans of 48 patients with suspected or known coronary artery disease (CAD) were analyzed in four centers using...
Purpose
To cross-compare three software packages (SPs)—Carimas, FlowQuant, and PMOD—to quantify myocardial perfusion at global, regional, and segmental levels.
Materials and Methods
Stress N-13 ammonia PET scans of 48 patients with HCM were analyzed in three centers using Carimas, FlowQuant, and PMOD. Values agreed if they had an ICC > 0.75 and a...
Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among o...
Krippendorff’s alpha coefficient is a statistical measure of the extent of agreement among coders,
and is regularly used by researchers in the field of content analysis. This coefficient is known to
involve complex calculations, making the evaluation and its sampling variation possible only
through resampling methods such as the bootstrap. In this...
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling techni...
Objectives:
The purpose of this study was to compare myocardial blood flow (MBF) and myocardial flow reserve (MFR) estimates from rubidium-82 positron emission tomography ((82)Rb PET) data using 10 software packages (SPs) based on 8 tracer kinetic models.
Background:
It is unknown how MBF and MFR values from existing SPs agree for (82)Rb PET.
M...
Background
Rater agreement is important in clinical research, and Cohen’s Kappa is a widely used method for assessing inter-rater reliability; however, there are well documented statistical problems associated with the measure. In order to assess its utility, we evaluated it against Gwet’s AC1 and compared the results.
Methods
This study was carri...
The notion of intrarater reliability will be of interest to researchers concerned about the reproducibility of clinical measurements. A rater in this context refers to any data-generating system, which includes individuals and laboratories; intrarater reliability is a metric for rater’s self-consistency in the scoring of subjects. The importance of...
Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limita...
Most inter-rater reliability studies using nominal scales suggest the existence of two populations of inference: the population
of subjects (collection of objects or persons to be rated) and that of raters. Consequently, the sampling variance of the
inter-rater reliability coefficient can be seen as a result of the combined effect of the sampling o...
The SAS system V.8 implements the computation of unweighted and weighted kappa statistics as an option in the FREQ procedure. A major limitation of this implemen-tation is that the kappa statistic can only be evaluated when the number of raters is limited to 2. Extensions to the case of multiple raters due to Fleiss (1971) have not been implemented...