Kilem Li Gwet

Kilem Li Gwet
AgreeStat Analytics

PhD

About

18
Publications
68,357
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,170
Citations
Introduction
K. Gwet is a Statistical and Quantitative management consultant specializing in inter-rater reliability and statistical surveys. Has developed the R packages irrCAC and irrICC for evaluating agreement coefficients. He has developed 2 cloud-based apps: (1) agreestat360.com for analyzing inter-rater agreement data and (2) AgreeTest (agreestat.net/agreetest/) to test the difference of 2 agreement coefficients for statistical significance. Has authored the "Handbook of Inter-Rater Reliability."

Publications

Publications (18)
Article
Full-text available
Purpose To estimate the interobserver agreement of the Carimas software package (SP) on global, regional, and segmental levels for the most widely used myocardial perfusion PET tracer—Rb-82. Materials and methods Rest and stress Rb-82 PET scans of 48 patients with suspected or known coronary artery disease (CAD) were analyzed in four centers using...
Article
Full-text available
Purpose To cross-compare three software packages (SPs)—Carimas, FlowQuant, and PMOD—to quantify myocardial perfusion at global, regional, and segmental levels. Materials and Methods Stress N-13 ammonia PET scans of 48 patients with HCM were analyzed in three centers using Carimas, FlowQuant, and PMOD. Values agreed if they had an ICC > 0.75 and a...
Preprint
Full-text available
Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among o...
Article
Full-text available
Krippendorff’s alpha coefficient is a statistical measure of the extent of agreement among coders, and is regularly used by researchers in the field of content analysis. This coefficient is known to involve complex calculations, making the evaluation and its sampling variation possible only through resampling methods such as the bootstrap. In this...
Article
Full-text available
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling techni...
Article
Objectives: The purpose of this study was to compare myocardial blood flow (MBF) and myocardial flow reserve (MFR) estimates from rubidium-82 positron emission tomography ((82)Rb PET) data using 10 software packages (SPs) based on 8 tracer kinetic models. Background: It is unknown how MBF and MFR values from existing SPs agree for (82)Rb PET. M...
Article
Full-text available
Background Rater agreement is important in clinical research, and Cohen’s Kappa is a widely used method for assessing inter-rater reliability; however, there are well documented statistical problems associated with the measure. In order to assess its utility, we evaluated it against Gwet’s AC1 and compared the results. Methods This study was carri...
Chapter
Full-text available
The notion of intrarater reliability will be of interest to researchers concerned about the reproducibility of clinical measurements. A rater in this context refers to any data-generating system, which includes individuals and laboratories; intrarater reliability is a metric for rater’s self-consistency in the scoring of subjects. The importance of...
Article
Full-text available
Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limita...
Article
Full-text available
Most inter-rater reliability studies using nominal scales suggest the existence of two populations of inference: the population of subjects (collection of objects or persons to be rated) and that of raters. Consequently, the sampling variance of the inter-rater reliability coefficient can be seen as a result of the combined effect of the sampling o...
Article
The SAS system V.8 implements the computation of unweighted and weighted kappa statistics as an option in the FREQ procedure. A major limitation of this implemen-tation is that the kappa statistic can only be evaluated when the number of raters is limited to 2. Extensions to the case of multiple raters due to Fleiss (1971) have not been implemented...

Network

Cited By