Content uploaded by George A Diamond
Author content
All content in this area was uploaded by George A Diamond
Content may be subject to copyright.
mographic Angiography of Individuals Undergoing Invasive Coronary Angiog-
raphy) trial. J Am Coll Cardiol. 2008;52(21):1724-1732.
7. Miller JM, Rochitte CE, Dewey M, et al. Diagnostic performance of coronary an-
giography by 64-row CT. N Engl J Med. 2008;359(22):2324-2336.
8. Garcia MJ, Lessick J, Hoffmann MHK; CATSCAN Study Investigators. Accuracy
of 16-row multidetector computed tomography for the assessment of coronary
artery stenosis. JAMA. 2006;296(4):403-411.
9. Meijboom WB, Meijs MFL, Schuijf JD, et al. Diagnostic accuracy of 64-slice com-
puted tomography coronary angiography: a prospective, multicenter, multiven-
dor study. J Am Coll Cardiol. 2008;52(25):2135-2144.
10. Diamond GA, Forrester JS. Analysis of probability as an aid in the clinical diag-
nosis of coronary-artery disease. N Engl J Med. 1979;300(24):1350-1358.
11. Chaitman BR, Bourassa MG, Davis K, et al. Angiographic prevalence of high-
risk coronary artery disease in patient subsets (CASS). Circulation. 1981;64
(2):360-367.
12. Gibbons RJ, Chatterjee K, Daley J, et al. ACC/AHA/ACP-ASIM guidelines for the
management of patients with chronic stable angina: executive summary and rec-
ommendations: a report of the American College of Cardiology/American Heart
Association Task Force on Practice Guidelines (Committee on Management of
Patients with Chronic Stable Angina). Circulation. 1999;99(21):2829-2848.
13. Epstein SE. Implications of probability analysis on the strategy used for nonin-
vasive detection of coronary artery disease: role of single or combined use of
exercise electrocardiographic testing, radionuclide cineangiography and myo-
cardial perfusion imaging. Am J Cardiol. 1980;46(3):491-499.
14. Abbara S, Arbab-Zadeh A, Callister TQ, et al. SCCT guidelines for performance of
coronary computed tomographic angiography: a report of the Society of Cardio-
vascular Computed Tomography Guidelines Committee. J Cardiovasc Comput
Tomogr. 2009;3(3):190-204.
15. Chun EJ, Lee W, Choi YH, et al. Effects of nitroglycerin on the diagnostic accu-
racy of electrocardiogram-gated coronary computed tomography angiography.
J Comput Assist Tomogr. 2008;32(1):86-92.
16. Chow BJW, Wells GA, Chen L, et al. Prognostic value of 64-slice cardiac com-
puted tomography severity of coronary artery disease, coronary atherosclerosis,
and left ventricular ejection fraction. J Am Coll Cardiol. 2010;55(10):1017-1028.
17. Hoffmann U, Moselewski F, Cury RC, et al. Predictive value of 16-slice multide-
tector spiral computed tomography to detect significant obstructive coronary ar-
tery disease in patients at high risk for coronary artery disease: patient-versus
segment-based analysis. Circulation. 2004;110(17):2638-2643.
18. Eagle KA, Guyton RA, Davidoff R, et al; American College of Cardiology; American
Heart Association. ACC/AHA 2004 guideline update for coronary artery bypass graft
surgery: a report of the American College of Cardiology/American Heart Associa-
tion Task Force on Practice Guidelines (Committee to Update the 1999 Guidelines
for Coronary Artery Bypass Graft Surgery). Circulation. 2004;110(14):e340-e437.
19. Patel MR, Dehmer GJ, Hirshfeld JW, Smith PK, Spertus JA; American College of
Cardiology Foundation Appropriateness Criteria Task Force; Society for Cardio-
vascular Angiography and Interventions; Society of Thoracic Surgeons; Ameri-
can Association for Thoracic Surgery; American Heart Association, and the Ameri-
can Society of Nuclear Cardiology Endorsed by the American Society of
Echocardiography; Heart Failure Society of America; Society of Cardiovascular
Computed Tomography. ACCF/SCAI/STS/AATS/AHA/ASNC 2009 Appropriate-
ness Criteria for Coronary Revascularization: a report by the American College
of Cardiology Foundation Appropriateness Criteria Task Force, Society for Car-
diovascular Angiography and Interventions, Society of Thoracic Surgeons, Ameri-
can Association for Thoracic Surgery, American Heart Association, and the Ameri-
can Society of Nuclear Cardiology Endorsed by the American Society of
Echocardiography, the Heart Failure Society of America, and the Society of Car-
diovascular Computed Tomography. J Am Coll Cardiol. 2009;53(6):530-553.
20. Ostrom MP, Gopal A, Ahmadi N, et al. Mortality incidence and the severity of
coronary atherosclerosis assessed by computed tomography angiography. JAm
Coll Cardiol. 2008;52(16):1335-1343.
21. Smith SC Jr, Dove JT, Jacobs AK, et al; American College of Cardiology; American
Heart Association Task Force on Practice Guidelines; Committee to Revise the 1993
Guidelines for Percutaneous Transluminal Coronary Angioplasty. ACC/AHA guide-
lines for percutaneous coronary intervention (revision of the 1993 PTCA guidelines)—
executive summary: a report of the American College of Cardiology/American Heart
Association Task Force on Practice Guidelines (Committee to Revise the 1993 Guide-
lines for Percutaneous Transluminal Coronary Angioplasty). J Am Coll Cardiol. 2001;
37(8):2215-2238.
22. Rozanski A, Diamond GA, Berman D, Forrester JS, Morris D, Swan HJ. The de-
clining specificity of exercise radionuclide ventriculography. N Engl J Med. 1983;
309(9):518-522.
23. Krenning BJ, Geleijnse ML, Poldermans D, Roelandt JR. Methodological analy-
sis of diagnostic dobutamine stress echocardiography studies. Echocardiography.
2004;21(8):725-736.
24. Klocke FJ, Baird MG, Lorell BH, et al; American College of Cardiology; American
Heart Association; American Society for Nuclear Cardiology. ACC/AHA/ASNC guide-
lines for the clinical use of cardiac radionuclide imaging—executive summary: a
report of the American College of Cardiology/American Heart Association Task Force
on Practice Guidelines (ACC/AHA/ASNC Committee to Revise the 1995 Guidelines
for the Clinical Use of Cardiac Radionuclide Imaging). J Am Coll Cardiol. 2003;
42(7):1318-1333.
25. Sicari R, Nihoyannopoulos P, Evangelista A, et al; European Association of Ech-
ocardiography. Stress echocardiography expert consensus statement: Euro-
pean Association of Echocardiography (EAE) (a registered branch of the ESC).
Eur J Echocardiogr. 2008;9(4):415-437.
ONLINE FIRST
INVITED COMMENTARY
Gone Fishing!
On the “Real-World” Accuracy of Computed Tomographic Coronary Angiography
’Tis with our judgments as our watches, none go just alike, yet each believes his own.
Alexander Pope
In this issue of the Archives, Chow and colleagues de-
scribe a multicenter “field evaluation” of computed to-
mographic coronary angiography (CTCA) in 169 pa-
tients undergoing conventional CA among 594 candidates
with suspected coronary artery disease and report that
its sensitivity, specificity, and predictive accuracy var-
ied widely from center to center.
There are numerous reasons for this variability. For
example, test likelihoods are well known to vary with the
severity of disease (the greater the severity, the higher
the sensitivity and the lower the specificity) and with the
threshold for categorical interpretation (the greater the
threshold, the lower the sensitivity and the higher the
specificity). Accordingly, if we wish to interpret the par-
ticular response in a particular patient, we need to know
the sensitivity and specificity of that particular response
rather than of some arbitrary spectrum of responses. Also,
conventional diagnostic assessment is often highly sub-
jective, even for the verification procedure itself. With
respect to CA as a diagnostic standard, for example, a given
patient can be considered severely diseased by one ob-
server and entirely normal by another.1
Most importantly, the preferential referral of positive
test responders toward diagnostic verification and nega-
tive test responders away from diagnostic verification—
albeit readily justified as the exercise of good clinical judg-
ARCH INTERN MED/ VOL 171 (NO. 11), JUNE 13, 2011 WWW.ARCHINTERNMED.COM
1029
©2011 American Medical Association. All rights reserved.
on June 18, 2011 www.archinternmed.comDownloaded from
ment—results in substantial distortions of observed
sensitivity and specificity.2-5 Consider an extreme ex-
ample. Suppose you have a diagnostic test with a sensi-
tivity of 80% and a specificity of 80%. Suppose further
that you refer every patient with a positive test response
for diagnostic verification, but you never refer a patient
with a negative test response for verification. Because only
positive test responders will undergo verification, every
diseased patient will have a positive test result (ob-
served sensitivity, 100%), but so will every nondiseased
patient (observed specificity, 0%). This phenomenon likely
contributed to the low diagnostic yield of elective CA in
a recent report.6
However, referral for verification depends not only on
the test response but also on a variety of concomitant clini-
cal observations with putative diagnostic value.2Thus,
referral for CA among patients undergoing diagnostic test-
ing because of suspected coronary artery disease is in-
fluenced directly by the test response itself but is also in-
fluenced indirectly by additional factors such as age, sex,
the quality and severity of symptoms, and the results of
other tests that might have been performed.
Therefore, just as Bayes’ theorem tells us that the pre-
dictive accuracy of any test is conditioned on the overall
prevalence of disease in the population tested, it also tells
us that test performance is conditioned on the overall
prevalence of abnormal responses in the patients under-
going testing: those in whom disease status is verified and
those in whom it is not. As a result, the observed sensi-
tivity and specificity are conditioned on the magnitude
of bias introduced by the process of verification.2,3 Chow
and colleagues’ study did not correct for such verifica-
tion bias, nor did it even assess its magnitude by report-
ing the proportion of abnormal test results observed in
the 425 candidates who were not referred for CA. As a
result, the actual sensitivity and specificity of CT might
be very different from that observed in their study
(Figure, A). Moreover, whatever its sensitivity (Sn) and
specificity (Sp), there is always some range of prior prob-
ability of disease (P) within which any test performs op-
timally.7The lower bound of this range is the point be-
low which false-positive responses exceed true-positive
responses [(1−Sp)⫻(1−P)⬎Sn⫻P], and the upper
bound is the point above which false-negative re-
sponses exceed true-negative responses [(1−Sn)
⫻P⬎Sp⫻(1−P)]. Solving each of these inequalities with
respect to P, we get:
1 − Sp
Sn + (1 − Sp) < P < Sp
(1 − Sn) + Sp
By inserting the OMCAS patient level data for 50% ste-
nosis into this expression, the optimal range of prior prob-
ability extends from 0.10 to 0.86. Thus, when prior prob-
ability is less than 0.10, a positive response is more likely
to be a false positive than a true positive, and when prior
1.0
0.6
0.8
0.4
0.2
0 0.2 0.4 0.6 0.8 1.0
Prevalence of Positive Responders
Test Likelihood
A
1.0
0.6
0.8
0.4
0.2
0 0.2 0.4 0.6 0.8 1.0
Prevalence of Positive Responders
Prior Probability
B
Sensitivity
Specificity
Figure. Potential variability in the performance of computed tomographic coronary angiography. A, Relationship between sensitivity and specificity (test
likelihood) vs the magnitude of verification bias (the unobserved prevalence of positive responders among the entire 594 candidate population in OMCAS). The
sensitivity and specificity values (adjusted for verification bias) are calculated from the raw “patient-based ⱖ50% stenosis” data in Table 3 of the OMCAS paper
(1) using a previously published computer algorithm based on Bayes’ theorem3:
Adjusted Sensitivity= PPA⫻p(R)/[PPA ⫻p(R)⫹NPA⫻(1 − p(R)],
and
Adjusted Specificity=1−(1−PPA)⫻p(R)/[(1− PPA) ⫻p(R) ⫹(1− NPA) ⫻(1 −p(R)],
where p(R) is the overall prevalence of positive test responders (total positive test results/total patients tested), PPA is the positive predictive accuracy
(true-positive test results/total positive test results, and NPA is the negative predictive accuracy (false-negative test results/total negative test results). Sensitivity
is directly related and specificity is inversely related to the magnitude of verification bias.4B, The upper and lower bounds for appropriate test use (based on
sensitivity, specificity, and prior probability of disease) as a function of the magnitude of verification bias (the unobserved prevalence of positive responders
among the entire 594 candidate population in OMCAS). The lower bound is the point below which false-positive responses exceed true-positive responses, and
the upper bound is the point above which false-negative responses exceed true-negative responses. Only within the intermediate range defined by these bounds
are all test responses more likely to be true than false.
ARCH INTERN MED/ VOL 171 (NO. 11), JUNE 13, 2011 WWW.ARCHINTERNMED.COM
1030
©2011 American Medical Association. All rights reserved.
on June 18, 2011 www.archinternmed.comDownloaded from
probability is greater than 0.86, a negative response is
more likely to be a false negative than a true negative.
Only within the intermediate range defined by this ex-
pression are all test responses more likely to be true than
false. This range thereby provides a rational standard for
appropriate test use. Depending on the magnitude of veri-
fication bias (the prevalence of positive test responders
in the candidate population), however, the lower bound
could be as high as 0.32 and the upper bound could be
as low as 0.65 (Figure, B).
Practicing clinicians will have only limited interest in
these technical issues and more likely will want to know
precisely if and when to use this test as an alternative to
initial medical management, cardiovascular stress test-
ing, coronary calcium screening, or invasive CA. Many
pragmatic considerations that go far beyond the mun-
dane measurement of sensitivity and specificity (such as
logistical availability, local expertise, and financial self-
interest) will likely play a greater role than comparative
effectiveness research in making this choice. As a result,
we expect the optimal role of CTCA to evolve in unpre-
dictable and contentious ways over the next several years.
In the meantime, we would be well advised to more
openly acknowledge the “real-world” limitations of ev-
ery diagnostic test, whether a simple historical question
or a sophisticated technical procedure. British astrophysi-
cist Sir Arthur Eddington8highlighted these limitations
by way of an engaging parable:
Let us suppose that an ichthyologist is exploring the life of the
ocean. He casts a net into the water and brings up a fishy as-
sortment. Surveying his catch, he [concludes that no] sea-
creature is less than two inches long....
An onlooker may object that the generalization is wrong. “There
are plenty of sea-creatures under two inches long, only your
net is not adapted to catch them.” The ichthyologist dismisses
this objection contemptuously: “Anything uncatchable by my
net is ipso facto outside the scope of ichthyological knowl-
edge, and is not part of the kingdom of fishes which has been
defined as the theme of ichthyological knowledge. In short, what
my net can’t catch isn’t fish”....
Suppose that a more tactful onlooker makes a rather different
suggestion: “I realize that you are right in refusing our friend’s
hypothesis of uncatchable fish, which cannot be verified by any
tests you and I would consider valid. By keeping to your own
method of study, you have reached a generalization of the high-
est importance—to fishmongers, who would not be interested
in generalizations about uncatchable fish. Since these gener-
alizations are so important, I would like to help you. You ar-
rived at your generalization in the traditional way by examin-
ing the fish. May I point out that you could have arrived more
easily at the same generalization by examining the net and the
method of using it?”
Eddington’s “more tactful onlooker” personifies an en-
tirely new breed of specialist, the clinical epistemologist
(from the Greek επιστηµη, referring to the nature and
scope of knowledge), whose role is to observe the ob-
servers, police the police, and explain to the rest of us
just how we know what we know.9We wish them well.
Published Online: March 14, 2011. doi:10.1001
/archinternmed.2011.75
Author Affiliations: Division of Cardiology, Cedars-Sinai
Medical Center, and the Department of Medicine, Da-
vid Geffen School of Medicine at UCLA, University of Cali-
fornia, Los Angeles.
Correspondence: Dr Diamond, Division of Cardiology,
Cedars-Sinai Medical Center, 2408 Wild Oak Dr,
Los Angeles, CA 90068 (gadiamond@pol.net).
Financial Disclosure: None reported.
1. Zir LM, Miller SW, Dinsmore RE, Gilbert JP, Harthorne JW. Interobserver
variability in coronary angiography. Circulation. 1976;53(4):627-632.
2. Begg CB, Greenes RA. Assessment of diagnostic tests when disease verifica-
tion is subject to selection bias. Biometrics. 1983;39(1):207-215.
3. Diamond GA. Reverend Bayes’ silent majority: an alternative factor affecting
sensitivity and specificity of exercise electrocardiography. Am J Cardiol. 1986;
57(13):1175-1180.
4. Diamond GA. How accurate is SPECT thallium scintigraphy? J Am Coll Cardiol.
1990;16(4):1017-1021.
5. Diamond GA. Affirmative actions: can the discriminant accuracy of a test be de-
termined in the face of selection bias? Med Decis Making. 1991;11(1):48-56.
6. Patel MR, Peterson ED, Dai D, et al. Low diagnostic yield of elective coronary
angiography. N Engl J Med. 2010;362(10):886-895.
7. Diamond GA, Denton TA, Berman DS, Cohen I. Prior restraint: a Bayesian
perspective on the optimization of technology utilization for diagnosis of coro-
nary artery disease. Am J Cardiol. 1995;76(1):82-86.
8. Eddington A. The Philosophy of Physical Science. Cambridge, England: Cam-
bridge University Press; 1949:16-19.
9. Diamond GA, Kaul S. What the tortoise said to Achilles. Am J Cardiol. 2010;
106(4):593-595.
George A. Diamond, MD
Sanjay Kaul, MD
ARCH INTERN MED/ VOL 171 (NO. 11), JUNE 13, 2011 WWW.ARCHINTERNMED.COM
1031
©2011 American Medical Association. All rights reserved.
on June 18, 2011 www.archinternmed.comDownloaded from