ArticlePDF Available

Abstract and Figures

Increased 5-y survival for screened patients is often inferred to mean that fewer patients die of cancer. However, due to several biases, the 5-y survival rate is a misleading metric for evaluating a screening's effectiveness. If physicians are not aware of these issues, informed screening counseling cannot take place. Two questionnaire versions ("group" and "time") presented 4 conditions: 5-y survival (5Y), 5-y survival and annual disease-specific mortality (5YM), annual disease-specific mortality (M), and 5-y survival, annual disease-specific mortality, and incidence (5YMI). Questionnaire version "time" presented data as a comparison between 2 time points and version "group" as a comparison between a screened and an unscreened group. All data were based on statistics for the same cancer site (prostate). Outcome variables were the recommendation of screening, reasoning behind recommendation, judgment of the screening's effectiveness, and, if judged effective, a numerical estimate of how many fewer people out of 1000 would die if screened regularly. After randomized allocation, 65 German physicians in internal medicine and its subspecialities completed either of the 2 questionnaire versions. Across both versions, 66% of the physicians recommended screening when presented with 5Y, but only 8% of the same physicians made the recommendation when presented with M (5YM: 31%; 5YMI: 55%). Also, 5Y made considerably more physicians (78%) judge the screening to be effective than any other condition (5YM: 31%; M: 5%; 5YMI: 49%) and led to the highest overestimations of benefit. A large number of physicians erroneously based their screening recommendation and judgment of screening's effectiveness on the 5-y survival rate. Results show that reporting disease-specificmortality rates can offer a simple solution to physicians' confusion about the real effect of screening.
Content may be subject to copyright.
http://mdm.sagepub.com/
Medical Decision Making
http://mdm.sagepub.com/content/31/3/386
The online version of this article can be found at:
DOI: 10.1177/0272989X10391469
2011 31: 386 originally published online 29 December 2010Med Decis Making
Odette Wegwarth, Wolfgang Gaissmaier and Gerd Gigerenzer
Deceiving Numbers : Survival Rates and Their Impact on Doctors' Risk Communication
Published by:
http://www.sagepublications.com
On behalf of:
Society for Medical Decision Making
can be found at:Medical Decision MakingAdditional services and information for
http://mdm.sagepub.com/cgi/alertsEmail Alerts:
http://mdm.sagepub.com/subscriptionsSubscriptions:
http://www.sagepub.com/journalsReprints.navReprints:
http://www.sagepub.com/journalsPermissions.navPermissions:
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
Deceiving Numbers: Survival Rates and Their
Impact on Doctors’ Risk Communication
Odette Wegwarth, PhD, Wolfgang Gaissmaier, PhD, Gerd Gigerenzer, PhD
Background. Increased 5-y survival for screened patients
is often inferred to mean that fewer patients die of cancer.
However, due to several biases, the 5-y survival rate is
a misleading metric for evaluating a screening’s effective-
ness. If physicians are not aware of these issues, informed
screening counseling cannot take place. Methods. Two
questionnaire versions (‘‘group’’ and ‘‘time’’) presented 4
conditions: 5-y survival (5Y), 5-y survival and annual
disease-specific mortality (5YM), annual disease-specific
mortality (M), and 5-y survival, annual disease-specific
mortality, and incidence (5YMI). Questionnaire version
‘‘time’’ presented data as a comparison between 2 time
points and version ‘‘group’’ as a comparison between
a screened and an unscreened group. All data were based
on statistics for the same cancer site (prostate). Outcome
variables were the recommendation of screening, reason-
ing behind recommendation, judgment of the screening’s
effectiveness, and, if judged effective, a numerical esti-
mate of how many fewer people out of 1000 would die if
screened regularly. After randomized allocation, 65 Ger-
man physicians in internal medicine and its subspecial-
ities completed either of the 2 questionnaire versions.
Results. Across both versions, 66% of the physicians
recommended screening when presented with 5Y, but
only 8% of the same physicians made the recommenda-
tion when presented with M (5YM: 31%; 5YMI: 55%).
Also, 5Y made considerably more physicians (78%) judge
the screening to be effective than any other condition
(5YM: 31%; M: 5%; 5YMI: 49%) and led to the highest
overestimations of benefit. Conclusion. A large number of
physicians erroneously based their screening recommen-
dation and judgment of screening’s effectiveness on the
5-y survival rate. Results show that reporting disease-
specific mortality rates can offer a simple solution to phy-
sicians’ confusion about the real effect of screening. Key
words: decision rules; risk communication or risk percep-
tion, shared decision making, health literacy, numeracy.
(Med Decis Making 2011;31:386–394)
According to the concept of shared medical deci-
sion making, the technical knowledge of risks
and benefits of medical interventions is held by the
physician, who then shares this knowledge with
patients to enable them to decide according to
their preferences.
1
If physicians do not have this
knowledge, which involves understanding health
statistics, effective risk communication and shared
decision making cannot take place. Numerous
studies found that the format in which numeric
health data are presented can generate divergent
interpretations amongst physicians—a phenomenon
that has been largely explained by the fact that some
formats yield larger numbers than others and thus
are more persuasive.
2–7
In the past, researchers
mainly focused on the persuasiveness of relative
risk reduction compared with absolute risk reduc-
tion and number needed to treat.
4,7,8
Another per-
suasive format, however, might be the 5-y survival
rate.
The 5-y survival rate is probably the most com-
mon statistic used to report the progress in treating
cancer. Improvements in 5-y survival are often con-
sidered an unambiguous sign of success: If patients
who receive screening tests or examinations tend to
live longer than those who do not, society’s enor-
mous investments in early detection and improved
treatments must be paying off. However, although
the 5-y survival rate is a valid measure for compar-
ing cancer therapies in randomized trials, it is not
valid for comparing differently diagnosed groups
(e.g., survival before versus after the introduction of
Received 7 December 2009 from the Max Planck Institute for Human
Development, Harding Center for Risk Literacy, Berlin, Germany.
Revision accepted for publication 9 September 2010.
Address correspondence to Odette Wegwarth, Max Planck Institute
for Human Development, Lentzeallee 94, 14195 Berlin, Germany;
telephone: +49-30-82406-695; fax: +49-30-82406-394; e-mail:
wegwarth@mpib-berlin.mpg.de.
DOI: 10.1177/0272989X10391469
386 MEDICAL DECISION MAKING/MAY–JUN 2011
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
screening; survival of unscreened versus screened
people). In fact, changes in 5-y survival over time
and groups were found to be completely unrelated
to changes in cancer mortality for the 20 most com-
mon solid tumors (r= .00) in the United States.
3,9
To understand why, it is helpful to look at how
the 5-y survival statistic is calculated in the context
of screening:
Disease-specific 5-year survival rate ¼
number of persons diagnosed with a specific
cancer still alive 5 years after diagnosis
number of persons diagnosed with a
specific cancer in the study population
:
In this calculation, the key term to notice is diag-
nosed, which appears in the numerator and denomi-
nator of the survival statistic of a specific cancer.
Cancer can be diagnosed by either symptoms or
screening. By definition, screening detects cancer
before it causes symptoms. Because of this property,
screening can bias 5-y survival rates in 2 ways: 1) by
prolonging the period in which patients are known
to have cancer and 2) by including people with non-
progressive cancer in the statistic. The first, called
lead-time bias, accounts for the fact that screening
may only reduce the time to diagnosis without
increasing the time to death. This prolonged period
of being diagnosed makes patients attending screen-
ing more likely to enter the 5-y survival statistic. Yet
this may have no bearing on real prolonged or saved
lives.
10
The second phenomenon, called length-time
bias or overdiagnosis bias, concerns the detection of
lesions that meet the pathological definition of can-
cer yet never become clinically significant due to
their prolonged preclinical phase or their lack of
propensity to progress.
10
The inclusion of nonpro-
gressive and slowly progressive cancer inflates the
5-y survival statistic in 2 ways: First, it inflates the
number of persons diagnosed (incidence) with a spe-
cific cancer in the study population (the denomina-
tor of the survival equation). Second, it inflates the
number of diagnosed persons still alive 5 y after
diagnosis (the numerator of the equation), because
people with slowly progressive and nonprogressive
cancer have a better prognosis than people with
aggressive cancer and are thus more likely to survive
the next 5 years.
Because 5-y survival rates do not allow reliable
judgments on improved cancer control due to these
biases, physicians are advised to use disease-
specific mortality rates for a specific cancer.
11
This
statistic does not depend on diagnostic procedures
and therefore is not prone to screening-induced
biases.
Annual disease-specific mortality ¼
number of persons who die from a
specific cancer over 1 year
number of persons in the study population :
Nonetheless, changes in survival rates over time
9
and across differently diagnosed groups
12
are still
reported in the relevant medical journals. What do
doctors conclude from these?
The present study is, to the best of our knowledge,
the first to investigate how different measures used to
report progress against cancer influence physicians’
recommendations of screening and judgments of its
effectiveness. A literature search of PubMed, MED-
LINE, and ISI Web of Knowledge using layers of sub-
ject headings combining 5-year survival as a constant
and the subjects of doctors’ understanding,clini-
cians’ understanding,physicians’ understanding,
risk communication,bias,andsuccess against cancer
did not reveal any study in this vein. We assessed
how 5-y survival rates, mortality rates, and 2 combi-
nations of 5-y survival rates, mortality rates, and inci-
dence—which were varied over time or group
would alter physicians’ recommendations of screen-
ing as well as their judgments of its effectiveness.
Furthermore, we investigated physicians’ knowledge
about lead-time bias and length-time bias. This study
was exploratory in nature.
METHOD
Materials
We developed 2 versions of a survival survey: ver-
sion ‘‘group’’ and version ‘‘time.’’ Although other-
wise alike, version ‘‘group’’ always presented data
as a comparison between a screened and an
unscreened group and version ‘‘time’’ as a compari-
son between 1975 and 2004. Each version rested on
a repeated-measures design and comprised 4 condi-
tions: Condition 5Y provided data on 5-y survival,
condition 5YM on 5-y survival and annual disease-
specific mortality, condition M on annual disease-
specific mortality, and condition 5YMI on 5-y
survival, annual disease-specific mortality, and inci-
dence (see Web Appendix 1).
To mask the fact that data shown in the 4 condi-
tions and in both versions always referred to the
potential effect of screening for the same cancer site
PROVIDER DECISION MAKING 387
SURVIVAL RATES AND THEIR IMPACT ON DOCTORS’ RISK
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
(prostate), screenings and tumors were labeled with
capital letters. The survey face sheet stated that all
screenings mentioned in the following conditions
were noninvasive, detected tumors for which sev-
eral treatment options exist, and were first used rou-
tinely at the beginning of the 1990s. Each of the 4
conditions was preceded by an introduction of a 55-
y-old healthy patient seeking advice from the doctor
on whether to be screened for tumor A, B, or C and
so forth.
Four outcomes were measured: recommendation
of screening (yes, no, I can’t decide), reasoning
behind recommendation (open answer format),
judgment of screening’s effectiveness (yes, no), and,
if judged effective, a numerical estimate of how
many fewer people out of 1000 would die if they
were regularly screened. After the 4 conditions, phy-
sicians were further asked if they knew what lead-
time bias and length-time bias are and, if so, to
explain each of these.
To investigate with as little bias as possible what
doctors would conclude from the commonly reported
5-y survival statistic, condition 5Y was always first,
followed by the other 3 conditions in the listed order.
Combined conditions aimed at investigating whether
more information fosters a detection of lead-time
(condition 5YM) and length-time bias (condition
5YMI). Both survey versions were piloted by 6 physi-
cians and refined in response.
Rationale for Data Source
Data on 5-y survival rates, disease-specific
mortality rates, and incidence presented in the 4
conditions were drawn from the Surveillance, Epi-
demiology and End Result (SEER) program for pros-
tate cancer and for the time points 1975 and 1999.
13
When our study was conducted in late 2008, no
reliable data for a comparison between a screened
and an unscreened group had yet been released
(results from 2 randomized controlled trials on the
effect of screening for prostate cancer were not
released until March 2009
14,15
). Hence, numbers
showninversion‘group’werealsoorientatedon
the SEER database. To make conditions appear
independent from each other, we marginally modi-
fied original SEER data from condition to condition
in the 2 versions. Yet compared with the 20 most
common cancer sites, the temporal increase of 5-y
survival and incidence for prostate cancer ranked
among the highest.
9
This raised concerns that
results of our study might not be representative of
other cancer sites. We thus decided to halve all
numbers reported in version ‘‘time’’ when used in
version ‘‘group.’’ For instance, when an increase of
31 percentage points was reported for condition 5Y
in version ‘‘time,’’ an increase of 16 percentage
points was reported for condition 5Y in version
‘‘group’’ (for details, see Web Appendix 2).
Sample and Procedure
Sixty-eight German physicians in internal medi-
cine and its subspecialities participated in the
study. Three physicians were excluded from analy-
sis because they did not provide answers throughout
all of the 4 conditions for recommendation of
screening and judgment of screening’s effectiveness,
which was required for the within-subject compari-
sons. Of the final 65 physicians, 34 physicians
completed the version ‘‘group’’ and 31 the version
‘‘time.’’ A breakdown of the physicians’ characteris-
tics is shown in Table 1. Characteristics of respon-
dents receiving the 2 versions did not differ with
respect to the position, the work environment, years
Table 1 Comparison of the Characteristics of the 2 Groups of Physicians
Characteristic Version ‘‘Group’’ (n= 34) Version ‘‘Time’’ (n= 31) Total (N= 65)
Position, n(%)
Junior physician 16 (48) 14 (45) 30 (46)
Senior physician 9 (26) 7 (23) 16 (25)
Head physician 9 (26) 10 (32) 19 (29)
Work environment, n(%)
Private practice 4 (11) 4 (13) 8 (12)
Hospital 18 (54) 20 (64) 38 (59)
Research hospital 12 (35) 7 (23) 19 (29)
Years since graduation, range (M, SD) 1–35 (12.5, 9.4) 1–38 (15.2, 10.6) 1–38 (13.8, 10)
Age, range (mean, SD) 27–65 (40.2, 9.8) 27–66 (44.8, 10.1) 27–66 (42.4, 10.1)
388 MEDICAL DECISION MAKING/MAY–JUN 2011
WEGWARTH AND OTHERS
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
since graduation, or age (all 95% confidence inter-
vals included 0).
Responses were obtained through personal over-
tures by one author (n= 41) and through approach-
ing physicians during further educational training
(n= 24). The response rate was 98% for personal
overtures and 65% for physicians approached dur-
ing training.
When physicians agreed to participate, they were
randomly allocated to one of the versions and were
tested either at their work site or at a separate place
within the facility where their training was
being held. After giving informed consent, physi-
cians were instructed to work individually,
complete all questions following each of the condi-
tions, and not to return to conditions that had
already been completed. No payment was made for
participation.
Analysis
All data were stored and analyzed with SPSS 16.
If a response was lacking for any outcome, except
recommendation of screening and judgment of
screening’s effectiveness, it was coded as a miss.
For the coding of reasoning behind recommenda-
tion (qualitative data), 2 independent raters first
reviewed all reasons to establish categories that
would cover these reasons. After the review, the
raters discussed and consented on the categories.
Next, each rater independently assigned physicians’
responses to the categories. The interrater reliability
was r= 0.91. Explanations of lead-time bias and
length-time bias were coded as either correct or
incorrect depending on whether they resembled the
epidemiological definition.
10
Because the study was
exploratory in nature, all outcomes were analyzed
on a descriptive level.
RESULTS
Screening Recommendation
What would an informed pattern of recommenda-
tion look like? The 5-y survival statistic alone does
not allow an unbiased judgment of the benefit of
screening. Thus, if physicians are aware of this
problem, they should chose either ‘‘no’’ or ‘‘I can’t
decide’’ in condition 5Y. The other 3 conditions pro-
vided information on disease-specific mortality,
which allows an evaluation of the effect. In accor-
dance with the SEER results, we presented
mortality data that showed a minimal increase
instead of a decrease for the screening group or the
later time point. Thus, if physicians knew that they
are advised to look at mortality data, one would
expect them to choose ‘‘no’’ in these conditions.
Therefore, an informed pattern of recommendation
over the 4 conditions should be the following: I
can’t decide/no, no, no, no. One could argue, how-
ever, that physicians could have been knowledge-
able about which statistic is best to look at but
responded to the questions according to the
practice of defensive medicine—a reaction to the
unpredictabilities of the legal system in which
a physician can be sued for doing too little but
rarely for doing too much.
3,16
In such a case, the
minimal increase in mortality could be viewed as
negligible and thus, for defensive reasons, the
screening as recommendable. In this case, a defen-
sive yet informed pattern of recommendation over
the 4 conditions would be the following: I can’t
decide/no, yes, yes, yes.
Figure 1 shows the actual pattern of recom-
mendations for each physician across the 4
conditions—ordered by the number of ‘‘yes’’
choices—and for both versions. As can be seen, the
2 survey versions yielded roughly similar response
patterns. No more than 4 physicians in version
‘‘group’’ (n= 34) and 1 physician in version ‘‘time’’
(n= 31) showed the informed pattern. The defen-
sive yet informed pattern was not found with any
of the physicians. Instead, in both versions, the pat-
terns suggest that many physicians focused on the
changes in 5-y survival. All conditions that
included the 5-y survival statistic (5Y, 5YM, 5YMI)
prompted considerably more ‘‘yes’’ choices than
the condition presenting only disease-specific mor-
tality (M). Forty-three of 65 physicians recom-
mended screening when presented with only 5-y
survival data. In contrast, only 5 of the same physi-
cians recommended screening when presented
with only disease-specific mortality data. The com-
bined conditions, intended to foster physicians’
insight by making them aware of lead-time bias and
length-time bias, only partly attenuated the mis-
leading effect of the included 5-y survival rate:
Twenty of the physicians recommended screening
whenpresentedwithcondition5YM,and36of
them did so when later presented with condition
5YMI. One might conjecture that due to the
repeated measurement design, physicians would
show carryover effects. However, of all 65 physi-
cians, only 6 always gave the same response
throughout the 4 conditions.
PROVIDER DECISION MAKING 389
SURVIVAL RATES AND THEIR IMPACT ON DOCTORS’ RISK
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
Reasons for Recommendation
What reasons did physicians give for their recom-
mendations? For both versions, the most frequent
reason for a recommendation in favor of the screen-
ing was the increase in the 5-y survival rate over
time or groups. Often, the physicians described this
increase as ‘‘meaningful,’’ ‘‘clinically significant,’’
or ‘‘exemplifying the merits of early detection.’’ For
condition 5YMI only, another prominent reason was
what we call the incidence-mortality fallacy. Here,
a physician would infer from an increased 5-y sur-
vival rate, plus increased incidence and a stable
mortality rate, that screening must be effective, not
realizing that more screening inflated the incidence
and thereby the 5-y survival rate. For instance, one
physician wrote on his sheet, ‘‘5-year survival better
for screened group, mortality between groups
equates, yet incidence is higher in screened group,
thus fewer people die in the screened group.’’ How-
ever, unless tumor biology (aggressiveness of the
tumor) was suddenly to change, the number of peo-
ple who developed the disease (incidence) would
not be expected to influence the 5-y survival rate,
that is, the prognosis of an individual case. Thus, if
incidence and 5-y survival rates increase at the same
time while mortality rates remain unchanged,
increased incidence may reflect changes in clinical
detection practice rather than changes in true
occurrence.
Recommendations against screening were mainly
triggered by physicians focusing on the mortality
rate and the subsequent impression that the benefit
is either negative or does not exist. For the occasions
in which physicians could not decide whether to
recommend screening, no prominent main reason
was detected. Table 2 gives an overview of the rea-
soning for each condition.
Judgment of Screening’s Effectiveness
Five-year survival rates affected the judgment of
screening’s effectiveness in the same way as for rec-
ommendation. Across versions, 51 of the 65 physi-
cians judged the screening effective when presented
with 5-y survival rates only. When presented with
the next condition providing 5-y survival and mor-
tality data together, 31 physicians changed their
minds—now, 20 of 65 considered the screening
effective. Shown the third condition, which
Version “Group”
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
5Y + + + + + + + + + + + + + + + + + + + + + ? ? ? ? ? ? ?
5YM + + + + + + + + + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
M + + ? ? ? ? ? ? ? ? ? ? ?
5YMI + + + + + + + + + ? + + ? + + + ? ? ? + +
Version “Time”
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
5Y + + + + + + + + + + + + + + + + + + + + + + ? ? ? ? ? ?
5YM + + + + + + + + ? ? ? ? ? ? ? + + ? ? ? +
M + ? ? ? + ? ? + ? ? ?
5YMI + + ? ? + + + + + + + + + + + + + + ? ? ? + ? ? + + +
Informed pattern Defensive yet informed pattern
5Y ?
?
5YM
+
M
+
Recommend?
yes
undecided
no
5YMI
+ + ?
Figure 1 Screening recommendations of 34 physicians receiving information for a comparison between a screened and an unscreened
group (version ‘‘group’’) and of 31 physicians receiving information for a comparison between the time points 1975 and 2004 (version
‘‘time’’). Conditions: 5Y = 5-y survival; 5YM = 5-y survival and annual disease-specific mortality; M = annual disease-specific mortality;
and 5YMI = 5-y survival, annual disease-specific mortality, and incidence.
390 MEDICAL DECISION MAKING/MAY–JUN 2011
WEGWARTH AND OTHERS
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
presented only mortality data, the number of physi-
cians who still judged the screening effective
reduced to 3. When given information on 5-y sur-
vival, mortality, and incidence data together in the
last condition, the number of physicians judging the
screening effective rose again to 32. Figure 2 shows
the influence of the different statistics on the physi-
cians’ judgments by version, which again are
roughly comparable.
Numerical Estimate of Screening’s Effectiveness
When a physician judged the screening to be
effective, he or she was subsequently asked how
many fewer people would die out of 1000 if they
were regularly screened. The change in the disease-
specific mortality rate over the 2 time points that we
drew from the SEER results suggested that the
answer is 0; no man has been saved from dying from
prostate cancer since the introduction of screening.
Thus, the best answer to the question of how many
fewer would die is 0. These results were based on
temporal changes, and one could argue that such
changes cannot be exclusively attributed to the
effect of screening. Results released later from 2 ran-
domized controlled trials on the effects of prostate-
specific antigen screening showed somewhat com-
parable results as the temporal data, however.
14,15
As Table 3 shows, 0 was not the answer that the
physicians arrived at when confronted with the 3
conditions that included 5-y survival rates. At the
median level, within these conditions, physicians
expected between 13 and 150 fewer deaths in 1000
screened people. Because numbers presented in the
conditions of version ‘‘group’’ were only halves of
those presented in version ‘‘time,’’ overestimations
for the former were smaller. In both versions, condi-
tion 5Y led to the highest overestimations of the
effectiveness and also to the largest variation among
estimates. The combined 5YM and 5YMI conditions
generated smaller variation and estimates; neverthe-
less, compared with the real reduction of cancer
mortality, they still yielded high overestimations.
Because 62 of 65 physicians in condition M had
Table 2 Physicians’ Reasons for Each of the Recommendation Options (Yes, No, I Can’t Decide),
by Condition
5Y 5YM M 5YMI
Yes No / Yes No / Yes No / Yes No /
Significant increase in survival 43 18 16
No benefit in survival 17 8 39 9 11 5
Incidence-mortality fallacy 19
Therapy v. screening 4 7 2 5 1 3
Data inconclusive/insufficient 4 4 1 3 1 1 2 3
Possible improved quality of life 1 1 3 8 3
Lead-time bias 3 2
Poor therapy 2
Missing value 1 2 1 3 1 1 1 1
Note: Numbers express the frequency of mention. A slash indicates ‘‘I can’t decide.’’ Therapy v. screening indicates feeling unable to judge whether tem-
poral changes of data are due to the success of screening or therapy. 5Y = 5-y survival; 5YM = 5-y survival and annual disease-specific mortality;
M = annual disease-specific mortality; 5YMI = 5-y survival, annual disease-specific mortality, and incidence.
Figure 2 Number of physicians who assumed the screening to
be effective, shown by condition and for the 2 versions, group: n=
34, and time: n= 31. Conditions 5Y and 5YMI led the majority to
assume the screening to be effective.
PROVIDER DECISION MAKING 391
SURVIVAL RATES AND THEIR IMPACT ON DOCTORS’ RISK
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
already correctly judged the screening to be ineffec-
tive in advance, only 3 physicians were asked for
a numerical estimate. The 1 physician who finally
gave an estimate would have been correct if we had
reported a decrease instead of an increase in
disease-specific mortality.
Knowledge of Lead-Time Bias and
Length-Time Bias
After physicians had worked through the condi-
tions, they were asked if they knew about the lead-
time bias and the length-time bias and, if so, were
asked to give an explanation of each. Fifty-four of
the 65 physicians did not know what the lead-time
bias was. Of the remaining 11 physicians who indi-
cated they did know, only 2 explained the bias cor-
rectly. With only 1 exception, no physician knew
what the length-time bias was. However, when
asked to explain the bias, the 1 physician did not
explain it correctly either.
DISCUSSION
Our exploratory study assessed how 5-y survival
rates, mortality rates, and 2 combinations of 5-y sur-
vival rates, mortality rates, and incidence—varied
over time or group—would alter physicians’ recom-
mendations of cancer screening and their judgment
of its effectiveness. To recapitulate, across both sur-
vey versions, only 5 of 65 physicians showed
informed recommendation patterns. More than two-
thirds of the physicians erroneously based their
screening recommendations on changes in 5-y sur-
vival rates over time or group. Furthermore, the
majority judged the screening to be effective and
overestimated this effectiveness by up to 150 fewer
deaths in the screened group when presented with
5-y survival rates. Knowledge of the 5-y survival-
related biases in the context of screening evaluation
was scarce to nonexistent. Only a few physicians
felt that they required information other than sur-
vival rates to make a recommendation and to decide
whether the screening would be effective. Combined
conditions, intended to foster a detection of lead-
time bias (condition 5YM) and length-time bias
(condition 5YMI), only partly attenuated the mis-
leading effect of the included 5-y survival rates.
Results further showed that disease-specific mortal-
ity enabled physicians to judge the effectiveness of
the screening correctly. The 2 versions of the ques-
tionnaire yielded comparable results.
An open question concerns how physicians
arrived at the numerical estimates of the effective-
ness of screening. Across conditions, between 10%
and 50% of the physicians seemed to have based
their calculations on changes in 5-y survival rates
over time or group. In a few further instances, physi-
cians used the corresponding 5-y survival rate of
2004 or of the screened group, respectively. How-
ever, for most of the estimates, it was impossible to
reconstruct how physicians made their calculations.
A better understanding of how physicians based
their estimates of effectiveness on the 5-y
survival rate—alone and in combination with other
statistics—would help to reveal the nature of the
confusion in more detail.
Strengths and Limitations of the Study
Our study was based on a repeated-measures
design. Such a design has the potential shortcoming
of carryover effects that might prevent participants
from changing their responses from one condition to
Table 3 Physicians’ Numerical Estimates of the Effectiveness of Screening by Condition and Version
Median Estimate
Condition
Version 5Y 5YM M 5YMI
Group 30 13 — 14
Range 2–500 2–200 — 2–100
Total n(misses) 23 (5) 7 (2) 0 (1) 9 (5)
Time 150 25 .003 30
Range 2–980 0.01–250 1–500
Total n(misses) 21 (2) 7 (4) 1 (1) 13 (5)
Note: Estimates were based on the question of how many fewer people would die out of 1000 if they regularly attended the screening. Only physicians
who indicated upfront that they judged the screening to be effective were asked to provide a quantitative estimate.
392 MEDICAL DECISION MAKING/MAY–JUN 2011
WEGWARTH AND OTHERS
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
the other. However, across both versions, only 6 of
the 65 physicians adhered to their initial recommen-
dation, not changing it in the following conditions.
Also, of the 51 physicians who judged the screening
to be effective in condition 5Y, only 3 still did so in
condition M, and many changed their judgment
again when presented with condition 5YMI.
A limitation of the present study is that we used
a convenience sample of 65 physicians in internal
medicine. Thus, we do not know about the general-
izability of our results. A study with a representative
sample would be needed to investigate if findings
hold true for a wider range of physicians. At the
same time, the study involved physicians of various
hierarchies from a variety of work environments. In
addition, results are clear-cut, so that we assume
a fair generalizability of our findings.
Another limitation might be seen in the fact that
we told physicians within the introduction of each
condition that the presented data were obtained
from a randomized controlled trial (RCT; see the
web appendices). Although RCTs produce the best
available evidence on which to base medical
actions, they do not necessarily make a wrong statis-
tic in a wrong context right. At the beginning of our
article, we made clear that 5-y survival rates are
a valid measure for comparing cancer therapies in
RCTs but not for comparing differently diagnosed
groups (screening v. symptoms) in or outside of
RCTs, due to lead-time bias and overdiagnosis bias.
In addition, it might be argued that physicians used
the data presented uncritically, overlooking the fact
that the trial was on screening and not on therapies.
However, we used words relating to screening (e.g.,
screened group,unscreened group) 4 times within
each condition, but in none did we include any
words relating to therapy or treatment.
Implications for Policy, Practice,
and Medical Training
It is important to us that our work is not miscon-
strued: This article is not meant to suggest that there
has been no real progress in cancer care. Instead, we
want to highlight that survival rates have the poten-
tial to deceive physicians’ perception of screening’s
effectiveness and therefore wrongly influence their
recommendations. Our findings lead to the impor-
tant issue of which metrics should be presented and
in what context in the medical literature. Policies
exist, such as CONSORT (http://www.consort
statement.org/), that recommend the reporting of
changes in mortality for screening evaluation, and
an increasing number of medical journals are now
subscribing to such policies. However, these poli-
cies appear not always to be enforced, and thus the
reporting of survival rates in the context of screening
is still common.
12
We believe that the implementation
and enforcement of such policies merit serious consid-
eration. Equally important, better training of physi-
cians in understanding health statistics at medical
schools may lessen their vulnerability to being misled.
For clinicians who wish to correctly inform their
patients about the true benefit of screening, it is essen-
tial for them to learn that improving 5-y survival rates
over time and over differently diagnosed groups may
not reflect a reduced disease burden and should not be
taken as evidence of improved prevention, screening,
or therapy. Improved survival rates may instead reflect
more cases being diagnosed and unchanged mortality,
as suggested, for instance, for prostate cancer.
14,15
In
contrast, mortality is a clear-cut number that decreases
with improvement in cancer control, be it through suc-
cessful early detection or better treatment. A health
system that expects physicians to correctly inform
their patients about medical interventions should thus
encourage the reporting of mortality rates in medical
journals and medical training.
Contributors: All authors declare that they partic-
ipated in conceptualizing the study, analyzing the
data, and writing the article. All authors had full
access to all of the data (including statistical reports
and tables) in the study and can take responsibility
for the integrity of the data and the accuracy of the
data analysis.
Conflicts of interest: None declared.
Ethical approval: The Ethics Committee of the
Max Planck Institute for Human Development
approved the study, and all participants consented
to participation at the beginning of the survey.
Funding/support: This study was funded by the
Harding Center for Risk Literacy at the Max Planck
Institute for Human Development (Germany). The
authors declare independence from these funding
agencies.
Role of the funding source: The funding source
did not affect the study design, data collection, anal-
ysis and interpretation of the data, writing of the
report, or the decision to submit the article for
publication.
REFERENCES
1. Charles CA, Gafni A, Whelan T. Shared decision-making in the
medical encounter: what does it mean? (or, It takes at least two to
tango). Soc Sci Med. 1997;44:681–92.
PROVIDER DECISION MAKING 393
SURVIVAL RATES AND THEIR IMPACT ON DOCTORS’ RISK
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
2. Mu
¨hlhauser I, Kasper J, Meyer G. FEND: understanding of dia-
betes prevention studies: questionnaire survey of professionals in
diabetes care. Diabetologia. 2006;49:1742–6.
3. Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz LM,
Woloshin S. Helping doctors and patients to make sense of health
statistics. Psychol Sci Public Interest. 2007;8:53–96.
4. Covey J. A meta-analysis of the effects of presenting treatment
benefits in different formats. Med Decis Making. 2007;27:638–54.
5. Ghosh AK, Ghosh K. Translating evidence-based information
into effective risk communication: current challenges and oppor-
tunities. J Lab Clin Med. 2005;145:171–80.
6. Fahey T, Griffiths S, Peters TJ. Evidence based purchasing:
understanding results of clinical trials and systematic reviews. Br
Med J. 1995;311:1056–9.
7. Naylor CD, Chen E, Strauss B. Measured enthusiasm: does the
method of reporting trial results alter perceptions of therapeutic
effectiveness? Ann Intern Med. 1992;117:916–21.
8. Hembroff LA, Holmes-Rovner M, Wills CE. Treatment
decision-making and the form of risk communication: results of
a factorial survey. BMC Med Inform Decis Mak. 2004;4:20.
9. Welch HG, Schwartz LM, Woloshin S. Are increasing 5-year
survival rates evidence of success against cancer? JAMA. 2000;
283:2975–8.
10. Gordis L. Epidemiology. 4th ed. Philadelphia: Saunders;
2008.
11. Extramural Committee to Assess Measures of Progress
Against Cancer. Measurement of progress against cancer. J Natl
Cancer Inst. 1990;82:825–35.
12. Henschke C, International Early Lung Cancer Action Program
Investigators. Survival of patients with stage I lung cancer
detected on CT screening. N Engl J Med. 2006;355:1763–71.
13. Ries LAG, Harkins D, Krapcho M, et al. SEER Cancer Statistics
Review, 1975–2003 (National Cancer Institute). 2005 [updated
2005; cited March 26, 2008]; Available from: URL: SEER Web site:
http://seer.cancer.gov/csr/1975_2003/
14. Andriole GL, Crawford ED, Grubb RL III, et al. Mortality
results from a randomized prostate-cancer screening trial. N Engl
J Med. 2009;360:1310–9.
15. Schro
¨der FH, Hugosson J, Roobol MJ, et al. Screening and
prostate-cancer mortality in a randomized European study. N
Engl J Med. 2009;360:1320–8.
16. Steurer J, Held U, Schmidt M, Gigerenzer G, Tag B, Bachman
LM. Legal concerns trigger PSA testing. J Eval Clin Pract. 2009;15:
390–2.
394 MEDICAL DECISION MAKING/MAY–JUN 2011
WEGWARTH AND OTHERS
at Max Planck Institut on July 13, 2011mdm.sagepub.comDownloaded from
... Unlike survival rates, however, mortality is a solid measure of cancer control success that avoids screening-induced bias because it does not depend on a time point of diagnosis. [5][6][7] The effectiveness of cancer screening is ultimately assessed by the reduction in mortality at the population level. 8 The problem is that, contrary to the expectation of a negative relationship between the 5-y survival rates and mortality, there is no correlation between the increase in 5-y survival rates and the change in mortality. ...
... Previous studies have demonstrated that mortality could be a solution to confusion about the actual effect of cancer screening among physicians. 7 Given the emphasis on shared decision making, 11,12 coupled with the public desire to make informed treatment decisions, [13][14][15][16][17][18] it is now imperative to examine how adult laypeople understand and perceive health statistics. ...
Article
Background Mortality is critical information in evaluating the benefits of cancer screening. However, 5-y survival rates and incidence, without mortality, have been frequently communicated to the public. Based on the literature that people’s perceptions and judgments can be altered by the way of presenting health statistics, the current study examined whether mortality alongside 5-y survival and incidence would influence laypeople’s perceptions of the effectiveness of cancer screening and screening intention. Methods In an online-based experimental survey conducted in South Korea in October 2022, 300 adults were randomly assigned to 1 of 2 groups (mortality: no v. yes) to be presented with 3 different cancers (A, B, and C). The perceived effectiveness of cancer screening and screening intention were measured using 7-point scales for each cancer. Results Across all cancers, participants in the no-mortality group perceived cancer screening to be more effective and were more willing to undergo screening compared with those in the mortality group, although the results were not statistically significant on the intention. Conclusions In general, mortality had an effect of decreasing the perceived effectiveness of cancer screening and screening intention compared with no mortality, although the effect on the intention was not statistically significant. Implications When communicating the benefits of cancer screening to the public, mortality statistics may play a role in mitigating the potentially inflated perception of the benefits of cancer screening and screening intention. Highlights Five-year survival rates, either alone or with incidence rates, are frequently communicated to the public in the context of the benefits of cancer screening. However, 5-y survival rates can sometimes be inflated without a reduction in mortality. Including mortality statistics in communications decreased the perceived effectiveness of cancer screening and screening intentions. Mortality information needs to be communicated in the benefits of cancer screening.
... (1) to examine PCPs' selfreported practices and point of view concerning prostate cancer screening with PSA tests, and (2) to assess older patients' points of view regarding PSA testing. 20 Self-reported practices and attitudes of PCP with regard to the use of PSA test + older patients point of view regarding PSA testing, ...
Book
INTRODUCTION 5 -- 2 UPDATE OF THE PREVIOUS REPORT 6 -- 2.1 METHODOLOGY 6 -- 2.1.1 Literature search 6 -- 2.1.2 Quality appraisal 6 -- 2.1.3 Data extraction 6 -- 2.2 RESULTS 7 -- 2.2.1 Systematic reviews 7 -- 2.2.2 Randomized controlled trials 7 -- 2.3 DISCUSSION 8 -- 2.4 KEY MESSAGES 9 -- 3 RISK COMMUNICATION AND SHARED DECISION MAKING 9 -- 3.1 INTRODUCTION 9 -- 3.2 RISK UNDERSTANDING AND COMMUNICATION 10 -- 3.2.1 Introduction 10 -- 3.2.2 Methodology 10 -- 3.2.3 Patients’ understanding of risk statistics 10 -- 3.2.4 Physicians’ understanding of risk statistics 10 -- 3.2.5 How to improve risk understanding? 11 -- 3.3 FROM INFORMED DECISION MAKING TOWARDS SHARED DECISION MAKING 14 -- 3.3.1 Introduction 14 -- 3.3.2 Methodology 14 -- 3.3.3 What is “shared decision making” and what is the aim of SDM? 14 -- 3.3.4 Shared decision making in PSA screening 15 -- 3.3.5 Barriers and the success factors to the implementation of shared decision making 15 -- 3.3.6 Effectiveness of interventions to improve SDM 16 -- 4 QUANTIFICATION OF THE BENEFIT AND HARMS OF THE SCREENING 16 -- 4.1 INTRODUCTION 16 -- 4.2 METHODOLOGY 16 4.3 RESULTS 17 -- 4.3.1 Burden of prostate cancer in the age-group 55-69 years 17 -- 4.3.2 Screening related benefit 18 -- 4.3.3 Screening related harms 19 -- 4.3.4 Treatment related harms 19 -- 5 ELABORATION OF A TOOL TO SUPPORT SDM 21 -- 5.1 INTRODUCTION 21 -- 5.2 METHODOLOGY 21 -- 5.2.1 First step 21 -- 5.2.2 Second step 22 -- 5.2.3 Fourth step 26 -- 5.3 RESULTS 31 -- 5.3.1 Part dedicated to practitioners 31 -- 5.3.2 Part to be discussed between patient and practitioner 32 -- APPENDIX 33 -- APPENDIX 1 UPDATE OF THE PREVIOUS REPORT 33 -- APPENDIX 1.1 REVIEW OF CLINICAL STUDIES 33 -- APPENDIX 1.2 SEARCH FOR SR AND MA 34 -- APPENDIX 1.4 QUALITY APPRAISAL 35 -- APPENDIX 1.5 DATA EXTRACTION TABLE 38 -- APPENDIX 3 RISK COMMUNICATION AND SHARED DECISION MAKING 43 -- APPENDIX 3.1 SEARCH STRATEGIES 43 -- APPENDIX 3.2 SELECTED STUDIES 44 -- APPENDIX 4 ELABORATION OF A TOOL 55 -- APPENDIX 4.1 INTERVIEW GUIDE IN-DEPTH DISCUSSIONS WITH GENERAL PRACTITIONERS- ACCEPTABILITY AND COMPREHENSION TEST 55 -- APPENDIX 4.2RESULTS ACCEPTABILITY AND COMPREHENSION TEST 62 -- APPENDIX 5 USABILITY OF THE TOOL 63 -- APPENDIX 5.1 PATIENT FORM TO BE FILLED OUT DURING THE USABILITY TEST OF SDM TOOL (FOURTH STEP). 63 -- APPENDIX 5.2 INTERVIEW GUIDE IN-DEPTH DISCUSSIONS WITH GENERAL PRACTITIONERS – CLOSING INTERVIEW 67 -- APPENDIX 5.3 ANALYSIS OF PATIENTS FORMS 72 -- BIBLIOGRAPHY 73
... A major challenge in health communication for lay information seekers, patients as well as health professionals is to get the numbers right. Difficulties understanding health statistics and risks have long been demonstrated and remain a prevailing concern (Jenny et al., 2018;Multmeier et al., 2014;Wegwarth et al., 2011). In a recent German survey, the interpretation of statistical information was the main issue for health professionals (Schaeffer et al., 2023). ...
Article
Full-text available
Static graphs of statistics are established visual aids in risk communication and decision support. Interactive information visualisations (InfoVis) and reflective tasks are supposed to enhance active processing, but the evidence is scarce and mixed. This mixed-methods research investigated the effectiveness and user experience of InfoVis and tasks in the context of mammography screening. In a web-based experiment prospective invitees of the screening program (N = 338; aged 30-49) tried a pre-tested web-based decision-aid with risk information either as text, static graph, or InfoVis with or without reflective tasks. The main outcomes were informed choice and risk knowledge, the latter operationalised according to the fuzzy-trace-theory. The accompanying qualitative evaluation with seven participants applied think-aloud protocols and focused interviews. There was no experimental evidence that InfoVis support risk knowledge or informed choice better than text or static graphs. There were even minor detrimental effects. The qualitative results showed problems with the InfoVis presenting risk of overdiagnosis, and negative reactions towards the- tasks. InfoVis processing was easy when the underlying concept was easy. While reflective tasks seem not advisable in this target group, limited and well-considered application of InfoVis with a low cognitive load can be an alternative, attention-directing visual aid format.
... A proven error in answering such questions is the so-called lead-time bias. To assess the effectiveness of screening programs against cancer, physicians use survival rates (Wegwarth et al. 2011). They do this by comparing the survival of a group of patients who received an earlier diagnosis through screening with that of another group of patients who did not receive a diagnosis until the onset of symptoms at a later point in time. ...
Article
Statistics are used in the police, for example, to show changes in crime trends over time and to provide evidence for the effectiveness of crime prevention measures. Statistical literacy is the ability to understand and draw sound conclusions from statistical data; therefore, it contributes to professional competencies in police work. In the present study, the handling of statistics and probabilities by police commissioner trainees is examined in more detail, focusing on risk literacy. Using police-related scenarios, it is examined whether biases can occur in the assessment of probabilities and risks, as is the case in other professions. Both the conjunction fallacy and the base rate neglect could be demonstrated and are discussed in terms of their relevance for training and police work in general.
... Our study has relevant implications for the debate on the suitable measures to condemn future pandemic scenarios (or other risk scenarios). First, communicating risks remains prone to misunderstandings and false inferences (Gigerenzer, Wegwarth, & Feufel, 2010;Wegwarth, Gaissmaier, & Gigerenzer, 2011). Following along the famous phrase from George Box that "all models are wrong, but some are useful', our results suggest that the dominance of specific expertise of opinion leaders in the public discourse seems to be unwarranted. ...
Article
The coronavirus disease 2019 pandemic has underscored the importance of scientific knowledge and highlighted the challenge for politicians: They had to rely on expert advice and still had to make decisions under uncertainty due to the lack of long-term health data. This article investigates how expert judgments and expert advice affect the choices between programs that are proposed to combat the outbreak of a viral disease by means of a between-subjects design embedded in a survey. We use the classic Asian disease experiment and extend earlier applications by varying the professional background of the experts (virologists vs. social scientists) within the experimental set-up. We use data from a university wide web-survey to show the persistence of framing effects and that the disciplinary background of the expert is not related to individual decision-making under risk.
... Indirect approaches via examinations of variations in prevalence of procedures, prescriptions and intensity of care, however, suggest that high-income countries face high rates of overuse across a wide range of services and prescriptions. 1 2 Overuse can detrimentally impact patients' health, both physically and psychologically, and strain the healthcare system by squandering resources and funds that could be more effectively allocated elsewhere. Past research indicates that physicians' level of medical risk literacy, [3][4][5][6][7][8][9][10] and, as a variant of risk literacy, their numeracy, [11][12][13] can considerably influence their recommendations and decisions. Medical risk literacy refers to the cognitive ability to understand and interpret numerical statistical information (eg, relative vs absolute risk) related to medical interventions. ...
Article
Background Overuse of medical care is a pervasive problem. Studies using hypothetical scenarios suggest that physicians’ risk literacy influences medical decisions; real-world correlations, however, are lacking. We sought to determine the association between physicians’ risk literacy and their real-world prescriptions of potentially hazardous drugs, accounting for conflicts of interest and perceptions of benefit–harm ratios in low-value prescribing scenarios. Setting and sample Cross-sectional study—conducted online between June and October 2023 via field panels of Sermo (Hamburg, Germany)—with a convenience sample of 304 English general practitioners (GPs). Methods GPs’ survey responses on their treatment-related risk literacy, conflicts of interest and perceptions of the benefit–harm ratio in low-value prescribing scenarios were matched to their UK National Health Service records of prescribing volumes for antibiotics, opioids, gabapentin and benzodiazepines and analysed for differences. Results 204 GPs (67.1%) worked in practices with ≥6 practising GPs and 226 (76.0%) reported 10–39 years of experience. Compared with GPs demonstrating low risk literacy, GPs with high literacy prescribed fewer opioids (mean (M ) : 60.60 vs 43.88 prescribed volumes/1000 patients/6 months, p=0.016), less gabapentin (M: 23.84 vs 18.34 prescribed volumes/1000 patients/6 months, p=0.023), and fewer benzodiazepines (M: 17.23 vs 13.58 prescribed volumes/1000 patients/6 months, p=0.037), but comparable volumes of antibiotics (M: 48.84 vs 40.61 prescribed volumes/1000 patients/6 months, p=0.076). High-risk literacy was associated with lower conflicts of interest (ϕ = 0.12, p=0.031) and higher perception of harms outweighing benefits in low-value prescribing scenarios (p=0.007). Conflicts of interest and benefit–harm perceptions were not independently associated with prescribing behaviour (all ps >0.05). Conclusions and relevance The observed association between GPs with higher risk literacy and the prescription of fewer hazardous drugs suggests the importance of risk literacy in enhancing patient safety and quality of care.
... For many people, the expression "survival rate" incorrectly relates only to the probability of survival. The cancer survival rate is an unclear metric that should be replaced by the mortality rate (Gigerenzer et al., 2007;Wegwarth, Gaissmaier, & Gigerenzer, 2011;Wegwarth et al., 2012). ...
Article
Full-text available
In order to make an informed, evidence-based decision, it is vital to recognize numbers, statistics and concepts that are apparently transparent but are not always adequately accounted for. Health professionals, patients, and others frequently misinterpret numbers, statistics and concepts in healthcare. Among the many repercussions of health literacy, appropriate decision-making and the reduction in the number of interventions and treatments stand out, resulting in an improvement in people's health and a decrease in overtreatment and health expenses. This study intends to evaluate how properly health professionals and the public, in general, comprehend and interpret some health-related numbers. To accomplish this goal, the researchers shared a questionnaire made available online in Portugal from January 2, 2019, until April 12, 2019. The final sample comprised 485 respondents; 154 physicians, 142 nurses, and 189 people from other professions. The findings suggest that there is a problem with widespread numerical illiteracy, which should not be the case, highlighting the need to improve the numerical and statistical health literacy of both health professionals and the general population. So, medical professionals and patients must thus comprehend the statistics and health-related concepts to obtain the proper consent.
Chapter
Full-text available
Disease Interception is based on the idea of stopping the development of diseases before the clinical manifestation by providing a targeted (medical) intervention. The concept is located between the categories of prevention and treatment of illness and addresses a new group of people: those who are no longer completely healthy, but who are not yet ill in the traditional sense implying a lack of symptoms or functional impairment. The articles focus on the framework needed for the development and application of Disease Interception, in particular the use of health data and digitalization, but also on the patients' perspective, the ethical implications, and the law of statutory health insurance. With contributions by Dr. Léon Beyer | Dr. Sarah Diner | Dr. Martin Danner | Dr. Anke Diehl, M.A.| Prof. Dr. Klaus Gerwert | Dr. Joschka Haltaufderheide | Prof. Dr. Stefan Huster | Prof. Dr. Thomas Jäschke | Prof. Dr. Alexandra Jorzig | Franz Knieps | Prof. Dr. Robert Ranisch | Dr. Nils Krochmann | Prof. Dr. Frank Stollmann | Prof. Dr. Jochen A. Werner | Lara Wiese | Dr. Silvia Woskowski, LL.M.
Article
Obwohl statistische Informationen allgegenwärtig sind und in verschiedensten Kontexten eingesetzt werden, kann nicht davon ausgegangen werden, dass die Adressaten diese korrekt einschätzen können – es fehlt an „statistical literacy“. Diese umfasst Fähigkeiten der Interpretation, Bewertung und Reflexion statistischer Aussagen, welche im Zusammenspiel mit kognitiven und dispositionellen Komponenten stehen. Frühere Studien zeigten bereits, dass Ärzte nicht fähig waren, Testergebnisse anhand statistischer Informationen korrekt einzuschätzen. Hinzu kommen verzerrte und/oder intransparente Darstellungen in medizinischer Berichterstattung. Am Beispiel des Mammographie-Screenings zeigt diese Arbeit, dass der Zweck von Statistiken, welcher in der objektiven Darstellung empirischer Daten liegt, nur bedingt erfüllt werden kann. Eine informierte Entscheidungsfindung vor allem auf der Seite der PatientInnen ist in diesem Sinne unmöglich. Die konzipierten Lösungsansätze könnten künftig zu größerer „statistical literarcy“ im Gesundheitswesen führen.
Article
Full-text available
The European Randomized Study of Screening for Prostate Cancer was initiated in the early 1990s to evaluate the effect of screening with prostate-specific-antigen (PSA) testing on death rates from prostate cancer. We identified 182,000 men between the ages of 50 and 74 years through registries in seven European countries for inclusion in our study. The men were randomly assigned to a group that was offered PSA screening at an average of once every 4 years or to a control group that did not receive such screening. The predefined core age group for this study included 162,243 men between the ages of 55 and 69 years. The primary outcome was the rate of death from prostate cancer. Mortality follow-up was identical for the two study groups and ended on December 31, 2006. In the screening group, 82% of men accepted at least one offer of screening. During a median follow-up of 9 years, the cumulative incidence of prostate cancer was 8.2% in the screening group and 4.8% in the control group. The rate ratio for death from prostate cancer in the screening group, as compared with the control group, was 0.80 (95% confidence interval [CI], 0.65 to 0.98; adjusted P=0.04). The absolute risk difference was 0.71 death per 1000 men. This means that 1410 men would need to be screened and 48 additional cases of prostate cancer would need to be treated to prevent one death from prostate cancer. The analysis of men who were actually screened during the first round (excluding subjects with noncompliance) provided a rate ratio for death from prostate cancer of 0.73 (95% CI, 0.56 to 0.90). PSA-based screening reduced the rate of death from prostate cancer by 20% but was associated with a high risk of overdiagnosis. (Current Controlled Trials number, ISRCTN49127736.)
Article
Full-text available
The effect of screening with prostate-specific-antigen (PSA) testing and digital rectal examination on the rate of death from prostate cancer is unknown. This is the first report from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial on prostate-cancer mortality. From 1993 through 2001, we randomly assigned 76,693 men at 10 U.S. study centers to receive either annual screening (38,343 subjects) or usual care as the control (38,350 subjects). Men in the screening group were offered annual PSA testing for 6 years and digital rectal examination for 4 years. The subjects and health care providers received the results and decided on the type of follow-up evaluation. Usual care sometimes included screening, as some organizations have recommended. The numbers of all cancers and deaths and causes of death were ascertained. In the screening group, rates of compliance were 85% for PSA testing and 86% for digital rectal examination. Rates of screening in the control group increased from 40% in the first year to 52% in the sixth year for PSA testing and ranged from 41 to 46% for digital rectal examination. After 7 years of follow-up, the incidence of prostate cancer per 10,000 person-years was 116 (2820 cancers) in the screening group and 95 (2322 cancers) in the control group (rate ratio, 1.22; 95% confidence interval [CI], 1.16 to 1.29). The incidence of death per 10,000 person-years was 2.0 (50 deaths) in the screening group and 1.7 (44 deaths) in the control group (rate ratio, 1.13; 95% CI, 0.75 to 1.70). The data at 10 years were 67% complete and consistent with these overall findings. After 7 to 10 years of follow-up, the rate of death from prostate cancer was very low and did not differ significantly between the two study groups. (ClinicalTrials.gov number, NCT00002540.)
Article
Full-text available
To assess whether the way in which the results of a randomised controlled trial and a systematic review are presented influences health policy decisions. A postal questionnaire to all members of a health authority within one regional health authority. Anglia and Oxford regional health authorities. 182 executive and non-executive members of 13 health authorities, family health services authorities, or health commissions. The average score from all health authority members in terms of their willingness to fund a mammography programme or cardiac rehabilitation programme according to four different ways of presenting the same results of research evidence--namely, as a relative risk reduction, absolute risk reduction, proportion of event free patients, or as the number of patients needed to be treated to prevent an adverse event. The willingness to fund either programme was significantly influenced by the way in which data were presented. Results of both programmes when expressed as relative risk reductions produced significantly higher scores when compared with other methods (P < 0.05). The difference was more extreme for mammography, for which the outcome condition is rarer. The method of reporting trial results has a considerable influence on the health policy decisions made by health authority members.
Article
The questions of the extent of progress against cancer and how to measure it have stimulated attention among scientists and in Congress. In this presentation, we summarize a report requested by the Senate Appropriations Committee to address the adequacy of the existing measures of progress against cancer. The report was prepared by an extramural committee convened by the National Cancer Institute at the request of Congress. It includes extensive findings and recommendations on the existing measures of progress against cancer, the systems used to develop the data reported through the measures, the frequency and content of reports addressing progress, and the need for analytic research on this topic. Although the Extramural Committee To Assess Measures of Progress Against Cancer found the measures and systems to be generally adequate, they also found that modification or expansion of the information base is needed in many areas. [J Natl Cancer Inst 82: 825–835, 1990]
Article
▪ Objective: To compare clinicians' ratings of therapeutic effectiveness when different trial end points were presented as percent reductions in relative compared with absolute risk and as numbers of patients treated to avoid one adverse outcome. ▪ Design: Survey, with random allocation of two questionnaires. ▪ Setting: Toronto teaching hospitals. ▪ Respondents: Convenience sample of 100 faculty and housestaff in internal medicine and family medicine. ▪ Intervention: One questionnaire presented results for three end points of the Helsinki Heart Study as separate drug trials using only absolute differences in events; the other showed the same end points as relative differences. Both questionnaires included a fourth "trial," showing person-years of treatment needed to prevent one myocardial infarction. ▪ Main Outcome Measure: The "trials" were each rated on an 11-point scale, from treatment "harmful" to "very effective." ▪ Results: Respondents' ratings of effectiveness varied with the end point. Controlling for end point, ratings of effectiveness by the 50 participants receiving absolute event data were lower than those by 50 participants responding to relative risk reductions (P < 0.001); however, no end-point difference was more than 0.6 scale points. For a "trial" reporting that 77 persons were treated for 5 years to prevent one myocardial infarction, mean ratings were 2.3 or 1.8 scale points lower, respectively (both P < 0.001), than when the same data were shown as relative or absolute risk reductions. ▪ Conclusions: Clinicians' views of drug therapies are affected by the common use of relative risk reductions in both trial reports and advertisements, by end-point emphasis, and, above all, by underuse of summary measures that relate treatment burden to therapeutic yields in a clinically relevant manner.
Article
Shared decision-making is increasingly advocated as an ideal model of treatment decision-making in the medical encounter. To date, the concept has been rather poorly and loosely defined. This paper attempts to provide greater conceptual clarity about shared treatment decision-making, identify some key characteristics of this model, and discuss measurement issues. The particular decision-making context that we focus on is potentially life threatening illnesses, where there are important decisions to be made at key points in the disease process, and several treatment options exist with different possible outcomes and substantial uncertainty. We suggest as key characteristics of shared decision-making (1) that at least two participants--physician and patient be involved; (2) that both parties share information; (3) that both parties take steps to build a consensus about the preferred treatment; and (4) that an agreement is reached on the treatment to implement. Some challenges to measuring shared decision-making are discussed as well as potential benefits of a shared decision-making model for both physicians and patients.
Article
Many doctors, patients, journalists, and politicians alike do not understand what health statistics mean or draw wrong conclusions without noticing. Collective statistical illiteracy refers to the widespread inability to understand the meaning of numbers. For instance, many citizens are unaware that higher survival rates with cancer screening do not imply longer life, or that the statement that mammography screening reduces the risk of dying from breast cancer by 25% in fact means that 1 less woman out of 1,000 will die of the disease. We provide evidence that statistical illiteracy (a) is common to patients, journalists, and physicians; (b) is created by nontransparent framing of information that is sometimes an unintentional result of lack of understanding but can also be a result of intentional efforts to manipulate or persuade people; and (c) can have serious consequences for health. The causes of statistical illiteracy should not be attributed to cognitive biases alone, but to the emotional nature of the doctor–patient relationship and conflicts of interest in the healthcare system. The classic doctor–patient relation is based on (the physician's) paternalism and (the patient's) trust in authority, which make statistical literacy seem unnecessary; so does the traditional combination of determinism (physicians who seek causes, not chances) and the illusion of certainty (patients who seek certainty when there is none). We show that information pamphlets, Web sites, leaflets distributed to doctors by the pharmaceutical industry, and even medical journals often report evidence in nontransparent forms that suggest big benefits of featured interventions and small harms. Without understanding the numbers involved, the public is susceptible to political and commercial manipulation of their anxieties and hopes, which undermines the goals of informed consent and shared decision making. What can be done? We discuss the importance of teaching statistical thinking and transparent representations in primary and secondary education as well as in medical school. Yet this requires familiarizing children early on with the concept of probability and teaching statistical literacy as the art of solving real-world problems rather than applying formulas to toy problems about coins and dice. A major precondition for statistical literacy is transparent risk communication. We recommend using frequency statements instead of single-event probabilities, absolute risks instead of relative risks, mortality rates instead of survival rates, and natural frequencies instead of conditional probabilities. Psychological research on transparent visual and numerical forms of risk communication, as well as training of physicians in their use, is called for. Statistical literacy is a necessary precondition for an educated citizenship in a technological democracy. Understanding risks and asking critical questions can also shape the emotional climate in a society so that hopes and anxieties are no longer as easily manipulated from outside and citizens can develop a better-informed and more relaxed attitude toward their health.
Article
In the United States, lawsuits against physicians have had an impact on their behaviour, resulting in overdiagnosis and other forms of 'defensive medicine'. Does a similar situation exist in Switzerland? Using prostate-specific antigen (PSA) screening as an example, we surveyed Swiss physicians and assessed the extent to which liability fears influenced their recommendation for testing. At a continuing medical education conference we distributed a pilot-tested questionnaire to 552 participants. Two hundred and fifty of them (45%) completed the questionnaire. Of the participants, 158 (68%) were general practitioners and 73 (32%) specialists in internal medicine. Seventy-five per cent of both groups recommend regular PSA screening to men older than age 50. Yet only 56% of the general physicians and 53% of the internists believe that PSA measurement is an effective screening method. A substantial proportion of the physicians - 41% of general practitioners and 43% of internists - reported that they sometimes or often recommend this test for legal reasons. Defensive medicine is not a phenomenon particular to the USA, but is also observable in Switzerland. This result is surprising, given that in Switzerland and other European countries, a physician who does not recommend a test or treatment whose effectiveness is controversial need not fear litigation.
Article
To compare clinicians' ratings of therapeutic effectiveness when different trial end points were presented as percent reductions in relative compared with absolute risk and as numbers of patients treated to avoid one adverse outcome. Survey, with random allocation of two questionnaires. Toronto teaching hospitals. Convenience sample of 100 faculty and housestaff in internal medicine and family medicine. One questionnaire presented results for three end points of the Helsinki Heart Study as separate drug trials using only absolute differences in events; the other showed the same end points as relative differences. Both questionnaires included a fourth "trial," showing person-years of treatment needed to prevent one myocardial infarction. The "trials" were each rated on an 11-point scale, from treatment "harmful" to "very effective." Respondents' ratings of effectiveness varied with the end point. Controlling for end point, ratings of effectiveness by the 50 participants receiving absolute event data were lower than those by 50 participants responding to relative risk reductions (P < 0.001); however, no end-point difference was more than 0.6 scale points. For a "trial" reporting that 77 persons were treated for 5 years to prevent one myocardial infarction, mean ratings were 2.3 or 1.8 scale points lower, respectively (both P < 0.001), than when the same data were shown as relative or absolute risk reductions. Clinicians' views of drug therapies are affected by the common use of relative risk reductions in both trial reports and advertisements, by end-point emphasis, and, above all, by underuse of summary measures that relate treatment burden to therapeutic yields in a clinically relevant manner.