ArticlePDF Available

Abstract

Researchers have shown that people often miss the occurrence of an unexpected yet salient event if they are engaged in a different task, a phenomenon known as inattentional blindness. However, demonstrations of inattentional blindness have typically involved naive observers engaged in an unfamiliar task. What about expert searchers who have spent years honing their ability to detect small abnormalities in specific types of images? We asked 24 radiologists to perform a familiar lung-nodule detection task. A gorilla, 48 times the size of the average nodule, was inserted in the last case that was presented. Eighty-three percent of the radiologists did not see the gorilla. Eye tracking revealed that the majority of those who missed the gorilla looked directly at its location. Thus, even expert searchers, operating in their domain of expertise, are vulnerable to inattentional blindness.
Psychological Science
24(9) 1848 –1853
© The Author(s) 2013
Reprints and permissions:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/0956797613479386
pss.sagepub.com
Research Report
When one is engaged in a demanding task, attention can
act like a set of blinders, making it possible for salient
stimuli to pass unnoticed right in front of one’s eyes
(Neisser & Becklen, 1975). This phenomenon of sus-
tained inattentional blindness (IB) is best known from
Simons and Chabris’s (1999) study in which observers
attended to a ball-passing game while a human in a
gorilla suit wandered through the field of play. Even
though the gorilla walked through the center of the
scene, a substantial portion of the observers did not
report seeing it (the video can be viewed at http://www
.theinvisiblegorilla.com/videos.html). Moving beyond
such demonstrations, one might ask whether IB still
occurs when the observers are experts who are highly
trained on the primary task. There is some evidence that
expertise mitigates the effect. For example, Memmert
(2006) found a decreased rate of IB among basketball
players who were asked to count the number of passes
in an artificial basketball game. However, when Potchen
(2006) asked radiologists to review cases as if for an
annual exam and showed them chest x-rays with a clavi-
cle (collarbone) removed, roughly 60% failed to notice
the missing bone. Finally, a recent observational report
documented that a misplaced femoral line was not
detected by a variety of health-care professionals who
evaluated the case (Lum, Fairbanks, Pennington, &
Zwemer, 2005).
Both of these instances of apparent IB in the medical
setting occurred when single-slice medical images were
viewed. Modern medical imaging technologies, such as
MRI, computed tomography (CT), and positron-emission
tomography (PET), are increasingly complex: The single
image of a chest x-ray has been replaced with hundreds
of slices in a chest CT scan. It is therefore important to
study whether IB occurs in these modern imaging modal-
ities. These situations are interesting because the observer
actively interacts with the stimulus—for example, scroll-
ing through a stack of images of the lung. This degree of
control may ameliorate the effects of IB because the
searcher is able to return to and further examine any
images that appear unusual.
479386PSSXXX10.1177/0956797613479386Drew et al.Sustained Inattentional Blindness in Expert Observers
research-article2013
Corresponding Author:
Trafton Drew, Visual Attention Lab, Harvard Medical School, 64
Sidney St., Suite 170, Cambridge, MA 02139
E-mail: tdrew1@partners.org
The Invisible Gorilla Strikes Again:
Sustained Inattentional Blindness
in Expert Observers
Trafton Drew, Melissa L.-H. Võ, and Jeremy M. Wolfe
Visual Attention Lab, Harvard Medical School, and Brigham and Women’s Hospital, Boston, Massachusetts
Abstract
Researchers have shown that people often miss the occurrence of an unexpected yet salient event if they are engaged in
a different task, a phenomenon known as inattentional blindness. However, demonstrations of inattentional blindness
have typically involved naive observers engaged in an unfamiliar task. What about expert searchers who have spent
years honing their ability to detect small abnormalities in specific types of images? We asked 24 radiologists to perform
a familiar lung-nodule detection task. A gorilla, 48 times the size of the average nodule, was inserted in the last case
that was presented. Eighty-three percent of the radiologists did not see the gorilla. Eye tracking revealed that the
majority of those who missed the gorilla looked directly at its location. Thus, even expert searchers, operating in their
domain of expertise, are vulnerable to inattentional blindness.
Keywords
visual attention, perception, selective attention
Received 8/1/12; Revision accepted 1/27/13
Sustained Inattentional Blindness in Expert Observers 1849
Moreover, whereas Potchen (2006) showed that radi-
ologists could miss the unexpected absence of a stimu-
lus, we wanted to know if they would miss the presence
of a readily detectable, highly anomalous item while per-
forming a task within their realm of expertise. In an hom-
age to Simons and Chabris’s (1999) study, we made that
item a gorilla. We compared the performance of radiolo-
gists with that of naive observers.
Design and Procedure
In CT lung-cancer screening, radiologists search a recon-
structed “stack” of axial slices of the lung for nodules that
appear as small light circles (Aberle et al., 2011). In
Experiment 1, 24 radiologists (mean age = 48, range =
28–70) had up to 3 min to freely scroll through each of
five chest CTs, searching for nodules as we tracked their
eye position. The five trials contained an average of 10
nodules, and the observers were instructed to click on
nodule locations with the computer mouse. In the final
trial, we inserted a gorilla with a white outline into the
lung (see Fig. 1). A typical stack of images from a chest
CT contains 100 to 500 slices. In the current study, the
stack that contained the gorilla had 239 slices.
Nine radiologists were tested at Brigham and Women’s
Hospital in Boston, Massachusetts, and 15 expert examin-
ers from the American Board of Radiology were tested at
a meeting of that organization in Louisville, Kentucky.
The gorilla measured 29 × 50 mm. Because of equipment
differences, the image size was slightly different at the
two sites, and consequently the size of the gorilla differed
slightly (Boston: 0.9 × 0.5 degrees of visual angle;
Louisville: 1.3 × 0.65 degrees of visual angle). To avoid
large onset transients, we had the gorilla fade into and
out of visibility over five 2-mm-thick slices (Fig. 1). The
total volume of the rectangular box that could hold the
gorilla was more than 7,400 mm3, roughly the size of a
box of matches. The gorilla was centered in depth near a
lung nodule such that both were clearly visible when the
gorilla was at maximum opacity. That is, if someone
pointed at the correct location in the static image and
asked you, “What is that?” you would have no trouble
answering, “That is a gorilla.” In the scans used in this
study, which were taken from the Lung Image Database
Consortium (Armato et al., 2011), the average volume of
the lung nodules was 153 mm3. Thus, the gorilla was
more than 48 times the size of the average nodule in the
images (see Fig. 2a).
Experiment 2 replicated Experiment 1 with 25 naive
observers (mean age = 33.7, range = 19–55), who had no
medical training. Prior to the experiment, the experi-
menter spent roughly 10 min teaching these observers
how to identify lung nodules. This experiment began
with a practice trial, during which the experimenter took
time to point out several nodules. The experimenter then
encouraged the observer to try to find nodules on his or
her own. Once the observer was able to detect at least
one nodule, the practice trial was concluded, and the
experimental trials began. As in Experiment 1, a subset
(12) of observers completed the study on a slightly
smaller screen. We observed no difference in gorilla or
nodule detection as a result of equipment differences.
Experiment 3 was a control experiment intended to
ensure that the gorilla was, in fact, visible. Twelve naive
observers (mean age = 37.3, range = 21–54) were shown
movies that progressed from the top to the bottom of the
same chest CT case that was used as the final trial in
Experiments 1 and 2. The gorilla was inserted into the
movies in the same location on 50% of the 20 trials, and
observers were asked to judge whether the gorilla was
present or absent on each trial. A circular cue indicated
the possible location of the gorilla on each trial. The
movies were presented at a rate of 35 or 70 ms per frame
(manipulated within subjects).
Fig. 1. Illustration of the slices showing the gorilla in the final trial of Experiments 1 and 2. The opacity of the gorilla increased from 50% to 100%
and then decreased back down to 50% over the course of 5 slices within a stack of 239.
1850 Drew et al.
Results
Experiment 1
The nodule detection task was challenging, even for
expert radiologists. The overall nodule detection rate was
55%. While engaged in this task, the radiologists freely
scrolled through the slices containing the gorilla an aver-
age of 4.3 times. At the end of the final case, we asked a
series of questions to determine whether they had noticed
the gorilla: “Did the final trial seem any different than any
of the other trials?” “Did you notice anything unusual on
the final trial?” and, finally, “Did you see a gorilla on the
final trial?” Twenty of the 24 radiologists failed to report
seeing a gorilla. This was not due to the gorilla being dif-
ficult to perceive: All 24 radiologists reported seeing the
gorilla when they were asked if they noticed anything
unusual in Figure 1 after completing the experiment (see
also the results for Experiment 3).
The radiologists had ample opportunity to find the
gorilla. On average, those who missed the gorilla spent
5.8 s viewing the five slices containing it (range = 1.1–12
s) and spent an average of 329 ms looking at the gorilla’s
location. Furthermore, eye tracking revealed that of the
20 radiologists who did not report the gorilla, 12 looked
directly at the gorilla’s location when it was visible. The
mean dwell time on the gorilla in this group was 547 ms.
Figure 2b shows the eye positions of a radiologist who
clearly fixated the gorilla but did not report it.
Experiment 2
None of the 25 naive observers reported noticing the
gorilla. As was the case with the radiologists in Experiment
1, all of the naive observers reported seeing the gorilla
when shown Figure 1. The results support the idea
(Memmert, 2006) that experts are somewhat less prone to
IB than novices are (Fisher’s exact test: p = .0497; see Fig.
3a). However, unlike in Memmert’s study, our two groups
showed a sizable difference in performance on the pri-
mary task. As expected, radiologists were much better at
detecting lung nodules (mean detection rate = 55%) than
were naive observers (12%), t(47) = 12.3, p < .001 (see
Fig. 3b).
Eye movement data followed the pattern seen with the
radiologists. The naive observers spent an average of
4.9 s searching the frames in which the gorilla was visible
and an average of 157 ms looking at the gorilla’s location.
Although both measures showed that radiologists who
missed the gorilla spent slightly more time searching in
its vicinity than did the naive observers, neither differ-
ence was significant, t(43) = 1.26, p = .22, and t(43) =
1.23, p = .22, respectively. Of the 25 naive observers, 9
looked at the gorilla’s location. The mean dwell time on
the gorilla in the latter group was 435 ms.
Experiment 3
Although all observers in Experiments 1 and 2 reported
seeing the gorilla when shown Figure 1 at the end of the
experiment, given the very high rate of IB in both stud-
ies, there was some concern that the gorilla was too dif-
ficult to detect when embedded within a stack of chest
CT images. We tested this possibility in Experiment 3.
The movies played at a fast or slower frame rate such that
the gorilla was visible for 175 or 350 ms, respectively—
substantially less time than the 4.9 s that the average
Fig. 2. Computed-tomography image containing the embedded gorilla
(a) and eye-position plot of a radiologist who did not report seeing the
gorilla (b). In (b), the circles represent eye positions recorded at 1-ms
intervals.
Sustained Inattentional Blindness in Expert Observers 1851
naive observer in Experiment 2 spent searching frames in
which the gorilla was present. Despite this large differ-
ence, performance on the detection task was near ceiling
(88% correct). Accuracy was not affected by the frame
rate, t(11) = 1.1, p = .18 (see Fig. 3c).
Discussion
In Experiment 1, 20 of 24 expert radiologists failed to
note a gorilla, the size of a matchbook, embedded in a
stack of CT images of the lungs. This is a clear illustration
that radiologists, though they are expert searchers, are
not immune to the effects of IB even when searching
medical images within their domain of expertise. Potchen
(2006) showed that radiologists could miss the absence
of an entire bone. Results from laboratory search tasks
have shown that it is harder to detect the absence of
something than to detect its presence (Treisman &
Souther, 1985). Our data show that under certain circum-
stances, experts can also miss the presence of a large,
anomalous stimulus. In fact, there is some clinical evi-
dence for errors of this sort in radiology. Lum et al. (2005)
reported a case study in which multiple emergency radi-
ologists failed to detect a misplaced femoral-line guide
wire that was mistakenly left in a patient and was clearly
visible on three different chest CT scans. Although these
scans were viewed by radiologists, emergency physi-
cians, internists, and intensivists, the guide wire was not
detected for 5 days. Clearly, radiologists can miss an
abnormality that is retrospectively visible when the
abnormality is unexpected.
It is reassuring that our experts exhibited somewhat
lower rates of IB than naive observers, as was reported
by Memmert (2006). In that earlier study, expertise was
defined as extensive basketball experience, and IB was
measured during an artificial task in which two groups of
individuals passed a ball back and forth while moving
randomly about a small area. The observers were asked
to count the number of passes completed by one group.
In this rather abnormal basketball game, the rate of IB
Radiologists Naive Observers
0
25
50
75
100
Inattentional-Blindness Rate (%)
a
Fast Slow
0
25
50
75
100
Presentation Rate
Gorilla Detection Rate (%)
c
Radiologists Naive Observers
0
25
50
75
100
Nodule Detection Rate (%)
b
Fig. 3. Experimental results. The graph in (a) shows the rate of inattentional blindness (i.e., the percentage of observers who
did not report seeing the gorilla) among the radiologists in Experiment 1 and the naive observers in Experiment 2. The graph
in (b) shows the percentage of nodules that were correctly marked by these same observers. The graph in (c) shows the rate at
which observers in Experiment 3 detected the gorilla as a function of presentation rate (fast: 35 ms/frame; slow: 70 ms/frame).
Error bars represent standard errors of the mean.
1852 Drew et al.
was lower for the experts than for observers with less
basketball experience. In the current study, high rates of
IB were obtained with a task and stimulus materials that
were very familiar to our expert observers: searching a
chest CT scan for signs of lung cancer.
Experts may perform slightly better on this IB task
than naive observers do because their attentional capac-
ity is less completely occupied by the primary task.
Simons and Jensen (2009) recently showed that the rate
of IB decreased when the primary task (counting the
number of times an object bounced) was made easier.
Along similar lines, there is evidence that training on a
specific task reduces the subsequent IB rate (Richards,
Hannon, & Derakshan, 2010). In our study, the radiolo-
gists certainly had much more experience on the specific
primary task, and were clearly better at it. Both factors
are likely to have contributed to the reduced rate of IB
observed in our experts. Nevertheless, even though the
radiologists were slightly better than the naive observers,
their miss rate of 83% indicates a striking level of IB.
Why do radiologists sometimes fail to detect such
large anomalies? Of course, as is critical in all IB demon-
strations, the radiologists were not looking for the unex-
pected stimulus. In most previous demonstrations of IB,
observers engaged in a primary task that was unrelated
to detection of the unexpected stimulus (e.g., counting
the number of passes or bounces, as in Most et al., 2001;
Richards et al., 2010; Simons & Chabris, 1999; Simons &
Jensen, 2009). Here, too, though detection of aberrant
structures in the lung would be a standard component of
the radiologist’s task, observers were not looking for
gorillas. Presumably, they would have done much better
at detecting the gorilla had they been told to be prepared
for such a target. Moreover, the observers were searching
for small, light nodules. Previous work with naive observ-
ers has shown that IB is modulated by the degree of
match between the designated targets and the unex-
pected item (Most et al., 2001). This suggests that our
observers might have fared better if we had used an
albino gorilla that better matched the luminance polarity
of the designated targets. Counterintuitively, perhaps a
smaller gorilla would have been more frequently detected
because it would have more closely matched the size of
the lung nodules.
Our results could be seen as an example of a phenom-
enon known as satisfaction of search, in which detection
of one stimulus interferes with detection of subsequent
stimuli (e.g., Berbaum et al., 1998). We placed the gorilla
on a slice that contained a nodule that was detected by
71% of the radiologists. Perhaps the observed rate of IB
was inflated by the presence of this nodule. Without run-
ning an additional experiment examining the detection
rate for the gorilla in the absence of the nodule, it
is difficult to be certain what role the presence of the
nodule played. However, if satisfaction of search truly
drove the IB effect, we would expect that radiologists
who missed the nodule would have been more likely to
detect the gorilla and that radiologists who found the
nodule would have been more likely to miss the gorilla.
Neither of these predictions held true: Of the 7 radiolo-
gists who missed the nodule, none detected the gorilla.
Furthermore, all of the radiologists who detected the
gorilla also detected the nodule on the same slice.
It would be a mistake to regard these results as an
indictment of radiologists. As a group, they are highly
skilled practitioners of a very demanding class of visual
search tasks. The message of the present set of results is
that even this high level of expertise does not immunize
individuals against inherent limitations of human atten-
tion and perception. Researchers should seek better
understanding of these limits, so that medical and other
man-made search tasks could be designed in ways that
reduce the consequences of these limitations.
Author Contributions
T. Drew developed the study concept. All authors contributed
to the study design. T. Drew collected and analyzed the data.
T. Drew wrote the manuscript in collaboration with J. M. Wolfe
and M. L.-H. Võ. All authors approved the final version of the
manuscript for submission.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with
respect to their authorship or the publication of this article.
References
Aberle, D. R., Adams, A. M., Berg, C. D., Black, W. C., Clapp,
J. D., Fagerstrom, R. M., . . . Sicks, J. D. (2011). Reduced
lung-cancer mortality with low-dose computed tomographic
screening. New England Journal of Medicine, 365, 395–
409.
Armato, S. G., III, McLennan, G., Bidaut, L., McNitt-Gray,
M. F., Meyer, C. R., Reeves, A. P., . . . Croft, B. Y. (2011).
The Lung Image Database Consortium (LIDC) and Image
Database Resource Initiative (IDRI): A completed reference
database of lung nodules on CT scans. Medical Physics, 38,
915–931.
Berbaum, K. S., Franken, E. A., Dorfman, D. D., Miller, E. M.,
Caldwell, R. T., Kuehn, D. M., & Berbaum, M. L. (1998).
Role of faulty visual search in the satisfaction of search
effect in chest radiography. Academic Radiology, 5,
9–19.
Lum, T. E., Fairbanks, R. J., Pennington, E. C., & Zwemer,
F. L. (2005). Profiles in patient safety: Misplaced femoral line
guidewire and multiple failures to detect the foreign body
on chest radiography. Academic Emergency Medicine, 12,
658–662.
Sustained Inattentional Blindness in Expert Observers 1853
Memmert, D. (2006). The effects of eye movements, age, and
expertise on inattentional blindness. Consciousness and
Cognition, 15, 620–627.
Most, S. B., Simons, D. J., Scholl, B. J., Jimenez, R., Clifford, E.,
& Chabris, C. F. (2001). How not to be seen: The contribu-
tion of similarity and selective ignoring to sustained inat-
tentional blindness. Psychological Science, 12, 9–17.
Neisser, U., & Becklen, R. (1975). Selective looking: Attending to
visually specified events. Cognitive Psychology, 7, 480–494.
Potchen, E. J. (2006). Measuring observer performance in chest
radiology: Some experiences. Journal of the American
College of Radiology, 3, 423–432.
Richards, A., Hannon, E. M., & Derakshan, N. (2010). Predicting
and manipulating the incidence of inattentional blindness.
Psychological Research, 74, 513–523.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst:
Sustained inattentional blindness for dynamic events.
Perception, 28, 1059–1074.
Simons, D. J., & Jensen, M. S. (2009). The effects of individual
differences and task difficulty on inattentional blindness.
Psychonomic Bulletin & Review, 16, 398–403.
Treisman, A., & Souther, J. (1985). Search asymmetry: A diag-
nostic for preattentive processing of separable features.
Journal of Experimental Psychology: General, 114, 285–310.
... Both methods provide semi-quantitative assessments, which are inherently subjective, leading to inconsistencies in diagnosis [35][36][37]. Physicians depend on subjective experience to interpret vast amounts of image data, making it challenging to avoid errors arising from the intrinsic limitations of human attention and perception [38]. ...
Article
Full-text available
Knee osteoarthritis (KOA) is a leading cause of disability globally. Early and accurate diagnosis is paramount in preventing its progression and improving patients’ quality of life. However, the inconsistency in radiologists’ expertise and the onset of visual fatigue during prolonged image analysis often compromise diagnostic accuracy, highlighting the need for automated diagnostic solutions. In this study, we present an advanced deep learning model, OA-HybridCNN (OHC), which integrates ResNet and DenseNet architectures. This integration effectively addresses the gradient vanishing issue in DenseNet and augments prediction accuracy. To evaluate its performance, we conducted a thorough comparison with other deep learning models using five-fold cross-validation and external tests. The OHC model outperformed its counterparts across all performance metrics. In external testing, OHC exhibited an accuracy of 91.77%, precision of 92.34%, and recall of 91.36%. During the five-fold cross-validation, its average AUC and ACC were 86.34% and 87.42%, respectively. Deep learning, particularly exemplified by the OHC model, has greatly improved the efficiency and accuracy of KOA imaging diagnosis. The adoption of such technologies not only alleviates the burden on radiologists but also significantly enhances diagnostic precision.
... To help students draw connections between the invisible gorilla video and future clinical practice, the instructor shared the results of a study that used a similar activity to highlight how inattentional blindness can also impact helping professionals making clinical decisions (Drew et al., 2013). The study examined the eye movements of 24 radiologists tasked with identifying lung nodules in computed tomography (CT) scans. ...
Article
Full-text available
Purpose This study examined whether a course in critical thinking (CT) facilitates change in undergraduate students' self-reported CT dispositions (CTDs). This study also examined whether students' postcourse CTDs predict real-world outcomes after controlling for students' baseline grade point average, need for cognition, and precourse CTDs. Method One hundred thirty-eight undergraduate communication sciences and disorders (CSD) students participated in the study. All students were enrolled in a course that applied evidence-based characteristics of an effective CT course. Students completed the Student–Educator Negotiated Critical Thinking Disposition Scale (SENCTDS), Need for Cognition Scale–Short Form, Real-World Outcomes inventory, and CT subscale of the Motivated Strategies for Learning Questionnaire before and after taking the course. Data were analyzed using a paired-samples t test and structural equation modeling. Results There was a statistically significant difference between pre- and posttest scores on the SENCTDS. After controlling for all sources of influence in the structural model, postcourse CTD scores were significantly related to postcourse RWO scores. Conclusions The results of the study indicate the CT course may be effective for facilitating change in undergraduate CSD students' CTDs. The results suggest the possibility that students who receive direct instruction related to CT may be more likely to demonstrate reflective, attentive, open-minded, organized, and persistent dispositions and may be more internally motivated to solve complex problems. These findings suggest a robust effect of CTDs on real-world outcomes and that the CT course produced a clear benefit. More research is warranted to identify the active ingredients responsible for the self-reported change in students' CTDs. Supplemental Material https://doi.org/10.23641/asha.28711154
... Failure to engage System 2's deliberate processing in clinical imaging tasks can result in diagnostic errors. For instance, research on inattentional blindness [68], has been shown to affect clinical imaging [16,34,50,58,59,61,78]. Studies estimate the under readings to comprise 42% of errors in clinical imaging [37], including stopping after an initial instance is identified resulting in missing a second finding [1, 4] (e.g., a patient with tuberculosis died from undiagnosed lymphoma [58]), and efforts to identify rare findings [52,82]. ...
Preprint
Full-text available
Clinical systems operate in safety-critical environments and are not intended to function autonomously; however, they are currently designed to replicate clinicians' diagnoses rather than assist them in the diagnostic process. To enable better supervision of system-generated diagnoses, we replicate radiologists' systematic approach used to analyze chest X-rays. This approach facilitates comprehensive analysis across all regions of clinical images and can reduce errors caused by inattentional blindness and under reading. Our work addresses a critical research gap by identifying difficult-to-diagnose diseases for clinicians using insights from human vision, enabling these systems to serve as an effective "second pair of eyes". These improvements make the clinical imaging systems more complementary and combine the strengths of human and machine vision. Additionally, we leverage effective receptive fields in deep learning models to present machine-generated diagnoses with sufficient context, making it easier for clinicians to evaluate them.
... B. hervorstechende Merkmale oder Bewegung) lenkt die Aufmerksamkeit der Beobachterinnen und Beobachter so sehr auf sich, dass sie andere Dinge bzw. wichtige Details nicht wahrnehmen(Hewlett, Oezbek 2012;Drew et al. 2013). Dieses Phänomen kennen viele Naturfotografinnen und -fotografen, die erst beim Betrachten der Bilder am PC bemerken, welche weiteren Insekten außer dem gesuchten Insekt noch auf der abgelichteten Pflanze saßen.Werden Beobachtungen in Gruppen durchgeführt, können falsche Einschätzungen einzelner Personen durch die Gruppe zwar korrigiert werden, es kann aber auch Gruppendruck entstehen, der ...
Article
Naturbeobachtungen basieren auf einer Reihe kognitiver Prozesse − vom Lernen artspezifischen Wissens, über die konkrete Beobachtung und Entscheidung (z. B. um welche Art es sich handelt) bis hin zu Meldungen von Beobachtungen an Datenbanken und deren Plausibilisierung. In allen Schritten können Verzerrungen und Fehler auftreten, die nicht auf Unwissenheit, mangelnde Anstrengung oder absichtliche Täuschung zurückzuführen sind, sondern auf den grundlegenden Mechanismen menschlicher Informationsverarbeitung beruhen. So kann bspw. das indi- viduelle Vorwissen für bestimmte Erwartungshaltungen sorgen, die den Blick einengen, unvollständige Wahrnehmungen werden typischerweise subjektiv ergänzt oder Entscheidungen werden meist nicht analytisch getroffen. Auch die Urteile von Expertinnen oder Experten und Gruppen- druck können den Entscheidungsspielraum zu schnell einschränken. In der Regel sind wir uns dieser Einflüsse gar nicht bewusst. Sie können aber die Validität von Naturbeobachtungen deutlich beeinträchtigen und auch zu folgenreichen Entscheidungen für den Naturschutz führen. Wir beschreiben in diesem Beitrag solche potenziell verfälschenden Einflüsse und stellen eine Reihe von Maßnahmen und Strategien dar, die diesen Verfälschungen entgegenwirken. Unser Anliegen und unsere Hoffnung ist es, dass diese Maßnahmen in die alltägliche Beobachtungspraxis integriert werden und somit zu einer verbesserten Datenlage auch im Naturschutz beitragen.
... Despite the benefits of intuition, cognitive shortcuts and heuristic methods of problem solving can sometimes lead to systematic errors or biases, even among experts 6,10,[96][97][98][99][100][101][102][103][104][105][106][107] . For example, clinicians can overestimate the likelihood that they would have made a correct diagnosis when seeing a case for the first time (hindsight bias) 108,109 , ignore the prevalence of a symptom when making a diagnosis (base rate neglect) 110 , or seek evidence that confirms rather than disconfirms their initial diagnosis (confirmation bias) 111 . ...
Article
Determining which experts to trust is essential for both routine and high-stakes decisions, yet evaluating expertise can be difficult. In this Review, we examine the cognitive processes that underpin genuine expertise and explore the disconnect between psychological insights into expertise and the practical methods used to evaluate it. In settings where expertise must be evaluated by laypeople, such as adversarial legal trials, evaluators face substantial challenges, including knowledge disparities that hinder analysis, communication barriers that impact the clear explanation of expert methods, and procedural constraints that limit the scrutiny of expert evidence. These challenges complicate the assessment of expert claims and contribute to wrongful convictions and unjust outcomes. We suggest that a distinction between ‘show-it’ and ‘know-it’ expert performances that differ in their visibility, measurability and immediacy can be used as a heuristic for identifying when evaluations of expertise require greater care and should incorporate a variety of diagnostic factors including foundational and applied validity. Finally, we highlight key knowledge gaps and propose promising directions for future research to improve evaluations of expertise in a range of contexts.
... Although the saliency maps highlight regions such as the larynx, aortic region, and other connective tissues, none of the survey participants wrote comments about noticing these regions (see Supplementary material). 23 Our study has limitations, including the small dataset. Additionally, the radiographs were collected over a 20-year period, during which variability in imaging techniques and equipment could yield inconsistencies. ...
Preprint
Full-text available
Deep learning (DL) is increasingly used to analyze medical imaging, but is less refined for rare conditions, which require novel pre-processing and analytical approaches. To assess DL in the context of rare diseases, we focused on alkaptonuria (AKU), a rare disorder that affects the spine and involves other sequelae; treatments include the medication nitisinone. Since assessing X-rays to determine disease severity can be a slow, manual process requiring considerable expertise, we aimed to determine whether our DL methods could accurately identify overall spine severity, severity at specific regions of the spine, and whether DL could detect whether patients were receiving nitisinone. We evaluated DL performance versus clinical experts using cervical and lumbar spine radiographs. DL models predicted global severity scores (30-point scale) within 1.72 +/- 1.96 points of expert clinician scores for cervical and 2.51 +/- 1.96 points for lumbar radiographs. For region-specific metrics, we assessed the degree of narrowing, calcium, and vacuum phenomena at each intervertebral space (IVS). Our model's narrowing scores were within 0.191-0.557 points from clinician scores (6-point scale), calcium was predicted with 78-90% accuracy (present, absent, or disc fusion), while vacuum disc phenomenon predictions were less consistent (41-90%). Intriguingly, DL models predicted nitisinone treatment status with 68-77% accuracy, while expert clinicians appeared unable to discern nitisinone status (51% accuracy). This highlights the potential for DL to augment certain types of clinical assessments in rare disease, as well as identifying occult features like treatment status.
Article
Much of high-level cognition appears inaccessible to consciousness. Countless studies have revealed mental processes—like those underlying our choices, beliefs, judgments, intuitions, etc.—which people do not notice or report, and these findings have had a widespread influence on the theory and application of psychological science. However, the interpretation of these findings is uncertain. Making an analogy to perceptual consciousness research, I argue that much of the unconsciousness of high-level cognition is plausibly due to internal inattentional blindness: missing an otherwise consciously-accessible internal event because your attention was elsewhere. In other words, rather than being structurally unconscious, many higher mental processes might instead be “preconscious”, and would become conscious if a person attended to them. I synthesize existing indirect evidence for this claim, argue that it is a foundational and largely untested assumption in many applied interventions (such as therapy and mindfulness practices), and suggest that, with careful experimentation, it could form the basis for a long-sought-after science of introspection training.
Article
Background Recently, health professional education uses visual art observation to promote various observation-related technical skills. This article maps the studies on such interventions, scrutinizes what they measured as observational skills, and discusses their effectiveness. Methods Following the PRISMA Extension for Scoping Reviews, a scoping review was conducted. Publications from 2001 on were identified by searching four databases and by hand searching. The author screened each publication using the pre-designed eligibility criteria: participants were novice healthcare learners enrolled in visual art observation training; the study aimed to evaluate the effect of the intervention on technical skills related to observation; the skills were objectively measured. The author extracted relevant information from the included papers without additional inquiry into the study authors. The extracted information was illustrated in both a tabular and descriptive format. Results 3,157 publications were identified, of which 18 articles were included. Few studies had valid and reliable experiments. The relatively valid evidence is that the participants listed more elements or signs for artistic or medical images. Conclusions Sound evidence is lacking for all the technical skills intended to be fostered. Observation skills for artistic images have not been demonstrated to transfer to technical skills. Nor do the studies show that they promoted accurate diagnoses and reduced misdiagnoses. Additionally, the evidence on verbalizing skills is not isolated from the impact of discussions and is unclear regarding its transfer to actual communication. For the others, there are not enough valid studies on technical skills. This is true for studies that directly examine promoting accurate diagnosis or reducing misdiagnosis. Moreover, there may be promising alternatives to visual art observations for cultivating such technical skills, but no comparative studies were conducted.
Article
Full-text available
The search rate for a target among distractors may vary dramatically depending on which stimulus plays the role of target and which that of distractors. For example, the time required to find a circle distinguished by an intersecting line is independent of the number of regular circles in the display, whereas the time to find a regular circle among circles with lines increases linearly with the number of distractors. The pattern of performance suggests parallel processing when the target has a unique distinguishing feature and serial self-terminating search when the target is distinguished only by the absence of a feature that is present in all the distractors. The results are consistent with feature-integration theory (Treisman & Gelade, 1980), which predicts that a single feature should be detected by the mere presence of activity in the relevant feature map, whereas tasks that require subjects to locate multiple instances of a feature demand focused attention. Search asymmetries may therefore offer a new diagnostic to identify the primitive features of early vision. Several candidate features are examined in this article: Colors, line ends or terminators, and closure (in the sense of a partly or wholly enclosed area) appear to be functional features; connectedness, intactness (absence of an intersecting line), and acute angles do not.
Article
Full-text available
Purpose: The development of computer-aided diagnostic (CAD) methods for lung nodule detection, classification, and quantitative assessment can be facilitated through a well-characterized repository of computed tomography (CT) scans. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) completed such a database, establishing a publicly available reference for the medical imaging research community. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process. Methods: Seven academic centers and eight medical imaging companies collaborated to identify, address, and resolve challenging organizational, technical, and clinical issues to provide a solid foundation for a robust database. The LIDC/IDRI Database contains 1018 cases, each of which includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (“nodule ≥ 3 mm,” “nodule<3 mm,” and “non-nodule ≥ 3 mm”). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus. Results: The Database contains 7371 lesions marked “nodule” by at least one radiologist. 2669 of these lesions were marked “nodule ≥ 3 mm” by at least one radiologist, of which 928 (34.7%) received such marks from all four radiologists. These 2669 lesions include nodule outlines and subjective nodule characteristic ratings. Conclusions: The LIDC/IDRI Database is expected to provide an essential medical imaging research resource to spur CAD development, validation, and dissemination in clinical practice.
Article
Full-text available
Subjects looked at two optically superimposed video sccreens, on which two different kinds of things were happening. In the principal condition, they were required to follow the action in one episode (by pressing keys when significant events occurred) and ignore the other. They could do this without difficulty, although both were present in the same fully overlapped visual field. Odd events in the unattended episode were rarely noticed. It was very difficult to monitor both episodes at once. Performance was no better when the two episodes were presented to different eyes (dichoptic condition) than when both were given binocularly. It is argued that selective attention does not involve special mechanisms to reject unwanted information, but is a direct consequence of skilled perceiving.
Article
Full-text available
The aggressive and heterogeneous nature of lung cancer has thwarted efforts to reduce mortality from this cancer through the use of screening. The advent of low-dose helical computed tomography (CT) altered the landscape of lung-cancer screening, with studies indicating that low-dose CT detects many tumors at early stages. The National Lung Screening Trial (NLST) was conducted to determine whether screening with low-dose CT could reduce mortality from lung cancer. From August 2002 through April 2004, we enrolled 53,454 persons at high risk for lung cancer at 33 U.S. medical centers. Participants were randomly assigned to undergo three annual screenings with either low-dose CT (26,722 participants) or single-view posteroanterior chest radiography (26,732). Data were collected on cases of lung cancer and deaths from lung cancer that occurred through December 31, 2009. The rate of adherence to screening was more than 90%. The rate of positive screening tests was 24.2% with low-dose CT and 6.9% with radiography over all three rounds. A total of 96.4% of the positive screening results in the low-dose CT group and 94.5% in the radiography group were false positive results. The incidence of lung cancer was 645 cases per 100,000 person-years (1060 cancers) in the low-dose CT group, as compared with 572 cases per 100,000 person-years (941 cancers) in the radiography group (rate ratio, 1.13; 95% confidence interval [CI], 1.03 to 1.23). There were 247 deaths from lung cancer per 100,000 person-years in the low-dose CT group and 309 deaths per 100,000 person-years in the radiography group, representing a relative reduction in mortality from lung cancer with low-dose CT screening of 20.0% (95% CI, 6.8 to 26.7; P=0.004). The rate of death from any cause was reduced in the low-dose CT group, as compared with the radiography group, by 6.7% (95% CI, 1.2 to 13.6; P=0.02). Screening with the use of low-dose CT reduces mortality from lung cancer. (Funded by the National Cancer Institute; National Lung Screening Trial ClinicalTrials.gov number, NCT00047385.).
Article
Full-text available
Inattentional blindness (IB) occurs when an observer, who is engaged in a resource-consuming task, fails to notice an unexpected although salient stimulus appearing in their visual field. The incidence of IB is affected by changes in stimulus-driven properties, but little research has examined individual differences in IB propensity. We examine working memory capacity (WMC), processing styles (flicker task), inhibition (Stroop task), and training in predicting IB. WMC is associated with IB (Experiments 1 and 2) but neither processing style (Experiment 1) nor inhibition (Experiment 2) was associated. In Experiment 2, prior training on a task reduced the incidence of IB compared to no prior training, and this effect was significantly larger when trained on the same tracking task as that used in the IB task rather than a different task. We conclude that IB is related to WMC and that training can influence the incidence of IB.
Article
Full-text available
Most studies of inattentional blindness-the failure to notice an unexpected object when attention is focused elsewhere-have focused on one critical trial. For that trial, noticing the unexpected object might be a result of random variability, so that any given individual would be equally likely to notice the unexpected object. On the other hand, individual differences in the ability to perform the primary task might make noticing more likely for some individuals than for others. Increasing the difficulty of the primary task has been shown to decrease noticing rates for both brief static displays (Cartwright-Finch & Lavie, 2007) and dynamic monitoring tasks (Simons & Chabris, 1999). However, those studies did not explore whether individual differences in noticing arise from differences in the ability to perform the primary task. For our Experiment 1, we used a staircase procedure to equate primary task performance across individuals in a dynamic inattentional blindness task and found that the demands of the primary task affected noticing rates when individual differences in accuracy were minimized. In Experiment 2, we found that individual differences in primary task performance did not predict noticing of an unexpected object. Together, these findings suggest that although the demands of the primary task do affect inattentional blindness rates, individual differences in the ability to meet those demands do not.
Article
PURPOSE: The Lung Image Database Consortium (LIDC) was created by the National Cancer Institute to create a public database of annotated thoracic computed tomography (CT) scans as a reference standard for imaging research. This effort was augmented by the Foundation for the National Institutes of Health through the Image Database Resource Initiative (IDRI). The LIDC/IDRI Database is intended to facilitate computer‐aided diagnosis (CAD) research for lung nodule detection, classification, and quantitative assessment. METHOD/MATERIALS: The LIDC/IDRI Database contains 1018 CT scans collected retrospectively from the clinical archives of seven academic institutions. Each scan was reviewed asynchronously by four thoracic radiologists through a two‐phase process. During the first “blinded read” phase, each radiologist independently reviewed the scans and marked lesions they identified according to one of three categories: “nodule ≥ 3 mm,” “nodule < 3 mm,” and “non‐nodule ≥ 3 mm.” The second “unblinded read” phase allowed each radiologist to review the marks of all other radiologists and confirm or modify their own marks. For any lesion that a radiologist marked as a “nodule ≥ 3 mm,” that radiologist constructed nodule outlines in every CT section in which the nodule appeared and provided subjective ratings of nodule characteristics such as subtlety, spiculation, solidity, and margin. The Database contains all images and radiologist marks for use by investigators. RESULTS: The Database contains 7371 lesions marked by at least one radiologist as either a “nodule ≥ 3 mm” or a “nodule < 3 mm.” 2669 lesions were marked by at least one radiologist as a “nodule ≥ 3 mm,” of which 777 (29.1%) were assigned such a mark by only a single radiologist, and 928 (34.8%) received such marks from all four radiologists. CONCLUSIONS: The LIDC/IDRI Database is expected to become a powerful resource as a reference standard for the medical imaging research community.
Article
Rationale and objectives: The authors tested the hypothesis that satisfaction of search effect, which is associated with the failure to detect native chest abnormalities in the presence of simulated nodules, is caused by reduced gaze on the native abnormalities. Materials and methods: Gaze dwell time of 20 radiologists was recorded for the region around abnormalities on images. Ten radiographs were reviewed, nine of which contained native abnormalities. Each image was seen with and without a simulated nodule. Results: The decrease in the rate of true-positive findings in the detection of native abnormalities on images that contained simulated nodules confirmed the occurrence of a satisfaction of search effect. Gaze times on native abnormalities (up to the time of report of the abnormalities) were the same for images with nodules in which native abnormalities were missed (gaze time, 9.4 seconds) as they were for images without nodules in which native abnormalities were detected (gaze time, 9.5 seconds). Gaze time on missed native abnormalities was not affected by the presence (7.80 seconds) or absence (7.45 seconds) of nodules. Conclusion: Reduction in gaze dwell time on the missed abnormalities is not the cause of satisfaction of search errors in chest radiographs.
Article
The search rate for a target among distractors may vary dramatically depending on which stimulus plays the role of target and which that of distractors. For example, the time required to find a circle distinguished by an intersecting line is independent of the number of regular circles in the display, whereas the time to find a regular circle among circles with lines increases linearly with the number of distractors. The pattern of performance suggests parallel processing when the target has a unique distinguishing feature and serial self-terminating search when the target is distinguished only by the absence of a feature that is present in all the distractors. The results are consistent with feature-integration theory (Treisman & Gelade, 1980), which predicts that a single feature should be detected by the mere presence of activity in the relevant feature map, whereas tasks that require subjects to locate multiple instances of a feature demand focused attention. Search asymmetries may therefore offer a new diagnostic to identify the primitive features of early vision. Several candidate features are examined in this article: Colors, line ends or terminators, and closure (in the sense of a partly or wholly enclosed area) appear to be functional features; connectedness, intactness (absence of an intersecting line), and acute angles do not.