Content uploaded by Sarah Lebovitz
Author content
All content in this area was uploaded by Sarah Lebovitz on Jan 12, 2022
Content may be subject to copyright.
This article was downloaded by: [199.111.241.60] On: 11 January 2022, At: 05:33
Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA
Organization Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
To Engage or Not to Engage with AI for Critical
Judgments: How Professionals Deal with Opacity When
Using AI for Medical Diagnosis
Sarah Lebovitz, Hila Lifshitz-Assaf, Natalia Levina
To cite this article:
Sarah Lebovitz, Hila Lifshitz-Assaf, Natalia Levina (2022) To Engage or Not to Engage with AI for Critical Judgments: How
Professionals Deal with Opacity When Using AI for Medical Diagnosis. Organization Science
Published online in Articles in Advance 10 Jan 2022
. https://doi.org/10.1287/orsc.2021.1549
Full terms and conditions of use: https://pubsonline.informs.org/Publications/Librarians-Portal/PubsOnLine-Terms-and-
Conditions
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
Copyright © 2022, INFORMS
Please scroll down for article—it is on subsequent pages
With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research (O.R.)
and analytics professionals and students. INFORMS provides unique networking and learning opportunities for individual
professionals, and organizations of all types and sizes, to better understand and use O.R. and analytics tools and methods to
transform strategic visions and achieve better outcomes.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
To Engage or Not to Engage with AI for Critical Judgments:
How Professionals Deal with Opacity When Using AI for
Medical Diagnosis
Sarah Lebovitz,
a
Hila Lifshitz-Assaf,
b
Natalia Levina
b
a
McIntire School of Commerce, University of Virginia, Charlottesville, Virginia 22903;
b
Stern School of Business, New York University,
New York, New York 10012
Contact: sl5xv@comm.virginia.edu,https://orcid.org/0000-0003-4853-3220 (SL); hlassaf@stern.nyu.edu,
https://orcid.org/0000-0002-3461-003X (HL-A); nlevina@stern.nyu.edu (NL)
Received: January 15, 2020
Revised: December 29, 2020; June 19, 2021;
September 13, 2021
Accepted: October 2, 2021
Published Online in Articles in Advance:
https://doi.org/10.1287/orsc.2021.1549
Copyright: © 2022 INFORMS
Abstract. Artificial intelligence (AI) technologies promise to transform how professionals
conduct knowledge work by augmenting their capabilities for making professional judg-
ments. We know little, however, about how human-AI augmentation takes place in prac-
tice. Yet, gaining this understanding is particularly important when professionals use AI
tools to form judgments on critical decisions. We conducted an in-depth field study in a
major U.S. hospital where AI tools were used in three departments by diagnostic radiolog-
ists making breast cancer, lung cancer, and bone age determinations. The study illustrates
the hindering effects of opacity that professionals experienced when using AI tools and ex-
plores how these professionals grappled with it in practice. In all three departments, this
opacity resulted in professionals experiencing increased uncertainty because AI tool results
often diverged from their initial judgment without providing underlying reasoning. Only
in one department (of the three) did professionals consistently incorporate AI results into
their final judgments, achieving what we call engaged augmentation. These professionals in-
vested in AI interrogation practices—practices enacted by human experts to relate their own
knowledge claims to AI knowledge claims. Professionals in the other two departments did
not enact such practices and did not incorporate AI inputs into their final decisions, which
we call unengaged “augmentation.”Our study unpacks the challenges involved in augment-
ing professional judgment with powerful, yet opaque, technologies and contributes to liter-
atureonAIadoptioninknowledgework.
History: This paper has been accepted for the Organization Science Special Issue on Emerging Technolo-
gies and Organizing.
Keywords:artificial intelligence •opacity •explainability •transparency •augmentation •technology adoption and use •uncertainty •
innovation •professional judgment •expertise •decision making •medical diagnosis
Introduction
Artificial intelligence (AI) technologies are edging
closer to human capabilities and are often positioned
as a revolutionary resource promising continuous im-
provements in problem-solving, perception, and rea-
soning (Rai et al. 2019). These technologies are seen as
enablers of a fundamental organizational transforma-
tion (Faraj et al. 2018, von Krogh 2018, Kellogg et al.
2019), especially when it comes to professional work
(Barley et al. 2017, Erickson et al. 2018). Heated de-
bates are emerging around whether, over time, AI
technologies are more likely to “automate”profes-
sional work on certain tasks by fully replacing human
input or to “augment”it by keeping human experts in
the loop (e.g., Brynjolfsson and Mitchell 2017, Kellogg
et al. 2019, Seamans and Furman 2019). Private and
public organizations increasingly opt for human-AI
augmentation, assuming it will generate value through
the synergistic integration of the diverse expertise that
AI and experts each offer. In this paper, we study how
human-AI augmentation for critical decisions unfolds
in practice by closely investigating how professionals
use AI tools to form three different medical diagnosis
judgments.
Human-AI augmentation is increasingly depicted as
“human-AI collaboration”(e.g., Wilson and Daugherty
2018, Puranam 2021, Raisch and Krakowski 2021), em-
phasizing the need to integrate potentially divergent
viewpoints. Drawing on the organizational literature
on collaboration, we know that such integration in-
volves transforming knowledge—a process that re-
quires both understanding the meaning behind others’
1
ORGANIZATION SCIENCE
Articles in Advance, pp. 1–23
ISSN 1047-7039 (print), ISSN 1526-5455 (online)
http://pubsonline.informs.org/journal/orsc
January 10, 2022
inputs and being willing to change one’s initial position
(Carlile 2004,Maguireetal.2004,Hardyetal.2005,Lev-
ina 2005). It is well known that achieving effective
collaboration in knowledge work is difficult as experts
cannot always explain their reasoning because of the
tacit nature of knowledge (Polanyi 1958,1966), and their
collaborators may not be willing to listen to unfamiliar
viewpoints (Carlile 2004,Maguireetal.2004,Levina
2005).
The problems of establishing an understanding
across diverse bases of expertise and being open to al-
ternative viewpoints are exacerbated in situations when
the reasoning behind them is inaccessible. This is partic-
ularly likely to occur when humans face a divergent
viewpoint expressed by an AI tool—the so-called
“opaque AI”problem. Modern AI tools, such as deep
learning algorithms, often appear as “black boxes”to
users because it may be very difficult or even impossi-
ble to examine how the algorithm arrived at a particular
output (Pasquale 2015,Christin2020,Diakopoulos
2020). Although experiencing opacity and using “black
box”technologies (e.g., cars or computers) is ubiquitous
(Anthony 2021), problems arise when there is a need to
integrate diverse knowledge claims into a single deci-
sion that a human expert can stand behind. This is the
case for many scenarios of AI use for critical decisions,
such as in medicine, human resource management, and
criminal justice, where opacity associated with AI use
is particularly problematic (Waardenburg et al. 2018,
Christin 2020, Van Den Broek et al. 2021).
In professional collaboration, human experts inte-
grate diverse knowledge by developing joint practices
based on shared interests and common understand-
ings (Bechky 2003b). This enables them to engage in
dialogue, at least partially uncovering one another’s
reasoning in order to arrive at a joint decision. What
would it take for human experts to be able to trans-
form their knowledge based on inputs from black box
machines? We set out to explore how experts using AI
tools are dealing with opacity and considering wheth-
er to alter their initial knowledge claims based on the
AI input.
Following a rich tradition of organizational stud-
ies investigating technology in work practices (e.g.,
Orlikowski 1992, Leonardi and Bailey 2008, Barrett
et al. 2012, Mazmanian et al. 2013, Lifshitz-Assaf
2018), we conducted an ethnographic field study
within a major tertiary hospital in the United States
that is using AI technologies for diagnostic radio-
logy. Medical diagnosis in general and diagnostic
radiology in particular represent some of the pre-
mier examples of professional work that is expected
to undergo dramatic transformation as AI techno-
logies continue advancing.
1
We investigate radio-
logists’use of AI tools for diagnostic processes in
three different departments, focusing on their work
practices in diagnosing lung cancer, breast cancer,
and bone age.
We show how radiologists invested their efforts
into reducing uncertainty when forming their diag-
nosis judgments and how the opacity they experi-
enced when using AI tools initially increased this
uncertainty in all three settings. Of the three depart-
ments we studied, only in one (when diagnosing
lung cancer) were the professionals able to use AI re-
sults to enhance their own expertise—the stated goal
of the human-AI augmentation. This was a case of
what we call engaged augmentation, where professio-
nals were regularly integrating the AI knowledge
claims with their own. They were able to relate AI
results to their initial judgment and reconcile diver-
gent knowledge claims by enacting “AI interroga-
tion practices,”which required a significant resource
investment on behalf of the professionals who were
already highly overextended in their daily work. In
the other two departments (when diagnosing breast
cancer and bone age), professionals enacted what we
call unengaged “augmentation,”where they were
either regularly ignoring AI’s input or accepting it
without much reflection. Our study contributes to
the nascent understanding of human-AI augmenta-
tion practices by unpacking how humans experience
and deal with opacity when using AI tools.
Background Literature
Augmenting Professional Expertise with AI
Two scenarios of AI use, either through automation or
augmentation, are increasingly debated across academic,
practitioner, and policy communities (e.g., Brynjolfsson
and Mitchell 2017, Benbya et al. 2021,Cremerand
Kasparov 2021, Raisch and Krakowski 2021). In this
study, we concentrated on the augmentation scenario,
which the literature largely equates with “humaninthe
loop”AI use, whereby human experts and AI technolo-
gies work together to accomplish a task. The word aug-
mentation is defined as a process of enlargement or
making something grander or more superior. Indeed,
scholars describe human-AI augmentation as an expan-
sion of expertise or knowledge where humans and ma-
chines “combine their complementary strengths”and
are “multiplying their capabilities”(Raisch and Krakow-
ski 2021,p.193).Throughthisexpansionofexpertise,
human-AI augmentation is expected to positively im-
pact organizations through superior performance or im-
proved efficiency (e.g., Brynjolfsson and McAfee 2014,
Davenport and Kirby 2016,DaughertyandWilson
2018).
Embracing the vision of multiplying diverse
expertise, many scholars describe human-AI aug-
mentation as humans and machines “collaborating”
together (e.g., Wilson and Daugherty 2018,Boyaci
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
2Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
et al. 2020,Khadpeetal.2020, Gao et al. 2021,
Puranam 2021). Prior organizational literature on ef-
fective collaboration among diverse human experts
shows how experts learn ways of working together
to leverage and combine the complementary capabil-
ities (Maguire et al. 2004,Hardyetal.2005). Effective
collaboration in knowledge work involves trans-
forming and integrating knowledge through a pro-
cess of relating the knowledge of others to one’s
own knowledge (Carlile 2004,Levina2005,Levina
and Vaast 2005). This requires collaborators to be
willing and able to understand the meaning behind
others’inputaswellastopotentiallychangeone’s
knowledge claims (Carlile 2004,Levina2005). A col-
laboration that effectively integrates divergent
knowledge results in individuals not only “adding
to”but also, “challenging”one another’sinput,
which is distinguished from merely “ignoring”input
without reflection (Levina 2005). Extending this liter-
ature to AI use, the expectation is that human ex-
perts “collaborating”with AI tools are transforming
their knowledge by integrating AI results in a way
that potentially challenges an expert’sinitialjudg-
ment. Indeed, Raisch and Krakowski (2021,p.202)
assert this expectation when describing augmenta-
tion as a tight coupling of human experts and ma-
chines influencing one another, wherein “machine
outputs are used to challenge humans, and human
inputs to challenge machines.”
Transforming knowledge is challenging when
collaborators are unable to interrogate the other’s
knowledge claims. Human experts develop collabo-
ration practices based on their shared interests
and common understandings that allow them to de-
liberate each other’s knowledge claims (Carlile 2004,
Maguire et al. 2004,Levina2005), despite their inabil-
ity to fully explicate their reasoning (Polanyi 1958,
1966). Although we have been investigating how
knowledge workers deal with tacit knowledge over
the last three decades of organizational scholarship
(e.g., Kogut and Zander 1992), we know relatively
little about dealing with the opacity of modern
technologies.
Opacity and AI Technologies
Issues of opacity, or the antithesis of transparency,
associated with organizational adoption of modern
technologies have increasingly been a topic of dis-
cussion and concern in many research and practi-
tioner communities (e.g., Zuboff 2015,Turco2016,
Albu and Flyverbom 2019, Leonardi and Treem
2020). Opacity refers to the difficulty to understand
the reasoning behind a given outcome when such
reasoning is obscured or hidden from view (Stohl
et al. 2016). Although initially, researchers argued
that the use of information technology will lead to
increased transparency—as more information about
activities and decision making was captured digital-
ly and could potentially be accessed and examined
by third parties—recent writings have pointed out
the fallacy of this thinking (Hansen and Flyverbom
2015,Stohletal.2016, Leonardi and Treem 2020).
Studying social media platforms as an example,
Stohl et al. (2016,p.125)identifya“transparency
paradox,”arguing that, although increased use of
information technology may increase how visible in-
formation may be, in certain cases, it may actually
reduce transparency. This line of argument may be
extended to the adoption and use of modern AI
tools. Today, such tools are developed with the
aim of transforming the glut of “big data”into a di-
gestible piece of highly relevant information—the al-
gorithmic output. Today, these outputs are often
presented to users with minimal transparency into
how the AI tool generated them. Yet, because of con-
straints of limited time and bounded rationality,
even if all the data and logic underlying an algorith-
mic output became accessible, transparency may still
not be likely (Leonardi and Treem 2020).
The concept of opacity has gained prominence in the
context of organizational adoption of AI tools (e.g.,
Burrell 2016, Faraj et al. 2018,Christin2020), especially
those tools that use deep learning methodologies.
These methods often rely on numerous algorithms cal-
culating weighted probabilities that are transferred and
transformed through complex multilayered networks
before a given output is generated for users. AI tools
using such methods are often referred to as “black
boxes”because they may generate unexpected or sur-
prising outputs that end users and even AI developers
are unable to explain or understand (Pasquale 2015,
Dourish 2016,Diakopoulos2020). In the current litera-
ture, opacity of AI tools typically describes the lack of
explanations provided as to “why a specificdecision
was made that are understandable to users, even when
they have little technical knowledge”(Glikson and
Woolley 2020,p.631).Thisfocusesonenactedmo-
ments of AI use, whereby individuals lack the practical
ability to know the reasoning behind aspecific AI result
presented to them, which is distinct from how individ-
uals may lack the ability to evaluate a particular AI tool
when examining its technical methodology, training
and validation data, and performance measures (Lebo-
vitz et al. 2021).
Although the goal of achieving transparency in AI
tools seems more necessary than ever—as more and
more critical judgments are involving AI tools—the
ability to achieve this goal seems more elusive than
ever. Scholars, including some computer scientists,
are now discussing AI’s“fundamental opacity,”argu-
ing that transparency may be technically infeasible
(e.g., Ananny and Crawford 2016, Xu et al. 2019).
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 3
Supporters of this view argue that, given the growing
complexity of methods and input data sets, “there
may be something, in the end, impenetrable about al-
gorithms”(Gillespie 2014, p. 192). Some scholars go so
far as to say that achieving transparency in the use of
AI is so difficult that it may be necessary “to avoid us-
ing machine learning algorithms in certain critical domains
of application”(Burrell 2016, p. 9, emphasis added).
Not only computer scientists but also, scholars from
awiderangeoffields including law, ethics, political
science, information sciences, and management are
arguing that using AI for judgments with serious
individual or societal consequences may be prob-
lematic. This challenge has led to the creation of
multidisciplinary research communities focused on
issues of transparency, ethics, and fairness in tech-
nology (e.g., Caplan et al. 2018,Crawfordetal.
2019). The research in this community broadly cov-
ers three areas.
The first area explores how the design of algorithmic
models can be more transparent to help address issues
of fairness and social justice (e.g., Barocas et al. 2020,Bird
et al. 2020,Kauretal.2020). For instance, some scholars
in this area are focused on developing models that can
show how unjust outcomes produced by machine learn-
ing models are highly related to bias that exists in the
training data. Despite this community’sprogressusing
advanced computational methods to improve transpar-
ency toward fairness and equality (e.g., Hooker et al.
2019, Samek et al. 2019,Fern
´
andez-Lor´
ıa et al. 2020),
most AI tools (and the potential impact of their results
on such issues) are still perceived as largely opaque by
their users. This is in part because of the growing com-
putational complexity of deep learning models and the
“curse of dimensionality”when attempting to assert
what features from massive sets of input data are yield-
ing specificpredictions(Domingos2015).
The second area of research explores the relationship
between algorithmic transparency and professional ac-
countability (Pasquale 2015,Diakopoulos2020). This
work is based on the reasoning that a system can be
better governed if its inner workings are more transpar-
ent and known (Ananny and Crawford 2016). This is
critical because introducing AI tools into a professional
work setting may transform existing distributions of
responsibility and accountability without providing the
ability to view or understand the underlying logic (e.g.,
Scott and Orlikowski 2012, Ananny and Crawford
2016, Caplan et al. 2018). Related questions are also be-
ing raised about the impact of opacity on new forms of
algorithmic management and control, as workers are
often unaware of how algorithms are directing and
evaluating their work (Kellogg et al. 2019,Watkins
2020).
The third area of research focuses on classifying and
characterizing the types and sources of transparency
and opacity associated with AI systems. Some work in
this area has focused on distinguishing, for example,
between the transparency of a system’s training data
sets and the transparency about the specific features
and weights that led an algorithm to a given outcome
(e.g., von Krogh 2018, Diakopoulos 2020). Another
area within this topic has investigated the reasons be-
hind opacity of AI tools, such as intentional organiza-
tional or managerial secrecy, technical complexity of
the tools, and structural factors that preexisted the
AI system, among other reasons (Burrell 2016,Christin
2020).
Today, despite the enduring challenges of opacity,
AI tools are increasingly being implemented in con-
texts where professionals are expected to integrate
their own knowledge with AI results when forming
judgments in critical contexts (Nunn 2018, Razorthink
Inc. 2019). Prior research has shown knowledge work-
ers attempting to examine the underlying logic as
they encounter new technologies, such as digital sim-
ulation technology in manufacturing (Bailey et al.
2012) and engineering (Dodgson et al. 2007). Howev-
er, in modern contexts of human-AI augmentation,
professionals are expected to “collaborate”and trans-
form knowledge without the practical ability to exam-
ine or evaluate AI knowledge claims. Thus, our study
focuses on the following question. How do professio-
nals experience and deal with opacity when using AI
tools to form critical judgments?
Investigating Opacity of AI-in-Use Through
Sociomaterial Practices of Knowledge Work
To investigate this question, we focus theoretically
on the sociomaterial practices of knowledge work
that AI tools are involved in. We adopt a relational
ontology that assumes the entangled nature of actors
and materials and foregrounds the performativity of
practices (Barad 2003,Suchman2007). This perspec-
tive emphasizes the way in which technologies and
actors are inseparable and continually (re-)produce
one another through practices situated within partic-
ular social and historical contexts (Orlikowski 2007,
Suchman 2007, Orlikowski and Scott 2008, Leonardi
2011). This lens has been used to uncover important
insights when studying organizational uses and im-
pacts of other technologies, such as enterprise inte-
gration platforms (Wagner et al. 2010,2011), social
media tools (Scott and Orlikowski 2014), online com-
munityplatforms(Barrettetal.2016), and robotic
tools (Barrett et al. 2012, Beane and Orlikowski
2015). We follow the argument of Suchman (2007,
p. 1) to shift from “categorical debates,”in our case,
around AI and opacity, to “empirical investigations
of concrete practices,”in which individuals and
technologies act together.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
4Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
Adopting this view means focusing on the genera-
tive materiality of technical infrastructures and treat-
ing the technologies-in-use (AI and otherwise) as part
of the sociomaterial configuration (Mazmanian et al.
2014, Scott and Orlikowski 2014, Barrett et al. 2016). In
particular, focusing on situated configurations empha-
sizes that individuals’understandings about a given
technology vary across local meaning systems (Pinch
and Bijker 1987,Mol2003, Leonardi and Barley 2010).
This means leaving opacity to be realized in practice
“depending on the actor’s situatedness”(Haraway
1988). Therefore, instead of conceptualizing opacity
as inherent or fixed features of AI tools, we view opac-
ity as something produced and enacted through prac-
tices situated in specific organizational configurations
(Orlikowski 2000, Leonardi 2011). Using this lens, we
set out to examine how opacity of AI-in-use is experi-
enced and dealt with when professionals use AI when
forming judgments.
Methods
Research Setting
We conducted an in-depth field study within three
different departments in a large diagnostic radiology
organization at Urbanside, a teaching hospital in a
major U.S. city. Diagnostic radiology is a specialized
medical field in which medical imaging is analyzed to
diagnose and treat diseases, and it has been at the
forefront of adopting cutting-edge technologies (AI
and non-AI) for decades (e.g., Barley 1986). Recently,
a great debate has been unfolding as to the impact of
AI tools on professionals in this field and how AI may
entirely replace the radiology profession (Mukherjee
2017, Recht and Bryan 2017, Grady 2019). We de-
signed our study, following the tradition of field stud-
ies of technologies, work, and organizations (Barley
1990, Orlikowski 2000, Lifshitz-Assaf 2018, Bechky
2020), to investigate three radiology departments
within the same organization and enable us to deepen
our investigation of professionals’work with AI tools.
Data Collection
Starting in late 2018, we immersed ourselves in the
field of diagnostic radiology, attending professional
conferences, symposia, and vendor events, to under-
stand the opportunities and challenges on the profes-
sional field’s horizon. Ethnographic field work began
in January of 2019 and studied 40 radiologists (li-
censed doctors or senior fellows offered positions
upon completing their fellowship) across three de-
partments actively using AI tools: breast imaging,
chest imaging, and pediatric imaging.
Observation. The primary source of data for this
study is ten months (over 500 hours) of ethnographic
observation (Van Maanen 1988). We documented over
1,000 cases of radiologists forming diagnoses in de-
tailed written observational notes, which were tran-
scribed and supplemented upon leaving Urbanside
facilities each day. Because Urbanside radiologists
trained medical students and residents, we often
captured radiologists verbally articulating their diag-
nostic reasoning, drawing on past experiences and re-
search, describing common errors and strategies to
avoid them, and so forth. Radiologists often quizzed
trainees about important diagnostic practices and
philosophies (e.g., “What might hypoinflation indicate
in a newborn?”or “What might indicate stroke on
MRI?”) and then offering their own thoughts. During
periods of observation, we paid close attention to the
technologies-in-use, capturing the role of the tools in
the diagnostic process, the results they produced,
what meanings emerged around the tools, and so
forth. Over the course of our field work, we observed
diagnostic cases involving and not involving AI tools.
Observing cases not involving AI tools strengthened
our understanding of radiologists’analytical practices.
Even for diagnosis scenarios typically involving AI
tools, we also observed cases of radiologists not using
the tools, such as during technical outages or when
working for satellite locations with different technical
infrastructures.
Interviews. Observational data were enriched through
33 semistructured interviews (Spradley 1979). Twenty-
one informal interviews took place as radiologists con-
ducted their work or during short breaks, covering
questions about unclear aspects of diagnoses for recent
patient cases, interactions with their colleagues or pa-
tients, or specific moments of using or not using vari-
ous technologies. Twelve formal interviews allowed
us to deepen our understanding of what it means to
be a radiologist, how they go about their diagnostic
work, their perceptions of various technologies, and
so forth. All formal interviews and some informal in-
terviews were recorded (with informants’consent)
and transcribed.
Documentation and Artifacts. Finally, we collected
documentation and artifact data, which served multi-
ple purposes in our study. First, we captured artifacts
produced and used by radiologists in their daily
work, including medical notes and photographs or
drawings of medical images they were referencing.
These materials supplemented observational notes
and strengthened our analysis when reconstructing
their diagnosis process. Next, we collected technical
research papers, regulatory filings, and vendor docu-
mentation to study the three focal AI tools and the
nature of their outputs. In the United States, after reg-
ulators approve a clinical AI system, it can no longer
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 5
change (or “actively learn”). Vendors can request ad-
ditional approval for updated software versions,
which can then be deployed in clinical settings. Thus,
we observed the use of unchanging technologies
throughout our study.
Data Analysis
In keeping with the principles of grounded theory de-
velopment, we engaged in iterative rounds of data
analysis during and throughout our data collection
(Glaser and Strauss 1967, Charmaz 2014). In the early
stages, we conducted open coding to capture a broad
range of emerging themes. Within the first few
months, the prominence of radiologists expressing
doubt, asking questions, double checking, and con-
ducting deep analytical practices was striking in the
data. We were also struck by the frequency of ques-
tions and confusion surrounding the AI results radiol-
ogists viewed. We, therefore, conducted targeted
rounds of data collection and analysis to deepen our
understanding of these themes.
Although all radiologists appeared to be “using”
the AI tools (clicking to display its results after form-
ing their initial judgment), we noticed different pat-
terns in the degree that AI results were influencing
radiologists’final judgments (e.g., “pausing to con-
sider AI results,”“updating original diagnosis,”or
“quickly disregarding AI results”). In all three de-
partments, the AI results and the radiologists’opin-
ions often diverged, and confusion and frustration
often followed. Deeper analysis led us to relate their
frustration to the lack of understanding of why a
given AI result was produced (e.g., “questioning
what the AI is looking at”or “guessing factors be-
hind AI output”). When we investigated the three AI
tools and the nature of their output, we found many
similarities; each reported high-performance met-
rics, used neural network classification methods,
and offered no explanation of its results to users.
Yet, despite similarities in the tools and radiologists’
consistent frustration, only radiologists diagnosing
lung cancer were regularly incorporating AI results,
whereas the other radiologists mostly ignored the
tools’results.
Next, we set out to understand what was behind
these divergent patterns. We mapped step by step
how radiologists formed each different type of diagno-
sis and analyzed their process along multiple dimen-
sions, such as what aspects of the diagnosis prompted
doubt, how evidence was analyzed, perceptions of
the AI tool and its results, and so forth. We studied ra-
diologists’similarities and differences among the diag-
nostic settings and their analytical practices and saw
noteworthy differences in the materialities of the imag-
ing technologies-in-use (computed tomography (CT)
scans, mammography, and x-ray) and the breadth and
depth of analysis that were afforded. Iterating with the
literature on professional adoption of technology led
us to analyze how senior and junior radiologists used
the tools similarly and how all radiologists held simi-
lar attitudes about AI adoption. Further analysis led us
to focus on a key difference in how radiologists inte-
grated the AI result (or not) using what is called “AI
interrogation practices.”We continued to sharpen our
analysis by consulting literatures on epistemic uncer-
tainty and opacity, which further enhanced our formal
theory development; we describe this in the following
section.
Findings
Diagnosing patients is a critical process that requires
the extensive expertise, training, and judgment of di-
agnostic radiology professionals. Radiologists devel-
op deep expertise in diagnosis through at least six
years of intense, immersive education after medical
school. In their daily work, they strive to provide the
best possible care to their patients and take their role
in patients’health outcomes very seriously. They
work under resource constraints and time pressure,
as healthcare facilities respond to intense pressures
to increase patient volumes and reduce costs. In re-
cent years, powerful diagnostic AI tools have cap-
tured the attention of radiologists and healthcare
leadership. We present how radiologists in three de-
partments at Urbanside worked with AI tools to pro-
vide three critical types of medical diagnoses.
Producing Lung Cancer Diagnoses Using
AI Tools
Diagnosing lung cancer was a key focus of Urban-
side radiologists specializing in chest imaging. Like
others across the field, these radiologists were com-
mitted to producing the most accurate diagnoses
possible and positively impacting patients’treat-
ment and health outcomes. As in other Urbanside
departments, they faced high workload demands
and felt strong pressure to work quickly. At the
same time, they provided thorough analyses, requir-
ing intense concentration and careful deliberation.
When diagnosing lung cancer, radiologists faced the
challenging task of identifying difficult to detect
lung “nodules”and characterizing their likelihood
of malignancy. Radiologists were deeply aware of
the significant consequences of their diagnoses, both
the cost of falsely diagnosing a healthy patient and
thecostofmissingsignsofcancer,andworkedwith
high diligence.
Forming Critical Judgments (Without AI): Experienc-
ing High Uncertainty. While forming a lung cancer di-
agnosis, radiologists experienced three main sources
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
6Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
of uncertainty when reviewing their primary source
of evidence, CT imaging: multiple series of high-
resolution images (in five- and one-millimeter (mm)
“slices”) that digitally reconstructed three-dimensional
cross-sections of a patient’s upper body and supported
numerous settings and projections (e.g., from side or
overhead views with varying degrees of contrast).
First, they experienced great uncertainty while dis-
cerning “lung nodules”from the healthy lung tissue.
This involved searching for small white-appearing
circles within the varying shades of white to dark gray
lung tissue on the CT images. However, hundreds of
small white circular areas may be visible on a given CT
that represent normal tissue or bone (see Figure A.1),
and radiologists often wavered considerably while de-
ciding whether a particular area was a nodule or not.
One afternoon, a physician called Dr. E’s phone, re-
questing her opinion about a potential nodule on her
patient’s CT. After several moments of searching and
deliberating over the phone, Dr. E asked the physician,
“Do you mind if I look more closely and figure it out
and call you back?”Hanging up, she leaned closer to
the monitors and continued her analysis before finally
returning the call: “It’sverylowdensity.It’s looking al-
most fat-like [which appears more gray than typical a
nodule]. But it actually does look like a nodule. Some-
times I’mlike,‘Am I going crazy?’”
The second source of uncertainty emerged from ra-
diologists’task of identifying each and every nodule
in the patient’s lung tissue. Very frequently, they ex-
pressed concern about the possibility of missing a
nodule, fearful of making consequential errors of
omission: “Idon’t see anything major jumping out.
Hopefully I’m not missing anything”(Dr. Y). This
struggle was related to the CT imaging not always
clearly capturing every region of a patient’s lung tis-
sue where nodules may be positioned, as in the
following case of Dr. E deliberating aloud: “Am I hal-
lucinating a nodule? …I think it’s there, but it’s hard
to see. It’s in a bad location …It’s behind two ribs, so
it’s impossible to get a good look there.”Dr. J ex-
plained how “there’s all this lung tucked in front and
behind right there that you just don’t see [on CT imag-
ing].”Radiologists’difficulty examining these
“impossible”areas of lung tissue using CT imaging
raised their uncertainty. In fact, they often concluded
that a seemingly nodule-free CT scan was not defini-
tive: “If you don’t see the nodule on one image, that
doesn’t mean it’s not there …A lot of missed cancers,
like ten percent, were seen only from one view and
not the other”(Dr. J). The CT, like other imaging tech-
nologies, may also be difficult to analyze when pa-
tients shift or fail to inhale deeply during the scan, as
in the case of Dr. S struggling to discern a particularly
blurry CT image: “It’s hard to tell because it’s such a
crappy study. He did not take a deep breath, did he?”
Radiologists worked to address these first two
sources of uncertainty by investing in various ana-
lytical practices during the “nodule search.”They
methodically combed through the CT images nu-
merous times, starting with the less granular set of
five-mm images and then the more granular one-
mm set, as Dr. J explained to medical students ob-
serving her work: “There is so much volume of data
on the images to deal with …We scroll faster at
first. It’s good to get a general overview first, then
we go to the smaller ones for deeper investigation.”
Then, their focus turned to further evaluating each
potential nodule they identified, scrolling slowly
through the neighboring slices to assess if it ap-
peared to “flow”in a continuous path (indicating
normal blood vessels) or disappear abruptly (indi-
cating a nodule): “You have to follow the vessels. If
it’s something you’re able to follow, then it’sproba-
bly just a vessel you’re catching, not a nodule.”They
increased their confidence using a technique called
“windowing”or assessing the different properties of
the tissue by adjusting the settings of the CT image
or changing its grayscale contrast: “Oh, I think it’sa
vessel. Yeah, I don’t think it’s a nodule. Ah, yeah,
I’m pretty sure. Windowing really helps”(Dr. E). As
afinal measure to address lingering concern or con-
fusion, they may request additional imaging, as Dr. J
explained to onlooking medical students: “This area
looks ill-defined. So, somebody could call it a nod-
ule. We try to make a firm guess …but sometimes
we call for follow-up imaging because we really
can’t decide.”
Finally, radiologists faced a third, and relatively less
acute, source of uncertainty during the task of charac-
terizing each nodule’s likelihood of malignancy. Radi-
ologists applied fairly explicit criteria and standards
to each nodule they had identified: “Almost everyone
has nodules, but some of them can be cancer …You
go through each nodule and make sure it’s solid.
Then with the prior [CT images] that you’re compar-
ing to, you actually look at each nodule and visually
make sure they look the same”[Dr. Y]. They first
gauged the patient’s overall risk level by reviewing
their medical details (e.g., clinical symptoms, age, his-
tory of illness). Next, they scrolled through the CT im-
ages several times to explore the nodule. They used
digital tools to precisely measure its dimensions and
noted whether it was larger than the five-mm stan-
dard associated with malignancy. They analyzed prior
CT imaging (if available), looking for changes or sta-
bility in the nodule’s appearance or size over time.
When Dr. J noted that a three-mm nodule was present
on a CT scan from five years earlier where it also mea-
sured three mm, she felt highly certain in characteriz-
ing the nodule as benign: “There it is [in the CT from
2014]! So it’s there [not new]. Oh, that’s stable [noting
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 7
the consistent 3mm measurement]. There it is. Okay,
now I’m good.”
Experiencing Opacity of AI-in-Use (and Increasing
Uncertainty). After completing their initial analysis,
radiologists then viewed the results of an AI tool im-
plemented to aid their lung cancer diagnosis. The Ur-
banside chest imaging department purchased an AI
tool several years prior, which we refer to as the “CT
AI tool,”as an add-on to the CT digital imaging tech-
nology from a leading healthcare technology provid-
er. Over the years, the tool was updated numerous
times to improve its technical sophistication and per-
formance. At the time of this study’s observation, the
tool performed imaging processing, segmentation,
and classification tasks utilizing artificial neural net-
works that were trained and validated using large
data sets of long-term radiological outcomes. Pub-
lished research showed these AI tools’ability to iden-
tify and classify nodules was similar to radiologists’
cancer detection rates. Following regulatory guide-
lines, the CT AI tool was deployed as an “aid”to radi-
ologists, designated to be used after the radiologist
first formed his or her independent judgment.
Clicking an icon on the digital workstation, instan-
taneously the display jumped to the first AI result, a
circle annotation placed on a precise location of the
CT image (Figure A.2). In the intermittent cases where
the AI result and the radiologists’judgment con-
verged that we observed, they quickly moved on to
complete the final report. Radiologists expressed de-
light and relief when the AI results confirmed their
previously uncertain assessment that no nodules were
present: “This time, [CT AI] found nothing. Any time
that happens, it puts a big smile on my face”(Dr. F).
They experienced a boost in confidence and certainty
after viewing the AI results, as Dr. W expressed: “If I
don't see any lung nodules, and [CT AI] doesn't see
any lung nodules, then okay, we're good! I now feel
very comfortable saying there's no lung nodules.”
However, in the majority of cases we observed, the
AI tool’s results presented a divergent view from the
radiologist’s initial view. Regularly, the CT AI tool
did not mark a nodule the radiologist had identified.
Even more frequently, the tool flagged additional
areas that the radiologist had not identified. Radiolog-
ists began experiencing opacity, as they were unable
to understand these divergent AI results. They ques-
tioned what features of underlying lung tissue were
relevant to the tool’s decision: “How does [the AI
tool] know that this is a nodule, but this isn’t?”(Dr. V).
Radiologists were deeply committed to providing
judgments with maximum certainty, but they ex-
pressed difficulty feeling certain given the opacity
they experienced when considering divergent AI re-
sults: “I just don't know of any radiologist who's not
looking closely at the case because they have AI. Be-
cause at the end of the day, you're still responsible.
How can you trust the machine that much?”(Dr. E).
Dealing with AI Opacity: Enacting AI Interrogation
Practices and Incorporating AI Results. On the sur-
face, it may seem that using the AI tools (and
experiencing opacity) increased the overall uncertain-
ty these radiologists experienced; however, in fact, us-
ing the AI tool resulted in radiologists experiencing
less uncertainty making their final judgments. They
achieved this by using “AI interrogation practices”or
practices that human experts enact to relate their own
knowledge claims to the AI knowledge claims. For
these radiologists, enacting AI interrogation practices
involved building an understanding of the AI result
and then reconciling the divergent viewpoints. They
examined the suspected area in question, zooming in
on that region of the CT image and scrolling forward
and backward to assess the tissue surrounding the
AI-marked region. They changed the contrast settings
on the CT to analyze the area’s size, shape, and densi-
ty and reviewed prior CT images to understand how
those features may have changed over time. They
were examining and probing the AI results in order to
understand them and ultimately, integrate them with
their own viewpoint.
Enacting AI interrogation practices led radiologists
to consistently integrate the AI results into their final
judgments. Radiologists regularly updated their initial
opinion after interrogating the AI results, either
through synthesizing the divergent opinions into a
new insight or through reflectively agreeing with the
AI result, as in the following case. After completing
his initial analysis, Dr. T was puzzled by three AI re-
sults suggesting nodules he had not initially flagged.
He began interrogating each area marked by the AI
tool, analyzing the CT imaging to try to understand
the AI result and how it related to his own view. He
decided to overrule one AI result and expand his orig-
inal opinion to include the two new additional ones.
Even when radiologists decided to overrule the AI re-
sults, they experienced higher confidence reporting
that final diagnosis. This was the case after Dr. F swift-
ly interrogated two unexpected AI-marked areas and
related them to his own analysis: “This is what [CT
AI] picked up: there and there. It’s just normal stuff,
parts of the bones protruding from the chest which
sometimes looks like it could be a nodule.”
Enacting AI interrogation practices required radiol-
ogists to invest additional time and analysis. They
were willing to make that investment time and time
again, which reflected their positive views of the AI
results’value, as expressed by Dr. F: “I know my limi-
tations and I know this [CT AI] is going to help them
[nodules] stand out a little better. It’s worth the extra
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
8Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
time in my mind.”They viewed the AI results as dis-
tinct and complementary to their own capabilities and
expressed strong positive opinions about the tool’s
value in their work. This was vividly expressed by Dr.
W, a senior radiologist who moved from another hos-
pital where they did not have a CT AI tool: “I actually
think [CT AI] is mission-critical. For me to read cases,
I absolutely love having the [CT AI]. I used to not
have it in my prior place [hospital]. I thought it was
the worst thing ever. And then when I came here, I
was amazed.”
Indeed, the practice of interrogating and integrating
the AI results had become a critical step in how these
experts formed their final judgments. This was re-
flected in Dr. V’s response one afternoon when the AI
results were unexpectedly unable to load for a CT she
was assessing. She instant messaged the CT techni-
cian, requesting the AI results for that study, and fol-
lowed up with a phone call when the technician did
not respond. She minimized that CT and began ana-
lyzing another case while she waited; a few minutes
later, she learned of technical issues disrupting the AI
services. Flustered, she returned to the minimized
case, scrolled through the CT several more times, and
reluctantly wrote the diagnosis report without AI in-
put: “Once in a while, it definitely picks up things that
you looked at yourself and you totally ignored, that
you just couldn’t see. Knowing that every now and
then it picks up something real makes you always
want to go back to it.”
Producing Breast Cancer Diagnoses Using
AI Tools
As breast cancer is prevalent and highly dangerous,
diagnosing it at the earliest and most treatable stage
was a great priority for radiologists specializing in
breast imaging. On a typical day, each Urbanside
breast radiologist evaluated over 100 patients—a
highly demanding workload—and was providing
life-or-death judgments in every case: “We have to
give our full attention to make the right call, but we
have so much volume we’re supposed to get through.
It’s a conflicting thing”(Dr. Q). On average, they
spent less than three minutes evaluating a case, an
amount of time that did not allow extensive delibera-
tions. The consequences of making these evaluations
were extremely high, as patients were either informed
they were not currently at risk or recommended to un-
dergo additional testing, biopsy, or treatment, which
resulted in patients bearing significant physical, emo-
tional, and financial costs.
Forming Critical Judgments (Without AI): Experienc-
ing High Uncertainty. While making critical judgments
about breast cancer, radiologists experienced two
main sources of uncertainty. First, like radiologists
conducting the lung nodule search, breast radiologists
wrestled with identifying abnormal areas within the
complex breast tissue anatomy. The main source of ev-
idence is mammography imaging: digital x-ray imag-
ing that provided four two-dimensional images and
four three-dimensional images (side and overhead
views of each breast). For certain patient scenarios, tar-
geted ultrasound imaging was also used.
Breast radiologists worked to detect every poten-
tial abnormality in the patient’sbreastimagingand
knew that overlooking a single abnormality carried
extremely high consequences. On mammography,
abnormalities typically appear as small bright white
patches amidst normal tissue ranging from white to
dark gray (see Figure A.3). Because of the subtle dif-
ferences in tissue appearance, and the difficulty of
interpreting mammogram imaging, radiologists fre-
quently expressed concern about missing critical
findings. Dr. C explained: “The [abnormalities] we
worry about are really faint and tiny ones: those are
signs of early cancer. They’re the ones you can bare-
ly see …A[n abnormality] is going to be really real-
ly masked …you just can’tseethecancer …It’s
like looking for a snowball in a snowstorm.”In some
cases, radiologists requested additional imaging to
be more sure, especially when the mammogram did
not capture areas of the patient’s body (often near
the armpit): “If I could see clearly in this area [point-
ing just outside the border of the image], I wouldn’t
be so concerned”(Dr. L).
To increase their confidence that they identified all
abnormalities, radiologists used careful analytical
practices. They combed over mammogram images,
zooming in closely on each region and scrolling
through each three-dimensional view multiple times.
They were searching for unusual patterns in the tissue
that may indicate “masses, calcifications, skin thicken-
ing, changes to the tissue, axillary lymph nodes, or
distortion”(Dr. K). They examined asymmetries in
the appearance of the left and right breast tissue, as
Dr. P described, referring to images on her screen:
“This is an area that caught my eye. This is the right
side, and this is the left. The right looks obviously dif-
ferent than the left. This is one of the things that our
eyes are trained to look for.”Using careful systematic
analysis that provided diverse views and evidence
helped ward off radiologists’uncertainty, as Dr. G ex-
plained: “I zoom in even more, so I'm going to see
even the tiniest finding. I zoom in, like, a lot …until
I'm pretty sure I see all of them.”
The second, and more intense, source of uncertainty
was characterizing each abnormality’s likelihood of
being malignant or benign. Making this distinction for
breast cancer diagnosis was challenging. Radiologists
described breast cancer as a complex disease that may
develop in unexpected ways that often varied from
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 9
patient to patient. Breast tissue anatomy is complex,
and often, malignant breast tissue may closely resem-
ble healthy tissue on mammography. Numerous
pieces of evidence needed to be analyzed and synthe-
sized: the size, shape, edges, and density of the abnor-
mality on mammography, ultrasound, and MRI (mag-
netic resonance imaging) (if available); the degree of
change across prior imaging; a patient’s genetic make-
up; prior history of disease; and lifestyle choices, as
well as clinical symptoms and physical examinations.
Occasionally, the evidence would overwhelmingly
support a benign judgment, as in this case: “Those cal-
cifications are really big and chunky, so once I look at
it closely, then I can immediately ignore it [because it
is likely benign]”(Dr. C). More frequently, however,
some factors would suggest benign, whereas others
did not, and radiologists struggled to reduce the acute
uncertainty: “If I know something is fine or I know
something is bad, then I know right away. But there
are really a lot of cases I waffleon”(Dr. B). This is il-
lustrated by the following case. During her analysis,
Dr. C noted a small gray oval abnormality on a pa-
tient’s mammogram. Through her analysis, she noted
the oval’s small size and sharply defined edges, that
the area appeared stable for several years, and that
the medical history did not suggest increased risk (all
suggesting benign). Yet, she felt uncertain and ex-
haled deeply in frustration before ultimately recom-
mending the patient undergo additional testing: “It's
probably normal tissue, but it looks so oval.I’ve gotta
call her back [for additional testing]. I just can’t ignore
that spot.”
They expressed deeper anguish and deliberation
when judging the malignancy of abnormalities than
when searching for them, as Dr. Z explained:
“Deciding what to do with an abnormal finding [decid-
ing malignant versus benign]—as opposed to detect-
ing a finding in the mammogram—that takes much
more discerning.”Colleagues often disagreed about
an area’s likelihood of malignancy, especially because
the mammogram imaging and its various features
were open to multiple interpretations. Even after com-
pleting their full analysis, radiologists often second
guessed their final judgment, as portrayed by Dr. G’s
continued wavering: “Do I do a follow up or do I just
return to routine screening? That's really the difference
between being a cyst [benign] and something being a
solid mass [malignant]. And we can't always tell the
difference.”
To build certainty in this judgment, radiologists
used a variety of analytical practices. They zoomed in
on the mammogram to examine the appearance of the
abnormality and its density, size, shape, and edge clari-
ty. They gauged whether the abnormality had changed
or remained stable across prior years’mammograms.
They also studied the patient’shealthrecords(e.g.,
physical symptoms, personal and family history, pa-
thology and surgical records) to gauge the patient’s
overall risk level and inform their emerging judgment.
In one case, Dr. L decided to recommend a biopsy after
considering a patient’s elevated risk factors, despite
the area’s otherwise benign appearance: “It’s not overt-
ly suspicious: it’sfairlycircumscribedandit’snot
very oval [both suggesting benign diagnosis]. But
this patient is here because she just found out from a
genetic risk screen that she is at increased risk for
breast cancer.”
Experiencing Opacity of AI-in-Use (and Increasing
Uncertainty). After forming their initial judgment, ra-
diologists then reviewed the results of an AI tool im-
plemented to aid their diagnosis process. Several
years ago, Urbanside purchased an AI tool, which we
call the “Mammo AI tool,”as an add-on product to
the mammography software from the imaging tech-
nology vendor, one of the leading U.S. healthcare
technology providers. Since its implementation at Ur-
banside, the vendor provided numerous updates im-
proving the tool. During this study’s observations, the
Mammo AI tool performed imaging processing, seg-
mentation, and classification tasks utilizing artificial
neural networks trained and validated using large-
scale data sets with long-term radiological outcomes.
Published research reported that the tool could identi-
fy malignancies at similar rates as trained radiologists
and showed some indication of increasing radiolog-
ists’overall cancer detection rates.
2
Following regula-
tory guidelines, the tools were deployed as an “aid”
to radiologists, who were required to only view AI re-
sults after forming their independent evaluation. The
tool was designed so that a single mouse click dis-
played the AI tool results: a series of shapes
3
marking
the specific location on the mammogram that was
classified as malignant, with no further information
(Figure A.4).
Clicking the designated button, the AI results ap-
peared on the mammogram image, which the radiol-
ogist compared with her initial judgment. In the
infrequent cases we observed where the opinions
converged, they swiftly proceeded to the final diag-
nostic report. However, in the large majority of cases
we observed, the AI results and the radiologists’
judgment diverged. On occasion, the AI tool did not
flag an area that was initially judged as abnormal,
and far more frequently, the AI tool flagged addi-
tional areas that the radiologist had not.
Radiologists experienced opacity as they encoun-
tered the AI tool’s unexplained results. They were un-
able to see what aspects of that tissue were causing
the AI tool to produce a given result: “I don’t know
why they marked these calcifications, what about all
these other calcifications (that the tool did not mark)?
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
10 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
They all look identical to me”(Dr. C). They expressed
frustration in their inability to understand the diver-
gent AI results: “What is it telling me to look at? At
this tissue? It looks just like the tissue over here, which
is perfectly normal …I have no idea what it's
thinking”(Dr. K). Radiologists had no practical means
of knowing the underlying reasoning of a given AI re-
sult and experienced the opacity of AI-in-use, as Dr. H
explained: “[The AI tool] just points an area out and
leaves you to figure it out. It’s like it’s saying, ‘This is
weird; what do you want to do with it?’”
Dealing with AI Opacity: Not Enacting AI Interrogation
Practices and Not Incorporating AI Results. Like chest
radiologists’use of the CT AI tool, in this department,
breast radiologists using the Mammo AI tool experi-
enced opacity and a surge in their level of uncertainty.
However, in this department, radiologists did not en-
act AI interrogation practices and ultimately, did not
regularly incorporate AI results into their final judg-
ments. Instead, when faced with divergent opinions,
the radiologists tended to review the image underly-
ing the AI result in a perfunctory way before ignoring
it, or “blowing it off,”as Dr. G described: “Iblowso
many things [AI results] off. Like if there’s normal
scarring or stable calcs [benign tissue], it’s [AI tool]
going to pick up everything.”They quickly dismissed
AI-marked areas that they previously deemed normal
without deeper inspection, writing them off as “false
positives.”“I already knew that stuff it marked didn’t
matter. I saw the mass was there a couple of years ago
[in prior imaging]”(Dr. Z). It was also common for
them to ignore AI results when the AI tool did not
flag an area they initially considered abnormal: “If
there’s something that’s concerning to you, based on
your initial interpretation, that the [AI tool] is saying,
‘Oh, this looks normal,’you couldn’t use that informa-
tion and say, ‘We’re not going to biopsy it’” (Dr. I).
Radiologists already faced extreme uncertainty and
intense time pressure, which suddenly multiplied
when they had to reconcile the (frequently) divergent
opinions of the Mammo AI tool: “So many different
factors are standing out to you all at once and giving
you conflicting information, and then there’s the result
from the software [Mammo AI]”(Dr. L). They ex-
pressed strong opinions that the Mammo AI results
did not add value to their process based on years of
repeatedly spending valuable time reviewing diver-
gent and unexplained AI results: “It isn’t helping any-
body. It's actually just another step for me to do”(Dr.
K). Radiologists expressed negative views of having
to tediously check and ultimately, “blow off”AI re-
sults for every patient’s case, especially given the high
time pressure they faced: “It’s not worth my time to
evaluate it”(Dr. L). Only under specific conditions
(when analyzing highly dense breast tissue) did some
radiologists comment on the potentially complemen-
tary nature of the tool’s results: “Calcifications can be
really little and sometimes hard to see. It [Mammo AI
tool] sees those calcifications better than I do. But it
also sees all kinds of calcifications that are neither here
nor there”(Dr. B).
4
Yet, in the same breath, Dr. B con-
veyed her view (shared by most of her colleagues)
that the AI results were often useless when making
her final judgments (they were “neither here nor
there”).
In the end, because of the lack of full feedback on
patients’health over time, it is unclear whether radi-
ologists’decisions to not incorporate AI results led
to more effective treatment or not. It is possible that
for some cases, had an AI result been incorporated,
additional patient testing may have been avoided.
For instance, Dr. L was examining new images for a
patient who had been recommended for additional
imagingbyDr.L’s colleague the week before. Dr. L
opened the patient’s original mammogram (from
the prior week) and reviewed the AI output, which
had not flagged the area that prompted her col-
league’sconcern:“[Mammo AI] didn’tmarkany-
thing on this one [the prior week’s image]. It didn’t
even mark the lesion that caught the radiologist’s
attention!”(Dr. L). Interestingly, after Dr. L review-
ed the patient’s new images, she recorded her opin-
ion that the area was benign. This pattern was not
uncommon; radiologists often recorded benign
judgments after reviewing additional imaging. In
this case, the original AI result was consistent with
the radiologist’s ultimate benign diagnosis; howev-
er, its accuracy is unclear without long-term patient
health outcomes.
Producing Bone Age Diagnoses Using AI Tools
“Bone age”evaluation involves radiologists specializ-
ing in pediatric imaging to assess the skeletal maturity
of children experiencing delays in growth or puberty.
This important diagnosis factors into considering
whether to treat the child with daily growth hormone
injections for a period of time. This diagnosis involves
comparing a child’s bone development with estab-
lished pediatric standards to determine whether it
falls within a “normal”or “abnormal”range for the
patient’s age. A pediatric radiologist at Urbanside
may perform seven or eight bone age evaluations on a
given day, among the variety of 40–50 other diagnoses
they provide (e.g., evaluating lung disease on CT
scans, gastrointestinal issues on ultrasound, or scolio-
sis on x-ray). Like in the previous departments, these
radiologists faced acute pressure to work quickly and
provide high-quality, time-sensitive assessments to
physician teams caring for young patients and their
concerned parents.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 11
Forming Critical Judgments (Without AI): Experienc-
ing Lower Uncertainty. Unlike the previous two spe-
cialties, pediatric radiologists viewed this evaluation
as a straightforward comparison task and did not ex-
perience particularly high uncertainty, or as Dr. O de-
scribed, “I don’t think it’s a very sophisticated thing.”
After first quickly noting the patient’s age and gender,
they reviewed the sole source of evidence for this
evaluation: a single digital x-ray representing the pa-
tient’s hand, fingers, and wrist (see Figure A.5). They
studied the size, shape, and appearance of specific
bones visible on the x-ray and drew on their knowl-
edge of how certain parts of the hand develop differ-
ently over time: “I use the phalanges [fingers] as the
gold standard, but there’s also carpal bones [wrist]
and the radius ulna [forearm]. But they’re more vari-
able, so I don’t look at that as much”(Dr. R). Dr. D ex-
plained how she considers and weighs multiple bone
areas to build a more certain judgment: “I give more
credence to distal bones [closer to fingertips], al-
though endocrinologists like the proximal [lower fin-
gers or wrist], which is probably more representative
of overall height growth …If there’s variation, or if
there’s discordance between different bones, I mental-
ly give more weight to some than others.”
Then, they compared the patient’s bone develop-
ment with the curated set of x-ray images in the text-
book of standards used across pediatric radiology. A
single x-ray image was used to depict a child’s ex-
pected bone development at each one-year increment.
Radiologists compared the appearance of the patient’s
hand x-ray with the standard images in the book,
searching for the closest match: “I’m looking at the
different shapes and seeing these are bigger than sev-
en years”(Dr. D). In the following assessment, Dr. N
went back and forth between the 18- and 19-year
standards,
5
noting slight differences in the bone devel-
opment: “You see here, the bones are all fissured
[pointing to patient’s x-ray]. And here [in the 18-year
standard image], there’s still a tiny physis.”A faint
white line (the “tiny physis”) ran horizontally be-
tween the knuckle and fingertip in the standard 18-
year-old image, but no horizontal line appeared on
the patient’s x-ray (it was “all fissured”or no gap be-
tween the bones). Dr. N interpreted this to mean the
patient’s bone age was greater than 18 and thus, re-
ported his judgment of 19 years.
Lastly, radiologists performed a calculation of the
“normal”range of bone ages using a data table of stan-
dard deviations for each consecutive age printed in the
textbook and reported whether their judgment of the
patient’s bone age fell within or outside that range.
Experiencing Opacity of AI-in-Use (and Increasing
Uncertainty). After forming their initial judgment,
the radiologist then viewed the result of the AI tool. In
2018, the Urbanside pediatric department imple-
mented a cutting-edge tool, which we refer to as the
“x-ray AI tool,”to aid in bone age diagnosis. Citing
the fairly straightforward comparison or “pattern rec-
ognition”nature of this task, Urbanside pediatric radi-
ologists expressed high enthusiasm for using the
x-ray AI tool, as Dr. N explained: “I think [the AI tool]
can be very useful …You have to look very finely
and carefully at a bunch of different images. It’s visu-
ally overwhelming. But I think it’s something a
computer is really good at …It’s just pattern recog-
nition.”The tool was developed by a reputable re-
search institution and used deep learning methods at
the forefront of diagnostic AI development at the
time, involving multiple stages of convolutional
neural networks performing image processing, seg-
mentation, and classification tasks. Published studies
reported that the tool’s results matched the “normal”
versus “abnormal”judgments of pediatric radiologists
in over 95% of test cases. Urbanside radiologists ea-
gerly agreed to participate in a multiinstitution effort
to further study the tool in settings of clinical use.
After implementation, every bone age evaluation
was automatically processed and analyzed by the
x-ray AI tool before entering the radiology work
queue. Upon opening a bone age case, the digital
x-ray displayed on the center monitor, and the diag-
nostic report software loaded on the side monitor. The
x-ray AI tool automatically populated the diagnostic
report with the AI result, a specific bone age measure-
ment, and its corresponding “normal”or “abnormal”
evaluation. Like in the previous two cases, radiolog-
ists first formed their initial opinion, and then, they
viewed the AI result and decided how to use it.
Viewing the AI results, all of a sudden, radiologists
experienced a new surge of uncertainty, rooted in
their inability to understand or explain the AI result.
In about a third of the cases, the AI tool’s bone age
roughly converged with their initial judgment. How-
ever, in the majority of cases, the bone age opinions
diverged, and radiologists faced uncertainty in how to
respond: “It [x-ray AI] would give me bone ages that
would make me rethink what I said …Ifind that I’m
often disagreeing with the model. Maybe it’sjustme
and I don’t know how to read bone ages”(Dr. D). Ra-
diologists were troubled by the discrepancies, which
led them to question their own judgments as well as
the AI tool’s, as in Dr. R remarking, “Sometimes I felt
that the algorithm was a little inaccurate, either too
old or too young …I couldn’t put my finger on what
it was that was off. Or maybe I was off, maybe the
algorithm was more accurate, and I wasn’t looking at
it right.”
Lacking the ability to understand or examine the
tool’s result left radiologists frustrated: “I have no
idea, I really don’t. I would be curious to know. I
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
12 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
don’t really know how it’s working”(Dr. M). They
were often questioning what the AI tool was consider-
ing and guessing at the image features the AI tool
may be weighing: “I’d be curious to find out what
parts of the image the algorithm actually uses …I felt
it was probably looking at—I wasn’t sure—but I felt
like it was probably looking at more of the hand than
I was …Idon’t know how much weight the AI gives
to the different bones”(Dr. R). One afternoon, a spirit-
ed discussion broke out as Dr. D attempted to reason
about the tool’s underlying logic: “Is there a way to
tell what the algorithm used on an individual case to
come to its determination? …If it’s looking at the
wrist bones, we would maybe disagree with it.”Al-
though Dr. A agreed, she questioned Dr. D’s assump-
tion about how the tool was forming its judgment:
“Yeah, but is it looking at the wrist bones?”Experienc-
ing opacity of AI-in-use, Dr. D shrugged: “I don’t
know. I don’t know how it works!”Dr. A sighed in
frustration, agreeing that “[i]t’s a mystery.”
In particular, they were baffled at how the AI tool
was producing bone age measurements at a level of
precision far greater than they were capable of pro-
ducing. Pediatric radiologists report bone age results
using the one-year increment standards, but the AI
tool reported more granular results using combina-
tions of years and months (e.g., “6 years 4 months”),
which Dr. R explained, saying “[i]t [x-ray AI tool]
doesn’t always give you an exact number [of years]. It
gives you a kind of interpolation between standards.
We don’t typically do that.”They struggled to under-
stand or interpret how the AI tool was able to discern
these precise results that did not correspond with
their accepted language or approach: “How is it com-
ing up with this granular of a bone age? How does
this make sense? How does it know?”(Dr. A).
Dealing with AI Opacity: Not Enacting AI Interrogation
Practices and Not Incorporating AI Results. As in the
previous cases, pediatric radiologists encountered a
sudden surge of uncertainty as they experienced
opacity of AI-in-use. They struggled in the process
of relating the AI tools’results to their own expert
knowledge, as Dr. O remarked: “Idon’t really know
how to gauge the results from that software; I’mnot
sure how it’sworking.”Ultimately, in the cases we
observed, pediatric radiologists did not enact AI in-
terrogation practices and thus, rarely incorporated
AI results into their final judgments.
These radiologists faced a sudden increase in uncer-
tainty when viewing the AI results, despite the (previ-
ously) straightforward nature of the task. They were
unable to integrate the tool’s unfamiliar way of com-
municating bone age opinions with their own knowl-
edge about pediatric bone development: “It [x-ray AI
tool] gives me things like ‘11 years 8 months.’How
does it get that? …If someone was going to ask me,
‘How do you know it was 11 years 8 months?’I’dbe
like, ‘Idon’treallyknow’” (Dr. D). Moreover, they did
not enact a rich range of analytical practices to help
them interrogate the AI result and relate it to their
own opinions. When they viewed a divergent AI bone
age opinion, they resorted to rereviewing the same
images from the x-ray and textbook and rarely trans-
formed their initial opinion as a result. This is illustrat-
ed in the following case. Dr. D’s eyes flicked back and
forth between the standard images and the patient’s
x-ray as she formed her initial assessment: “I’m look-
ing at how wide is this area here [the areas separating
the bones of the fingers]. Looking at the different
shapes. This is bigger. This is the same. I think he’sbe-
tween 8 and 9. The machine says between 9 and 10.
Closer to 10 actually!”Reacting to the divergent AI
opinion, Dr. D cocked her head to the side and exhaled
in frustration: “Now I’m going to try to find why it
said that.”She continued reviewing the same image
on her screen and the textbook again, which yielded
no new insights that would change her original view:
“Ifeelhe’s not that close to that [10 years]. I think the
machine’s overestimating. To me, it’s 8 or 9.”
Discussion
Summary of Findings
This study brings to light a process of how profession-
al knowledge workers experienced and dealt with
opacity of AI-in-use when forming critical judgments.
In all three departments we studied, professionals’
key practice is producing knowledge claims with the
highest level of certainty possible. Professionals in
two departments faced intense uncertainty (during
lung cancer and breast cancer diagnoses) and worked
hard to reduce it using varied analytical practices. In
the third department (when evaluating bone age),
they experienced lower uncertainty and drew on few-
er analytical practices. In all three departments, pro-
fessionals first formed initial knowledge claims and
then considered the AI knowledge claim, which fre-
quently conflicted with their initial claim. In all three
departments, professionals experienced opacity of AI-
in-use because they had no insight into the underlying
reasoning of a given AI result, which in turn, height-
ened their experience of uncertainty.
Interestingly, the three departments had divergent
patterns of the degree to which they transformed their
own knowledge as a result of considering AI tool re-
sults. Only one department consistently integrated the
AI results, despite the opacity of AI-in-use (when diag-
nosing lung cancer), whereas professionals in two other
departments did not integrate the AI results (when di-
agnosing breast cancer and bone age). Upon closer
analysis, we found that it was critical that professionals
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 13
were enacting “AI interrogation practices”or practices
humans enact to relate their own knowledge claims to
AI’s knowledge claims. Enacting AI interrogation prac-
tices enabled professionals to reconcile the two knowl-
edge claims (by overruling the AI claim, reflectively
agreeing with it, or synthesizing the claims synergisti-
cally) and reduce their overall uncertainty. Professio-
nals who did not enact such practices struggled to
incorporate AI results because of the opacity and con-
sequently, formed their final judgments by either blind-
ly accepting or ignoring AI claims. We differentiate
these paths of human-AI use as engaged augmentation
and unengaged “augmentation.”Figure 1summarizes
this process and the two paths.
Theoretical and Practical Implications
Drawing on our conceptualization, we now outline
the implications of our study for two key areas of fo-
cus for organizational scholars of AI: AI opacity and
human-AI augmentation.
Opacity of AI-in-Use and the Importance of AI Interro-
gation Practices. Opacity associated with AI tools has
become a fiercely debated topic in academic and socie-
tal conversations (Pasquale 2015, von Krogh 2018,
Christin 2020, Diakopoulos 2020). Our study brings is-
sues of opacity to the center stage in studying how
professionals use AI tools to form critical judgments.
Most of the existing literature on opacity conceptual-
izes opacity as a property of AI tools, especially of tools
that use deep learning methods (Domingos 2015,
Burrell 2016, Pearl and Mackenzie 2018, Kellogg et al.
2019). Our study shifts the analytical focus from what
appears as an innate and fixed property of technology
to the broader sociomaterial practice that produces
opacity as a specific technology is used in a particular
context. This enables us to focus on the process of
how AI opacity emerges in practice and how, in some
cases, professionals can deal with it.
A growing community focusing on issues of AI
opacity proposes two approaches for dealing with it.
The first focuses on limiting the use of AI tools for
critical decisions if transparency is unattainable (e.g.,
Gillespie 2014, Domingos 2015, Burrell 2016, Teodor-
escu et al. 2021). The second approach is designing
“explainable”or “interpretable”AI tools that provide
greater transparency toward explaining AI outputs.
Although this work is critical (as we discuss), our
work uncovers a third approach. We illuminate a path
where professionals deal with opacity of AI-in-use by
enacting AI interrogation practices. These practices
provided professionals a way of validating AI results,
despite experiencing opacity, and resulted in an en-
gaged mode of human-AI augmentation.
Although many researchers are focused on devel-
oping “explainable AI”or “interpretable AI”(e.g.,
Guidotti et al. 2018, Hooker et al. 2019,Rudin2019,
Samek et al. 2019, Barredo Arrieta et al. 2020,
Fern´
andez-Lor´
ıa et al. 2020, Bauer et al. 2021, Teodor-
escu et al. 2021), some leading scholars (Simonite
2018,Cukieretal.2021) and AI designers believe
there is no need for explanations. They argue that an
AI tool’s evidence-based performance results should
motivate experts to rely on the tool’sresultswith
confidence. This assumption was expressed by a
leader of AI research at Urbanside: “People talk
about explainability in AI a lot. My personal opinion
is I don’t think you need to do any explaining. As
long as you show users that the tool performs well.
When it performs well, I think people are really okay
Figure 1. Experts Using AI Tools for Critical Judgments
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
14 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
working under that uncertainty.”Our study shows
how that point of view is disconnected from the real-
ity of how professionals are wrestling with opacity of
using AI in practice. Based on our study’sfindings,
explainable or interpretable AI may enable but not
guarantee that professionals are able to integrate AI
knowledge (i.e., engaged augmentation). Our study
showed that despite AI tools’high performance
documented in published literature, some professio-
nals chose to invest their valuable time into AI inter-
rogation practices rather than simply relying on
unexplained AI claims at face value.
If new generation AI tools provide explanations or
become more interpretable, this should impact ex-
perts’ability to engage with AI but not necessarily
their motivation or willingness to do so. Such willing-
ness is influenced by many factors, such as profes-
sional norms, organizational and financial incen-
tives, and societal expectations. Making sense of AI
explanations requires an investment of time and re-
sources. Not only is this challenging given intense
organizational constraints (e.g., time, knowledge),
but such investment does not align with widely held
expectations that AI will make work faster and more
efficient. In medical practice, professionals solicit
opinions from their colleagues, investing in collabo-
ration only when they experience particularly high
doubt or uncertainty (on a regular but infrequent ba-
sis). In contrast, in our study (as in many leading
U.S. hospitals), the AI tool provides opinions on ev-
ery case, regardless of the professional’sdegreeof
uncertainty. Thus, professionals were spending ad-
ditional time coping with the heightened uncertain-
ty, even for simple and routine cases (where prom-
ises of AI efficiency are strongest). We hope future
research will further unpack the relationship be-
tween AI and time as the push for accelerating the
pace of work is increasing (Lifshitz-Assaf et al.
2021), yet implications on the nature and quality of
work are underexplored.
Our study also contributes to the debates and con-
versations on opacity by uncovering an important
relationship between opacity of AI-in-use and epi-
stemic uncertainty (GriffinandGrote2020,Packard
and Clark 2020, Rindova and Courtney 2020). In
many knowledge fields, experts are keenly focused
on producing high-quality judgments, and they are
willing to invest resources to obtain additional evi-
dence and reduce their epistemic uncertainty—or
ignorance of unknown but knowable information
(Packard and Clark 2020). Contrary to prior litera-
ture, when professionals in our study obtained
additional “evidence”from AI tools, which often di-
verged from their prior judgment, their epistemic
uncertainty increased because of their experience of
opacity. In our study, professionals would regularly
integrate conflicting knowledge provided by their
colleagues by probing one another and building on
their common ground and participation in a shared
field (Carlile 2004,Maguireetal.2004,Levina2005).
However, when professionals’opinions diverged
from AI tools’opinions, no common ground or
shared field exists or can be created (as tools are de-
signed today). Enacting AI interrogation practices
was the only way some professionals were able to
overcome the opacity of AI-in-use and reduce the
uncertainty needed to integrate the AI knowledge
into their own.
Future research is warranted to explain why
some professionals enact AI interrogation practices,
whereasothersdonot.Ourstudysuggeststhree
main potential factors: the AI tool’s ability to re-
duce professionals’uncertainty, the presence of
time pressure (and other resource constraints) on
professionals’work, and the richness of professio-
nals’complementary technologies-in-use. Motiva-
tion to invest in AI interrogation practices may be
lowerifprofessionalsviewtheAIexpertiseassimi-
lar to (or worse than) their own. In such cases, there
is only increased pressure of investing additional
time without the benefit of reducing uncertainty (as
in the breast and pediatric departments in our
study). Moreover, the time required to enact AI
interrogation practices may further deter professio-
nals from investing in them (as in the breast depart-
ment, where time pressure was extremely high). It
is also possible that professionals may still develop
AI interrogation practices as they continue using
the AI tool over a longer period of time (as in the
pediatric department); on the other hand, it may
be difficult to develop such practices when the
complementary technologies are limited and lack
richness (e.g., when analyzing x-ray images). Fu-
ture research should investigate other motivators
or deterrents that were not apparent in our context,
such as the impact of regulation or perceived legal
and institutional risks. It could be, for example, that
regulatory or authority bodies that require profes-
sionals to articulate why they overrule an AI result
may motivate investment in AI interrogation
practices.
Importantly,wedonotwishtosuggestthatAI
interrogation practices are a substitute for explain-
ability or interpretability. On the contrary, we urge
continued dedicated attention and resources toward
designing AI tools that enable professionals to more
readily integrate AI knowledge claims in practice.
For instance, when investigating the x-ray AI tool
for determining bone age, we as academic research-
ers learned from reading archival published materi-
alsthatitispossibletoproducesaliencemaps
showing what areas on the x-ray were most relevant
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 15
for producing a given AI result. Although this tech-
nology is available for the algorithm underlying the
x-ray tool, it was not implemented at Urbanside. By
highlighting the critical importance of AI interroga-
tion practices, we hope that managers, practitioners,
and researchers will focus on designing and adopt-
ing more transparent and interpretable AI tools.
This should help professionals to more easily devel-
op AI interrogation practices that build on these ex-
planations. In this study, all AI tools had a similar
degree of opacity of AI-in-use. Future studies can ex-
plore a variety of interrogation practices that may
emerge in response to different degrees of opacity.
Moreover, AI researchers can proactively design
new features that ease engagement.
We also wish to highlight our focus on professio-
nals’judgments for critical decisions, those with par-
ticularly high consequences or costs of errors (medical
diagnoses in our study). Our findings are relevant to
contexts where experts make critical decisions that re-
quire knowledge integration and transformation, such
as judges rendering verdicts and sentencing, human
resource managers evaluating employees, or military
experts carrying out targeted attacks. Our study
speaks to such contexts where engaged augmentation
is necessary versus those where experts may defer to
AI results even when opaque. We do not suggest that
our findings apply for decisions that do not require
knowledge transformation or when the cost of errors
is substantially low, such as using AI for supply chain
logistics, marketing and advertising, grammatical ed-
iting, or call center prioritization.
Future research is needed to explore potential dif-
ferences in how professionals in other contexts
experience opacity and enact AI interrogation prac-
tices. This is a study of a specific profession (physi-
cians) within a highly resourced U.S. organization
(a teaching hospital), and we believe other experts
in different legal and professional environments are
important to investigate. The organization we stud-
ied has world-leading experts with high standards
of quality and strong professional accountability. In
the past few years, there has been a gold rush to
purchasing AI tools, especially in hospitals with
fewer resources and lower standards of care (Moran
2018,Gkeredakisetal.2021, Roberts et al. 2021).
Based on this study, we suggest that such a gold
rush may give rise to unengaged “augmentation,”
which is highly risky from a learning and knowl-
edge perspective for experts, AI companies, and
consumers. In addition, our study is based in the
U.S. legal system where hospitals must adhere to
strict regulation and oversight, which is not the case
in many countries currently adopting AI tools. Fu-
ture research is warranted on the role of regulation
on the adoption and engagement of AI tools for
critical decisions. When such regulation is missing
and fewer checks and validations are in place, en-
gaged augmentation may be even less likely and
yet, even more important.
Challenging the Taken for Granted Concept of Aug-
mentation. Professional work is currently being dis-
rupted by AI technologies, as modern AI increasing-
ly pertains to processes of producing and evaluating
knowledge claims (Anthony 2018, Faraj et al. 2018,
von Krogh 2018,Pachidietal.2021). Debates are
emerging around the degree of automation or aug-
mentation that may result as AI tools are adopted
into professional work settings (e.g., Autor 2015,
Seamans and Furman 2019, Zhang et al. 2021). Our
study speaks to this important debate by problemat-
izing the taken for granted concept of augmentation
and its implication for the future of work and human
expertise.
Within the current literature, augmentation gener-
ally refers to human in the loop scenarios where ex-
perts and AI tools “collaborate”so as to “multiply”
and “combine their complementary strengths”(Raisch
and Krakowski 2021,p.193).Theresultsofour
study challenge the taken for granted equivalency of
augmentation with collaboration. Instead, we sug-
gest differentiating engaged augmentation from unen-
gaged “augmentation.”In engaged augmentation, ex-
perts integrated AI knowledge claims with their
own, which requires both building an understand-
ing of the AI claim and the ability and willingness to
transform one’sownknowledgebasedontheAI
claim (this took place in lung cancer diagnosis in our
study).ByenactingAIinterrogationpractices,pro-
fessionals were able to understand the AI result, al-
beit the opacity they experienced, and demonstrated
their willingness to change their initial judgment
through reflectively agreeing with the AI claim,
overruling it, or synthesizing the two claims. From a
learning and knowledge perspective, engaged aug-
mentation scenarios could be productive and benefi-
cial, including cases of reflectively overruling the AI
results. Future research is needed to investigate the
learning that professionals (and AI tools) experience
when involved in engaged augmentation over ex-
tended periods of time. For example, if engaged
professionals routinely change their judgment by en-
dorsingAIresults,theymaybereproducingAI’s
shortcomings over time (e.g., AI errors or biases in
judgment).
In contrast, unengaged “augmentation”involved
professionals not relating the AI knowledge claims to
their own claims (this took place during breast cancer
and bone age diagnosis in our study). These professio-
nals appeared to be using the AI tool as they were
going through the act of opening the AI results.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
16 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
However, they were not integrating knowledge claims
and mostly blindly accepting or blinding ignoring the
AI results. This path does not offer strong opportuni-
ties or benefits from a learning and knowledge per-
spective. We argue that human in the loop scenarios of
unengaged “augmentation”are essentially cases of au-
tomation; without the ability to relate AI knowledge
claims to experts’(through AI interrogation practices
or explainable or interpretable AI tools that enable in-
terrogation), what looks like augmentation on paper is
much closer to automation.
Moreover, there is an assumption that augmen-
tation will help organizations achieve positive
outcomes, usually through improving humans’
knowledge insights, efficiency, or both (Davenport
and Kirby 2016, Brynjolfsson and Mitchell 2017,
Daugherty and Wilson 2018, Raisch and Krakowski
2021). Researchers have already raised issues of AI
accuracy claims and “superior to human”knowl-
edge performance (Lebovitz et al. 2021). Using AI
tools may have reduced some human error but could
have introduced other errors into the human’sjudg-
ment (e.g., because of biases or poor training data).
To truly understand the impact on accuracy, we
must be able to compare AI outputs and experts’
judgments. However, in many professional contexts,
such evaluations are limited because the knowledge
is highly uncertain, and many “ground truth”meas-
ures are based on knowledge claims that lack strong
external validation (Lebovitz et al. 2021).
Our study adds to these concerns by calling into
question the assumption of increased efficiency.
In all three cases we studied, experts using AI tools
spent additional time even on “simple”cases as they
experienced opacity and additional uncertainty. With-
in engaged augmentation, experts invested additional
time to reconcile the AI knowledge claims by enacting
AI interrogation practices. This additional time may
be justified by improvements to care quality, but it is
unclear whether healthcare systems are willing to
commit that additional time. AI vendors tend to pro-
mote their tools using promises of efficiency. If man-
agers implement these tools based on such claims,
they may pressure experts to reduce time spent per
judgment, which is likely to encourage unengaged
“augmentation”and potentially lead to a decline in
quality (e.g., patient health outcomes).
Another perspective on our study that warrants
future research is the impact of AI on an overall pro-
fessional field and its knowledge work over time.
Leading medical professionals have been claiming that
AI tools will eliminate the need for professional radiol-
ogists, explicitly citing the rise of diagnostic AI tools
as a case for automation. On the other hand, leading
radiologists are arguing that AI can enhance their pro-
fessional roles and abilities. In our study, we did not
find significant differences in attitudes toward AI
across departments, which all took positive ap-
proaches toward adopting new technologies (includ-
ing AI). Future research may explore the impact of
AI on the broader professional field of radiology
and other professions experiencing massive disruption
because of AI tools (e.g., human resource manage-
ment, criminal justice). It could be that engaged aug-
mentation and unengaged “augmentation”are reac-
tive responses of professionals dealing with the
potential disruption posed by AI and automation. It
will be important to investigate how professionals re-
spond to a new technological force that is challenging
the professional jurisdiction and knowledge bound-
aries of an existing profession: for instance, how pro-
fessionals enact professional identity work (Tripsas
2009, Lifshitz-Assaf 2018) or knowledge boundary
work (Bechky 2003a, Levina and Vaast 2005,Barrett
et al. 2012) or the strategies and responses that impact
the profession field (Nelson and Irwin 2014, Howard-
Grenville et al. 2017, Bechky 2020).
To conclude, we do not wish to convey that dealing
with the opacity related to AI tools, even by using AI
interrogation practices, should be viewed as the de-
sired or optimal path forward. From knowledge and
learning perspectives, the opacity of AI tools can be
seen as inhibiting knowledge workers’full feedback
and reflective cycles (Sch¨
on 1983,Gherardi2000).
When professionals cannot analyze the reasoning be-
hind AI decisions, they miss out on the learning pro-
cess (Beer 2017), lacking opportunities to reflect on,
deepen, or update their expertise (Beane 2019). Ulti-
mately, AI technologies are designed to create new
sorts of expertise to enable professionals, organiza-
tions, and even society to better address hard prob-
lems such as medical diagnosis. However, the opacity
experienced when professionals are using AI tools is
an increasingly critical problem in its own right. We
urge further researchers and policy makers to tackle
this problem, across domains and disciplines, to en-
sure that the path of new technological development
meets the needs of humanity and society.
Acknowledgments
The authors thank the special issue editors and the anony-
mous reviewers for their invaluable insights throughout
the review process. This research benefited from the help-
ful feedback provided by Beth Bechky and Foster Provost
as well as constructive comments from researchers at the
New York University (NYU) Qualitative Research Semi-
nar, the NYU Future of Work Seminar, the Stanford
Changing Nature of Work Workshop, and International
Conference of Information Systems 2020 AI in Practice
Professional Development Workshop and in the Work in
the Age of Intelligent Machines community. Finally, the
authors thank the individuals at “Urbanside”who gra-
ciously allowed them to study their daily work.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 17
Appendix
Figure A.1. Single Image from a CT Scan Showing Various Lung Structures
Figure A.2. CT AI Tool Outputs
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
18 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
Figure A.3. Typical Display of the Digital Mammogram Images Used for Breast Cancer Diagnosis
Figure A.4. Mammo AI Tool Outputs
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 19
Endnotes
1
See the Radiological Society of North America’s journal Radiology:
Artificial Intelligence (https://pubs.rsna.org/journal/ai) overview of
the state of the field when it comes to AI use and other resources cu-
rated by the American College of Radiology (https://www.acrdsi.
org/).
2
This research prompted the U.S. government in 2003 to mandate
that insurance providers must reimburse the use of AI tools for
breast cancer screening, leading to wide purchasing of such tools
across U.S. breast imaging centers.
3
Three shapes were used to indicate the type of classification the
tool generated: star indicated “mass,”triangle indicated
“calcification,”and plus sign indicated the co-occurrence of mass
and calcification.
4
Calcifications are tiny flecks of calcium that can sometimes indi-
cate early signs of cancer. They are usually unable to be felt by a
physical examination. Large calcifications are not usually associated
with cancer. Clusters of small calcifications indicate extra breast cell
activity, which can be related to early cancer development but may
also be related to normal breast cell activity.
5
A physis is a growth plate located between bones. Over time, the
physis becomes thinner until eventually disappearing as one nears
full growth.
References
Albu OB, Flyverbom M (2019) Organizational transparency: Con-
ceptualizations, conditions, and consequences. Bus. Soc. 58(2):
268–297.
Ananny M, Crawford K (2016) Seeing without knowing: Limitations
of the transparency ideal and its application to algorithmic ac-
countability. New Media Soc. 20(3):973–989.
Anthony C (2018) To question or accept? How status differences in-
fluence responses to new epistemic technologies in knowledge
work. Acad. Management Rev. 43(4):661–679.
Anthony C (2021) When knowledge work and analytical technolo-
gies collide: The practices and consequences of black boxing
algorithmic technologies. Admin. Sci. Quart. ePub ahead of print
June 4, https://doi.org/10.1177/00018392211016755.
Autor DH (2015) Why are there still so many jobs? The history and
future of workplace automation. J. Econom. Perspect. 29(3):3–30.
Bailey D, Leonardi P, Barley S (2012) The lure of the virtual. Organ.
Sci. 23(5):1485–1504.
Barad K (2003) Posthumanist performativity: Toward an under-
standing of how matter comes to matter. Signs 28(3):801–831.
Barley S (1986) Technology as an occasion for structuring: Technical-
ly induced change in the temporal organization of radiological
work. Admin. Sci. Quart. 3(1):78–108.
Barley S (1990) The alignment of technology and structure through
roles and networks. Admin. Sci. Quart. 35(1):61–103.
Barley SR, Bechky BA, Milliken FJ (2017) The changing nature of
work: Careers, identities, and work lives in the 21st century.
Acad. Management Discoveries 3(2):111–115.
Barocas S, Selbst AD, Raghavan M (2020) The hidden assumptions
behind counterfactual explanations and principal reasons. Proc.
2020 Conf. Fairness, Accountability, Transparency (Association for
Computing Machinery, New York), 80–89.
Barredo Arrieta A, D´
ıaz-Rodr´
ıguez N, Del Ser J, Bennetot A, Tabik
S, Barbado A, Garcia S, et al. (2020) Explainable artificial intelli-
gence (XAI): Concepts, taxonomies, opportunities and chal-
lenges toward responsible AI. Inform. Fusion 58(2020):82–115.
Barrett M, Oborn E, Orlikowski W (2016) Creating value in online
communities: The sociomaterial configuring of strategy, plat-
form, and stakeholder engagement. Inform. Systems Res. 27(4):
704–723.
Barrett M, Oborn E, Orlikowski WJ, Yates J (2012) reconfiguring
boundary relations: Robotic innovations in pharmacy work. Or-
gan. Sci. 23(5):1448–1466.
Bauer K, Hinz O, van der Aalst W, Weinhardt C (2021) Expl(AI)n it
to me –explainable AI and information systems research. Bus.
Inform. Systems Engrg. 63(2):79–82.
Beane M (2019) Shadow learning: Building robotic surgical skill
when approved means fail. Admin. Sci. Quart. 64(1):87–123.
Beane M, Orlikowski WJ (2015) What difference does a robot make?
The material enactment of distributed coordination. Organ. Sci.
26(6):1553–1573.
Figure A.5. Typical Display of the Digital Mammogram Images Used for Breast Cancer Diagnosis
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
20 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
Bechky B (2003a) Object lessons: Workplace artifacts as represen-
tations of occupational jurisdiction. Amer. J. Sociol. 109(3):
720–752.
Bechky B (2003b) Sharing meaning across occupational communi-
ties: The transformation of understanding on a production
floor. Organ. Sci. 14(3):312–330.
Bechky BA (2020) Evaluative spillovers from technological change:
The effects of “DNA Envy”on occupational practices in foren-
sic science. Admin. Sci. Quart. 65(3):606–643.
Beer D (2017) The social power of algorithms. Inform. Comm. Soc.
20(1):1–13.
Benbya H, Pachidi S, Jarvenpaa S (2021) Special issue editorial: Arti-
ficial intelligence in organizations: Implications for information
systems research. J. Assoc. Inform. Systems,https://aisel.aisnet.
org/jais/vol22/iss2/10.
Bird S, Dud´
ık M, Edgar R, Horn B, Lutz R, Milan V, Sameki M,
Wallach H, Walker K (2020) Fairlearn: A toolkit for assess-
ing and improving fairness in AI. Report, Microsoft, Red-
mond, WA.
Boyaci T, Canyakmaz C, deVericourt F (2020) Human and Machine:
The Impact of Machine Input on Decision-Making Under Cognitive
Limitations (Social Science Research Network, Rochester, NY).
Brynjolfsson E, McAfee A (2014) The Second Machine Age: Work, Pro-
gress, and Prosperity in a Time of Brilliant Technologies (W. W.
Norton & Company, New York).
Brynjolfsson E, Mitchell T (2017) What can machine learning do?
Workforce implications. Science 358(6370):1530–1534.
Burrell J (2016) How the machine ‘thinks’: Understanding opacity in
machine learning algorithms. Big Data Soc.,https://doi.org/10.
1177/2053951715622512.
Caplan R, Donovan J, Hanson L, Matthews J (2018) Algorithmic
accountability: A primer (Data & Society). Accessed
January 10, 2020, https://datasociety.net/library/algorithmic-
accountability-a-primer/.
Carlile PR (2004) Transferring, translating, and transforming: An in-
tegrative framework for managing knowledge across bound-
aries. Organ. Sci. 15(5):555–568.
Charmaz K (2014) Constructing Grounded Theory (Sage, Thousand
Oaks, CA).
Christin A (2020) The ethnographer and the algorithm: Beyond the
black box. Theory Soc. 49(5):897–918.
Crawford K, Dobbe R, Dyer T, Fried G, Green B, Kaziunas E, Kak
A, et al. (2019) AI Now 2019 Report (AI Now Institute, New
York).
Cremer DD, Kasparov G (2021) AI should augment human intelligence,
not replace it. Harvard Bus. Rev. (March 18), https://hbr.org/2021/
03/ai-should-augment-human-intelligence-not-replace-it.
Cukier K, Mayer-Schonberger V, De Vericourt F (2021) Framers: Hu-
man Advantage in an Age of Technology and Turmoil (Dutton,
New York).
Daugherty PR, Wilson HJ (2018) Human +Machine: Reimagining Work
in the Age of AI (Harvard Business Press, Cambridge, MA).
Davenport TH, Kirby J (2016) Only Humans Need Apply: Winners
and Losers in the Age of Smart Machines (HarperBusiness,
New York).
Diakopoulos N (2020) Transparency. Dubber M, Pasquale F, Das S,
eds. The Oxford Handbook of Ethics in AI (Oxford University
Press, Oxford, United Kingdom), 197–214.
Dodgson M, Gann DM, Salter A (2007) “In case of fire, please use
the elevator”: Simulation technology and organization in fire
engineering. Organ. Sci. 18(5):849–864.
Domingos P (2015) The Master Algorithm: How the Quest for the Ulti-
mate Learning Machine Will Remake Our World, 1st ed. (Basic
Books, New York).
Dourish P (2016) Algorithms and their others: Algorithmic culture
in context. Big Data Soc.,https://doi.org/10.1177/20539517166
65128.
Erickson I, Robert L, Crowston K, Nickerson J (2018) Workshop:
Work in the Age of Intelligent Machines. GROUP ’18 Proc. 20th
ACM Internat. Conf. Supporting Groupwork (Sundial Island, FL),
359–361.
Faraj S, Pachidi S, Sayegh K (2018) Working and organizing in the
age of the learning algorithm. Inform. Organ. 28(1):62–70.
Fern´
andez-Lor´
ıa C, Provost F, Han X (2020) Explaining data-driven
decisions made by AI systems: The counterfactual approach. Pre-
print, submitted January 21, https://arxiv.org/abs/2001.07417v1.
Gao R, Saar-Tsechansky M, De-Arteaga M, Han L, Lee MK, Lease
M (2021) Human-AI collaboration with bandit feedback. Pre-
print, submitted May 22, https://arxiv.org/abs/2105.10614.
Gherardi S (2000) Practice-based theorizing on learning and know-
ing in organizations. Organization 7(2):211–223.
Gillespie T (2014) The relevance of algorithms. Gillespie T, Bocz-
kowski PJ, Foot KA, eds. Media Technologies: Essays on Communi-
cation, Materiality, and Society (MIT Press, Cambridge, MA),
167–194.
Gkeredakis M, Lifshitz-Assaf H, Barrett M (2021) Crisis as opportu-
nity, disruption and exposure: Exploring emergent responses to
crisis through digital technology. Inform. Organ. 31(1):100344.
Glaser B, Strauss A (1967) Discovering Grounded Theory (Aldine Pub-
lishing Company, Chicago).
Glikson E, Woolley AW (2020) Human trust in artificial intelli-
gence: Review of empirical research. Acad. Management Ann.
14(2):627–660.
Grady D (2019) A.I. took a test to detect lung cancer. It got an A.
New York Times (May 20), https://www.nytimes.com/2019/05/
20/health/cancer-artificial-intelligence-ct-scans.html.
Griffin M, Grote G (2020) When is more uncertainty better? A mod-
el of uncertainty regulation and effectiveness. Acad. Management
Rev. 45(4):745–765.
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi
D (2018) A survey of methods for explaining black box models.
ACM Comput. Surveys 51(5):93:1–93:42.
Hansen HK, Flyverbom M (2015) The politics of transparency
and the calibration of knowledge in the digital age. Organi-
zation 22(6):872–889.
Haraway D (1988) Situated knowledges: The science question in
feminism and the privilege of partial perspective. Feminist Stud.
14(3):575–599.
Hardy C, Lawrence TB, Grant D (2005) Discourse and collaboration:
The role of conversations and collective identity. Acad. Manage-
ment Rev. 30(1):58–77.
Hooker S, Erhan D, Kindermans PJ, Kim B (2019) A benchmark for
interpretability methods in deep neural networks. Preprint,
submitted November 5, https://arxiv.org/abs/1806.10758.
Howard-Grenville J, Nelson AJ, Earle A, Haack J, Young D (2017)
“If chemists don’t do it, who is going to?”Peer-driven occupa-
tional change and the emergence of green chemistry. Admin.
Sci. Quart. 62(3):524–560.
Kaur H, Nori H, Jenkins S, Caruana R, Wallach H, Wortman
Vaughan J (2020) Interpreting interpretability: Understanding
data scientists’use of interpretability tools for machine learn-
ing. Proc. 2020 CHI Conf. Human Factors Comput. Systems,Hono-
lulu (Association for Computing Machinery, New York), 1–14.
Kellogg K, Valentine M, Christin A (2019) Algorithms at work: The
new contested terrain of control. Acad. Management Ann. 14(1):
366–410.
Khadpe P, Krishna R, Fei-Fei L, Hancock JT, Bernstein MS (2020)
Conceptual metaphors impact perceptions of human-AI collab-
oration. Proc. ACM Human Comput. Interactions, 163:1–163:26.
Kogut B, Zander U (1992) Knowledge of the firm, combinative ca-
pabilities, and the replication of technology. Organ. Sci. 3(3):
383–397.
Lebovitz S, Levina N, Lifshitz-Assaf H (2021) Is AI ground truth re-
ally “true”? The dangers of training and evaluating AI tools
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 21
based on experts’know-what. Management Inform. Systems
Quart. 45(3):1501–1525.
Leonardi P (2011) When flexible routines meet flexible technologies: Af-
fordance, constraint, and the imbrication of human and material
agencies. Management Inform. Systems Quart. 35(1):147–167.
Leonardi PM, Bailey DE (2008) Transformational technologies and
the creation of new work practices: Making implicit knowledge
explicit in task-based offshoring. Management Inform. Systems
Quart. 32(2):411–436.
Leonardi P, Barley S (2010) What’s under construction here? Social
action, materiality, and power in constructivist studies of tech-
nology and organizing. Acad. Management Ann. 4(1):1–51.
Leonardi PM, Treem JW (2020) Behavioral visibility: A new para-
digm for organization studies in the age of digitization, digitali-
zation, and datafication. Organ. Stud. 41(12):1601–1625.
Levina N (2005) Collaborating on multiparty information systems
development projects: A collective reflection-in-action view. In-
form. Systems Res. 16(2):109–130.
Levina N, Vaast E (2005) The emergence of boundary spanning
competence in practice: Implications for implementation and
use of information systems. Management Inform. Systems Quart.
29(2):335–363.
Lifshitz-Assaf H (2018) Dismantling knowledge boundaries at
NASA: The critical role of professional identity in open innova-
tion. Admin. Sci. Quart. 63(4):746–782.
Lifshitz-Assaf H, Lebovitz S, Zalmanson L (2021) Minimal and
adaptive coordination: How hackathons’projects accelerate in-
novation without killing it. Acad. Management J. 64(3):684–715.
Maguire S, Hardy C, Lawrence TB (2004) institutional entrepre-
neurship in emerging fields: HIV/AIDS treatment advocacy
in Canada. Acad. Management J. 47(5):657–679.
Mazmanian M, Cohn M, Dourish P (2014) Dynamic reconfiguration
in planetary exploration: A sociomaterial ethnography. Manage-
ment Inform. Systems Quart. 38(3):831–848.
Mazmanian M, Orlikowski W, Yates J (2013) The autonomy para-
dox: The implications of mobile email devices for knowledge
professionals. Organ. Sci. 24(5):1337–1357.
Mol A (2003) The Body Multiple (Duke University Press, Durham,
NC).
Moran G (2018) This artificial intelligence won’ttakeyourjob,it
will help you do it better. Fast Company (October 24), https://
www.fastcompany.com/90253977/this-artificial-intelligence-
wont-take-your-job-it-will-help-you-do-it-better.
Mukherjee S (2017) A.I. Vs. M.D. New Yorker (March 27), https://
www.newyorker.com/magazine/2017/04/03/ai-versus-md.
NelsonAJ,IrwinJ(2014)“Defining what we do—all over again”:
Occupational Identity, technological change, and the librari-
an/internet-search relationship. Acad. Management J. 57(3):
892–928.
Nunn J (2018) How AI Is Transforming HR Departments. Forbes
(May 9), https://www.forbes.com/sites/forbestechcouncil/
2018/05/09/how-ai-is-transforming-hr-departments/.
Orlikowski W (1992) The duality of technology: Rethinking the con-
cept of technology in organizations. Organ. Sci. 3(3):398–427.
Orlikowski W (2000) Using technology and constituting structures:
A practice lens for studying technology in organizations. Organ.
Sci. 11(4):404–428.
Orlikowski W (2007) Sociomaterial practices: Exploring technology
at work. Organ. Stud. 28(9):1435–1448.
Orlikowski W, Scott S (2008) Sociomateriality: Challenging the sepa-
ration of technology, work and organization. Acad. Management
Ann. 2(1):433–474.
Pachidi S, Berends H, Faraj S, Huysman M (2021) Make way for the
algorithms: Symbolic actions and change in a regime of know-
ing. Organ. Sci. 32(1):18–41.
Packard MD, Clark BB (2020) On the mitigability of uncertainty and
the choice between predictive and nonpredictive strategy. Acad.
Management Rev. 45(4):766–786.
Pasquale F (2015) The Black Box Society: The Secret Algorithms That
Control Money and Information (Harvard University Press,
Cambridge, MA).
Pearl J, Mackenzie D (2018) The Book of Why: The New Science of
Cause and Effect, 1st ed. (Basic Books, New York)
Pinch T, Bijker W (1987) The social construction of facts and arti-
facts: Or how the sociology of science and the sociology of tech-
nology might benefit each other. Hughes TP, Bijker W, Pinch T,
eds. The Social Construction of Technological Systems: New Direc-
tions in the Sociology and History of Technology (MIT Press, Cam-
bridge, MA), 17–50.
Polanyi M (1958) Personal Knowledge: Toward a Post-Critical Philosophy
(University of Chicago Press, Chicago).
Polanyi M (1966) The Tacit Dimension (University of Chicago Press,
Chicago).
Puranam P (2021) Human–AI collaborative decision-making as an
organization design problem. J. Organ. Design 10(2021):75–80.
Rai A, Constantinides P, Sarker S (2019) Editor’s comments: Next-
generation digital platforms: Toward human–AI hybrids. Man-
agement Inform. Systems Quart. 43(1):iii–ix.
Raisch S, Krakowski S (2021) Artificial intelligence and manage-
ment: The automation–augmentation paradox. Acad. Manage-
ment Rev. 46(1):192–210.
Razorthink Inc. (2019) 4 major challenges facing fraud detection;
ways to resolve them using machine learning. Medium (April
25), https://medium.com/razorthink-ai/4-major-challenges-fac
ing-fraud-detection-ways-to-resolve-them-using-machine-
learning-cf6ed1b176dd.
Recht M, Bryan RN (2017) Artificial intelligence: Threat or boon to
radiologists? J. Amer. College Radiology 14(11):1476–1480.
Rindova V, Courtney H (2020) To shape or adapt: Knowledge prob-
lems, epistemologies, and strategic postures under Knightian
uncertainty. Acad. Management Rev. 45(4):787–807.
Roberts M, Driggs D, Thorpe M, Gilbey J, Yeung M, Ursprung S,
Aviles-Rivero AI, et al. (2021) Common pitfalls and recommen-
dations for using machine learning to detect and prognosticate
for COVID-19 using chest radiographs and CT scans. Nature
Machine Intelligence 3(3):199–217.
Rudin C (2019) Stop explaining black box machine learning models
for high stakes decisions and use interpretable models instead.
Nature Machine Intelligence 1(5):206–215.
Samek W, Montavon G, Vedaldi A, Hansen LK, M¨
uller KR, eds.
(2019) Explainable AI: Interpreting, Explaining and Visualizing
Deep Learning (Springer Nature, Cham, Switzerland).
Sch¨
on DA (1983) The Reflective Practitioner: How Professionals Think in
Action (Basic Books, New York).
Scott SV, Orlikowski WJ (2012) Reconfiguring relations of account-
ability: Materialization of social media in the travel sector.
Accounting Organ. Soc. 37(1):26–40.
Scott S, Orlikowski W (2014) Entanglements in practice: Performing
anonymity through social media. Management Inform. Systems
Quart. 38(3):873–893.
Seamans R, Furman J (2019) AI and the economy. Innovation Policy
Econom. 19(1):161–191.
Simonite T (2018) Google’s AI guru wants computers to think more
like brains. Wired Magazine (December 12), https://www.wired.
com/story/googles-ai-guru-computers-think-more-like-brains/.
Spradley (1979) The Ethnographic Interview (Holt, Rinehart and Win-
ston, New York).
Stohl C, Stohl M, Leonardi PM (2016) Managing opacity: Informa-
tion visibility and the paradox of transparency in the digital
age. Internat.J.Comm.10(2016):123–137.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
22 Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS
Suchman L (2007) Human-Machine Reconfigurations: Plans and Situat-
ed Actions (Cambridge University Press, Cambridge, United
Kingdom).
Teodorescu M, Morse L, Awwad Y, Kane G (2021) Failures of fair-
ness in automation require a deeper understanding of
human–ML augmentation. Management Inform. Systems Quart.
45(3b):1483–1499.
Tripsas M (2009) Technology, identity, and inertia through the lens
of “the digital photography company.”Organ. Sci. 20(2):441–460.
Turco CJ (2016) The Conversational Firm: Rethinking Bureaucracy in the
AgeofSocialMedia(Columbia University Press, New York).
Van Den Broek E, Sergeeva A, Huysman M (2021) When the ma-
chine meets the expert: An ethnography of developing AI for
hiring. Management Inform. Systems Quart. 45(3):1557–1580.
Van Maanen J (1988) Tales of the Field: On Writing Ethnography,2nd
ed. (University of Chicago Press, Chicago).
von Krogh G (2018) Artificial intelligence in organizations: New op-
portunities for phenomenon-based theorizing. Acad. Manage-
ment Discoveries 4(4):404–409.
Waardenburg L, Sergeeva A, Huysman M (2018) Hotspots and
blind spots. Schultze U, Aanestad M, M¨
ahring M, Østerlund
C, Riemer K, eds. Living with Monsters? Social Implication of
Algorithmic Phenomena, Hybrid Agency, and the Permativity of
Technology, IFIP Advances in Information and Communication
Technology (Springer International Publishing, Cham, Switzer-
land), 96–109.
Wagner EL, Moll J, Newell S (2011) Accounting logics, reconfigura-
tion of ERP systems and the emergence of new accounting
practices: A sociomaterial perspective. Management Accounting
Res. 22(3):181–197.
Wagner E, Newell S, Piccoli G (2010) Understanding project surviv-
al in an ES environment: A sociomaterial practice perspective.
J. Assoc. Inform. Systems 11(5):276–297.
Watkins EA (2020) Took a pic and got declined, vexed and per-
plexed: Facial recognition in algorithmic management. 2020
Comput. Supported Cooperative Work Social Comput. (Association
for Computing Machinery, New York), 177–182.
Wilson HJ, Daugherty PR (2018) Collaborative intelligence: Humans
and AI are joining forces. Harvard Bus. Rev. (July 1), https://hbr.
org/2018/07/collaborative-intelligence-humans-and-ai-are-
joining-forces.
Xu F, Uszkoreit H, Du Y, Fan W, Zhao D, Zhu J (2019) Explainable
AI: A brief survey on history, research areas, approaches and
challenges. Tang J, Kan MY, Zhao D, Li S, Zan H, eds. Natural
Language Processing and Chinese Computing, Lecture Notes in Com-
puter Science (Springer International Publishing, Cham, Switzer-
land), 563–574.
Zhang D, Mishra S, Brynjolfsson E, Etchemendy J, Ganguli D, Grosz
B, Lyons T, et al. (2021) The AI index 2021 annual report. Re-
port, AI Index Steering Committee, Human-Centered AI Insti-
tute, Stanford University, Stanford, CA.
Zuboff S (2015) Big other: Surveillance capitalism and the prospects
of an information civilization. J. Inform. Tech. 30(1):75–89.
Sarah Lebovitz received her PhD from New York Univer-
sity’s Stern School of Business and is an assistant professor
at the University of Virginia’s McIntire School of Commerce.
Her current research investigates how new technologies are
adopted in organizations and how they impact professionals
and their knowledge work practices. She studies how AI
tools are evaluated and used in consequential decision mak-
ing and how accelerating technologies are transforming in-
novation processes.
Hila Lifshitz-Assaf is an associate professor at New York
University’s Stern School of Business and a faculty associate at
Harvard University’s Laboratory for Innovation Science. Her
research focuses on the microfoundations of scientific and tech-
nological innovation and knowledge creation processes in the
digital age. She earned her doctorate from Harvard Business
School. She won a prestigious award from the National Science
Foundation for inspiring cross-disciplinary research.
Natalia Levina is a Toyota Motors Corp Term Professor
of Information Systems at New York University’sStern
School of Business and a part-time research environment
professor at the Warwick Business School. Her main re-
search interests focus on how people span organizational,
professional, cultural, and other boundaries while develop-
ing and using new technologies. Her current work explores
AI adoption in professional work, open innovation, block-
chain, and firm-crowd relationships.
Lebovitz, Lifshitz-Assaf, and Levina: To Engage or Not to Engage with AI
Organization Science, Articles in Advance, pp. 1–23, © 2022 INFORMS 23