ArticlePDF AvailableLiterature Review

Co-designing diagnosis: Towards a responsible integration of Machine Learning decision-support systems in medical diagnostics

Authors:

Abstract

Rationale: This paper aims to show how the focus on eradicating bias from Machine Learning decision-support systems in medical diagnosis diverts attention from the hermeneutic nature of medical decision-making and the productive role of bias. We want to show how an introduction of Machine Learning systems alters the diagnostic process. Reviewing the negative conception of bias and incorporating the mediating role of Machine Learning systems in the medical diagnosis are essential for an encompassing, critical and informed medical decision-making. Methods: This paper presents a philosophical analysis, employing the conceptual frameworks of hermeneutics and technological mediation, while drawing on the case of Machine Learning algorithms assisting doctors in diagnosis. This paper unravels the non-neutral role of algorithms in the doctor's decision-making and points to the dialogical nature of interaction not only with the patients but also with the technologies that co-shape the diagnosis. Findings: Following the hermeneutical model of medical diagnosis, we review the notion of bias to show how it is an inalienable and productive part of diagnosis. We show how Machine Learning biases join the human ones to actively shape the diagnostic process, simultaneously expanding and narrowing medical attention, highlighting certain aspects, while disclosing others, thus mediating medical perceptions and actions. Based on that, we demonstrate how doctors can take Machine Learning systems on board for an enhanced medical diagnosis, while being aware of their non-neutral role. Conclusions: We show that Machine Learning systems join doctors and patients in co-designing a triad of medical diagnosis. We highlight that it is imperative to examine the hermeneutic role of the Machine Learning systems. Additionally, we suggest including not only the patient, but also colleagues to ensure an encompassing diagnostic process, to respect its inherently hermeneutic nature and to work productively with the existing human and machine biases.
ORIGINAL PAPER
Co-designing diagnosis: Towards a responsible integration
of Machine Learning decision-support systems in
medical diagnostics
Olya Kudina PhD
1
| Bas de Boer PhD
2
1
Department of Values, Technology &
Innovation, Section on Ethics and Philosophy
of Technology, Delft University of Technology,
Delft, the Netherlands
2
Philosophy Department, University of
Twente, Enschede, the Netherlands
Correspondence
Olya Kudina, Department of Values,
Technology & Innovation, Section on Ethics
and Philosophy of Technology, Delft
University of Technology, Building
31, Jaffalaan 5, 2628 BX Delft, the
Netherlands.
Email: o.kudina@tudelft.nl
Funding information
H2020 European Research Council, Grant/
Award Number: 788321; 4TU Pride and
Prejudice project under the High Tech for a
Sustainable Future programme
Abstract
Rationale: This paper aims to show how the focus on eradicating bias from Machine
Learning decision-support systems in medical diagnosis diverts attention from the
hermeneutic nature of medical decision-making and the productive role of bias. We
want to show how an introduction of Machine Learning systems alters the diagnostic
process. Reviewing the negative conception of bias and incorporating the mediating
role of Machine Learning systems in the medical diagnosis are essential for an
encompassing, critical and informed medical decision-making.
Methods: This paper presents a philosophical analysis, employing the conceptual frame-
works of hermeneutics and technological mediation, while drawing on the case of Machine
Learning algorithms assisting doctors in diagnosis. This paper unravels the non-neutral role
of algorithms in the doctor's decision-making and points to the dialogical nature of interac-
tion not only with the patients but also with the technologies that co-shape the diagnosis.
Findings: Following the hermeneutical model of medical diagnosis, we review the notion
of bias to show how it is an inalienable and productive part of diagnosis. We show how
Machine Learning biases join the human ones to actively shape the diagnostic process,
simultaneously expanding and narrowing medical attention, highlighting certain aspects,
while disclosing others, thus mediating medical perceptions and actions. Based on that,
we demonstrate how doctors can take Machine Learning systems on board for an
enhanced medical diagnosis, while being aware of their non-neutral role.
Conclusions: We show that Machine Learning systems join doctors and patients in
co-designing a triad of medical diagnosis. We highlight that it is imperative to exam-
ine the hermeneutic role of the Machine Learning systems. Additionally, we suggest
including not only the patient, but also colleagues to ensure an encompassing diag-
nostic process, to respect its inherently hermeneutic nature and to work productively
with the existing human and machine biases.
KEYWORDS
hermeneutics, Machine Learning, medical diagnosis, technological mediation
The authors have contributed equally to the development of this article.
Received: 8 September 2020 Revised: 15 December 2020 Accepted: 28 December 2020
DOI: 10.1111/jep.13535
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any
medium, provided the original work is properly cited and is not used for commercial purposes.
© 2021 The Authors. Journal of Evaluation in Clinical Practice published by John Wiley & Sons Ltd
J Eval Clin Pract. 2021;18. wileyonlinelibrary.com/journal/jep 1
1|INTRODUCTION
At the beginning of the 2010s, Drew et al asked a group of 24 radiolo-
gists to perform a familiar lung nodule detection task. The radiologists
were asked to search for nodules on CT-scans in which, unbeknown
to the doctors, a 29×50 mm image of a gorilla had been included.
Strikingly, 83% of the radiologists did not notice the gorilla, even
though that eye-tracking revealed that most of the radiologists who
missed the gorilla looked directly at the place where it was located. In
the psychological literature, this phenomenon is known as
inattentional blindness during which attention to a particular task
makes us blind to other salient phenomena. This suggests that expert
radiologists search for particular anomalies located at particular places
in the lungs, such that unexpected anomalies at unexpected locations
might go unnoticed, potentially with severe medical consequences.
1
While bias is no stranger to medical encounters of doctors with
patients, it is not clear what happens when it is coupled with the
introduction of Machine Learning (ML) algorithms in assisting medical
diagnosis.
One of the central promises of the use of ML in medical diagnos-
tics is that it will make medical diagnoses more objective by eliminat-
ing forms of human bias.
2
Bias might be caused by deficiencies
inherent to human perception such as discussed in the example
above, but also of other biases arising in doctor-patient relations,
which might be caused by prejudices on the side of the doctor.
3
Because of this, so it is postulated, ML diagnostic systems will be a
significant improvement to human capabilities in clinical decision-
making in terms of diagnosis, prognosis, and treatment
recommendation.
4(p3)
In sum, the introduction of ML systems in medi-
cal diagnostics is often presented as an important augmentation to, or
even as threatening to replace, human medical expertise.
In this paper, we critically scrutinize the promise of ML in diag-
nostic practice by drawing attention to its relationship with medical
expertise. First, we briefly discuss the ethical issues often discussed in
relation to the introduction of ML into healthcare broadly conceived.
Second, we flesh out a hermeneutic understanding of medical exper-
tise and the diagnostic process, in which biases have a productive,
rather than a distorting role. Third, we make clear how ML can be
understood as mediating the hermeneutic process through which a
medical diagnosis is established. On the basis of this, we suggest that
the introduction of ML systems in medical diagnostics should not be
framed as requiring to make a choice for either the expertise of clini-
cians or the alleged objectivity of ML systems. Finally, we offer some
starting points for how ML can be seen as a dialogical partner for
medical experts.
2|FROM BIAS TO THE QUESTION OF
EXPERTISE
Since ML systems are often presented as a solution to the problem of
bias, it should not come as a surprise that both developers of ML sys-
tems and doctors that critically reflect on ML search for biases that
might be present in algorithms used to make medical diagnoses. When
ML systems also suffer from biases, they effectively undermine the
promise of developing a more objective way of clinical decision-mak-
ing. For example, the data-sets on which ML systems rely might be
biased towards particular healthcare systems, as was the case when
IBM launched Watson for Oncology. This assistive system was based
on data collected in the American healthcare system, having a bias
towards specific ways of drug-prescription that are deemed normal in
the USA, but did not align with cultures of drug-prescription in other
countries, such as Taiwan.
5
Furthermore, existing datasets typically
exist for medical problems suffered by white men, leading to a poor
performance rate when applied to other groups, such as younger
black women.
4(p5)
For ML systems to live up to their promise of objec-
tivity, it is thus crucial to identify and eliminate such biases in
datasets.
This also explains the centrality of another concern: the opacity
of the algorithms on the basis of which clinical decisions are made.
Algorithms can be opaque to users because they lack the appropriate
training allowing to understand how the algorithm comes to a certain
diagnosis, or when it is inherent to the design of the algorithm that its
workings are not intelligible to humans. In both cases, opacity ham-
pers the possibility of detecting potential biases. And if the opaque-
ness of algorithms can indeed not be circumvented, then also their
potential to make medical diagnoses more objective by eliminating
bias cannot be properly assessed. As a result, researchers are worried
about the potentially ethically problematic outcomes that can be
expected when ML systems are constructed as black boxes, making it
more likely that problems such as the ones mentioned in the previous
paragraph might remain unnoticed.
6
The focus on bias of clinicians and developers is to a large extent
mirrored in policy documents discussing the impact of ML, in
healthcare and beyond.
7-10
Some of the frequently discussed risks
concern the individual harm that can be induced due to the algorithms
that make decisions about treatment on the basis of biased datasets
or the unfair advantage that people that are represented in (biased)
datasets have over the ones that are typically under-
represented.
11(pp23-24)
In order to avoid such biases and to prevent
harm and unfairness, it is often stressed that algorithms should be
designed in accordance with principles of transparency and/or
explainability.
12
The opacity of ML systems is especially concerning since clini-
cians reportedly tend to find it challenging to counter algorithm-based
judgements and provide independent diagnoses or suggestions for
treatment, affecting how they value their own judgements.
13
As a
consequence, if the decisions of ML systems are biased, then it seems
likely that these biases are reproduced or reified due to them
remaining effectively unchallenged.
7(p181)
Insofar as ethical discus-
sions take objectivity (or the absence thereof) in clinical decision-
making to be the central issue at stake, the negative impact of bias
must be a focal point, as it is this issue that makes it that ML systems
cannot live up to their promises.
However, more recently, ethical discussions on the use of ML in
clinical decision-making have started to address ethical concerns
2KUDINA AND de BOER
related to the introduction of ML beyond the narrow focus on bias.
4
In fact, so it is argued, the belief that algorithms arein contrast with
human beingsharbours of objectivity is a carefully crafted
myth.
4(p4),14
While algorithms might outperform humans when it
comes to pattern recognition, their ability to attach meaning to pat-
terns or make inferences on the basis of them remains unclear.
2
In
one way or another, this suggests that instead of speaking of a com-
petition between humans and ML systems, discussions about how to
integrate ML in healthcare practices should be augmented through
exploring what kind of collaborations between doctors and ML sys-
tems are desirable.
15
For instance, in the field of mental healthcare,
physicians and patients engage in developing ML systems in the
patients' smartphones for onset symptoms detection.
16
In pathology,
collaborative efforts take place to design diagnostic AI assistant that
capitalizes on the mental models of the clinicians, while utilizing opti-
mization techniques of ML systems.
17
Radiologists propose strate-
gies on how to practically integrate ML systems for collaboration in
the work practice: while they can remove the workload by taking on
normal examinations (eg, head CTs or MRIs for headache), the cur-
rent business strategies do not allow integrating the input of ML sys-
tems in the administrative flows or reimbursement schemes.
18
While
evidence on including ML systems as collaborators continues to sur-
face, the early practice-driven efforts already hint at the adjustments
to the healthcare process and the reconfiguration of the medical pro-
fession
19
that the recognition of ML systems as collaborators
requires.
Insofar as a medical diagnosis is concerned with the interpre-
tation of the patient and her health status, ethical discussions that
narrowly conceive of an ethics of AI as an ethics of bias might
neglect the way ML systems shape medical expertise. After all, if
clinical decision-making is more than simple pattern recognition
and requires another form of expertise, it is crucial to explore
what this expertise is, and in what sense ML systems might con-
tribute to it. This we will do in the next two sections of this
paper.
3|EXPERTISE AND MEDICAL DIAGNOSIS:
GADAMER'S NOTION OF FORE-
UNDERSTANDING
Recently, it has been argued that the demands of transparency and
explainabilitywhile importanthold ML algorithms to an unrealisti-
cally high standard [...], possibly owing to an unrealistically high
estimate of the degree of transparency attainable from human
decision-makers.
20(p662)
Regardless of whether it is justified that the
standards we set for ML are exceptionally high, Zerili et al importantly
point to the need to clarify what we take medical (or diagnostic)
expertise to be, and if and how it can be outperformed by ML. In
other words, a discussion about the potential biases in ML systems
must be informed by a discussion about what we consider good
human forms of decision-making
21
and the nature of expertise
exercised by clinicians.
Recently, Grote and Berens argued that the use of ML in diagnos-
tic practice changes the epistemic conditions under which medical
expertise is exercised.
22
They note that medical diagnosis is often not
a solitary activity of a clinician, but one that also involves discussions
with other clinicians that function as peers to diagnostic judgements.
The peers offer epistemic import that might support or criticize a cer-
tain judgement, making diagnostic expertise effectively distributed
among different individuals. These different individuals can engage in
a dialogue each providing reasons for or against a certain diagnosis,
and this dialogical process eventually will improve the diagnostic
process.
22(p207)
Within such diagnostic processes, clinicians use all
kinds of technologies (eg, imaging technologies) that influence, and
might support their judgements, making those also a crucial part of
diagnostics already.
23
However, what is crucially different about the
involvement of ML as a diagnostic peer is thatinsofar the inferences
it makes cannot be articulatedclinicians are unable to judge whether
or not their import is epistemically credible.
22(p207)
The idea that medical diagnoses presuppose some form of situ-
ated or distributed expertise nicely illustrates the uncertainties that
ML might introduce into diagnostic practices. However, and this is
what they seemingly have in common with developers of ML systems,
Grote and Berens
22
conceive of expertise as a form of knowledge that
is propositionally available to its bearer, such that the steps that one
makes to come to a certain diagnosis (a) can be reconstructed as a log-
ical argument, and (b) that this reconstruction adequately represents
the expertise exercised to come to a diagnosis. Yet, and this is what
we will further clarify in this section, there might be another way of
thinking about expertise; one that conceives of it as a hermeneutic
process.
An image of a physician as an objective judge who weighs in
different concerns in a logical inductive manner and iteratively ver-
ifies the conclusions became dominant in medicine in the 19th
century.
24
This approach to diagnostics and medical expertise was
facilitated by the introduction of medical technologies such as a
stethoscope and X-Ray imaging. Medical tools facilitate the diag-
nostic process by providing a supposedly direct view into the body
of the patient through medical imaging and the quantitative repre-
sentation of bodily concerns. Leder challenged this model of diag-
nosis and expertise as untenable in view of the value-laden and
historically situated nature of both the physician and the patient,
as well as the tacit experiential knowledge that also shapes medical
expertise and resists quantification.
25
Instead, building on
Gadamer,
26
Leder puts forth a model of diagnosis as an inherently
hermeneutic enterprise.
The hermeneutic model of the clinical encounter suggests that
the doctor iteratively interprets the patient's symptoms and their visu-
alization by instruments against her own background knowledge and
experience to arrive at a diagnosis. The text to interpret here appears
in the integration of the bodily signals of the patient (the experiential
text), their stories combined with doctor's hypotheses (the narrative
text), the recorded results of the exams (the physical text) and the
instrumental input of graphs and numbers (the instrumental text).
25
The doctor reads these texts as an embodied individual bringing in her
KUDINA AND de BOER 3
own conceptual and experiential frameworks that incorporate tacit
knowledge and the relevant technological input. Medical diagnosis
and expertise are thus hermeneutic not by method, but ontologi-
cally and epistemologically.
24(p131)
Following Leder, [in] its attempt
to expunge interpretive subjectivity, modern medicine thus
threatens to expunge the subject [doctor and patient as the inter-
preters]. This can lead to an undermining of medicine's [] herme-
neutic telos.
25(p22)
The hermeneutic model of diagnosis and expertise helps to
reframe the nature and role of bias. Gadamer, whose work
inspired Leder's hermeneutic approach to medicine, understood
bias as a productive pre-judgement and fore-understanding that
starts the process of interpretation.
26
Such pre-judgements form
an effective history from which any act of interpretation departs
because these allow an entry into the mindset of another time,
place or object. Gadamer discards the modern negative meaning
of bias as prejudice and instead relies on its ancient meaning as
prior awareness or pre-judgement.
26(p273)
Also in medicine, bias
denotes the cumulative potential of the preconceptions, provi-
sional judgements and prejudices that direct a physician to the
patient and their illness, being an inalienable part of her herme-
neutic situatedness.
However, acknowledging the productive role of bias for medical
diagnosis and expertise does not mean that they are a matter of opin-
ion or preference. As mentioned earlier, medical diagnosis always also
presupposes following best practices of consensual validation with
colleagues and with an eye to instrumental decision support. Gadamer
similarly suggests that interpretation relies on making oneself aware
of own biases and how they direct us in viewing new phenomena,
even though it is never possible to fully expel them. Viewed as such,
bias appears as enabling clinicians to exercise expertise when coming
to a medical diagnosis rather than constituting a hindrance to clinical
interpretation: By acknowledging the interpretive nature of clinical
understanding, we leave behind the dream of a pure objectivity.
Where there is interpretation there is subjectivity, ambiguity, room
for disagreement.
25(p10)
A potential caveat to Gadamer's hermeneutics when applied to
the medical diagnosis is that it primarily concerns human bias. Becom-
ing aware of the productive role of bias in decision-making becomes
even more difficult when medical diagnosis concerns not just human
but also machine bias, for example, in ML algorithms. However, simul-
taneously Gadamer's hermeneutics points to the impossibility of erad-
icating bias, because it is an inalienable by-product of both human
engineers and designers that developed the AI-assisting decision-
support systems, the clinicians eventually using these systems, as well
as the ML systems themselves. Indeed, from the perspective of
Gadamer's hermeneutics, the very idea of asking algorithms to be
completely free of bias places far too high demands on them when
compared with human actors. Just as that Zerili et al have argued that
demands of algorithms to be fully transparent presuppose an unrealis-
tically high degree on the transparency available on human-decision
making,
20
the same can be said about the ability of humans to have
full access to their own biases and those of others. Put differently,
also human decision-making seems to be, from a hermeneutic point of
view, to a large degree opaque.
A hermeneutic perspective thus points to the need to antici-
pate and identify the productive role of ML in medical decision-
making and act responsibly in light of the non-neutral hermeneutic
role of algorithms, instead of focusing on expelling machine bias to
ensure the objectivity of the medical diagnosis. This can be done
by considering interactions between doctors and algorithms not in
the abstract, but as embedded in specific practices. In such prac-
tices, once a bias in algorithmic suggestions is noted, doctors can
start to identify its relevance within the intricacies of the case and
compare it against their experiences and those of their colleagues.
As will become clear below, this implies that ML systems should
not be treated as offering immediately actionable suggestions
before entering specific practices. In the next section, we show
how the philosophical approach of postphenomenology can be
helpful in this regard to reconceptualize the role of ML algorithms
as active mediators in medical encounters.
4|POSTPHENOMENOLOGY AND
MEDICAL DIAGNOSIS
In the previous section, we have argued that a medical diagnosis can
be fruitfully understood as a hermeneutic process in which doctors
and patients work together towards a medical diagnosis. Having
expertise in this process thus both involves a certain fore-
understanding of medical diseases and classifications, as well as the
capacity to match this knowledge with, and update it in light of the
patient's report and instrumental input. In this section, we make clear
how ML must be considered as mediating the hermeneutic process
through which medical diagnoses are established. To do so, we draw
on postphenomenology, an approach within the philosophy of tech-
nology concerned with how technologies shape the world to which
human beings relate.
27,28
From a postphenonomenological perspective, when people use
technologies, these always mediate human perceptions and actions in
view of their design and inherent scripts.
27
However, technologies
never fully determine how they are used because the totality of
human experiences and prior conceptions, coupled with the specific
sociocultural settings, productively inform specific technological medi-
ations. Verbeek calls this phenomenon the co-shaping of subjects
and objects
28
to designate that not only technological use and its
effects are influenced by specific users, but also the agency and sub-
jectivity of those users get shaped in relation to technologies at hand.
Viewed through the prism of the technological mediation approach,
ML decision-support systems are thus not passive providers of data
or neutral diagnostic instruments but actively take part in the diagnos-
tic process, both by providing hermeneutic input and by being a co-
interpreter alongside the doctors. ML decision-support systems thus
help to shape specific diagnostic pre-judgements and biases, making
the medical expertise not solely a human affair but one that is medi-
ated by technologies.
4KUDINA AND de BOER
ML-based decision-support systems significantly expand and
complicate the hermeneutic model of clinical encounter as put forth
by Leder.
25
In Leder's model, the doctor has to reconcile different
streams of information about the patient in an iterative way: the ones
from initial anamnesis, patient's account and examination, and the
others that appear on the screen of the decision-support system,
guided by numerical representations of lab results and correlations
with evidence-based treatments in similar patient histories. However,
as Tschandl et al
29
found in their empirical studies on the interaction
of clinicians with ML-based support for skin cancer diagnosis, the line
between supporting medical decisions and determining them may be
thin if not carefully reflected upon. The statistically ranked and at
times colour-coded manner in which ML systems visualize the results
and suggest treatments can change the doctor's mind regarding their
initial diagnosis.
5,29
Tschandl et al further found that the ML sugges-
tions helped less experienced specialists and general practitioners
improve the accuracy of their diagnosis by 26% by changing their ini-
tial diagnosis in favour of the one suggested by the ML system when
their initial diagnosis was not at least the second or third option
suggested by the ML system.
29
More experienced specialists, on the
contrary, insisted on their original diagnosis after checking the
suggested alternatives and which eventually turned out to be corre-
ct.
29(p4)
The experience and confidence of doctors when interpreting
and combining various stages of the diagnostic process were deter-
mining factors in an accurate diagnosis, whereby the ML suggestions
were perceived as alternatives to consider and verify the diagnosis, as
a matter of second opinion. Viewed through the technological media-
tion lens, the doctors acknowledged and scrutinized the productive
role of ML in a diagnostic process, making a decision a matter of
weighing in both inputs as an intersection between the interpretative
horizons of the doctor and the machine.
However, as Tschandl et al also note,
29(p4)
once the doctors gain
trust in the ML systems to help them reach a correct diagnosis, the
trust may lead them astray when the ML systems become faulty, for
example, tainted with biased datasets, applied to an unintended target
group or when under malicious attacks. This further challenges the
epistemic credibility of ML systems in medicine, as suggested by
Grote and Berens,
22
and in parallel strengthens their proposal about
introducing diagnostic soundboards in the form of peer panels when
ML systems are involved. The case of South Korean doctors as early
users of ML-based decision-support systems in cancer treatment sug-
gests that such collaborative diagnostic practices are possible and
helpful in reaching a correct diagnosis.
In South Korea, ML systems became involved in accompanying
the diagnosis starting from 2016 in several hospitals.
30
To maintain
diagnostic transparency and treat the ML system as a recommender
and not as a definitive judge, a team of at least five doctors, senior
and junior, would correlate the options suggested by the ML system
with their own ones to jointly reach a decision.
5
As a positive side-
effect, the open manner in which the ML system showed the diagnos-
tic data and the treatment options on a big screen on the wall levels
out the decision-making process. It allowed junior doctors to reflect
on the data in an open manner, debate the recommendation of the
ML system and the hypothesis of their senior colleagues and thus
level the hierarchy of the diagnostic process. Such a reflective and col-
laborative manner of introducing ML-based systems in medicine
explicitly addresses both human and machine biases within the itera-
tive diagnostic process: even though it does not offer a way to elimi-
nate machine bias, it can help productively integrating bias into
medical practices by creating the opportunity to compare what the
algorithm is offering against the expertise of a doctor and her col-
leagues. The South Korean case was supported by the recent findings
of Tschandl et al, demonstrating that aggregated AI-based multiclass
probabilities and crowd wisdom significantly increased the number of
correct diagnoses in comparison to individual [doctors] or AI in
isolation.
29(p4)
Viewed through the prism of technological mediation, ML-based
decision-support systems do not surround the doctor with a mute wall
of numbers and graphs but help to bring the real world in through
continuous feedback loops, learning and engagement with the tech-
nology and other doctors. As becomes visible in the examples dis-
cussed above, it does not seem productive to think of ML systems as
potential complete replacements of existing clinical practices, but
instead as potential collaborators that function within the collective
practice of coming to adequate diagnoses and treatment. ML systems
can thus be said to mediate what medical expertise is: an integral part
of it is being able to not consider the treatments and diagnoses
offered by ML systems as immediately actionable, but as something
to be integrated into collective diagnostic practices. Instead of
treating ML systems as black-boxes, medical expertise now also con-
sist of developing the ability to treat them as conversational partners
to enter into a dialogue with. This, then, requires to contrast a ML sys-
tem with one's own biases and treating it as an equally biased dialogi-
cal partner. When doing so, medical diagnosis that is accompanied by
ML systems becomes an even more nuanced hermeneutic enterprise
without blind trust either in the human expertise or in the machine's
suggestions. Potentially, this new way of diagnosing becomes less
individual and more team-based and where the effectiveness of diag-
nosis depends on not treating machines as competitors but as
collaborators.
5|DISCUSSION: HOW MACHINE
LEARNING RE-DISTRIBUTES EXPERTISE AND
CO-DESIGNS DIAGNOSIS
With the aid of the technological mediation approach, we showed
how decision-supporting ML systems change the hermeneutic process
through which medical diagnoses are made, as well as the role of
expertise when coming to a diagnosis. Important to highlight is that
this perspective implies that it is not needed to make a choice for
either the expertise of clinicians or the alleged objectivity of ML sys-
tems; a hermeneutic perspective in technological mediation reveals
that clinical expertise and ML systems are co-extensive. This implies
that we should recognize that ML systems and clinicians inevitably are
dialogical partners during the diagnostic process.
KUDINA AND de BOER 5
Tschandl et al have recently demonstrated how ML systems can
help doctors to identify better a specific type of skin lesions,
pigmented actinic keratoses.
29
Backward engineering the algorithmic
workings, Tschandl et al found that whereas the ML system focused
on the blemish as well as on the area around it, doctors tend to focus
only on the blemish itself. Expanding the area of attention allowed ML
systems to spot chronic UV damage surrounding the blemish, which
causes actinic keratosis. The researchers integrated this finding into
training medical resident students, whose accuracy in detecting actinic
keratoses consequently increased from 32,5% to 47,3%.
29(p4)
The
researchers suggest that learning from the ML systems helps expand
the areas of doctors' attention and highlights the value of human col-
laboration with ML systems.
This example suggests that a focus on human-machine collabora-
tion rather than competition can help to improve the accuracy of
medical diagnosis and expand the areas of medical attention. This
new form of collaboration should acknowledge the mediating non-
neutral import of ML systems. On the one hand, it shows that doctors
are notand never have beenalone in making medical decisions. On
the other hand, accounting for the productive role of ML systems in
doctor's decision-making dispels the idea of objectivity and de-biasing
in the medical practice, rather drawing attention to its inherently her-
meneutic nature. From this perspective, any collaboration between
clinicians and ML systems presupposes that medical expertise also
consists of being able to treat the latter as a conversational partner
(just as other team members) that does not offer immediately action-
able input, but instead as putting forward its own biases that can be
compared against the biases of other team members.
The technological mediation lens helps to expand Leder's herme-
neutic model of diagnosis with the active impact of technologies.
Highlighting the mediating role of ML systems in the medical diagno-
sis would help to make what Leder calls the hermeneutic telos
25(p22)
of medical decision-making more nuanced. It helps to bring the coher-
ent overview of the patient by preventing her experience from getting
lost in the troves of data by increasing opportunities for hermeneutic
dialogue with the patient, the colleagues and the machine. ML sys-
tems can contribute to the interpretative coherence, collaboration
and effectiveness of the diagnosis by confronting the doctor with
evidence-based alternative possibilities for diagnosis (which also miti-
gates physician's biases), and encouraging consultations with other
physicians to account for the inaccuracies in the ML systems and the
broader social factors that they miss (which additionally mitigates
machine's biases).
Doctor's participation in the development and/or tailoring of the
ML-based decision-support systems to their specific practice can
increase the diagnostic effectiveness. The visual way in which the ML
systems communicate the findings may present an undue influence in
the doctor's decisions, while not all ML support features are relevant
for the practice at hand.
29
As Tschandl et al suggest, the form of
machine support should be proportional to the task and the physicians
can effectively contribute to the joint development and tailoring of
the ML systems in medical practice.
29(pp2,4)
The increased interaction
between the doctors and the ML systems essentially transforms
medical diagnosis to a form of co-design, whereby all actors co-shape
each other.
While in this paper we focused on a diagnostic moment, our
research points to a further direction to explore in the future research:
how the technologically mediated diagnostic moment in parallel
shapes the medical infrastructure, for example, the doctor-nurses rela-
tions, the hidden costs of embedding AI technology in the hospital,
the hospital organization, etc. Bringing attention to the productive
nature of bias in medical diagnosis demonstrates that it is short-
sighted to consider the technological factor alone but to see it in the
systematic and sociocultural embedding.
Acknowledging the mediating role of ML systems in clinical
decision-making essentially points to a triad of diagnostic co-design:
an iterative hermeneutic process between doctors, patients and the
ML system. The quality of the interaction between the doctor and the
ML systems depends on examining the hermeneutic role of the tech-
nology, how it simultaneously expands and narrows medical attention,
highlights certain aspects, while disclosing others, thus mediating
medical perceptions and actions. Including not only the patient, but
also other colleagues in the process helps to ensure an encompassing
diagnostic process, to respect its inherently hermeneutic nature and
to work productively with existing human and machine biases. In this
paper, we have primarily focused on two parts of the triad of
co-design: doctors and the ML systems. While elsewhere we have dis-
cussed in more detail how ML might shape the relation between doc-
tors and patients,
31
a detailed analysis of this is beyond the scope of
the current paper. However, let us conclude with a few words on how
the understanding of medical expertise in the collaboration between
clinicians and ML systems can be used to think about the role of
patients in the diagnostic triad. It is argued that ML will reduce the
time the doctors spend on making diagnoses and searching for treat-
ments that the doctors can consequently redirect to the interaction
with patients.
2,32
Advocates of introducing ML in healthcare in gen-
eral, and in medical diagnostics in specific, allude to the objectivity of
ML as a means to make medical practice more human. Our analysis,
however, suggests that instead of understanding ML as a way to solve
such concerns, we should rather ask how it shapes medical expertise
and how it shapes the interactions between doctors and patients.
After all, the question of whether or not medical practice eventually
will become more human crucially depends on how ML shapes how
patients, the most important stakeholders in medical practice, are
made present.
One of the potential pitfalls of ML is that it bears the threat of turn-
ing the triad of diagnostic co-design into a dyad: since ML systems rely
and make decision on the basis of quantifiable datasets, they implicitly
present patients as data, and potentially move the patient's own narra-
tive and experiences to the background.
13
However, as we saw in
Leder's account,
25
this information is crucial for how doctors test their
hypotheses, and eventually come up with a diagnosis. Therefore, ML
places an extra demand on patients to be explicitly vocal about their
(medical) biography and personal context that otherwise remain invisible
to ML systems. It cannot be expected from every patient that she is
capable of doing so, which points to an important concern for doctors
6KUDINA AND de BOER
working with ML systems that should be a critical part of medical exper-
tise: the responsibility of ensuring that patients are able to narrate their
experiences and context is magnified, as well as the capability to con-
tinue integrating these narrations into the diagnostic triad. In other
words, it requires active work to keep the diagnostic triad intact and pre-
vent that patient experience disappear from view. From this perspective,
keeping medicine humanconsists of maintaining the existence of the
diagnostic triad between doctors, ML systems, and patients, rather than
eliminating it through an unrealistic pursuit for purified objectivity.
ACKNOWLEDGEMENTS
Olya Kudina's work on this paper has been supported financially by
the project Value Change that had received funding from the Euro-
pean Research Council (ERC) under the European Union's Horizon
2020 research and innovation programme under grant agreement No
788321. Bas de Boer's work on this paper has been supported finan-
cially by the project Pride and Prejudice that had received funding from
4TU under the High Tech for a Sustainable Future programme.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
ETHICAL APPROVAL
The research conducted in the paper did not involve any human
and/or animal participants.
DATA AVAILABILITY STATEMENT
Data sharing not applicable to this article as no datasets were gener-
ated or analysed during the current study.
ORCID
Olya Kudina https://orcid.org/0000-0001-5374-1687
Bas de Boer https://orcid.org/0000-0002-2009-2198
REFERENCES
1. Drew T, Vo MLH, Wolfe JM. The invisible gorilla strikes again:
sustained inattentional blindness in expert observers. Psychol Sci.
2013;24:1848-1853. https://doi.org/10.1177/0956797613479386.
2. Topol E. Deep Medicine. How Artificial Intelligence Can Make Healthcare
Human Again. New York: Basic Books; 2019.
3. O'Sullivan ED, Schofield SJ. Cognitive bias in clinical medicine. J R Coll
Physicians Edinb. 2018;48:225-232. https://doi.org/10.4997/JRCPE.
2018.306.
4. Morley J, Machado CCV, Burr C, et al. The ethics of AI in health care:
a mapping review. Soc Sci Med. 2020;260:113172. https://doi.org/10.
1016/j.socscimed.2020.113172.
5. Ross C, Swetlitz I. IBM pitched its Watson supercomputer as a revolu-
tion in cancer care. It's nowhere closeSTAT. September 5, 2017.
https://www.statnews.com/2017/09/05/watson-ibm-cancer/.
Accessed August 4, 2020.
6. Char DS, Shah NH, Magnus D. Implementing machine learning in
health care - addressing ethical challenges. N Engl J Med. 2018;378:
981-983. https://doi.org/10.1056/NEJMp1714229.
7. Schönberg D. Artificial intelligence in healthcare: a critical analysis of
the legal and ethical implications. Int J Law Inf Technol. 2019;27:171-
203. https://doi.org/10.1093/ijlit/eaz004.
8. Rowley Y, Turpin R, Walton S. The emergence of artificial intelligence
and machine learning algorithms in healthcare: recommendations to
support governance and regulation [Position paper]. BSI, Association
for Advancement of Medical Instrumentation; 2019. https://www.
bsigroup.com/globalassets/localfiles/en-gb/about-bsi/nsb/
innovation/mhra-ai-paper-2019.pdf. Accessed September 2, 2020.
9. Floridi L, Cowls J, Beltrametti M, et al. AI4Peoplean ethical frame-
work for a good AI society: opportunities, risks, principles, and recom-
mendations. Mind Mach. 2018;28:689-707.
10. Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guide-
lines. Nat Mach Intell. 2019;1:389-399.
11. Whittaker M, Crawford K, Dobbe R, et al. AI Now Report 2018. AI
Now Institute; 2018. https://ainowinstitute.org/AI_Now_2018_
Report.pdf. Accessed September 2, 2020.
12. Goodman B, Flaxman S. European Union regulations on algorithmic
decision-making and a Right to explanation.AI Mag. 2017;38:50-
57. https://doi.org/10.1609/aimag.v38i3.2741.
13. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of
machine learning in medicine. JAMA. 2017;318:517-518. https://doi.
org/10.1001/jama.2017.7797.
14. Gillespie T, Boczkowski PJ, Foot KA. Media Technologies: Essays on
Communication, Materiality, and Society. Cambridge, MA: The MIT
Press; 2014.
15. Topol EJ. High-performance medicine: the convergence of human
and artificial intelligence. Nat Med. 2019;25:44-56.
16. Torous J, Wisniewski H, Bird B, et al. Creating a digital health
smartphone app and digital phenotyping platform for mental health
and diverse healthcare needs: an interdisciplinary and collaborative
approach. J Technol Behav Sci. 2019;4:73-85.
17. Cai CJ, Winter S, Steiner D, Wilcox L, Terry M. Hello AI: uncovering
the onboarding needs of medical practitioners for Human-AI collabo-
rative decision-making. Paper presented at: Proceedings of the ACM
on Human-Computer Interaction; November 2019:104. https://doi.
org/10.1145/3359206.
18. Paul HY, Hui FK, Ting DS. Artificial intelligence and radiology: collab-
oration is key. J Am Coll Radiol. 2018;15:781-783.
19. McCoy LG, Nagaraj S, Morgado F, Harish V, Das S, Celi LA. What do
medical students actually need to know about artificial intelligence?
npj Digit Med. 2020;3:86.
20. Zerili J, Knott A, Maclaurin J, Gavaghan C. Transparency in algorith-
mic and human decision-making: is there a double standard? Philos
Technol. 2019;32:661-683.
21. Coeckelbergh M. AI Ethics. Cambridge, MA: The MIT Press; 2020.
22. Grote T, Berens P. On the ethics of algorithmic decision-making in
healthcare. J Med Ethics. 2020;46:205-211. https://doi.org/10.1136/
medethics-2019-105586.
23. van Baalen S, Carusi A, Sabroe I, Kiely DG. A social-technological
epistemology of clinical decision-making as mediated by imaging.
J Eval Clin Pract. 2017;23:949-958. https://doi.org/10.1111/jep.
12637.
24. Svenaeus F. The Hermeneutics of Medicine and the Phenomenology of
Health: Steps Towards a Philosophy of Medical Practice. Vol 5. Dor-
drecht: Springer Science & Business Media; 2013.
25. Leder D. Clinical interpretation: the hermeneutics of medicine. Theor
Med. 1990;11:9-24.
26. Gadamer H-G. Truth and Method. New York: Crossroad; 2004 /1975.
27. Ihde D. Philosophy of Technology: An Introduction. New York: Paragon
House; 1993.
28. Verbeek P-P. What Things Do: Philosophical Reflections on Technology,
Agency, and Design. University Park, PA: Pennsylvania State Univer-
sity Press; 2005.
29. Tschandl P, Rinner C, Apalla Z, et al. Humancomputer collaboration
for skin cancer recognition. Nat Med. 2020;26:1229-1234. https://
doi.org/10.1038/s41591-020-0942-0.
KUDINA AND de BOER 7
30. Yoon S-W. Korea's third AI-based oncology center to open next
month. The Korea Times. March 16, 2017. http://www.koreatimes.co.
kr/www/tech/2017/03/129_225819.html. Accessed August 4, 2020.
31. de Boer B, Kudina O. What is morally at stake when using algorithms
to make medical diagnoses? Expanding the discussion beyond risks
and harms. Theor Med Bioeth (in press).
32. Chung J, Zink A. Hey Watson - Can I sue you for malpractice? Exam-
ining the liability of Artificial Intelligence in medicine. Asia Pac J Health
Law Ethics. 2018;11:51-80.
How to cite this article: Kudina O, de Boer B. Co-designing
diagnosis: Towards a responsible integration of Machine
Learning decision-support systems in medical diagnostics.
J Eval Clin Pract. 2021;18. https://doi.org/10.1111/jep.
13535
8KUDINA AND de BOER
... This iterative process is essential in studies employing machine learning algorithms, as it helps to bridge computational challenges with expert insights. However, as shown by our initial attempt using a single k-means clustering with three clusters, the application of machine learning to formulate hypotheses in medical or veterinary fields must be undertaken with caution [40][41][42]. ...
Article
Full-text available
Aquatic training has been integrated into equine rehabilitation and training programs for several decades. While the cardiovascular effects of this training have been explored in previous studies, limited research exists on the locomotor patterns exhibited during the swimming cycle. This study aimed to analyze three distinct swimming strategies, identified by veterinarians, based on the propulsion phases of each limb: (S1) two-beat cycle with lateral overlap, (S2) two-beat cycle with diagonal overlap, and (S3) four-beat cycle. 125 underwater videos from eleven horses accustomed to swimming were examined to quantify the differences in locomotor patterns between these strategies. Initially, a classifier was developed to categorize 125 video segments into four groups (CatA to CatD). The results demonstrated that these categories correspond to specific swimming strategies, with CatA aligning with S1, CatB with S2, and CatC and CatD representing variations of S3. This classification highlights that two key parameters, lateral and diagonal ratios, are indeed effective in distinguishing between the different swimming strategies. Additionally, coordination patterns were analyzed in relation to these swimming strategies. One of the primary findings is the variability in swimming strategies both within and between individual horses. While five horses consistently maintained the same strategy throughout their swimming sessions, six others exhibited variations in their strategy between laps. This suggests that factors such as swimming direction, pauses between laps, and fatigue may influence the selection of swimming strategy. This study offers new insights into the locomotor patterns of horses during aquatic training and has implications for enhancing the design of rehabilitation protocols.
... Rather, we should evaluate whether and how a technology should be introduced into our societies on a case-specific, context-dependent basis. Contributing to Technology Assessment (TA) is indeed one of the most fruitful assets of the postphenomenological approach (e.g., de Boer et al., 2018;Kudina & de Boer, 2021;Morrison, 2020;Mykhailov, 2023;Wellner & Mykhailov, 2023). I think that it could be rendered even more consistent by appreciating how technology shapes human evolution. ...
Article
Full-text available
In this paper, I aim to assess whether postphenomenology’s ontological framework is suitable for making sense of the most recent technoscientific developments, with special reference to the case of AI-based technologies. First, I will argue that we may feel diminished by those technologies seemingly replicating our higher-order cognitive processes only insofar as we regard technology as playing no role in the constitution of our core features. Secondly, I will highlight the epistemological tension underlying the account of this dynamic submitted by postphenomenology. On the one hand, postphenomenology’s general framework prompts us to conceive of humans and technologies as mutually constituting one another. On the other, the postphenomenological analyses of particular human-technology relations, which Peter-Paul Verbeek calls cyborg relations and hybrid intentionality, seem to postulate the existence of something exclusively human that technology would only subsequently mediate. Thirdly, I will conclude by proposing that postphenomenology could incorporate into its ontology insights coming from other approaches to the study of technology, which I label as human constitutive technicity in the wake of Peter Sloterdijk’s and Bernard Stiegler’s philosophies. By doing so, I believe, postphenomenology could better account for how developments in AI prompt and possibly even force us to revise our self-representation. From this viewpoint, I will advocate for a constitutive role of technology in shaping the human lifeform not only in the phenomenological-existential sense of articulating our relation to the world but also in the onto-anthropological sense of influencing our evolution.
... After all, medical ML systems for decision-support are not independent decision-makers, but sociotechnical agents that are (permanently) embedded in a collaborative decision-making process with one or more human clinicians. [54,55] Wherever a clinician decides to use a medical ML system while engaged in team-based decision-making themselves, the medical ML system will also become embedded in a team-based decision-making dynamic. Indeed, medical ML systems are more strongly regulated by team-based decision-making than human clinicians. ...
Article
Full-text available
It is commonly accepted that clinicians are ethically obligated to disclose their use of medical machine learning systems to patients, and that failure to do so would amount to a moral fault for which clinicians ought to be held accountable. Call this "the disclosure thesis." Four main arguments have been, or could be, given to support the disclosure thesis in the ethics literature: the risk-based argument, the rights-based argument, the materiality argument, and the autonomy argument. In this article, I argue that each of these four arguments are unconvincing, and therefore, that the disclosure thesis ought to be rejected. I suggest that mandating disclosure may also even risk harming patients by providing stakeholders with a way to avoid accountability for harm that results from improper applications or uses of these systems.
... Particularly, to counteract cognitive biases associated with an over-reliance on decision support technologies, ML algorithms have recently been utilized as tools for offering second opinions [8,9]. In this context, they are viewed as cognitive supports with specialized capacities, designed to confirm or revise (i.e., augment) decisions initially made by clinicians, rather than merely automating the clinical decision-making process [10]. Several studies have explored the impact of algorithmic assistance on clinicians' diagnostic performance when supplemented by a second opinion from an ML algorithm. ...
Article
Full-text available
Background The frequency of hip and knee arthroplasty surgeries has been rising steadily in recent decades. This trend is attributed to an aging population, leading to increased demands on healthcare systems. Fast Track (FT) surgical protocols, perioperative procedures designed to expedite patient recovery and early mobilization, have demonstrated efficacy in reducing hospital stays, convalescence periods, and associated costs. However, the criteria for selecting patients for FT procedures have not fully capitalized on the available patient data, including patient-reported outcome measures (PROMs). Methods Our study focused on developing machine learning (ML) models to support decision making in assigning patients to FT procedures, utilizing data from patients’ self-reported health status. These models are specifically designed to predict the potential health status improvement in patients initially selected for FT. Our approach focused on techniques inspired by the concept of controllable AI. This includes eXplainable AI (XAI), which aims to make the model’s recommendations comprehensible to clinicians, and cautious prediction, a method used to alert clinicians about potential control losses, thereby enhancing the models’ trustworthiness and reliability. Results Our models were trained and tested using a dataset comprising 899 records from individual patients admitted to the FT program at IRCCS Ospedale Galeazzi-Sant’Ambrogio. After training and selecting hyper-parameters, the models were assessed using a separate internal test set. The interpretable models demonstrated performance on par or even better than the most effective ‘black-box’ model (Random Forest). These models achieved sensitivity, specificity, and positive predictive value (PPV) exceeding 70%, with an area under the curve (AUC) greater than 80%. The cautious prediction models exhibited enhanced performance while maintaining satisfactory coverage (over 50%). Further, when externally validated on a separate cohort from the same hospital-comprising patients from a subsequent time period-the models showed no pragmatically notable decline in performance. Conclusions Our results demonstrate the effectiveness of utilizing PROMs as basis to develop ML models for planning assignments to FT procedures. Notably, the application of controllable AI techniques, particularly those based on XAI and cautious prediction, emerges as a promising approach. These techniques provide reliable and interpretable support, essential for informed decision-making in clinical processes.
... In this context of several interacting complex comorbidities and data generation, Artificial Intelligence (AI) technologies can support decision-making on how best to manage the different levels of disease activity and available resources [4]. For AI technologies to be effectively adopted, adapted and implemented, it is imperative that health care staff managers are involved in the design of these technologies [5]. ...
... These techniques have been promising in helping doctors by providing accurate prognostic predictions, disease detection, and image-based diagnosis. In addition, machine learning algorithms have become instruments in addressing challenges such as class imbalances in medical datasets, thereby improving the accuracy and reliability of predictive models [3], [23]. ...
Article
Full-text available
Nephropathy is a severe diabetic complication affecting the kidneys that presents a substantial risk to patients. It often progresses to renal failure and other critical health issues. Early and accurate prediction of nephropathy is paramount for effective intervention, patient well-being, and healthcare resource optimization. This research used medical records from 500 datasets of diabetic patients with imbalanced classes. The main goal of this study is to get high-performance predictive models for nephropathy. So, this study suggests a new way to deal with the common problem of having too little or too much data when trying to predict nephropathy: adding more data through adaptive synthetic sampling (ADASYN). This technique is particularly pertinent in ensemble machine-learning methods like Random Forest, AdaBoost, and bagging (Adabag). By increasing the number of instances of minority classes, it tries to reduce the bias that comes with imbalanced datasets, which should lead to more accurate and strong predictive models in the long run. The experimental results show an improving 4% rise in performance evaluation such as precision, recall, accuracy, and f1-score, especially for the ensemble methods. Two contributions of this research are highlighted here: first, the utilization of adaptive synthetic sampling data to improve the balance and diversity of the training dataset. The second contribution is incorporating ensemble methods within machine learning algorithms to enhance the accuracy and robustness of diabetic nephropathy detection.
... Innovative AI generative video technology is capable of creating photorealistic content not only for living celebrities but also for resurrecting famous people who have already passed away. Complex Machine Learning algorithms enable new scientific practices of data interpretation and so create a new situation of scientific explanation of nature and human beings (Kudina & de Boer, 2021). While postphenomenology has analyzed AI from various perspectives, such as technological intentionality in artificial neural networks (Mykhailov & Liberati, 2022), algorithmic biases and non-neutrality of AI models (Wellner & Rothman, 2020), the problem of the black-box (Friedrich et al., 2022) and recent postphenomenology of ChatGPT (Laaksoharju et al., 2023), the progress in the field of AI over the last several months has brought forth radically new challenges that must be addressed from a strong philosophical standpoint. ...
Research
Full-text available
This is a call for papers in the special issue "Postphenomenology in the Age of AI: Prospects, Challenges, Opportunities" for the Journal of Human-Technology Relations. Guest Editor - Dr. Dmytro Mykhailov For more details, please check the file attached or visit a webpage for the special issue - https://journals.open.tudelft.nl/jhtr/announcement/view/401
... 74,75 At present, machine algorithms can be roughly divided into three categories: supervised learning, unsupervised learning, and reinforcement learning. They play a crucial role in many application scenarios such as image recognition, [76][77][78] natural language processing, [79][80][81] traffic prediction, 82-84 medical diagnosis [85][86][87] and so on. In recent years, LIBS combined with machine learning algorithms has become a hot topic. ...
Article
Full-text available
In this paper, we examine the qualitative moral impact of machine learning-based clinical decision support systems in the process of medical diagnosis. To date, discussions about machine learning in this context have focused on problems that can be measured and assessed quantitatively, such as by estimating the extent of potential harm or calculating incurred risks. We maintain that such discussions neglect the qualitative moral impact of these technologies. Drawing on the philosophical approaches of technomoral change and technological mediation theory, which explore the interplay between technologies and morality, we present an analysis of concerns related to the adoption of machine learning-aided medical diagnosis. We analyze anticipated moral issues that machine learning systems pose for different stakeholders, such as bias and opacity in the way that models are trained to produce diagnoses, changes to how health care providers, patients, and developers understand their roles and professions, and challenges to existing forms of medical legislation. Albeit preliminary in nature, the insights offered by the technomoral change and the technological mediation approaches expand and enrich the current discussion about machine learning in diagnostic practices, bringing distinct and currently underexplored areas of concern to the forefront. These insights can contribute to a more encompassing and better informed decision-making process when adapting machine learning techniques to medical diagnosis, while acknowledging the interests of multiple stakeholders and the active role that technologies play in generating, perpetuating, and modifying ethical concerns in health care.
Article
Full-text available
The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care. Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows. We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support. We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage. In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts. Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis. Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human–computer collaboration in clinical practice.
Article
Full-text available
With emerging innovations in artificial intelligence (AI) poised to substantially impact medical practice, interest in training current and future physicians about the technology is growing. Alongside comes the question of what, precisely, should medical students be taught. While competencies for the clinical usage of AI are broadly similar to those for any other novel technology, there are qualitative differences of critical importance to concerns regarding explainability, health equity, and data security. Drawing on experiences at the University of Toronto Faculty of Medicine and MIT Critical Data’s “datathons”, the authors advocate for a dual-focused approach: combining robust data science-focused additions to baseline health research curricula and extracurricular programs to cultivate leadership in this space.
Article
Full-text available
In recent years, a plethora of high-profile scientific publications has been reporting about machine learning algorithms outperforming clinicians in medical diagnosis or treatment recommendations. This has spiked interest in deploying relevant algorithms with the aim of enhancing decision-making in healthcare. In this paper, we argue that instead of straightforwardly enhancing the decision-making capabilities of clinicians and healthcare institutions, deploying machines learning algorithms entails trade-offs at the epistemic and the normative level. Whereas involving machine learning might improve the accuracy of medical diagnosis, it comes at the expense of opacity when trying to assess the reliability of given diagnosis. Drawing on literature in social epistemology and moral responsibility, we argue that the uncertainty in question potentially undermines the epistemic authority of clinicians. Furthermore, we elucidate potential pitfalls of involving machine learning in healthcare with respect to paternalism, moral responsibility and fairness. At last, we discuss how the deployment of machine learning algorithms might shift the evidentiary norms of medical diagnosis. In this regard, we hope to lay the grounds for further ethical reflection of the opportunities and pitfalls of machine learning for enhancing decision-making in healthcare.
Article
Full-text available
Although rapid advances in machine learning have made it increasingly applicable to expert decision-making, the delivery of accurate algorithmic predictions alone is insufficient for effective human-AI collaboration. In this work, we investigate the key types of information medical experts desire when they are first introduced to a diagnostic AI assistant. In a qualitative lab study, we interviewed 21 pathologists before, during, and after being presented deep neural network (DNN) predictions for prostate cancer diagnosis, to learn the types of information that they desired about the AI assistant. Our findings reveal that, far beyond understanding the local, case-specific reasoning behind any model decision, clinicians desired upfront information about basic, global properties of the model, such as its known strengths and limitations, its subjective point-of-view, and its overall design objective--what it's designed to be optimized for. Participants compared these information needs to the collaborative mental models they develop of their medical colleagues when seeking a second opinion: the medical perspectives and standards that those colleagues embody, and the compatibility of those perspectives with their own diagnostic patterns. These findings broaden and enrich discussions surrounding AI transparency for collaborative decision-making, providing a richer understanding of what experts find important in their introduction to AI assistants before integrating them into routine practice.
Article
Full-text available
In the past five years, private companies, research institutions and public sector organizations have issued principles and guidelines for ethical artificial intelligence (AI). However, despite an apparent agreement that AI should be ‘ethical’, there is debate about both what constitutes ‘ethical AI’ and which ethical requirements, technical standards and best practices are needed for its realization. To investigate whether a global agreement on these questions is emerging, we mapped and analysed the current corpus of principles and guidelines on ethical AI. Our results reveal a global convergence emerging around five ethical principles (transparency, justice and fairness, non-maleficence, responsibility and privacy), with substantive divergence in relation to how these principles are interpreted, why they are deemed important, what issue, domain or actors they pertain to, and how they should be implemented. Our findings highlight the importance of integrating guideline-development efforts with substantive ethical analysis and adequate implementation strategies.
Article
Full-text available
As the potential of smartphone apps and sensors for healthcare and clinical research continues to expand, there is a concomitant need for open, accessible, and scalable digital tools. While many current app platforms offer useful solutions for either clinicians or patients, fewer seek to serve both and support the therapeutic relationship between them. Thus, we aimed to create a novel smartphone platform at the intersection of patient demands for trust, control, and community and clinician demands for transparent, data driven, and translational tools. The resulting LAMP platform has evolved through numerous iterations and with much feedback from patients, designers, sociologists, advocates, clinicians, researchers, app developers, and philanthropists. As an open and free tool, the LAMP platform continues to evolve as reflected in its current diverse use cases across research and clinical care in psychiatry, neurology, anesthesia, and psychology. In this paper, we explore the motivation, features, current progress, and next steps to pair the platform for use in a new digital psychiatry clinic, to advance digital interventions for youth mental health, and to bridge gaps in available mental health care for underserved patient groups. The code for the LAMP platform is freely shared with this paper to encourage others to adapt and improve on our team’s efforts.
Article
Artificial intelligence (AI) is perceived as the most transformative technology of the 21st century. Healthcare has been identified as an early candidate to be revolutionized by AI technologies. Various clinical and patient-facing applications have already reached healthcare practice with the potential to ease the pressure on healthcare staff, bring down costs and ultimately improve the lives of patients. However, various concerns have been raised as regards the unique properties and risks inherent to AI technologies. This article aims at providing an early stage contribution with a holistic view on the ‘decision-making’ capacities of AI technologies. The possible ethical and legal ramifications will be discussed against the backdrop of the existing frameworks. I will conclude that the present structures are largely fit to deal with the challenges AI technologies are posing. In some areas, sector-specific revisions of the law may be advisable, particularly concerning non-discrimination and product liability.
Article
This article presents a mapping review of the literature concerning the ethics of artificial intelligence (AI) in health care. The goal of this review is to summarise current debates and identify open questions for future research. Five literature databases were searched to support the following research question: how can the primary ethical risks presented by AI-health be categorised, and what issues must policymakers, regulators and developers consider in order to be ‘ethically mindful?. A series of screening stages were carried out—for example, removing articles that focused on digital health in general (e.g. data sharing, data access, data privacy, surveillance/nudging, consent, ownership of health data, evidence of efficacy)—yielding a total of 156 papers that were included in the review. We find that ethical issues can be (a) epistemic, related to misguided, inconclusive or inscrutable evidence; (b) normative, related to unfair outcomes and transformative effectives; or (c) related to traceability. We further find that these ethical issues arise at six levels of abstraction: individual, interpersonal, group, institutional, and societal or sectoral. Finally, we outline a number of considerations for policymakers and regulators, mapping these to existing literature, and categorising each as epistemic, normative or traceability-related and at the relevant level of abstraction. Our goal is to inform policymakers, regulators and developers of what they must consider if they are to enable health and care systems to capitalise on the dual advantage of ethical AI; maximising the opportunities to cut costs, improve care, and improve the efficiency of health and care systems, whilst proactively avoiding the potential harms. We argue that if action is not swiftly taken in this regard, a new ‘AI winter’ could occur due to chilling effects related to a loss of public trust in the benefits of AI for health care.
Book
An accessible synthesis of ethical issues raised by artificial intelligence that moves beyond hype and nightmare scenarios to address concrete questions. Artificial intelligence powers Google's search engine, enables Facebook to target advertising, and allows Alexa and Siri to do their jobs. AI is also behind self-driving cars, predictive policing, and autonomous weapons that can kill without human intervention. These and other AI applications raise complex ethical issues that are the subject of ongoing debate. This volume in the MIT Press Essential Knowledge series offers an accessible synthesis of these issues. Written by a philosopher of technology, AI Ethics goes beyond the usual hype and nightmare scenarios to address concrete questions. Mark Coeckelbergh describes influential AI narratives, ranging from Frankenstein's monster to transhumanism and the technological singularity. He surveys relevant philosophical discussions: questions about the fundamental differences between humans and machines and debates over the moral status of AI. He explains the technology of AI, describing different approaches and focusing on machine learning and data science. He offers an overview of important ethical issues, including privacy concerns, responsibility and the delegation of decision making, transparency, and bias as it arises at all stages of data science processes. He also considers the future of work in an AI economy. Finally, he analyzes a range of policy proposals and discusses challenges for policymakers. He argues for ethical practices that embed values in design, translate democratic values into practices and include a vision of the good life and the good society.