ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions, and the major challenge of quantifying societal expectations about the ethical principles that should guide machine behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to explore the moral dilemmas faced by autonomous vehicles. This platform gathered 40 million decisions in ten languages from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’ demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are publicly available.
Robustness checks: external validation of three factors Calculated values correspond to values in Fig. 2a (AMCE calculated using conjoint analysis). For example, ‘Sparing Pedestrians [Relation to AV]’ refers to the difference between the probability of sparing pedestrians, and the probability of sparing passengers (attribute name: Relation to AV), aggregated over all other attributes. Error bars represent 95% confidence intervals of the means. a, Validation of textual description (seen versus not seen). By default, respondents see only the visual representation of a scenario. Interpretation of what type of characters they represent (for example, female doctor) may not be obvious. Optionally, respondents can read a textual description of the scenario by clicking on ‘see description’. This panel shows that direction and (except in one case) order of effect estimates remain stable. The magnitude of the effects increases for respondents who read the textual descriptions, which means that the effects reported in Fig. 2a were not overestimated because of visual ambiguity. b, Validation of device used (desktop versus mobile). Direction and order of effect estimates remain stable regardless of whether respondents used desktop or mobile devices when completing the task. c, Validation of data set (all data versus full first-session data versus survey-only data). Direction and order of effect estimates remain stable regardless of whether the data used in analysis are all data, data restricted to only first completed (13-scenario) session by any user, or data restricted to completed sessions after which the demographic survey was taken. First completed session by any user is an interesting subset of the data because respondents had not seen their summary of results yet, and respondents ended up completing the session. Survey-only data are also interesting given that the conclusions about individual variations in the main paper and from Extended Data Fig. 3 and Extended Data Table 1 are drawn from this subset. See Supplementary Information for more details.
… 
This content is subject to copyright. Terms and conditions apply.
ARTICLE https://doi.org/10.1038/s41586-018-0637-6
The Moral Machine experiment
Edmond Awad1, Sohan Dsouza1, Richard Kim1, Jonathan Schulz2, Joseph Henrich2, Azim Shariff3*, Jean-François Bonnefon4* &
Iyad Rahwan1,5*
With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions,
and the major challenge of quantifying societal expectations about the ethical principles that should guide machine
behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to
explore the moral dilemmas faced by autonomous vehicles. This platform gathered 40 million decisions in ten languages
from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we
summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’
demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we
show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences
can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are
publicly available.
We are entering an age in which machines are tasked not only to pro-
mote well-being and minimize harm, but also to distribute the well-
being they create, and the harm they cannot eliminate. Distribution
of well-being and harm inevitably creates tradeoffs, whose resolution
falls in the moral domain
1–3
. Think of an autonomous vehicle that is
about to crash, and cannot find a trajectory that would save everyone.
Should it swerve onto one jaywalking teenager to spare its three elderly
passengers? Even in the more common instances in which harm is not
inevitable, but just possible, autonomous vehicles will need to decide
how to divide up the risk of harm between the different stakeholders
on the road. Car manufacturers and policymakers are currently strug-
gling with these moral dilemmas, in large part because they cannot
be solved by any simple normative ethical principles such as Asimov’s
laws of robotics4.
Asimov’s laws were not designed to solve the problem of universal
machine ethics, and they were not even designed to let machines
distribute harm between humans. They were a narrative device whose
goal was to generate good stories, by showcasing how challenging it
is to create moral machines with a dozen lines of code. And yet, we
do not have the luxury of giving up on creating moral machines5–8.
Autonomous vehicles will cruise our roads soon, necessitating
agreement on the principles that should apply when, inevitably, life-
threatening dilemmas emerge. The frequency at which these dilemmas
will emerge is extremely hard to estimate, just as it is extremely hard to
estimate the rate at which human drivers find themselves in comparable
situations. Human drivers who die in crashes cannot report whether
they were faced with a dilemma; and human drivers who survive a
crash may not have realized that they were in a dilemma situation.
Note, though, that ethical guidelines for autonomous vehicle choices in
dilemma situations do not depend on the frequency of these situations.
Regardless of how rare these cases are, we need to agree beforehand
how they should be solved.
The key word here is ‘we. As emphasized by former US president
Barack Obama9, consensus in this matter is going to be important.
Decisions about the ethical principles that will guide autonomous vehi-
cles cannot be left solely to either the engineers or the ethicists. For con-
sumers to switch from traditional human-driven cars to autonomous
vehicles, and for the wider public to accept the proliferation of artificial
intelligence-driven vehicles on their roads, both groups will need to
understand the origins of the ethical principles that are programmed
into these vehicles10. In other words, even if ethicists were to agree on
how autonomous vehicles should solve moral dilemmas, their work
would be useless if citizens were to disagree with their solution, and
thus opt out of the future that autonomous vehicles promise in lieu of
the status quo. Any attempt to devise artificial intelligence ethics must
be at least cognizant of public morality.
Accordingly, we need to gauge social expectations about how auton-
omous vehicles should solve moral dilemmas. This enterprise, how-
ever, is not without challenges11. The first challenge comes from the
high dimensionality of the problem. In a typical survey, one may test
whether people prefer to spare many lives rather than few9,12,13; or
whether people prefer to spare the young rather than the elderly14,15;
or whether people prefer to spare pedestrians who cross legally, rather
than pedestrians who jaywalk; or yet some other preference, or a sim-
ple combination of two or three of these preferences. But combining a
dozen such preferences leads to millions of possible scenarios, requiring
a sample size that defies any conventional method of data collection.
The second challenge makes sample size requirements even more
daunting: if we are to make progress towards universal machine ethics
(or at least to identify the obstacles thereto), we need a fine-grained under-
standing of how different individuals and countries may differ in their eth-
ical preferences
16,17
. As a result, data must be collected worldwide, in order
to assess demographic and cultural moderators of ethical preferences.
As a response to these challenges, we designed the Moral Machine,
a multilingual online ‘serious game’ for collecting large-scale data on
how citizens would want autonomous vehicles to solve moral dilemmas
in the context of unavoidable accidents. The Moral Machine attracted
worldwide attention, and allowed us to collect 39.61million decisions
from 233 countries, dependencies, or territories (Fig.1a). In the main
interface of the Moral Machine, users are shown unavoidable accident
scenarios with two possible outcomes, depending on whether the
autonomous vehicle swerves or stays on course (Fig.1b). They then
click on the outcome that they find preferable. Accident scenarios are
generated by the Moral Machine following an exploration strategy that
1The Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA. 2Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA. 3Department of Psychology,
University of British Columbia, Vancouver, British Columbia, Canada. 4Toulouse School of Economics (TSM-R), CNRS, Université Toulouse Capitole, Toulouse, France. 5Institute for Data, Systems &
Society, Massachusetts Institute of Technology, Cambridge, MA, USA. *e-mail: shariff@psych.ubc.ca; jean-francois.bonnefon@tse-fr.eu; irahwan@mit.edu
1 NOVEMBER 2018 | VOL 563 | NATURE | 59
© 2018 Springer Nature Limited. All rights reserved.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
... 6 While some animal rights activists endorse similar moral positions (e.g., Francione 1995), this view is again unlikely to capture the commonsense view given that most people hold that it is permissible to harm animals to benefit humans, e.g., via medical testing . In fact, a recent large-scale study employing sacrificial dilemmas involving autonomous vehicles confirmed, unsurprisingly, that willingness to sacrifice animals to save human lives is overwhelmingly endorsed by most people around the world (Awad et al., 2018) There are, however, at least three ways to capture this intuitive moral difference between humans and animals while still ascribing some moral significance to animals. One is the Nozickian View we already discussed above on which deontology applies only to humans, while consequentialism applies to animals. ...
... In the studies we discussed so far that priority was expressed in more stringent moral constraints against harming humans. Other psychological research has also demonstrated that people are willing to harm animals to benefit humans (Awad et al., 2018;Petrinovich et al., 1993;Topolski et al., 2013). But the priority is also manifested in the context of help. ...
Article
Full-text available
Robert Nozick famously raised the possibility that there is a sense in which both deontology and utilitarianism are true: deontology applies to humans while utilitarianism applies to animals. In recent years, there has been increasing interest in such a hybrid views of ethics. Discussions of this Nozickian Hybrid View, and similar approaches to animal ethics, often assume that such an approach reflects the commonsense view, and best captures common moral intuitions. However, recent psychological work challenges this empirical assumption. We review evidence suggesting that the folk is deontological all the way down—it is just that the moral side constraints that protect animals from harm are much weaker than those that protect humans. In fact, it appears that people even attribute some deontological protections, albeit extremely weak ones, to inanimate objects. We call this view Multi-level Weighted Deontology. While such empirical findings cannot show that the Nozickian Hybrid View is false, or that it is unjustified, they do remove its core intuitive support. That support belongs to Multi-level Weighted Deontology, a view that is also in line with the view that Nozick himself seemed to favour. To complicate things, however, we also review evidence that our intuitions about the moral status of humans are, at least in significant part, shaped by factors relating to mere species membership that seem morally irrelevant. We end by considering the potential debunking upshot of such findings about the sources of common moral intuitions about the moral status of animals.
... These 4 countries were chosen because their citizens showed substantial differences in their responses to moral dilemmas in a previous survey. 19 Accordingly, they offered good prospects to capture cultural differences in triage preferences, if any. ...
... Second, we posted the same survey on the Moral Machine website (moralmachine.net). 19,20 The Moral Machine is a highly popular citizen science website that was designed in 2016 to collect public preferences related to the moral dilemmas of self-driving cars. It receives a constant flow of visitors interested in contributing responses to moral dilemmas and thus offers a convenient way to collect data from participants worldwide. ...
Article
Full-text available
Objective. When medical resources are scarce, clinicians must make difficult triage decisions. When these decisions affect public trust and morale, as was the case during the COVID-19 pandemic, experts will benefit from knowing which triage metrics have citizen support. Design. We conducted an online survey in 20 countries, comparing support for 5 common metrics (prognosis, age, quality of life, past and future contribution as a health care worker) to a benchmark consisting of support for 2 no-triage mechanisms (first-come-first-served and random allocation). Results. We surveyed nationally representative samples of 1000 citizens in each of Brazil, France, Japan, and the United States and also self-selected samples from 20 countries (total N = 7599) obtained through a citizen science website (the Moral Machine). We computed the support for each metric by comparing its usability to the usability of the 2 no-triage mechanisms. We further analyzed the polarizing nature of each metric by considering its usability among participants who had a preference for no triage. In all countries, preferences were polarized, with the 2 largest groups preferring either no triage or extensive triage using all metrics. Prognosis was the least controversial metric. There was little support for giving priority to healthcare workers. Conclusions. It will be difficult to define triage guidelines that elicit public trust and approval. Given the importance of prognosis in triage protocols, it is reassuring that it is the least controversial metric. Experts will need to prepare strong arguments for other metrics if they wish to preserve public trust and morale during health crises. Highlights We collected citizen preferences regarding triage decisions about scarce medical resources from 20 countries. We find that citizen preferences are universally polarized. Citizens either prefer no triage (random allocation or first-come-first served) or extensive triage using all common triage metrics, with “prognosis” being the least controversial. Experts will need to prepare strong arguments to preserve or elicit public trust in triage decisions.
Article
Widespread failures of replication and generalization are, ironically, a scientific triumph, in that they confirm the fundamental metascientific theory that underlies our field. Generalizable and replicable findings require testing large numbers of subjects from a wide range of demographics with a large, randomly‐sampled stimulus set, and using a variety of experimental parameters. Because few studies accomplish any of this, meta‐scientists predict that findings will frequently fail to replicate or generalize. We argue that to be more robust and replicable, developmental psychology needs to find a mechanism for collecting data at a greater scale and from more diverse populations. Luckily, this mechanism already exists as follows: Citizen science, in which large numbers of uncompensated volunteers provide data. While best‐known for its contributions to astronomy and ecology, citizen science has also produced major findings in neuroscience and psychology, and increasingly in developmental psychology. We provide examples, address practical challenges, discuss limitations, and compare to other methods of obtaining large datasets. Ultimately, we argue that the range of studies where it makes sense *not* to use citizen science is steadily dwindling.
Article
At present, unmanned driving technology has made great progress, while those research on its related ethical issues, laws, and traffic regulations are relatively lagging. In particular, it is still a problem how unmanned vehicles make a decision when they encounter ethical dilemmas where traffic collision is inevitable. So it must hinder the application and development of unmanned driving technology. Firstly, 1048575 survey data collected by Moral Machine online experiment platform is analyzed to calculate the prior probability that the straight being protector or sacrificer in ethical dilemmas with single feature. Then, 116 multifeature ethical dilemmas are designed and surveyed. The collected survey data are analyzed to determine decision-making for these ethical dilemmas by adopting the majority principle and to calculate correlation coefficient between attributes, then an improved Naive Bayes algorithm based on attribute correlation (ACNB) is established to solve the problem of unmanned driving decision in multifeature ethical dilemmas. Furthermore, these ethical dilemmas are used to test and verify traditional NB, ADOE, WADOE, CFWNB, and ACNB, respectively. According to the posterior probability that the straight being protector or sacrificer in those ethical dilemmas, classification and decision are made in these ethical dilemmas. Then, the decisions based on these algorithms are compared with human decisions to judge whether these decisions are right. The test results show that ACNB and CFWNB are more consistent with human decisions than other algorithms, and ACNB is more conductive to improve unmanned vehicle’s decision robustness than NB. Therefore, applying ACNB to unmanned vehicles has a good role, which will provide a new research point for unmanned driving ethical decision and a few references for formulating and updating traffic laws and regulations related to unmanned driving technology for traffic regulation authorities.
Article
The exploratory sandbox for blockchain services, Lithopy, provided an experimental alternative to the aspirational frameworks and guidelines regulating algorithmic services ex post or ex ante. To understand the possibilities and limits of this experimental approach, we compared the regulatory expectations in the sandbox with the real-life decisions about an “actual” intrusive service: contact tracing application. We gathered feedback on hypothetical and real intrusive services from a group of 59 participants before and during the first and second waves of the COVID-19 pandemic in the Czech Republic (January, June 2020, and April 2021). Participants expressed support for interventions based on an independent rather than government oversight that increases participation and representation. Instead of reducing the regulations to code or insisting on strong regulations over the code, participants demanded hybrid combinations of code and regulations. We discuss this as a demand for “no algorithmization without representation.” The intrusive services act as new algorithmic “territories,” where the “data” settlers must redefine their sovereignty and agency on new grounds. They refuse to rely upon the existing institutions and promises of governance by design and seek tools that enable engagement in the full cycle of the design, implementation, and evaluation of the services. The sandboxes provide an environment that bridges the democratic deficit in the design of algorithmic services and their regulations.
Article
Speciesism, like other forms of prejudice, is thought to be underpinned by biased patterns of language use. Thus far, however, psychological science has primarily focused on how speciesism is reflected in individuals' thoughts as opposed to wider collective systems of meaning such as language. We present a large‐scale quantitative test of speciesism by applying machine‐learning methods (word embeddings) to billions of English words derived from conversation, film, books, and the Internet. We found evidence of anthropocentric speciesism: words denoting concern (vs. indifference) and value (vs. valueless) were more closely associated with words denoting humans compared to many other animals. We also found evidence of companion animal speciesism: the same words were more closely associated with words denoting companion animals compared to most other animals. The work describes speciesism as a pervasive collective phenomenon that is evident in a naturally occurring expression of human psychology – everyday language.
Article
Full-text available
The ethics of autonomous cars and automated driving have been a subject of discussion in research for a number of years (cf. Lin 2015; Goodall in Transportation Research Record: Journal of the Transportation Research Board 2424:58–65, 2014; Goodall in IEEE Spectrum 53(6):28–58, 2016). As levels of automation progress, with partially automated driving already becoming standard in new cars from a number of manufacturers, the question of ethical and legal standards becomes virulent. For exam-ple, while automated and autonomous cars, being equipped with appropriate detection sensors, processors, and intelligent mapping material, have a chance of being much safer than human-driven cars in many regards, situations will arise in which accidents cannot be completely avoided. Such situations will have to be dealt with when programming the software of these vehicles. In several instances, internationally, regulations have been passed, based on legal considerations of road safety, mostly. However, to date, there have been few, if any, cases of a broader ethics code for autonomous or automated driving preceding actual regulation and being based on a broadly composed ethics committee of independent experts. In July 2016, the German Federal Minister of Transport and Digital Infrastructure, Alexander Dobrindt, appointed a national ethics committee for automated and connected driving, which began its work in September 2016. In June 2017, this committee presented a code of ethics which was published in German (with annotations, BMVI 2017a) and in English (cf. BMVI 2017b). It consists of 20 ethical guidelines. Having been a member of this committee, I will present the main ethical topics of these guidelines and the discussions that lay behind them.
Article
Full-text available
p>Self-driving cars offer a bright future, but only if the public can overcome the psychological challenges that stand in the way of widespread adoption. We discuss three: ethical dilemmas, overreactions to accidents, and the opacity of the cars’ decision-making algorithms — and propose steps towards addressing them.</p
Article
Full-text available
As intelligent systems are increasingly making decisions that directly affect society, perhaps the most important upcoming research direction in AI is to rethink the ethical implications of their actions. Means are needed to integrate moral, societal and legal values with technological developments in AI, both during the design process as well as part of the deliberation algorithms employed by these systems. In this paper, we describe leading ethics theories and propose alternative ways to ensure ethical behavior by artificial systems. Given that ethics are dependent on the socio-cultural context and are often only implicit in deliberation processes, methodologies are needed to elicit the values held by designers and stakeholders, and to make these explicit leading to better understanding and trust on artificial autonomous systems.
Article
Full-text available
AI is here now, available to anyone with access to digital technology and the Internet. But its consequences for our social order aren't well understood. How can we guide the way technology impacts society?
Article
Full-text available
Two separate bodies of work have examined whether culture affects cooperation in economic games and whether cooperative or non-cooperative decisions occur more quickly. Here, we connect this work by exploring the relationship between decision time and cooperation in American versus Indian subjects. We use a series of dynamic social network experiments in which subjects play a repeated public goods game: 80 sessions for a total of 1,462 subjects (1,059 from the United States, 337 from India, and 66 from other countries) making 13,560 decisions. In the first round, where subjects do not know if connecting neighbors are cooperative, American subjects are highly cooperative and decide faster when cooperating than when defecting, whereas a majority of Indian subjects defect and Indians decide faster when defecting than when cooperating. Almost the same is true in later rounds where neighbors were previously cooperative (a cooperative environment) except decision time among Indian subjects. However, when connecting neighbors were previously not cooperative (a non-cooperative environment), a large majority of both American and Indian subjects defect, and defection is faster than cooperation among both sets of subjects. Our results imply the cultural background of subjects in their real life affects the speed of cooperation decision-making differentially in online social environments.
Article
Full-text available
Autonomous vehicles (AVs) should reduce traffic accidents, but they will sometimes have to choose between two evils, such as running over pedestrians or sacrificing themselves and their passenger to save the pedestrians. Defining the algorithms that will help AVs make these moral decisions is a formidable challenge. We found that participants in six Amazon Mechanical Turk studies approved of utilitarian AVs (that is, AVs that sacrifice their passengers for the greater good) and would like others to buy them, but they would themselves prefer to ride in AVs that protect their passengers at all costs. The study participants disapprove of enforcing utilitarian regulations for AVs and would be less willing to buy such an AV. Accordingly, regulating for utilitarian algorithms may paradoxically increase casualties by postponing the adoption of a safer technology.
Conference Paper
As intelligent systems are increasingly making decisions that directly affect society, perhaps the most important upcoming research direction in AI is to rethink the ethical implications of their actions. Means are needed to integrate moral, societal and legal values with technological developments in AI, both during the design process as well as part of the deliberation algorithms employed by these systems. In this paper, we describe leading ethics theories and propose alternative ways to ensure ethical behavior by artificial systems. Given that ethics are dependent on the socio-cultural context and are often only implicit in deliberation processes, methodologies are needed to elicit the values held by designers and stakeholders, and to make these explicit leading to better understanding and trust on artificial autonomous systems.
Article
Deception is common in nature and humans are no exception. Modern societies have created institutions to control cheating, but many situations remain where only intrinsic honesty keeps people from cheating and violating rules. Psychological, sociological and economic theories suggest causal pathways to explain how the prevalence of rule violations in people's social environment, such as corruption, tax evasion or political fraud, can compromise individual intrinsic honesty. Here we present cross-societal experiments from 23 countries around the world that demonstrate a robust link between the prevalence of rule violations and intrinsic honesty. We developed an index of the 'prevalence of rule violations' (PRV) based on country-level data from the year 2003 of corruption, tax evasion and fraudulent politics. We measured intrinsic honesty in an anonymous die-rolling experiment(5). We conducted the experiments with 2,568 young participants (students) who, due to their young age in 2003, could not have influenced PRV in 2003. We find individual intrinsic honesty is stronger in the subject pools of low PRV countries than those of high PRV countries. The details of lying patterns support psychological theories of honesty. The results are consistent with theories of the cultural co-evolution of institutions and values, and show that weak institutions and cultural legacies that generate rule violations not only have direct adverse economic consequences, but might also impair individual intrinsic honesty that is crucial for the smooth functioning of society.
Article
We review contemporary work on cultural factors affecting moral judgments and values, and those affecting moral behaviors. In both cases, we highlight examples of within-societal cultural differences in morality, to show that these can be as substantial and important as cross-societal differences. Whether between or within nations and societies, cultures vary substantially in their promotion and transmission of a multitude of moral judgments and behaviors. Cultural factors contributing to this variation include religion, social ecology (weather, crop conditions, population density, pathogen prevalence, residential mobility), and regulatory social institutions such as kinship structures and economic markets. This variability raises questions for normative theories of morality, but also holds promise for future descriptive work on moral thought and behavior.