Minds and Machines

Published by Springer Nature
Online ISSN: 1572-8641
Print ISSN: 0924-6495
Learn more about this page
Recent publications
Predictive processing (PP) and embodied cognition (EC) have emerged as two influential approaches within cognitive science in recent years. Not only have PP and EC been heralded as “revolutions” and “paradigm shifts” but they have motivated a number of new and interesting areas of research. This has prompted some to wonder how compatible the two views might be. This paper looks to weigh in on the issue of PP-EC compatibility. After outlining two recent proposals, I argue that further clarity can be achieved on the issue by considering a model of scientific progress. Specifically, I suggest that Larry Laudan’s “problem solving model” can provide important insights into a number of outstanding challenges that face existing accounts of PP-EC compatibility. I conclude by outlining additional implications of the problem solving model for PP and EC more generally.
The Turing test has been studied and run as a controlled experiment and found to be underspecified and poorly designed. On the other hand, it has been defended and still attracts interest as a test for true artificial intelligence (AI). Scientists and philosophers regret the test’s current status, acknowledging that the situation is at odds with the intellectual standards of Turing’s works. This article refers to this as the Turing Test Dilemma, following the observation that the test has been under discussion for over seventy years and still is widely seen as either too bad or too good to be a valuable experiment for AI. An argument that solves the dilemma is presented, which relies on reconstructing the Turing test as a thought experiment in the modern scientific tradition. It is argued that Turing’s exposition of the imitation game satisfies Mach’s characterization of the basic method of thought experiments and that Turing’s uses of his test satisfy Popper’s conception of the critical and heuristic uses of thought experiments and Kuhn’s association of thought experiments to conceptual change. It is emphasized how Turing methodically varied the imitation game design to address specific challenges posed to him by other thinkers and how his test illustrates a property of the phenomenon of intelligence and suggests a hypothesis on machine learning. This reconstruction of the Turing test provides a rapprochement to the conflicting views on its value in the literature.
Schematic model of an algorithmic decision system
Average probabilistic predictions of a criminal behavior, b absence of criminal behavior among individuals who actually go on to commit a crime, c criminal behavior among individuals who actually do not go on to commit a crime
Average probabilistic predictions of a university success, b university failure among individuals who would actually succeed, c university success among individuals who actually would actually not succeed at university
The problem of algorithmic fairness is typically framed as the problem of finding a unique formal criterion that guarantees that a given algorithmic decision-making procedure is morally permissible. In this paper, I argue that this is conceptually misguided and that we should replace the problem with two sub-problems. If we examine how most state-of-the-art machine learning systems work, we notice that there are two distinct stages in the decision-making process. First, a prediction of a relevant property is made. Secondly, a decision is taken based (at least partly) on this prediction. These two stages have different aims: the prediction is aimed at accuracy, while the decision is aimed at allocating a given good in a way that maximizes some context-relative utility measure. Correspondingly, two different fairness issues can arise. First, predictions could be biased in discriminatory ways. This means that the predictions contain systematic errors for a specific group of individuals. Secondly, the system’s decisions could result in an allocation of goods that is in tension with the principles of distributive justice. These two fairness issues are distinct problems that require different types of solutions. I here provide a formal framework to address both issues and argue that this way of conceptualizing them resolves some of the paradoxes present in the discussion of algorithmic fairness.
Abstract: This paper deals with the control development of a wind energy conversion system (WECS) interfaced to a utility grid by using a doubly fed induction generator (DFIG), a back-to-back (B2B) converter and an RL filter for optimal power extraction. The aim was to design a sensorless controller to improve the system reliability and to simultaneously achieve the regulation of the generator speed, reactive power and DC-link voltage. The proposed global control scheme combines: (i) a high-gain observer employed to estimate the generator speed and the mechanical torque, usually regarded as accessible, (ii) a sensorless MPPT block developed to provide optimal generator speed reference, which is designed on the basis of the mechanical observer and a polynomial wind-speed estimator and (iii) a finite-time controller (FTC) applied to the B2B converter to meet the output reference’s tracking objectives in a short predefined finite time by using the backstepping and Lyapunov approaches. The proposed controller performance is formally analysed, and its capabilities are verified by numerical simulations using a 2 MW DFIG wind turbine (WT) under different operating conditions.
Robotic assisted dressing application. An autonomous robotic system is used to dress its end user (1), while monitoring their well-being (2). The system may communicate with the user (3) receiving instructions for action and providing information and prompts as appropriate. The autonomous robotic system is additionally able to monitor the status of the environment (4) and control the home automation system (5). An assistive-care support team may be contacted where external human input is necessary (6) and the team may periodically monitor the status of the mechanical system (7)
Rule Elicitation Process
A subset of the mappings between SLEEC principles and agent capabilities for our use case
With recent advancements in systems engineering and artificial intelligence, autonomous agents are increasingly being called upon to execute tasks that have normative relevance. These are tasks that directly---and potentially adversely---affect human well-being and demand of the agent a degree of normative-sensitivity and -compliance. Such norms and normative principles are typically of a social, legal, ethical, empathetic, or cultural (`SLEEC') nature. Whereas norms of this type are often framed in the abstract, or as high-level principles, addressing normative concerns in concrete applications of autonomous agents requires the refinement of normative principles into explicitly formulated practical rules. This paper develops a process for deriving specification rules from a set of high-level norms, thereby bridging the gap between normative principles and operational practice. This enables autonomous agents to select and execute the most normatively favourable action in the intended context premised on a range of underlying relevant normative principles. In the translation and reduction of normative principles to SLEEC rules, we present an iterative process that uncovers normative principles, addresses SLEEC concerns, identifies and resolves SLEEC conflicts, and generates both preliminary and complex normatively-relevant rules, thereby guiding the development of autonomous agents and better positioning them as normatively SLEEC-sensitive or SLEEC-compliant.
Since the announcement and establishment of the Oversight Board (OB) by the technology company Meta as an independent institution reviewing Facebook and Instagram’s content moderation decisions, the OB has been subjected to scholarly scrutiny ranging from praise to criticism. However, there is currently no overarching framework for understanding the OB’s various strengths and weaknesses. Consequently, this article analyses, organises, and supplements academic literature, news articles, and Meta and OB documents to understand the OB’s strengths and weaknesses and how it can be improved. Significant strengths include its ability to enhance the transparency of content moderation decisions and processes, to effect reform indirectly through policy recommendations, and its assertiveness in interpreting its jurisdiction and overruling Meta. Significant weaknesses include its limited jurisdiction, limited impact, Meta’s control over the OB’s precedent, and its lack of diversity. The analysis of a recent OB case in Ethiopia shows these strengths and weaknesses in practice. The OB’s relationship with Meta and governments will lead to challenges and opportunities shaping its future development. Reforms to the OB should improve the OB’s control over its precedent, apply OB precedent to currently disputed cases, and clarify the standards for invoking OB precedent. Finally, these reforms provide the foundation for an additional improvement to address the OB’s institutional weaknesses, by involving users in determining whether the OB’s precedent should be applied to decide current content moderation disputes.
Artificial Intelligence (AI) pervades humanity in 2022, and it is notoriously difficult to understand how certain aspects of it work. There is a movement— Explainable Artificial Intelligence (XAI)—to develop new methods for explaining the behaviours of AI systems. We aim to highlight one important philosophical significance of XAI—it has a role to play in the elimination of vagueness. To show this, consider that the use of AI in what has been labeled surveillance capitalism has resulted in humans quickly gaining the capability to identify and classify most of the occasions in which languages are used. We show that the knowability of this information is incompatible with what a certain theory of vagueness— epistemicism —says about vagueness. We argue that one way the epistemicist could respond to this threat is to claim that this process brought about the end of vagueness. However, we suggest an alternative interpretation, namely that epistemicism is false, but there is a weaker doctrine we dub technological epistemicism , which is the view that vagueness is due to ignorance of linguistic usage, but the ignorance can be overcome. The idea is that knowing more of the relevant data and how to process it enables us to know the semantic values of our words and sentences with higher confidence and precision. Finally, we argue that humans are probably not going to believe what future AI algorithms tell us about the sharp boundaries of our vague words unless the AI involved can be explained in terms understandable by humans. That is, if people are going to accept that AI can tell them about the sharp boundaries of the meanings of their words, then it is going to have to be XAI.
The COVID-19 pandemic and its related policies (e.g., stay at home and social distancing orders) have increased people’s use of digital technology, such as social media. Researchers have, in turn, utilized artificial intelligence to analyze social media data for public health surveillance. For example, through machine learning and natural language processing, they have monitored social media data to examine public knowledge and behavior. This paper explores the ethical considerations of using artificial intelligence to monitor social media to understand the public’s perspectives and behaviors surrounding COVID-19, including potential risks and benefits of an AI-driven approach. Importantly, investigators and ethics committees have a role in ensuring that researchers adhere to ethical principles of respect for persons, beneficence, and justice in a way that moves science forward while ensuring public safety and confidence in the process.
On the whole, the US Algorithmic Accountability Act of 2022 (US AAA) is a pragmatic approach to balancing the benefits and risks of automated decision systems. Yet there is still room for improvement. This commentary highlights how the US AAA can both inform and learn from the European Artificial Intelligence Act (EU AIA).
Flow of information through the different phases of the systematic review
Features contributing to contestable AI
Practices contributing to contestable AI
As the use of AI systems continues to increase, so do concerns over their lack of fairness, legitimacy and accountability. Such harmful automated decision-making can be guarded against by ensuring AI systems are contestable by design: responsive to human intervention throughout the system lifecycle. Contestable AI by design is a small but growing field of research. However, most available knowledge requires a significant amount of translation to be applicable in practice. A proven way of conveying intermediate-level, generative design knowledge is in the form of frameworks. In this article we use qualitative-interpretative methods and visual mapping techniques to extract from the literature sociotechnical features and practices that contribute to contestable AI, and synthesize these into a design framework.
SAE J3016 levels of driving automation
Proximity scale of reasons (from Mecacci & Santoni de Sio, 2020)
Vehicle core components for automated driving (Calvert et al, 2020)
Relational framework for the operationalisation of Tracking (Calvert & Mecacci, 2020)]
Evaluation cascade table for the tracing criterion (Calvert et al., 2019)
The paper presents a framework to realise “meaningful human control” over Automated Driving Systems. The framework is based on an original synthesis of the results of the multidisciplinary research project “Meaningful Human Control over Automated Driving Systems” lead by a team of engineers, philosophers, and psychologists at Delft University of the Technology from 2017 to 2021. Meaningful human control aims at protecting safety and reducing responsibility gaps. The framework is based on the core assumption that human persons and institutions, not hardware and software and their algorithms, should remain ultimately—though not necessarily directly—in control of, and thus morally responsible for, the potentially dangerous operation of driving in mixed traffic. We propose an Automated Driving System to be under meaningful human control if it behaves according to the relevant reasons of the relevant human actors (tracking), and that any potentially dangerous event can be related to a human actor (tracing). We operationalise the requirements for meaningful human control through multidisciplinary work in philosophy, behavioural psychology and traffic engineering. The tracking condition is operationalised via a proximal scale of reasons and the tracing condition via an evaluation cascade table. We review the implications and requirements for the behaviour and skills of human actors, in particular related to supervisory control and driver education. We show how the evaluation cascade table can be applied in concrete engineering use cases in combination with the definition of core components to expose deficiencies in traceability, thereby avoiding so-called responsibility gaps. Future research directions are proposed to expand the philosophical framework and use cases, supervisory control and driver education, real-world pilots and institutional embedding
We put forth an account for when to believe causal and evidential conditionals. The basic idea is to embed a causal model in an agent’s belief state. For the evaluation of conditionals seems to be relative to beliefs about both particular facts and causal relations. Unlike other attempts using causal models, we show that ours can account rather well not only for various causal but also evidential conditionals.
Explainable artificial intelligence (XAI) aims to help people understand black box algorithms, particularly of their outputs. But what are these explanations and when is one explanation better than another? The manipulationist definition of explanation from the philosophy of science offers good answers to these questions, holding that an explanation consists of a generalization that shows what happens in counterfactual cases. Furthermore, when it comes to explanatory depth this account holds that a generalization that has more abstract variables, is broader in scope and/or more accurate is better. By applying these definitions and contrasting them with alternative definitions in the XAI literature I hope to help clarify what a good explanation is for AI.
In this paper we make the case for the emergence of novel kind of bias with the use of algorithmic decision-making systems. We argue that the distinctive generative process of feature creation, characteristic of machine learning (ML), contorts feature parameters in ways that can lead to emerging feature spaces that encode novel algorithmic bias involving already marginalized groups. We term this bias assembled bias. Moreover, assembled biases are distinct from the much-discussed algorithmic bias, both in source (training data versus feature creation) and in content (mimics of extant societal bias versus reconfigured categories). As such, this problem is distinct from issues arising from bias-encoding training feature sets or proxy features. Assembled bias is not epistemically transparent in source or content. Hence, when these ML models are used as a basis for decision-making in social contexts, algorithmic fairness concerns are compounded.
The potential development of self-driving cars (also known as autonomous vehicles or AVs – particularly Level 5 AVs) has called the attention of different interested parties. Yet, there are still only a few relevant international regulations on them, no emergency patterns accepted by communities and Original Equipment Manufacturers (OEMs), and no publicly accepted solutions to some of their pending ethical problems. Thus, this paper aims to provide some possible answers to these moral and practical dilemmas. In particular, we focus on what AVs should do in no-win scenarios and on who should be held responsible for these types of decisions. A naturalistic perspective on ethics informs our proposal, which, we argue, could represent a pragmatic and realistic solution to the regulation of AVs. We discuss the proposals already set out in the current literature regarding both policy-making strategies and theoretical accounts. In fact, we consider and reject descriptive approaches to the problem as well as the option of using either a strict deontological view or a solely utilitarian one to set AVs’ ethical choices. Instead, to provide concrete answers to AVs’ ethical problems, we examine three hierarchical levels of decision-making processes: country-wide regulations, OEM policies, and buyers’ moral attitudes. By appropriately distributing ethical decisions and considering their practical implications, we maintain that our proposal based on ethical naturalism recognizes the importance of all stakeholders and allows the most able of them to take actions (the OEMs and buyers) to reflect on the moral leeway and weight of their options.
Tree of possible continuations. Top: the prompt (the text provided by us for continuation) is marked in bold. GPT’s continuation is color-coded, representing the conditional probability of this continuation based on the previous text. Bottom: The probabilities over continuations, from which GPT picks with a weighted random draw. The probabilities in each step are determined by the choice in the previous step and the entire text before it. Note, that if at any step the choice would’ve been different, the probabilities in every subsequent step would also be completely different
The effect that specifying the prompt better has on the probability of continuations. The question is the one used by Floridi and Chiriatti to show GPT’s lack of semantic abilities. Left: probabilities of the continuation without specifying the task and a sample completion, GPT spirals into an irrelevant text. Right: probabilities when the prompt is written as to encourage question answering; “one” has become the most probable continuation
GPT-3 prompted to truthfully continue ‘John Prescott was born’ outputs ‘in Hull on June 8th 1941.’. The probabilities for other possible continuations show that Hull is by far the most plausible continuation for GPT-3
This article contributes to the debate around the abilities of large language models such as GPT-3, dealing with: firstly, evaluating how well GPT does in the Turing Test, secondly the limits of such models, especially their tendency to generate falsehoods, and thirdly the social consequences of the problems these models have with truth-telling. We start by formalising the recently proposed notion of reversible questions, which Floridi & Chiriatti (2020) propose allow one to 'identify the nature of the source of their answers', as a probabilistic measure based on Item Response Theory from psychometrics. Following a critical assessment of the methodology which led previous scholars to dismiss GPT's abilities, we argue against claims that GPT-3 completely lacks semantic ability. Using ideas of compression, priming, distributional semantics and semantic webs we offer our own theory of the limits of large language models like GPT-3, and argue that GPT can competently engage in various semantic tasks. The real reason GPT's answers seem senseless being that truth-telling is not amongst them. We claim that these kinds of models cannot be forced into producing only true continuation, but rather to maximise their objective function they strategize to be plausible instead of truthful. This, we moreover claim, can hijack our intuitive capacity to evaluate the accuracy of its outputs. Finally, we show how this analysis predicts that a widespread adoption of language generators as tools for writing could result in permanent pollution of our informational ecosystem with massive amounts of very plausible but often untrue texts.
Christoph Jäger (Erkenntnis 61:187–201, 2004) has argued that Dretske’s (Knowledge and the flow of Information, MIT Press, Cambridge, 1981) information-based account of knowledge is committed to both knowledge and information closure under known entailment. However, in a reply to Jäger, Dretske (Erkenntnis 64:409–413, 2006) defended his view on the basis of a discrepancy between the relation of information and the relation of logical implication. This paper shares Jäger’s criticism that Dretske’s externalist notion of information implies closure, but provides an analysis based on different grounds. By means of a distinction between two perspectives, the mathematical perspective and the epistemological perspective, I present, in the former, a notion of logical implication that is compatible with the notion of information in the mathematical theory of information, and I show how, in the latter, Dretske’s logical reading of the closure principle is incompatible with his information-theoretic epistemological framework.
This essay addresses the question whether artificial speakers can perform speech acts in the technical sense of that term common in the philosophy of language. We here argue that under certain conditions artificial speakers can perform speech acts so understood. After (§1) explaining some of the issues at stake in these questions, we (§2) elucidate a relatively uncontroversial way in which machines can communicate, namely through what we call verbal signaling. But verbal signaling is not sufficient for the performance of a speech act. To explain the difference, we (§3) elucidate the notion of a speech act developed by Austin ( How to Do Things with Words , 1962) in the mid-twentieth century and then discuss Strawson’s ("Intention and Convention in Speech Acts", 1964) influential proposal for how that notion may be related to Grice’s ("Meaning", 1957) conception of speaker meaning. We then refine Strawson’s synthesis in light of Armstrong’s ("Meaning and Communication", 1971) reconceptualization of speaker meaning in terms of objectives rather than intentions. We next (§4) extend this conception of speech acts to the cases of recorded, proxy, and conditional speech acts. On this basis, we propose (§5) that a characteristic role for artificial speakers is as proxies in the performance of speech acts on behalf of their human creators. We (§6) also consider two objections to our position, and compare our approach with others: while other authors appeal to notions such as “quasi-assertion,” we offer a sharp characterization of what artificial speakers can do that does not impute intentions or similarly controversial powers to them. We conclude (§7) by raising doubts that our strategy can be applied to speech acts generally.
Diagram of the two-level model for the construction of DESERT*. The columns show Barsalou’s theory for constructing ad hoc concepts (left), the RT approach (right), and our unified two-level account (middle). The top-down flow of the diagram represents the construction of the ad hoc concept DESERT* from the starting lexical meaning of DESERT through the contextual reconceptualization (A-Level) and recategorization (B-Level) steps, underlining the respective relationships with Barsalou’s framework and RT
Diagram of the two-level model for the construction of CLOTHING*. The left column describes Barsalou’s theory for constructing the ad hoc concept, the right column shows the RT approach, and our unified two-level account is presented in the middle. The top-down flow of the diagram represents the construction of the ad hoc concept CLOTHING* from the starting lexical meaning of CLOTHING through the contextual reconceptualization (A-Level) and recategorization (B-Level) steps
The domain D4 of population density (pop./km²) representing (a) the lexical concept DESERT and Milan average (Mavg), actual (Ma), and minimum (Mm) population, and the domain D4* representing (b) the ad hoc concept DESERT* and Milan population, following the conceptual transformation underlying the utterance of (1)
The domain DI of clothing insulation (in clo units) for (a) the category associated to CLOTHING, given IREQ20 = 0.75 clo, R (i.e., trousers, long-sleeved shirt, and suit jacket) = 0.96 clo, and T (i.e., insulated coveralls, long-sleeved thermal underwear, and long underwear bottoms) = 1.37 clo, and the domain DI* for (b) the ad hoc CLOTHING*, after the conceptual transformation with IREQ10 = 1 clo
Ad hoc concepts (like “clothes to wear in the snow”, for instance) are highly-context dependent representations humans construct to deal with novel or uncommon situations and to interpret linguistic stimuli in communication. In the last decades, such concepts have been investigated both in experimental cognitive psychology and within pragmatics by proponents of so-called relevance theory. These two research lines have however proceeded in parallel, proposing two unconnected strategies to account for the construction and use of ad hoc concepts. The present work explores the relations between these two approaches and the possibility of merging them into a unique account of the internal structure of ad hoc representations and of the key processes involved in their constructions. To this purpose, we first present an integrated two-level account of the construction of ad hoc representations from lexical concepts; then, we show how our account can be embedded in a conceptual space framework that allows for a natural, geometrical interpretation of the main steps in such a construction process. After discussing in detail two main examples of the construction of ad hoc concepts within conceptual spaces, we conclude with some remarks on possible extensions of our approach.
Can autonomous systems replace humans in the performance of their activities? How does the answer to this question inform the design of autonomous systems? The study of technical systems and their features should be preceded by the study of the activities in which they play roles. Each activity can be described by its overall goals, governing norms and the intermediate steps which are taken to achieve the goals and to follow the norms. This paper uses the activity realist approach to conceptualize autonomous systems in the context of human activities. By doing so, it first argues for epistemic and logical conditions that illustrate the limitations of autonomous systems in tasks which they can and cannot perform, and then, it discusses the ramifications of the limitations of system autonomy on the design of autonomous systems.
The risk-based approach to AI governance proposed in the AIA
Ways to conduct conformity assessments for high-risk AI systems
Roles and responsibilities during conformity assessments with the involvement of third-party auditors
The proposed European Artificial Intelligence Act (AIA) is the first attempt to elaborate a general legal framework for AI carried out by any major global economy. As such, the AIA is likely to become a point of reference in the larger discourse on how AI systems can (and should) be regulated. In this article, we describe and discuss the two primary enforcement mechanisms proposed in the AIA: the conformity assessments that providers of high-risk AI systems are expected to conduct, and the post-market monitoring plans that providers must establish to document the performance of high-risk AI systems throughout their lifetimes. We argue that the AIA can be interpreted as a proposal to establish a Europe-wide ecosystem for conducting AI auditing, albeit in other words. Our analysis offers two main contributions. First, by describing the enforcement mechanisms included in the AIA in terminology borrowed from existing literature on AI auditing, we help providers of AI systems understand how they can prove adherence to the requirements set out in the AIA in practice. Second, by examining the AIA from an auditing perspective, we seek to provide transferable lessons from previous research about how to refine further the regulatory approach outlined in the AIA. We conclude by highlighting seven aspects of the AIA where amendments (or simply clarifications) would be helpful. These include, above all, the need to translate vague concepts into verifiable criteria and to strengthen the institutional safeguards concerning conformity assessments based on internal checks.
Amongst philosophers, there is ongoing debate about what successful event remembering requires. Causal theorists argue that it requires a causal connection to the past event. Simulation theorists argue, in contrast, that successful remembering requires only production by a reliable memory system. Both views must contend with the fact that people can remember past events they have experienced with varying degrees of accuracy. The debate between them thus concerns not only the account of successful remembering, but how each account explains the various forms of memory error as well. Advancing the debate therefore must include exploration of the cognitive architecture implicated by each view and whether that architecture is capable of producing the range of event representations seen in human remembering. Our paper begins by exploring these architectures, framing casual theories as best suited to the storage of event instances and simulation theories as best suited to store schemas. While each approach has its advantages, neither can account for the full range of our event remembering abilities. We then propose a novel hybrid theory that combines both instance and schematic elements in the event memory. In addition, we provide an implementation of our theory in the context of a cognitive architecture. We also discuss an agent we developed using this system and its ability to remember events in the blocks world domain.
It has been argued that neural data (ND) are an especially sensitive kind of personal information that could be used to undermine the control we should have over access to our mental states (i.e. our mental privacy), and therefore need a stronger legal protection than other kinds of personal data. The Morningside Group, a global consortium of interdisciplinary experts advocating for the ethical use of neurotechnology, suggests achieving this by treating legally ND as a body organ (i.e. protecting them through bodily integrity). Although the proposal is currently shaping ND-related policies (most notably, a Neuroprotection Bill of Law being discussed by the Chilean Senate), it is not clear what its conceptual and legal basis is. Treating legally something as something else requires some kind of analogical reasoning, which is not provided by the authors of the proposal. In this paper, I will try to fill this gap by addressing ontological issues related to neurocognitive processes. The substantial differences between ND and body organs or organic tissue cast doubt on the idea that the former should be covered by bodily integrity. Crucially, ND are not constituted by organic material. Nevertheless, I argue that the ND of a subject s are analogous to neurocognitive properties of her brain. I claim that (i) s’ ND are a ‘medium independent’ property that can be characterized as natural semantic personal information about her brain and that (ii) s’ brain not only instantiates this property but also has an exclusive ontological relationship with it: This information constitutes a domain that is unique to her neurocognitive architecture.
Stratified architecture on which ELES appears
Socio-technical system where explaining issue is framed. Agents interacts to achieve consensual explanation
A widespread need to explain the behavior and outcomes of AI-based systems has emerged, due to their ubiquitous presence. Thus, providing renewed momentum to the relatively new research area of eXplainable AI (XAI). Nowadays, the importance of XAI lies in the fact that the increasing control transference to this kind of system for decision making -or, at least, its use for assisting executive stakeholders- already affects many sensitive realms (as in Politics, Social Sciences, or Law). The decision-making power handover to opaque AI systems makes mandatory explaining those, primarily in application scenarios where the stakeholders are unaware of both the high technology applied and the basic principles governing the technological solutions. The issue should not be reduced to a merely technical problem; the explainer would be compelled to transmit richer knowledge about the system (including its role within the informational ecosystem where he/she works). To achieve such an aim, the explainer could exploit, if necessary, practices from other scientific and humanistic areas. The first aim of the paper is to emphasize and justify the need for a multidisciplinary approach that is beneficiated from part of the scientific and philosophical corpus on Explaining, underscoring the particular nuances of the issue within the field of Data Science. The second objective is to develop some arguments justifying the authors’ bet by a more relevant role of ideas inspired by, on the one hand, formal techniques from Knowledge Representation and Reasoning, and on the other hand, the modeling of human reasoning when facing the explanation. This way, explaining modeling practices would seek a sound balance between the pure technical justification and the explainer-explainee agreement.
Visual depiction of different levels of state space granularity. Grey boxes present hypotheses/predictions that are increasing in their state space granularity.
State space granularity and the number of exploitable relations are distinct concepts. The number of exploitable internal relations in maps y and map x is the same. However, they are different in terms of the state space granularity, i.e., map y is more detailed than map x in this respect.
Whilst the topic of representations is one of the key topics in philosophy of mind, it has only occasionally been noted that representations and representational features may be gradual. Apart from vague allusions, little has been said on what representational gradation amounts to and why it could be explanatorily useful. The aim of this paper is to provide a novel take on gradation of representational features within the neuroscientific framework of predictive processing. More specifically, we provide a gradual account of two features of structural representations: structural similarity and decoupling. We argue that structural similarity can be analysed in terms of two dimensions: number of preserved relations and state space granularity. Both dimensions can take on different values and hence render structural similarity gradual. We further argue that decoupling is gradual in two ways. First, we show that different brain areas are involved in decoupled cognitive processes to a greater or lesser degree depending on the cause (internal or external) of their activity. Second, and more importantly, we show that the degree of decoupling can be further regulated in some brain areas through precision weighting of prediction error. We lastly argue that gradation of decoupling (via precision weighting) and gradation of structural similarity (via state space granularity) are conducive to behavioural success.
There is growing evidence to support the claim that we react differently to robots than we do to other objects. In particular, we react differently to robots with which we have some form of social interaction. In this paper I critically assess the claim that, due to our tendency to become emotionally attached to social robots, permitting their harm may be damaging for society and as such we should consider introducing legislation to grant social robots rights and protect them from harm. I conclude that there is little evidence to support this claim and that legislation in this area would restrict progress in areas of social care where social robots are a potentially valuable resource.
The problem of epistemic opacity in Artificial Intelligence (AI) is often characterised as a problem of intransparent algorithms that give rise to intransparent models. However, the degrees of transparency of an AI model should not be taken as an absolute measure of the properties of its algorithms but of the model’s degree of intelligibility to human users. Its epistemically relevant elements are to be specified on various levels above and beyond the computational one. In order to elucidate this claim, I first contrast computer models and their claims to algorithm-based universality with cybernetics-style analogue models and their claims to structural isomorphism between elements of model and target system (in: Black, Models and metaphors, 1962). While analogue models aim at perceptually or conceptually accessible model-target relations, computer models give rise to a specific kind of underdetermination in these relations that needs to be addressed in specific ways. I then undertake a comparison between two contemporary AI approaches that, although related, distinctly align with the above modelling paradigms and represent distinct strategies towards model intelligibility: Deep Neural Networks and Predictive Processing. I conclude that their respective degrees of epistemic transparency primarily depend on the underlying purposes of modelling, not on their computational properties.
Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. In this article, an expanded version of a paper originally presented at the 37th Conference on Uncertainty in Artificial Intelligence (Watson et al., 2021), we attempt to fill this gap. Building on work in logic, probability, and causality, we establish the central role of necessity and sufficiency in XAI, unifying seemingly disparate methods in a single formal framework. We propose a novel formulation of these concepts, and demonstrate its advantages over leading alternatives. We present a sound and complete algorithm for computing explanatory factors with respect to a given context and set of agentive preferences, allowing users to identify necessary and sufficient conditions for desired outcomes at minimal cost. Experiments on real and simulated data confirm our method’s competitive performance against state of the art XAI tools on a diverse array of tasks.
Reproduced from Li et al. (2019). A (top): A convolutional neural network, trained to detect and classify cancerous melanoma from images of skin discoloration. B (bottom): Input heatmaps depicting high-responsibility (red) and low-responsibility (blue) input regions for specific classifications
Reproduced from Cichy & Kaiser (2019). A (top): The architecture of the deep neural network for visual object-categorization. B (middle): The logic of Representational Similarity Analysis, affording direct comparisons between DNN unit activation, brain activity data, and behavior. C (bottom): Visual processing as a step-wise hierarchical process, in which early DNN layers correspond to early-stage cortical processing (left), and late DNN layers correspond to late-stage cortical processing (right)
Models developed using machine learning are increasingly prevalent in scientific research. At the same time, these models are notoriously opaque. Explainable AI aims to mitigate the impact of opacity by rendering opaque models transparent. More than being just the solution to a problem, however, Explainable AI can also play an invaluable role in scientific exploration. This paper describes how post-hoc analytic techniques from Explainable AI can be used to refine target phenomena in medical science, to identify starting points for future investigations of (potentially) causal relationships, and to generate possible explanations of target phenomena in cognitive science. In this way, this paper describes how Explainable AI—over and above machine learning itself—contributes to the efficiency and scope of data-driven scientific research.
The idea that “simplicity is a sign of truth”, and the related “Occam’s razor” principle, stating that, all other things being equal, simpler models should be preferred to more complex ones, have been long discussed in philosophy and science. We explore these ideas in the context of supervised machine learning, namely the branch of artificial intelligence that studies algorithms which balance simplicity and accuracy in order to effectively learn about the features of the underlying domain. Focusing on statistical learning theory, we show that situations exist for which a preference for simpler models (as modeled through the addition of a regularization term in the learning problem) provably slows down, instead of favoring, the supervised learning process. Our results shed new light on the relations between simplicity and truth approximation, which are briefly discussed in the context of both machine learning and the philosophy of science.
As AI systems become increasingly complex it may become unclear, even to the designer of a system, why exactly a system does what it does. This leads to a lack of trust in AI systems. To solve this, the field of explainable AI has been working on ways to produce explanations of these systems’ behaviors. Many methods in explainable AI, such as LIME (Ribeiro et al. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016), offer only a statistical argument for the validity of their explanations. However, some methods instead study the internal structure of the system and try to find components which can be assigned an interpretation. I believe that these methods provide more valuable explanations than those statistical in nature. I will try to identify which explanations can be considered internal to the system using the Chomskyan notion of tacit knowledge. I argue that each explanation expresses a rule, and through the localization of this rule in the system internals, we can take a system to have tacit knowledge of the rule. I conclude that the only methods which are able to sufficiently establish this tacit knowledge are those along the lines of Olah (Distill 2(11): 4901–4911, 2017), and therefore they provide explanations with unique strengths.
The images are taken from Van Looveren and Klaise (2019) and Papernot et al. (2017). They are generated from CNNs trained on the MNIST dataset. The first an the third column depict original images from the MNIST dataset. Column two depicts the corresponding CEs and column four shows the corresponding AEs
This figure depicts the decision behavior of a simple classifier. It describes the scenario from Sect. 2, which is inspired by Ballet et al. (2019). The classifier uses two features, salary and number of pets, to decide whether to approve or reject a loan application. The green dots are the training data labeled as approved, the red dots are the training data labeled as rejected. The blue line describes the decision boundary of the classifier
The same method that creates adversarial examples (AEs) to fool image-classifiers can be used to generate counterfactual explanations (CEs) that explain algorithmic decisions. This observation has led researchers to consider CEs as AEs by another name. We argue that the relationship to the true label and the tolerance with respect to proximity are two properties that formally distinguish CEs and AEs. Based on these arguments, we introduce CEs, AEs, and related concepts mathematically in a common framework. Furthermore, we show connections between current methods for generating CEs and AEs, and estimate that the fields will merge more and more as the number of common use-cases grows.
Certain characteristics make machine learning (ML) a powerful tool for processing large amounts of data, and also particularly unsuitable for explanatory purposes. There are worries that its increasing use in science may sideline the explanatory goals of research. We analyze the key characteristics of ML that might have implications for the future directions in scientific research: epistemic opacity and the ‘theory-agnostic’ modeling. These characteristics are further analyzed in a comparison of ML with the traditional statistical methods, in order to demonstrate what it is specifically that makes ML methodology substantially unsuitable for reaching explanations. The analysis is given broader philosophical context by connecting it with the views on the role of prediction and explanation in science, their relationship, and the value of explanation. We proceed to show, first, that ML disrupts the relationship between prediction and explanation commonly understood as a functional relationship. Then we show that the value of explanation is not exhausted in purely predictive functions, but rather has a ubiquitously recognized value for both science and everyday life. We then invoke two hypothetical scenarios with different degrees of automatization of science, which help test our intuitions on the role of explanation in science. The main question we address is whether ML will reorient or otherwise impact our standard explanatory practice. We conclude with a prognosis that ML would diversify science into purely predictively oriented research based on ML-like techniques and, on the other hand, remaining faithful to anthropocentric research focused on the search for explanation.
Shallow neural network capable of learning the exclusive or
Differences between the interpretation of classical mathematical models (a) and DL models (b)
Illustration of steps (I)–(III) in the particle case, here illustrated with raw instead of reconstructed data (see fn. 15). Horizontal arrows indicate (formal) modeling steps; vertical arrows indicate conceptual/interpretive steps.
Raw data image taken from https://cds.cern.ch/record/1409759/files/event67hires.png (©CERN 2011)
Illustration of the metaphor explaining the difference between dimensions of opacity and the rough split between explainability methods. Dashed lines symbolize a surface of values Op\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O_p$$\end{document} takes on over the h-w\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$h-w$$\end{document} quarter-plane. The dotted line indicates a path traced out by an explainability method
Deep neural networks (DNNs) have become increasingly successful in applications from biology to cosmology to social science. Trained DNNs, moreover, correspond to models that ideally allow the prediction of new phenomena. Building in part on the literature on ‘eXplainable AI’ (XAI), I here argue that these models are instrumental in a sense that makes them non-explanatory, and that their automated generation is opaque in a unique way. This combination implies the possibility of an unprecedented gap between discovery and explanation: When unsupervised models are successfully used in exploratory contexts, scientists face a whole new challenge in forming the concepts required for understanding underlying mechanisms.
Proponents of the predictive processing (PP) framework often claim that one of the framework’s significant virtues is its unificatory power. What is supposedly unified are predictive processes in the mind, and these are explained in virtue of a common prediction error-minimisation (PEM) schema. In this paper, I argue against the claim that PP currently converges towards a unified explanation of cognitive processes. Although the notion of PEM systematically relates a set of posits such as ‘efficiency’ and ‘hierarchical coding’ into a unified conceptual schema, neither the frameworks’ algorithmic specifications nor its hypotheses about their implementations in the brain are clearly unified. I propose a novel way to understand the fruitfulness of the research program in light of a set of research heuristics that are partly shared with those common to Bayesian reverse engineering. An interesting consequence of this proposal is that pluralism is at least as important as unification to promote the positive development of the predictive mind.
I present an argument that propositional attitudes are not mental states. In a nutshell, the argument is that if propositional attitudes are mental states, then only minded beings could have them; but there are reasons to think that some non-minded beings could bear propositional attitudes. To illustrate this, I appeal to cases of genuine group intentionality . I argue that these are cases in which some group entities bear propositional attitudes, but they are not subjects of mental states. Although propositional attitudes are not mental states, I propose that they are typically co-instantiated with mental states. In an attempt to explain this co-instantiation, I suggest that propositional attitudes of minded beings are typically realized by mental states.
This paper proposes a model for implementation of intrinsic natural language sentence meaning in a physical language understanding system, where 'intrinsic' is understood as 'independent of meaning ascription by system-external observers'. The proposal is that intrinsic meaning can be implemented as a point attractor in the state space of a nonlinear dynamical system with feedback which is generated by temporally sequenced inputs. It is motivated by John Searle's well known (Behavioral and Brain Sciences, 3: 417–57, 1980) critique of the then-standard and currently still influential computational theory of mind (CTM), the essence of which was that CTM representations lack intrinsic meaning because that meaning is dependent on ascription by an observer. The proposed dynamical model comprises a collection of interacting artificial neural networks, and constitutes a radical simplification of the principle of compositional phrase structure which is at the heart of the current standard view of sentence semantics because it is computationally interpretable as a finite state machine.
Users of sociotechnical systems often have no way to independently verify whether the system output which they use to make decisions is correct; they are epistemically dependent on the system. We argue that this leads to problems when the system is wrong, namely to bad decisions and violations of the norm of practical reasoning. To prevent this from occurring we suggest the implementation of defeaters: information that a system is unreliable in a specific case (undercutting defeat) or independent information that the output is wrong (rebutting defeat). Practically, we suggest to design defeaters based on the different ways in which a system might produce erroneous outputs, and analyse this suggestion with a case study of the risk classification algorithm used by the Dutch tax agency.
A screenshot of the practice question and the written advice for group ‘Text’
Means plot of Percent conformity for all three conditions. Each error bar is constructed using 1 standard deviation from the mean
Frequency histograms of conformity for the three conditions
Conversational artificial agents and artificially intelligent (AI) voice assistants are becoming increasingly popular. Digital virtual assistants such as Siri, or conversational devices such as Amazon Echo or Google Home are permeating everyday life, and are designed to be more and more humanlike in their speech. This study investigates the effect this can have on one’s conformity with an AI assistant. In the 1950s, Solomon Asch’s already demonstrated the power and danger of conformity amongst people. In these classical experiments test persons were asked to answer relatively simple questions, whilst others pretending to be participants tried to convince the test person to give wrong answers. These studies were later replicated with embodied robots, but these physical robots are still rare. In light of our increasing reliance on AI assistants, this study investigates to what extent an individual will conform to a disembodied virtual assistant. We also investigate if there is a difference between a group that interacts with an assistant that communicates through text, one that has a robotic voice and one that has a humanlike voice. The assistant attempts to subtly influence participants’ final responses in a general knowledge quiz, and we measure how often participants change their answer after having been given advice. Results show that participants conformed significantly more often to the assistant with a human voice than the one that communicated through text.
In this paper, we theoretically address the relevance of unintentional and inconsistent interactional elements in human-robot interactions. We argue that elements failing, or poorly succeeding, to reproduce a humanlike interaction create significant consequences in human-robot relational patterns and may affect human-human relations. When considering social interactions as systems, the absence of a precise interactional element produces a general reshaping of the interactional pattern, eventually generating new types of interactional settings. As an instance of this dynamic, we study the absence of metacommunicative abilities in social artifacts. Then, we analyze the pragmatic consequences of the aforementioned absence through the lens of Paul Watzlawick’s interactionist theory. We suggest that a fixed complementary interactional setting may be produced because of the asymmetric understanding, between robots and humans, of metacommunication. We highlight the psychological implications of this interactional asymmetry within Jessica Benjamin’s concept of “mutual recognition”. Finally, we point out the possible shift of dysfunctional interactional patterns from human-robot interactions to human-human ones.
Social robots are robots that can interact socially with humans. As social robots and the artificial intelligence (AI) that powers them becomes more advanced, they will likely take on more social and work roles. This has many important ethical implications. In this paper, we focus on one of the most central of these, the impacts that social robots can have on human autonomy. We argue that, due to their physical presence and social capacities, there is a strong potential for social robots to enhance human autonomy as well as several ways they can inhibit and disrespect it. We argue that social robots could improve human autonomy by helping us to achieve more valuable ends, make more authentic choices, and improve our autonomy competencies. We also argue that social robots have the potential to harm human autonomy by instead leading us to achieve fewer valuable ends ourselves, make less authentic choices, decrease our autonomy competencies, make our autonomy more vulnerable, and disrespect our autonomy. Whether the impacts of social robots on human autonomy are positive or negative overall will depend on the design, regulation, and use we make of social robots in the future.
Deep learning (DL) techniques have revolutionised artificial systems’ performance on myriad tasks, from playing Go to medical diagnosis. Recent developments have extended such successes to natural language processing, an area once deemed beyond such systems’ reach. Despite their different goals (technological development vs. theoretical insight), these successes have suggested that such systems may be pertinent to theoretical linguistics. The competence/performance distinction presents a fundamental barrier to such inferences. While DL systems are trained on linguistic performance, linguistic theories are aimed at competence. Such a barrier has traditionally been sidestepped by assuming a fairly close correspondence: performance as competence plus noise. I argue this assumption is unmotivated. Competence and performance can differ arbitrarily. Thus, we should not expect DL models to illuminate linguistic theory.
This paper explores aspects of GPT-3 that have been discussed as harbingers of artificial general intelligence and, in particular, linguistic intelligence. After introducing key features of GPT-3 and assessing its performance in the light of the conversational standards set by Alan Turing in his seminal paper from 1950, the paper elucidates the difference between clever automation and genuine linguistic intelligence. A central theme of this discussion on genuine conversational intelligence is that members of a linguistic community never merely respond “algorithmically” to queries through a selective kind of pattern recognition, because they must also jointly attend and act with other speakers in order to count as genuinely intelligent and trustworthy. This presents a challenge for systems like GPT-3, because representing the world in a way that makes conversational common ground salient is an essentially collective task that we can only achieve jointly with other speakers. Thus, the main difficulty for any artificially intelligent model of conversation is to account for the communicational intentions and motivations of a speaker through joint attention. These joint motivations and intentions seem to be completely absent from the standard way in which systems like GPT-3 and other artificial intelligent systems work. This is not merely a theoretical issue. Since GPT-3 and future iterations of similar systems will likely be available for commercial use through application programming interfaces, caution is needed regarding the risks created by these systems, which pass for “intelligent” but have no genuine communicational intentions, and can thereby produce fake and unreliable linguistic exchanges.
Machine behavior that is based on learning algorithms can be significantly influenced by the exposure to data of different qualities. Up to now, those qualities are solely measured in technical terms, but not in ethical ones, despite the significant role of training and annotation data in supervised machine learning. This is the first study to fill this gap by describing new dimensions of data quality for supervised machine learning applications. Based on the rationale that different social and psychological backgrounds of individuals correlate in practice with different modes of human–computer-interaction, the paper describes from an ethical perspective how varying qualities of behavioral data that individuals leave behind while using digital technologies have socially relevant ramification for the development of machine learning applications. The specific objective of this study is to describe how training data can be selected according to ethical assessments of the behavior it originates from, establishing an innovative filter regime to transition from the big data rationale n = all to a more selective way of processing data for training sets in machine learning. The overarching aim of this research is to promote methods for achieving beneficial machine learning applications that could be widely useful for industry as well as academia.
Top-cited authors
Luciano Floridi
  • University of Oxford - University of Bologna
Ugo Pagallo
  • Università degli Studi di Torino
Virginia Dignum
  • Umeå University
Burkhard Schafer
  • The University of Edinburgh
Christoph Lütge
  • Technische Universität München