ArticlePDF Available

A RIGHT TO REASONABLE INFERENCES: RE-THINKING DATA PROTECTION LAW IN THE AGE OF BIG DATA AND AI

Authors:

Abstract

Columbia Business Law Review, 2019(2). Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice (ECJ). This Article shows that individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively ‘economy class’ personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Art 13-15), rectify (Art 16), delete (Art 17), object to (Art 21), or port (Art 20) personal data are significantly curtailed for inferences. The GDPR also provides insufficient protection against sensitive inferences (Art 9) or remedies to challenge inferences or important decisions based on them (Art 22(3)). This situation is not accidental. In standing jurisprudence the ECJ has consistently restricted the remit of data protection law to assessing the legitimacy of input personal data undergoing processing, and to rectify, block, or erase it. Critically, the ECJ has likewise made clear that data protection law is not intended to ensure the accuracy of decisions and decision-making processes involving personal data, or to make these processes fully transparent. Current policy proposals addressing privacy protection (the ePrivacy Regulation and the EU Digital Content Directive) and Europe’s new Copyright Directive and Trade Secrets Directive also fail to close the GDPR’s accountability gaps concerning inferences. This Article argues that a new data protection right, the ‘right to reasonable inferences’, is needed to help close the accountability gap currently posed by ‘high risk inferences’ , meaning inferences drawn from Big Data analytics that damage privacy or reputation, or have low verifiability in the sense of being predictive or opinion-based while being used in important decisions. This right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data form a normatively acceptable basis from which to draw inferences; (2) why these inferences are relevant and normatively acceptable for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged.
A preview of the PDF is not available
... The lack of clear regulations and guidelines poses significant risks, including violation of privacy (Martin & Zimmermann, 2024). AI systems can collect and process vast amounts of personal data, potentially infringing on individuals' right to privacy (Wachter & Mittelstadt, 2019;Manheim & Kaplan, 2019). It is also documented that AI algorithms can perpetuate existing biases and discriminate against certain groups (Lewis, 2024;Nazer et al., 2023). ...
Chapter
This chapter investigates the legal and ethical frameworks of artificial intelligence (AI) in Ghana, emphasizing how the country can draw upon the EU AI Act 2024 to influence a harmonious AI legal regime. It is argued that the effective regulation of AI in Ghana requires a harmonious legal framework that addresses ethical concerns and human rights, transparency, and accountability. Using a desktop review, it is evident that Ghana's expanding AI landscape requires a well- aligned regulatory approach that prioritizes human well-being, safety, and dignity. Such a framework ensures that AI technologies respect citizens' rights and enhance public safety. By categorizing AI systems based on risk levels, Ghana can mitigate potential harmful practices while fostering innovation and trust. Incorporating ethical frameworks into AI governance is vital for effective regulation, as it offers clear guidelines that align with human values. This alignment promotes global cooperation and safeguards human interests in society.
... The awareness that an impersonal algorithm can make biased decisions affecting one's opportunities can lead to feelings of sensitive information, including location data, personal preferences, and behavioral patterns [92]. AI algorithms analyze this data to identify patterns, make predictions, and automate decisions, often uncovering deeply personal insights-such as health conditions, financial status, and personal relationships-often without the individual's awareness [93]. This ability to infer information beyond what is explicitly shared heightens feelings of intrusion and loss of control over personal information. ...
Article
Full-text available
The rapid advancement of artificial intelligence (AI) has raised significant concerns regarding its impact on human psychology, leading to a phenomenon termed AI Anxiety—feelings of apprehension or fear stemming from the accelerated development of AI technologies. Although AI Anxiety is a critical concern, the current literature lacks a comprehensive analysis addressing this issue. This paper aims to fill that gap by thoroughly examining the psychological factors underlying AI Anxiety and proposing effective solutions to tackle the problem. We begin by comparing AI Anxiety with Automation Anxiety, highlighting the distinct psychological impacts associated with AI-specific advancements. We delve into the primary contributor to AI Anxiety—the fear of replacement by AI—and explore secondary causes such as uncontrolled AI growth, privacy concerns, AI-generated misinformation, and AI biases. To address these challenges, we propose multidisciplinary solutions, offering insights into educational, technological, regulatory, and ethical guidelines. Understanding the root causes of AI Anxiety and implementing strategic interventions are critical steps for mitigating its rise as society enters the era of pervasive AI.
... This requires careful design and testing of AI algorithms to mitigate biases. Data Minimization and Purpose Limitation Balancing the collection and use of gender-related data for security purposes with principles of data minimization to protect user privacy [238]. Excessive data collection can lead to privacy violations, especially for vulnerable groups. ...
Chapter
Full-text available
From a legal perspective, next-generation wireless communication technologies such as Wireless Fidelity (WiFi), 5G and the upcoming 6G do not constitute a separate area of law. Thus, for legal scholars, these technologies do not constitute a distinct research area but are rather part of broader discussions about the regulation of digital systems. Wireless communication is based on and related to data transmission and processing, and therefore falls within the broader context of digitization and datafication. This chapter summarizes how legal aspects and regulatory frameworks are relevant for research on wireless systems, on cybercrime related to these systems and on humans as central actors – with a specific focus on aspects that are relevant in the framework of the COST Action BEiNG-WISE. In this context, the main purpose of legal research is to adopt legal approaches that have been developed for other areas of digitization and apply these to wireless systems. Another target of the current chapter is the identification of research gaps related to particularities of these emerging wireless technologies that require attention, modifications to existing laws and potential new regulatory approaches. As the BEiNG-WISE COST Action primarily includes researchers from European Union (EU) and Council of Europe (CoE) countries, the chapter more specifically looks at the legal frameworks established at EU and CoE level. However, certain legal issues that are relevant for wireless communication systems are not yet regulated at EU level, remaining under the member states’ legislative authority. This means that considerable differences persist concerning the regulatory frameworks and their application across different jurisdictions. This is particularly the case for criminal law and criminal procedure. Thus, in some respect, comparative approaches can help to understand the scope of the legal frameworks applicable to wireless communication systems.
... This requires careful design and testing of AI algorithms to mitigate biases. Data Minimization and Purpose Limitation Balancing the collection and use of gender-related data for security purposes with principles of data minimization to protect user privacy [238]. Excessive data collection can lead to privacy violations, especially for vulnerable groups. ...
Technical Report
Full-text available
Next-Generation Wireless Networks and Systems (NGWN-Ss) are foundational to realizing a seamlessly connected world, unlocking transformative services and applications. However, the pervasive connectivity of NGWN-Ss introduces complex and new challenges in cybersecurity and privacy. Key concerns include the vast volumes of data exchanged, evolving user interactions with advanced technologies, and the increasing sophistication of cybercriminals utilizing these technologies for malicious purposes. The BEiNG-WISE Action highlights critical gaps in technologies, legislation, ethical considerations, and the integration of user-centric perspectives into technological development. Current regulatory frameworks lag behind the rapid pace of technological advancement, often neglecting the intricate needs of endusers. During the first year of collaborative efforts, the WGs identified key interdependencies across technical, legal, and sociological dimensions, underscoring the need for multidisciplinary approaches to address cybersecurity challenges comprehensively. This document synthesizes findings from various domains, ranging from the technical evolution of wireless systems (WG1) and the sociological dynamics of cybercrime (WG2) to innovative cybersecurity frameworks (WG3) and user-centered methodologies (WG4). A central theme is the interplay between advanced technology, human factors, and the evolving legal landscape (WG5). The chapters explore these connections and provide a foundation for re-imagining cybersecurity through a holistic, responsible-by-design approach. By integrating human, ethical, and regulatory dimensions, this work sets the foundations for novel cybersecurity solutions that balance technological innovation with societal impact.
... Our position is challenged by scholars and data protection authorities who argue that AI models, in general, cannot be classified as personal data (Wachter & Mittelstadt, 2019;Leiser & Dechesne, 2020;Datatilsynet, 2023;Hamburg, 2024). Consequently, they contend that the legal implications outlined in Section 4 do not apply. ...
Preprint
Does GPT know you? The answer depends on your level of public recognition; however, if your information was available on a website, the answer is probably yes. All Large Language Models (LLMs) memorize training data to some extent. If an LLM training corpus includes personal data, it also memorizes personal data. Developing an LLM typically involves processing personal data, which falls directly within the scope of data protection laws. If a person is identified or identifiable, the implications are far-reaching: the AI system is subject to EU General Data Protection Regulation requirements even after the training phase is concluded. To back our arguments: (1.) We reiterate that LLMs output training data at inference time, be it verbatim or in generalized form. (2.) We show that some LLMs can thus be considered personal data on their own. This triggers a cascade of data protection implications such as data subject rights, including rights to access, rectification, or erasure. These rights extend to the information embedded with-in the AI model. (3.) This paper argues that machine learning researchers must acknowledge the legal implications of LLMs as personal data throughout the full ML development lifecycle, from data collection and curation to model provision on, e.g., GitHub or Hugging Face. (4.) We propose different ways for the ML research community to deal with these legal implications. Our paper serves as a starting point for improving the alignment between data protection law and the technical capabilities of LLMs. Our findings underscore the need for more interaction between the legal domain and the ML community.
Article
Full-text available
Background and Objective: The combination of artificial intelligence (AI) and big data has significantly altered data collecting, processing, and privacy concerns. This study looks into the changing viewpoints on data privacy and security before and after the introduction of these technologies. The core study issue investigates how AI and big data have changed attitudes towards data protection, with an emphasis on changes in privacy perceptions and security measures. The study expands on previous research by identifying the problems and possibilities posed by these technologies, with the goal of filling knowledge gaps on their influence on privacy and security.
Chapter
Computer systems, that are capable of executing activities that require human intelligence, often referred to as Artificial Intelligence (AI), are making their mark in all industrial territory lately. The fact that enormous data sets can be used to teach these AI systems to recognize patterns and make predictions accordingly, is highly advantageous adding the agility of the present humanized systems at work. In recent times, it has been noted that artificial intelligence and machine learning technology have reshaped the healthcare sector as well. AI has the potential to transform the healthcare industry by increasing efficiency, lowering costs, and improving the prognosis for patients. Integrating AI in healthcare presents a few hurdles, of which the two most important are meeting compliance requirements and resolving issues of confidence with machine learning outcomes. Despite these obstacles, introducing machine learning, artificial intelligence and other technologies to the healthcare business has resulted in various benefits for both healthcare organizations and the patients they serve. Both machine learning and AI have shown an array of advantages in the healthcare industry by optimizing processes and assisting with routine chores, as well as assisting users in promptly finding solutions to critical concerns, enabling improved services for patients as well as consumers. Most healthcare providers are offering user-driven experiences and increasing operational efficiency in making the best possible use of collected data, assets, and resources by evaluating data trends, enhancing coherence and improving the accomplishments of clinical and operational procedures. Yet, even after exiting a major part of two decades in the medical industry, insufficient information exists about consumer perceptions towards AI in medical treatments and procedures. Our study is aimed at learning consumers’ apprehensiveness to embrace AI-assisted healthcare in both tangible and hypothetical choices, be it independent or collaborative evaluations.
Article
Full-text available
In reaction to concerns about a broad range of potential ethical issues, dozens of proposals for addressing ethical aspects of artificial intelligence (AI) have been published. However, many of them are too abstract for being easily translated into concrete designs for AI systems. The various proposed ethical frameworks can be considered an instance of principlism that is similar to that found in medical ethics. Given their general nature, principles do not say how they should be applied in a particular context. Hence, a broad range of approaches, methods, and tools have been proposed for addressing ethical concerns of AI systems. This paper presents a systematic analysis of more than 100 frameworks, process models, and proposed remedies and tools for helping to make the necessary shift from principles to implementation, expanding on the work of Morley and colleagues. This analysis confirms a strong focus of proposed approaches on only a few ethical issues such as explicability, fairness, privacy, and accountability. These issues are often addressed with proposals for software and algorithms. Other, more general ethical issues are mainly addressed with conceptual frameworks, guidelines, or process models. This paper develops a structured list and definitions of approaches, presents a refined segmentation of the AI development process, and suggests areas that will require more attention from researchers and developers.
Chapter
This informative Handbook provides a comprehensive overview of the legal, ethical, and policy implications of AI and algorithmic systems. As these technologies continue to impact various aspects of our lives, it is crucial to understand and assess the challenges and opportunities they present. Drawing on contributions from experts in various disciplines, the book covers theoretical insights and practical examples of how AI systems are used in society today. It also explores the legal and policy instruments governing AI, with a focus on Europe. The interdisciplinary approach of this book makes it an invaluable resource for anyone seeking to gain a deeper understanding of AI's impact on society and how it should be regulated. This title is also available as Open Access on Cambridge Core.
Article
Full-text available
Nowadays algorithms can decide if one can get a loan, is allowed to cross a border, or must go to prison. Artificial intelligence techniques (natural language processing and machine learning in the first place) enable private and public decision-makers to analyse big data in order to build profiles, which are used to make decisions in an automated way. This work presents ten arguments against algorithmic decision-making. These revolve around the concepts of ubiquitous discretionary interpretation, holistic intuition, algorithmic bias, the three black boxes, psychology of conformity, power of sanctions, civilising force of hypocrisy, pluralism, empathy, and technocracy. The lack of transparency of the algorithmic decision-making process does not stem merely from the characteristics of the relevant techniques used, which can make it impossible to access the rationale of the decision. It depends also on the abuse of and overlap between intellectual property rights (the “legal black box”). In the US, nearly half a million patented inventions concern algorithms; more than 67% of the algorithm-related patents were issued over the last ten years and the trend is increasing. To counter the increased monopolisation of algorithms by means of intellectual property rights (with trade secrets leading the way), this paper presents three legal routes that enable citizens to ‘open’ the algorithms. First, copyright and patent exceptions, as well as trade secrets are discussed. Second, the GDPR is critically assessed. In principle, data controllers are not allowed to use algorithms to take decisions that have legal effects on the data subject’s life or similarly significantly affect them. However, when they are allowed to do so, the data subject still has the right to obtain human intervention, to express their point of view, as well as to contest the decision. Additionally, the data controller shall provide meaningful information about the logic involved in the algorithmic decision. Third, this paper critically analyses the first known case of a court using the access right under the freedom of information regime to grant an injunction to release the source code of the computer program that implements an algorithm. Only an integrated approach – which takes into account intellectual property, data protection, and freedom of information – may provide the citizen affected by an algorithmic decision of an effective remedy as required by the Charter of Fundamental Rights of the EU and the European Convention on Human Rights. Recommended citation: Guido Noto La Diega, Against the Dehumanisation of Decision-Making – Algorithmic Decisions at the Crossroads of Intellectual Property, Data Protection, and Freedom of Information, 9 (2018) JIPITEC 3 para 1.
Article
Full-text available
Since approval of the EU General Data Protection Regulation (GDPR) in 2016, it has been widely and repeatedly claimed that a 'right to explanation' of decisions made by automated or artificially intelligent algorithmic systems will be legally mandated by the GDPR. This right to explanation is viewed as an ideal mechanism to enhance the accountability and transparency of automated decision-making. However, there are several reasons to doubt both the legal existence and the feasibility of such a right. In contrast to the right to explanation of specific automated decisions claimed elsewhere, the GDPR only mandates that data subjects receive limited information (Articles 13-15) about the logic involved, as well as the significance and the envisaged consequences of automated decision-making systems, what we term a 'right to be informed'. Further, the ambiguity and limited scope of the 'right not to be subject to automated decision-making' contained in Article 22 (from which the alleged 'right to explanation' stems) raises questions over the protection actually afforded to data subjects. These problems show that the GDPR lacks precise language as well as explicit and well-defined rights and safeguards against automated decision-making, and therefore runs the risk of being toothless. We propose a number of legislative steps that, if taken, may improve the transparency and accountability of automated decision-making when the GDPR comes into force in 2018.
Chapter
In this chapter, a critical analysis is undertaken of the provisions of Art. 22 of the European Union’s General Data Protection Regulation of 2016, with lines of comparison drawn to the predecessor for these provisions—namely Art. 15 of the 1995 Data Protection Directive. Article 22 places limits on the making of fully automated decisions based on profiling when the decisions incur legal effects or similarly significant consequences for the persons subject to them. The basic argument advanced in the chapter is that Art. 22 on its face provides persons with stronger protections from such decision making than Art. 15 of the Directive does. However, doubts are raised as to whether Art. 22 will have a significant practical impact on automated profiling.
Article
The criminal justice system is becoming automated. At every stage, from policing to evidence to parole, machine learning and other computer systems guide outcomes. Widespread debates over the pros and cons of these technologies have overlooked a crucial issue: ownership. Developers often claim that details about how their tools work are trade secrets and refuse to disclose that information to criminal defendants or their attorneys. The introduction of intellectual property claims into the criminal justice system raises undertheorized tensions between life, liberty, and property interests. This Article offers the first wide-ranging account of trade secret evidence in criminal cases and develops a framework to address the problems that result. In sharp contrast to the general view among trial courts, legislatures, and scholars alike, this Article argues that trade secrets should not be privileged in criminal proceedings. A criminal trade secret privilege is ahistorical, harmful to defendants, and unnecessary to protect the interests of the secret holder. Meanwhile, compared to substantive trade secret law, the privilege overprotects intellectual property. Further, privileging trade secrets in criminal proceedings fails to serve the theoretical purposes behind either trade secret law or privilege law. The trade secret inquiry sheds new light on how evidence rules do, and should, function differently in civil and criminal cases.
Chapter
This chapter focuses on big data analytics and, in this context, investigates the opportunity to consider informational privacy and data protection as collective rights. From this perspective, privacy and data protection are not interpreted as referring to a given individual, but as common to the individuals that are grouped into various categories by data gatherers. The peculiar nature of the groups generated by big data analytics requires an approach that cannot be exclusively based on individual rights. The new scale of data collection entails the recognition of a new layer, represented by groups’ need for the safeguard of their collective privacy and data protection rights. This dimension requires a specific regulatory framework, which should be mainly focused on the legal representation of these collective interests, on the provision of a mandatory multiple-impact assessment of the use of big data analytics and on the role played by data protection authorities.
Article
Perfect anonymization of data sets that contain personal information has failed. But the process of protecting data subjects in shared information remains integral to privacy practice and policy. While the deidentification debate has been vigorous and productive, there is no clear direction for policy. As a result, the law has been slow to adapt a holistic approach to protecting data subjects when data sets are released to others. Currently, the law is focused on whether an individual can be identified within a given set. We argue that the best way to move data release policy past the alleged failures of anonymization is to focus on the process of minimizing risk of reidentification and sensitive attribute disclosure, not preventing harm. Process-based data release policy, which resembles the law of data security, will help us move past the limitations of focusing on whether data sets have been “anonymized.” It draws upon different tactics to protect the privacy of data subjects, including accurate deidentification rhetoric, contracts prohibiting reidentification and sensitive attribute disclosure, data enclaves, and query-based strategies to match required protections with the level of risk. By focusing on process, data release policy can better balance privacy and utility where nearly all data exchanges carry some risk. © 2016, University of Washington School of Law. All rights reserved.