BookPDF Available

Data Pollution & Power - White Paper for a Global Sustainable Development Agenda on AI by Gry Hasselbalch with contributions from the Data Pollution & Power (DPP) Group at Bonn Sustainable AI Lab

Authors:

Abstract

Data Pollution is to the big data age what smog was to the industrial age. Our response to data pollution will develop much like our reaction to traditional forms of pollution—just much faster and hopefully with dedication and great force. This white paper describes a nascent environmental data pollution movement. It frames data pollution in the context of powers and interests exploring eight domains in which data pollution has the greatest impact: Nature, Science & Innovation, Democracy, Human Rights, Infrastructure, Decision-Making, Global Opportunities, and Time. The main objective is to ensure that data pollution of AI in particular is included in the global sustainable development agenda.
1
White Paper for a Global Sustainable
Development Agenda on AI
Data Pollution & Power
Gry Hasselbalch
2022 Gry Hasselbalch
This work is licensed under Creative Commons (CC BY-NC 4.0)
Published by The Sustainable AI Lab, Bonn University
Citation: Hasselbalch, G. (2022) Data Pollution & Power – White
Paper for a Global Sustainable Agenda on AI, The Sustainable AI Lab,
Bonn University.
Front Cover Photo: Alina Grubnyak, Unsplash.com
Funding has been provided by the Alexander von Humboldt
Foundation in the framework of the Alexander von Humboldt
Professorship for Articial Intelligence endowed by the Federal
Ministry of Education and Research.
Gry Hasselbalch
DATA POLLUTION
& POWER
White Paper for a Global Sustainable
Development Agenda on AI
4
With contributions from the Data Pollution & Power
(DPP) Group:
Aimee van Wynsberghe
The Sustainable AI Lab, University of Bonn
Carolina Aguerre
Universidad Católica del Uruguay
Federica Lucivero
Oxford University
Jenny Brennan
The Ada Lovelace Institute
Lynn H. Kaack
Hertie School
Pak-Hang Wong
H&M Group
Sebnem Yardimci-Geyikci
The Sustainable AI Lab, University of Bonn
Signe Daugbjerg
Universita Cattolica del Sacro Cuore
5
Table of Contents
WHITE PAPER GLOSSARY 8
INTRODUCTION 13
1. DATA POLLUTION TERMINOLOGY 18
Data Pollution 21
The Holistic Approach 26
The Human-Centric Approach 30
Data Ethics of Power 34
Data Interests 34
2. THE GEO-POLITICS OF DATA POLLUTION 36
3. CATALOGUE OF DATA POLLUTION DOMAINS 50
Nature 51
Science and Innovation 54
Democracy 57
Human Rights 61
Infrastructure 63
Decision-Making 66
Global Opportunities 71
Time 75
4. DATA POLLUTION QUESTIONS 80
CONCLUSION 85
A Data Pollution Movement 85
END NOTES & BIBLIOGRAPHY 91
6
Abstract
This white paper explores ‘data pollution as the adverse environmental
impact of the data of articial intelligence (AI). It frames the environmental
problems of AI and big data in the context of powers and interests, explores
eight data pollution domains, and provides a set of questions that may be
further explored when including data pollution in the global sustainable
development agenda.
The rst chapter outlines shared terminology that can be used to address
the data pollution of AI and its contextual interrelated power dynamics.
The second chapter describes the current geopolitical landscape of AI ethics
and sustainability. The third chapter explores eight key domains of the
natural, social and personal environment in which data pollution has the
greatest impact: Nature, Science & Innovation, Democracy, Human Rights,
Infrastructure, Decision-Making, Global Opportunities, and Time. Lastly, in
the fourth chapter, a set of questions for the global sustainable development
agenda are presented.
About the Data Pollution & Power Initiative
(the DPP Initiative)
The Data Pollution & Power initiative was set up by Gry
Hasselbalch (the author of this paper) at the Sustainable AI Lab at
the Institute of Science and Ethics (IWE), Bonn University to explore
the power dynamics that shape AI data pollution across the UN
Sustainable Development Goals (SDGs). The project examines how
power dynamics and interests in AI data determine the handling and
distribution of data in data ecosystems and considers actions and
governance approaches that are intrinsically interrelated in systems
of power and interests.1
7
About the Data Pollution & Power Group
(the DPP Group)
The Data Pollution & Power Group is a group of experts that
was established in June 2021 to examine the data of AI as a human
and natural resource in ‘eco systems’ and environments—and ‘data
pollution’ as the interrelated (big) data deriving from AI which has
adverse eects on the Sustainable Development Goals (SDGs). It is a
cross-disciplinary group with diverse expertise and interests that cut
across several of the SDGs. The core aim of the group is to debate,
scope out, map and explore the data pollution of AI.
About the Sustainable AI Lab
The Sustainable AI Lab2 is an initiative of Professor Dr Aimee
van Wynsberghe, Director of the Institute of Science and Ethics at
Bonn University, and the result of her Humboldt Professorship for
the Applied Ethics of AI, which was awarded by theAlexander von
Humboldt Foundation.3
The Sustainable AI Lab brings together researchers from various
backgrounds to conduct projects aimed at: measuring & assessing
the environmental impact of AI, ways of making AI systems more
sustainable, and directing AI towards the SDGs.
8
WHITE PAPER
GLOSSARY
Agency
In this white paper, ‘agency’ does not imply autonomous individual
agent agency in human or technical form. Instead, it is used to refer
to socio-technical processes and systems that consist of a complex
of actors and components that in combination constitute various
actions (such as e.g. making ’decisions’, ‘infrastructuring’ (as a verb),
‘discriminating’ or ‘polluting’) and that have an identiable impact
in their respective environments. These ‘socio-technical actions’ are
driven forward by dominant interests and cultural narratives that
are reinforced in socio-technical design. They are realised in more
or less humanly controlled contexts with dierent levels of human
involvement.
Autonomous Decision-Making (ADM) Systems
ADM systems are autonomous or semi-autonomous (AI) decision-
making systems. ADM systems are embedded in the socio-technical
information infrastructure of private and public decision-making sectors;
decisions are increasingly informed or replaced by big data AI systems that
predict and analyse risks or potential based on accumulated data.
9
Big Data Socio-Technical Infrastructures (BDSTIs)
BDSTIs are socio-technical infrastructures constituted by big
data technologies.4 They are the primary infrastructures of global
information economies and societies and are institutionalised in
systems requirements standards for ICT practices and in regulatory
frameworks, and they are invested with human imagination about
the challenges and opportunities of big data.
Articial Intelligence Socio-Technical Infrastructures (AISTIs)
AISTIs are an evolution of the analytical capabilities of BDSTIs.5
They are BDSTIs designed to sense their environment in real time,
learning and evolving with autonomous or semiautonomous ‘agency’
(see above). BDSTIs extend space in digital data and AISTIs work in
time by acting on that data to form the past and present in the image
of the future.
Data Ethics of Power
Data Ethics of Power is an applied ethics approach concerned
with making the power dynamics of the big data society and the
conditions of their negotiation visible in order to point to design,
business, policy, and social and cultural processes that support a
human(-centric) distribution of power.6
Data Interests
A ‘data interest’ constitutes a specic need, value or goal centred
on data as a resource.7 This can be political, commercial, or scientic
interests in data or an individual’s interest in protecting or making
use of their personal data. Data interests can be found in data design
and in data governance activities. (see also ‘agency’ above)
Data Pollution
Data pollution is the interrelated adverse impact that the
10
generation, storing, handling and processing of digital data has on our
natural environment, social environment and personal environment.
It is the unsustainable handling, distribution and generation of
data resources. Data pollution due diligence means managing—in
organisational, policy and design practice—the adverse eects and
risks of data exhaust on natural, social and personal ecosystems.
Data Pollution Domains
This white paper identies and explores the most prevalent and
urgent data pollution problems and challenges in a catalogue of
eight domains within the natural, social and personal environments:
Nature, Science & Innovation, Democracy, Human Rights, Infrastructure,
Decision-Making, Global Opportunities, and Time. These domains are
explored in terms of the material and immaterial power conditions
and contexts that are aected by data pollution.
Environmentally Sound Technologies (ESTs)
Environmentally sound technologies protect the environment, are less
polluting, use all resources in a more sustainable manner, recycle more of
their wastes and products, and handle residual waste in a more acceptable
manner than the technologies for which they were substitutes8
International Human Rights
International human rights have their origin in the UN Declaration
of Human Rights (UNDHR), which was draed by representatives
from regions worldwide and adopted in 1948, and has been expanded
upon via other international instruments, treaties and covenants.
Furthermore, the European Convention on Human Rights (ECHR)
was signed by 47 member states and entered into force in 1953.
Mechanisms are in place for monitoring the compliance of states
party to the UNDHR, while member states that have signed the ECHR
are accountable to the European Court of Human Rights (ECtHR). In
11
the EU, the Charter of Fundamental Rights, which came into force in
2009, further embeds the rights of EU citizens into EU law. Here, the
protection of personal data (article 8) as a fundamental right of EU
citizens is delineated, for example, in an extensive data protection
regulatory framework (the GDPR).
Infrastructure
Infrastructure is the immaterial and material socio-technical
organisation of space. It constitutes social, cultural and spatial
architecture that is created and directed by humans in social,
economic, political and historical contexts. Socio-technical
infrastructures are human-made spaces composed of engineered
and non-engineered processes that evolve in contexts of negotiation
and struggle between dierent societal interests and aspirations.9
The Human-Centric Approach to AI
The human-centric approach to AI concerns the ethical
responsibility of humans and the preservation of human dynamic
qualities and empowerment in socio-technical AI infrastructures.
The approach gained momentum in the late 2010s in global policy
discourses on AI as a human rights and risk-based approach of the
EU’s Articial Intelligence strategies and policy instruments and
the ethics recommendations and principles of intergovernmental
organisations, such as the OECD and UNESCO.10
Socio-technical
Society and technology are intrinsically interlinked and cannot be
understood in isolation. Society is part of technology and technology
is part of society. Technology design is a complex process constituted
by diverse social, political, economic, cultural and technological
factors.11
12
Sustainable Development Goals (SDGs)
The 17 Sustainable Development Goals (SDGs) of the UN’s 2030
Agenda address the balance of three dimensions of sustainable
development—economic, social and environmental—with strategies
to undertake climate change, improve health and education, reduce
inequality, and stimulate economic growth.
13
INTRODUCTION
Data Pollution & Power
DATA POLLUTION IS TO the big data age what smog was to the
industrial age. Our response to data pollution will develop much like
our reaction to traditional forms of pollution—just much faster and
hopefully with dedication and great force. Only a few decades ago,
a ‘nice’ automobile was big, and its toxic exhaust a distant ghost in a
dark sky accumulating over cities. In contrast, today’s automobiles
are required to adhere to legal regulations and the market demands
environmental standards and friendliness. We have environmental
laws, standards for sustainable business conduct, and awareness
of air, water and land pollution. These risk mitigation strategies
and social responses have evolved alongside a growing number
of scientic studies and tools measuring the adverse impacts of
harmful pollutants on our natural environments. An environmental
movement to tackle the pollutants of the Industrial age has matured
and materialised in law, policy, international agreements, consumer
demands, innovation and business practices.
It is time now for an environmental movement to tackle the data
pollution of the big data era.
14
In 1972, the urgency of global political attention and coordination
on environmental issues was recognised at the Conference on the
Human Environment in Stockholm, Sweden. This was also where
‘Environmentally Sound Technologies’ (ESTs) were dened as
technologies capable of reducing environmental damage, while at the
same time being designed for sustainability during implementation
and adoption.12 Echoing these early ideas about the role of human
science and technology in tackling environmental challenges with
a global coordinated sustainable approach, this white paper places
data pollution as part of a global sustainable development agenda
in relation to AI. It addresses a largely undened and inconsistently
studied area of environmental concern: data pollution caused by
specic AI technologies. Based on desk research, the author’s active
participation in the emerging global AI governance and policy eld,
and invaluable interactions with a group of experts, the Data Pollution
& Power Group hosted by the Sustainable AI Lab of Bonn University,
this white paper conceptualises and presents a preliminary outline
of the interrelated components of data pollution and the power
dynamics that challenge our societal response to this environmental
problem.
Technological tools with AI capabilities can help tackle some of
the biggest challenges identied within the sustainable development
agenda. For example, AI can help in the green transition with policy
foresight, prediction of environmental impacts, more ecient use
of limited resources and optimisation of production processes.
However, we need to ensure that the original reections on the
sustainability of the technologies we develop and deploy today are
not overshadowed by AI hype, power struggles and competition. AI
data pollution comes in many forms, from the carbon footprint
of processing and storing the data used to train AI systems to non-
15
representative big data sets and biased healthcare analysis, or
invisible data micro-targeting that pollutes democratic electoral
processes. Awareness of the various forms of the data pollution of
the big data age calls for a proactive approach to the technologies we
build, govern and embed in our socio-technical infrastructures.
Public awareness of data pollution in society and among the
companies and institutions responsible for it today lags behind other
environmental concerns. We therefore urgently need to develop
and increase awareness about it. However, to do so, a conceptual
framework is needed to address the power dynamics that shape
the conditions of data pollution. Data pollution constitutes an
unsustainable distribution and exploitation of data resources. In
a big data economy, data is the main ‘currency’ and ‘resource’, and
therefore also the locus of dierent societal interests and power
dynamics that do not always put human, social or environmental
interests rst. As a result, data resources are distributed unevenly and
exhausted while creating imbalances in delicate personal, social and
natural eco-systems.
This white paper explores the power dynamics that are
transformed, impacted and even produced by data in our natural,
social and personal environments. The transformation of the power
(im)balances between dierent actors on a local, regional and global
scale is at the heart of the matter. In modern democracies, power
asymmetries are breathing data pollution. On a global scale, data
pollution is an environmental problem constituted by and further
contributing to imbalances in ecosystems of power. Data pollution
is a human condition and creation that underpins new modes of
technological colonialism—AI, digital and data colonialism—which
impacts global opportunities, democratic participation, and the very
constitution of democracy.
16
An important note and caveat on the voice, the ‘we’, of the white
paper: although the focus is global, the point of departure is mainly
a European context, based on the embedded experience and
perspective of the author. This means that although the white paper
does address the core structural power dimension of the global data
pollution problem, it does not claim to speak with the experience
and voice of those who are most exposed to data pollution. It also
means that there is an underlying emphasis on European policies
and regulations on data protection, AI and platforms.
This white paper attempts to make visible the connections
between the dierent actors and components of a nascent data
pollution environmental movement, with ‘sustainability’ as
the thread that links the elements of the movement in shared
understanding and a common approach. Data pollution is
conceptualised holistically as the interrelated adverse eects on the
ecosystems of our natural, social and personal environments, and a
rst attempt is made to highlight the power dynamics that shape our
identication of data pollution and societal responses to it.
This white paper has three objectives:
1. Shared Terminology
The rst objective is to delineate common ground for debate on
data pollution by outlining shared terminology that can be used to
address the data pollution of AI and its contextual interrelated power
dynamics.
The aim is to frame the negative eects of AI data in the context of
sustainable development, and place data pollution as a problem on
par with other environmental issues. As a point of departure, data
17
pollution is therefore described here in terms of the entire eld of
the environmental impact of AI data, from the carbon footprint of AI-
related energy consumption to privacy implications for individuals.
Furthermore, terminology for an approach to data pollution which
aims to make visible and tackle the societal power dynamics of
powerful actors, hierarchies and asymmetries is delineated.
2. A Catalogue of Power Domains Impacted by Data Pollution
The second objective is to explore the key domains of the
natural, social and personal environments in which data pollution
has the greatest impact. To create common ground for the debate
surrounding data pollution with a focus on the power dynamics that
shape the eld, this white paper identies and explores the most
prevalent and urgent data pollution problems in a catalogue of eight
domains of our natural, social and personal environments: Nature,
Science & Innovation, Democracy, Human Rights, Infrastructures, Decision-
Making, Global Opportunities, and Time. These domains are explored in
terms of the material and immaterial power conditions and contexts
that are impacted by data pollution.
3. Making Power Dynamics Visible
The third objective is to make the power dynamics that shape data
pollution visible by repositioning big data and AI as environmental
risks. In the last section of this white paper, a set of questions are
posed to open up the discussion about dierent aspects of data
pollution in dierent domains and among various power actors. The
questions were draed by the members of the Data Pollution and
Power Group and edited by the author.
18
1.
DATA POLLUTION
TERMINOLOGY
THERE IS A TENDENCY to reduce the complexity of sociotech-
nical change in disciplinary and sectoral silos and specic stakehold-
er interests. However, complex interrelated environments know no
boundaries and, as such, a lack of coordination and translation be-
tween various elds of expertise, stakeholder groups and interests
can limit the mitigation of the adverse environmental impacts of
data pollution. We need shared terminology and a conceptual plat-
form from which to pose the most urgent data pollution questions
to be addressed within the global sustainable development agenda.
As a concept in policy and business discourse, sustainability has
been articulated over the last ve decades alongside the identication
of the adverse impacts of the Industrial Age on social, economic and
natural environments. From the outset, it has represented a more
holistic approach to the management of environmental risks and
impacts.13 It includes the recognition that global environmental
problems are largely the result of the unsustainable consumption and
production patterns of the Global North, coupled with widespread
poverty in the Global South.14
The potential of AI and big data technologies to tackle traditional
19
environmental challenges and reach Green Deal (EU) or Sustainable
Development (UN) goals has been explored extensively and is time
and again highlighted in AI and data policies. The sustainability of
AI and big data, on the other hand, is predominantly treated as a
separate eld of action in policy as well as scientic research. Here, we
want to, as van Wynsberghe describes it, treat sustainability for and of
AI data as two sides of the same coin.15 That is: we need to recognise
that AI cannot help us reach sustainable development goals if it itself
is unsustainable.
We will herein explore data pollution in the context of the
development of AI technologies and the creation of Articial
Intelligence Socio-Technical Infrastructures (AISTIs) in particular.16
Big data is the key resource of the big data society and Big Data Socio-
Technical Infrastructures (BDSTIs),17 but it is an empty one without
complex data processing systems for analysis. Today, in the early
2020s, AI systems give meaning to big data. They have increasingly
gained traction in public and private sectors as ‘sense makers’ in the
age of big data ows. Thus, AI is used to make sense of large amounts
of data, predict patterns, analyse risks and act on that knowledge in
healthcare, manufacturing, public administration, social networking,
nance and most other areas in society. A survey of AI uptake in
Europe found that four in ten enterprises (42%) have adopted at least
one AI program, with a quarter of them having already adopted at
least two.18 Business and technology companies have generally started
rebranding their big data eorts as ‘AI’19 and, in the policy-making eld,
AI has gained strategic importance worldwide. In public and private
sectors, decision-making processes are progressively informed by
and even replaced by big data AI systems. Risk assessment systems
look for patterns in the backgrounds of defendants to inform judges
about who would be most likely to commit a crime in the future.
Personalisation and recommendation systems are creating proles
20
based on our personal data to decide what we see and read, and whom
we engage with online. Triage systems analyse the medical records
and the demographic information of patients to decide who gets a
new kidney. In their current form, AI systems amount to very little
without data and most of them need data to be available, accessible,
collected and stored. As the Data Governance Working Group (WG)
of the Global Partnership of AI (GPAI) highlights in a report on AI
data:
…data availability (whether data exists) and accessibility (whether data
is accessible) are the main driver behind development of products that use AI
technologies.20
Businesses, economies and policies are changing alongside the
adoption of new AI and big data socio-technical infrastructures,
and with them, so are the moral decisions and choices which are
increasingly intertwined with the complex data processing of AI
systems. Accordingly, interests in the fuel of AI—data—as a resource
to acquire, protect and share come together in eorts to direct the
development of AI in society.21
Data pollution as a term speaks into a new green movement for data
sustainability. The global environmental movement originally took
form as a response to the tangible environmental impact of industrial
development and urbanization, such as the introduction of harmful
pollutants in our natural environments, habitat reduction/changes,
the extinction of dierent species, and damage to the land, water,
and forests. Tackling this sort of environmental impact became a
driver for entire new legal frameworks and policies, and national and
international environmental laws. It transformed entire sectors, like
the car industry, and drove forward the development of new elds and
sciences, like ESTs, ‘green tech’. Today, we are experiencing a similar
21
process when articulating our societal response to what computer
security and privacy technologist Bruce Schneier described in 2006
as the core environmental problem of the age of big age:
this tidal wave of data is the pollution problem of the information age.
All information processes produce it. If we ignore the problem, it will stay
around forever. And the only way to successfully deal with it is to pass laws
regulating its generation, use and eventual disposal.22
We have had policy and public debate on the privacy and social
implications of big data since the early 2000s, and we are now
having more serious conversations about the carbon footprint of
data storage and processing. Moreover, society is starting to have
a conversation about the most powerful actors in this eld, such as
regions, governments, intergovernmental organisations and tech
giants. Nevertheless, there is still very little awareness about data
pollution as an ‘environmental problem’ and its disturbance of
entire ecosystems. What is needed is a new green movement for data
pollution and a better understanding of the power dynamics that
shape the eld across dierent data pollution issues. In that regard,
many of the concepts of the global environmental movement and
‘sustainable development’ discourse in policy and business can be
reappropriated to help map and identify data pollution and power.
Data Pollution
Data pollution is the interrelated adverse impact that the generation,
storing, handling and processing of digital data has on our natural
environment, social environment and personal environment. It is the
unsustainable handling, distribution and generation of data resources. Data
pollution due diligence means managing—in organisational, policy and
22
design practice—the adverse eects and risks of data exhaust on natural,
social and personal ecosystems.
Since the mid 1990s, we have seen a transformation of our societies
enabled by computer technologies and directed by a conversion of
just about everything into various data formats (datacation).23 Big
data is a movement driven by a particular vision about of the role of
digitalised data in society.24 For many years, industries, governments
and scientists have perceived big data as an end in itself, with the
promise of unlimited future uses; an endless resource that will
never run out and therefore is distinct from other natural resources,
which can be exhausted (like oil or water).25 Nevertheless, big data
is increasingly also understood as a societal force for change that,
like industrialisation, not only has brought about growth, but also
has negative consequences, including the impact that we see on our
natural environment in the form of climate change.
Two traditional usages of the term data pollution can thus be
combined:
Firstly, data pollution can be understood as the adverse impact
on personal and social environments, for instance on individual rights,
such as data protection or the right to private life, and on democratic
institutions and balances of power. Secondly, data pollution can be
understood as the material adverse eects on our natural environment,
e.g., the carbon footprint of big data.26
1. Impact on social and personal environments
Originally, the term data pollution was used to refer to the
invisible data asymmetries of power of a growing big data economy
23
and the datacation of individual lives and societies. As such, data
pollution came to represent the concrete adverse consequences
of big data for personal and social environments. Thus, with this
term, Schneier emphasised the very real and material eects of the
massive collection and processing of big data by companies and
governments alike on people’s right to privacy. 27 Following this, ‘data
pollution’ has been expanded in the denition of a more holistic
governance approach to the adverse eects of the big data economy,
recognizing that not only are personal environments at stake, but
also social environments. As we stated in 2016 in Data Ethics. The New
Competitive Advantage when dening and carving out a role for the
term ‘data ethics’ in policy and public debates on big data:28
Individual privacy is not the only societal value under pressure in the
current data-saturated infrastructure. The eects of data practices without
ethics can be manifold – unjust treatment, discrimination and unequal
opportunities. But privacy is at its core. It’s the needle on the gauge of society’s
power balance.29
Since then, Ben-Shahar has introduced data pollution in the legal
eld as a way to rethink the harms of the data economy to manage the
negative externalities of big data with an environmental law for data
protection recognizing that harmful data exhaust is not only disrupting
the privacy and data protection rights of individuals, but also has an
adverse impact on an entire digital ecosystem of social institutions
and public interest:30
The concept of data pollution invites us to expand the focus and examine
the ways that the collection of personal data aects institutions and groups
of people—beyond those whose data are taken, and apart from the harm to
their privacy.31
24
2. Impact on the natural environment
The other strand of usages of the term data pollution addresses
the more traditional environmental impact of big data on our
natural environment. This is what Lucivero and Samuel, along
with an interdisciplinary group of scholars, refer to as data driven
unsustainability.32 The impact on the natural environment caused by
the data pollution of digital technologies is due to its complexity,
which, however dicult it may be to get a full picture of, is undeniable.
The French thinktank advocating a shi to a post-carbon economy,
the Shi Project, estimates that the share of global greenhouse
gas emissions produced by data had increased from 2.5% in 2013
to 3.7% in 2019.33 In that regard, data centres account for 1% (and
steadily growing) of total global electricity demand. The majority
of this growth is attributed to cloud computing by the largest big
data companies such as Amazon, Google and Microso.34 The
impact of data-intensive technologies, such as AI, is also signicant.
For example, a famous study by Strubell et al. found that training
(including tuning and experimentation) a large AI model for natural
language processing, such as machine translation, uses seven times
more carbon than an average human in one year.35 Importantly,
the environmental pollution of data-driven digital technologies,
such as AI, is not only an issue of data, but also ICT disposal and
consequences more dicult to discern (such as consumers’ energy
consumption when making use of digital services).36
For the purposes of this white paper, these two usages of the term
data pollution are combined with the aim to identify data pollution in
a common ecosystem of power and, accordingly, to consider actions
with a more holistic governance approach to the pollution problem
caused by the age of big data. The UN 2030 Agenda for Sustainable
25
Development, which was adopted by all United Nations Member
States in 2015, established sustainability as an interrelated issue to be
tackled across various elds of action. Accordingly, the 17 Sustainable
Development Goals (SDGs) of the agenda address the balance of
three dimensions of sustainable development—economic, social
and environmental—with strategies to grapple with climate change,
improve health and education, reduce inequality, and stimulate
economic growth. 37 In the white paper, data pollution is addressed
similarly as not only one type of environmental impact, but rather
as the interrelated adverse eects on delicate balances in our natural,
social and personal ecosystems and environments.38
As described, the term data pollution is currently used to
emphasise the very real and material adverse environmental impact
of big data on these environments. As follows, the goal of a new ‘green
movement’ for big data is ‘data sustainability’, which cuts across the
SDGs with sustainability considerations connected to the various
environmental changes caused by the volume and diversity of big
data, ranging from its eects on the natural landscape to our decisions
and democracy 39 (see also the eight data pollution domains).
The assumption is that data pollution does not take one easily
identiable form. It impacts entire eco systems of ‘material’ and ‘non-
material’ environments altogether. Thus, no matter the denition
being referred to, the impact of data pollution on our social, personal
or natural environments is as ‘real’ and ‘material’ as the pollutants of
the Industrial Age and must be managed as such. This also means
that we cannot tackle one adverse eect without also tackling others.
A company, for example, cannot claim to have ‘sustainable data
practices’ by reducing its carbon footprint alone, while at the same
time failing to manage the risks that big data handling, storage and
processing poses to our personal and social environments. True data
sustainability means taking into account the entire complex of an
interrelated eco system impacted by the datacation of our societies
26
The Holistic Approach
A holistic approach to data pollution encompasses the complexity of the
socio-technical powers of our 21st-century big data society with micro, meso
and macro level analyses. The presumption is that data pollution is embedded
in complex, very real and material interrelated architectures of powers. It is
at once a design, cultural and social, organisational and geo-political issue.
The presumption is that our identication of data pollution and societal
responses to it are simultaneously enabled and inhibited by structural power
dynamics.
Alongside the introduction of big data and AI in society, we are
experiencing a concrete transformation of the objective qualities of a
material and non-material (social and personal) spatial environment.
That is, electronic global and local digital realities have real qualities
that form the architecture of our realities. They represent existing
forces of power while also transforming them. In this way, we may
also describe our present digitalised global space as an expression
Data Pollution Environments
27
of the expansion 19th-century capitalism and industrialisation.
In other words: we are experiencing the compression of time and
space created mainly for the operations of capital.40 Ours is a global
society characterised by forms of power, materialised in the virtual
architecture and ows of global and digitalised networks.41 That is,
societal power is no longer xed in places, like the nation state, but
is distributed in the very information architecture of socio-technical
systems.
We may thus understand data pollution and power in the context
of larger socio-technical transformations in global societies, but at
the same time knots of power are unravelled in the context of design
and engineering practices that shape spatial digital infrastructures.
In this way, we might, for example, detect a link between a lack of
reection by AI practitioners regarding the impact of their design
choices on the personal, social or natural environment and data
pollution as a global environmental problem. The complexity of
issues that move beyond traditional boundaries of practice equally
complicate ethical reection at a local/micro level. Data pollution is
indeed a complex environmental problem in need of more holistic
solutions.
We can here use a multi-level analysis42 that encompasses both
micro, meso and macro perspectives on data pollution to grasp
the complexity of their socio-technical environments and power
dynamics. The aim is to move beyond a reductive analysis of complex
socio-technical developments focusing on either the micro dynamics
of, for example, designers and engineers of a technology or, on the
other hand, only focusing on larger macro-economic or ideological
patterns. A narrow focus on data pollution in the micro contexts of
design will not comprehend the wider social conditions and power
dynamics for change, while a correspondingly narrow analysis of
28
macro power dynamics and social change will reduce individual
nuances and factors by making sense of them only in terms of these
larger societal dynamics. A multi-level analysis, on the other hand,
allows for the exploration of a more complex environment.43
In terms of an analysis of data pollution, this also means approaching
the issue as a movement between dierent scales of time 44 to detect
larger patterns of technological innovation and consolidation on a
historical scale, while simultaneously understanding their specic
life cycles.45 In this way, we can at the same time understand their
political, organisational and cultural contexts.
Thus, these three levels of analysis (micro, meso and macro) are
central to the delineation of the power dynamics that give shape to
data pollution and our response to it:
On the micro level, powers and interests in data can be identied
in the very design of an AI system. Here, we want to understand data
pollution of the very data design of AI. Where is the data pollution in
the data ecosystem of an AI program? Which interests are embedded
in the data design process? How are data design choices made? Are
there alternative, more sustainable data design options available?
What are the barriers and enablers on a micro design level for
tackling data pollution and achieving sustainable AI data?
On the meso level, institutions, companies, governments and
intergovernmental organisations will be negotiating the interests,
values and cultural frameworks of their practices in contexts of
standards and laws. How are laws and standards implemented
within an organisation? Which interests are emphasised in the
implementation of institutional, standardised frameworks? What
are the barriers and enablers on an institutional, organisational and
governmental meso level for tackling data pollution and achieving
sustainable AI data?
29
Sociotechnical change happens on a macro level in terms of
interest negotiations in what constitute the technological momentum
that a larger socio-technical system needs to evolve and consolidate.46
Our increasing awareness of the data pollution of AI is integral
to a current critical moment in which dierent societal interests
are being negotiated on a macro level in society and expressed
in cultures, norms and histories on macro scales of time. This is
where we see the conicts between dierent systems, political and
business ‘narratives’ of change and innovation, and it is where
critical problems are exposed, solutions are negotiated, and dierent
interests are nally gathered around solutions to direct the evolution
of technological developments. In this regard, we want to understand
the power dynamics of the geo-political battle between dierent
approaches to data and AI. How do the political and social discourses,
legal twists and cultural tensions shape how we tackle data pollution
of AI on a macro level and scale? What are the barriers and enablers
on a historical and geo-political level for tackling data pollution and
achieving sustainable AI data?
The Three
Data Pollution Levels
of Analysis
30
The Human-Centric Approach
The human-centric approach to AI concerns the ethical responsibility of
humans and the preservation of human dynamic qualities and empowerment
in socio-technical AI infrastructures. The approach gained momentum in the
late 2010s in global policy discourses on AI as a human rights and risk-based
approach of the EU’s Articial Intelligence strategies and policy instruments
and the ethics recommendations and principles of intergovernmental
organisations, such as the OECD and UNESCO.47
The ‘people-centred’ approach to ICT governance was the
foundation of early international multistakeholder initiatives on
the regulation of the global information society. As stated in the
Declaration of Principles published in correlation with the World
Summit on the Information Society that was supported by 50 heads
of state/governments and vice-presidents, 82 ministers, and 26 vice-
ministers and heads of delegation as well as high-level representatives
from international organisations, the private sector, and civil society:
We, the representatives of the peoples of the world, assembled in Geneva
from 10-12 December 2003 for the rst phase of the World Summit on the
Information Society, declare our common desire and commitment to build
a people-centred, inclusive and development-oriented Information Society,
where everyone can create, access, utilize and share information and
knowledge, enabling individuals, communities and peoples to achieve their
full potential in promoting their sustainable development and improving
their quality of life, premised on the purposes and principles of the Charter
of the United Nations and respecting fully and upholding the Universal
Declaration of Human Rights.48
A similar human rights-based approach later gained momentum
31
more specically in the late 2010s in global policy discourses on
AI as the ‘human-centric’ approach to AI. It was rst emphasised
in the EU’s Articial Intelligence strategies and policy instruments
published in 2018.49 The European Commission’s Communication
on AI published in the beginning of that same year, for example,
described an anticipatory approach that invests in people as a
cornerstone of a human-centric, inclusive approach to AI and also refers
to putting the human at the centre, based on the Responsible Research
and Innovation (RRI) principle that guides research funded within
the EU’s Framework Programmes.50 The Coordinated Plan on AI
published in December 2018 outlined the unique position and global
ambition of the EU: to become the world-leading region for developing
and deploying cutting-edge, ethical and secure AI, promoting a human-
centric approach in the global context.
The human-centric approach also became the guiding framework
for the EU High-Level Expert Group on AI ethics guidelines published
in 2019. This was importantly recognised as a ‘fundamental rights-
based’ approach stemming from the EU Charter of Fundamental
Rights. The same year, AI principles that emphasised human-centred
values and fairness in particular were adopted by OECD member
countries. In 2021, the Recommendation on the Ethics of Articial
Intelligence was adopted by UNESCO’s General Conference guided
by the more traditional human rights values-based framework to
respect, protect and promote human rights and fundamental freedoms and
human dignity.51
Regardless of the dierent institutional settings, global positions
and historical paths towards a human-centric approach to AI, these
‘ethical governance52 initiatives have a common objective to ‘do good’
while also managing the associated risks and ethical implications
of AI. However, what is also important to note is that, aside from
an emphasis on the special role and status of humans, no shared
32
conceptualisation of how to achieve that goal exists.
Policy debates on the role of people, ethics and values in
technological development have been ongoing since the 1990s as a
response to accelerated progress in science—biology and medicine
in particular53 The Council of Europe’s Oviedo Convention (the
Bioethics Convention), for instance, emphasises the interest of the
human being:
Primacy of the human being. The interests and welfare of the human
being shall prevail over the sole interest of society or science.54
One contemporary critique of the human-centric governance
approach to AI evidently concerns presumed anthropocentricism,
i.e., that this approach is primarily concerned with individual people
and the human species as such.55 Yet, there is also a dierent way
to understand this approach. Rather than ‘human-centrism’ we may
instead refer to a ‘human approach.56 A ‘human approach’ is one that
is concerned with the role of the human as an ethical being with a
corresponding ethical responsibility for not only ourselves but for life
and being in general. It thus follows that people’s dynamic qualities
are prioritised when developing socio-technical infrastructures of
human empowerment.57 In design and engineering contexts, the
term ‘human-centred design’ (HCD) has existed for many decades,
encompassing an approach to ICT system development that focuses
on user needs, knowledge, well-being and other factors, including
adverse eects and risks to humans. While a human(-centric)
governance approach58 does indeed foster HCD in relation to AI
systems focused on individual human beings and their needs, the
ultimate objective goes beyond the individual human being only
when considering rst and foremost the role of people and human
governance in ecosystems—e.g. the role of the empowered citizen in
33
the ecosystem of a democracy, the role of an individual consumer in
the ecosystem of consumption and the natural environment when
making well-informed environmentally friendly choices, or the
role of a group of democratically elected policymakers that create
policies that support environmentally sound science and technology
development. This sort of ‘human(-centric) approach’ is thus not just
about humans—it is human.59
Importantly, the human (-centric) approach also constitutes a
foundational critique of more traditional utilitarian AI development
frameworks that do not encompass a humanist, holistic reection
on the role of people and their ‘artefacts’ in delicately balanced
eco-systems. The impact of human science and technology on the
natural environment in the form of climate change is, for instance,
evidence of the foundational problems of the utilitarian approach.
As a species, throughout history humans have proven to be both a
creative and destructive force on Earth, and while climate change is
a product of the more destructive kind of human activity, humans
also comprise a creative productive force that can readjust and cra
changes through man-made tools (technology). AI is taking centre
stage as a technology that can mitigate environmental challenges.
It can be used to understand and lessen climate change with, for
example, predictions, pattern recognition, optimisation of resources
etc. However, like most human products, it also carries with it risks
and harm that may contribute to climate change. In this context,
we can also think of the human-centric approach that is expressed
in recent AI ethical governance initiatives as faith in humanity, as
recognition that humans have the power to reect on their impact on
and disturbance of social ecosystems and those of planet Earth, and
as an appreciation of the human capacity to be critical, to readjust
and to cra alternative realities and steer sustainable developments.
We see this expressed in policies on sustainable development and
34
the green transition which recognise that digital technologies (AI in
particular) are enablers of sustainable development goals in many
dierent sectors, but simultaneously that we need to also create
sustainable alternative technologies and methodologies and to
reduce the energy consumed by AI.60
This is a critical moment in which human governance, so far
driven by tunnel-vision interests, can be replaced by a more holistic
approach to the environment that embraces the complexity of such
a sensitive ecosystem.
Data Ethics of Power
Values and interests are core components of sociotechnical
change. When identifying the data pollution of AI and exploring
constructive actions to tackle its environmental impact, we need to
consider the interests in data invested in the design of AI, and also
AI and data governance. Data Ethics of Power61 is an applied ethics
approach concerned with making the power dynamics of the big
data society and the conditions of their negotiation visible in order
to point to design, business, policy, and social and cultural processes
that support a human(-centric) distribution of power.
Data Interests
A ‘data interest’ constitutes a specic need, value or goal centred on data
as a resource.62 This can be political, commercial, or scientic interests in
data or an individual’s interest in protecting or making use of their personal
data. Data interests can be found in data design and in data governance
activities.63
35
To develop actions that tackle AI data pollution and support
sustainable AI and data, we rst need to explore how interests are
embedded in the knowledge and values-based worldviews and
frameworks that shape the practices, development and adoption of
big data and AI systems. Data interests can be explicitly examined
during dierent design and deployment phases of AI. They can be
examined as negotiations between dierent interests in digital data.
They represent micro individual stakeholder objectives, values and
needs (for instance, the interest of developers, users, institutions
or businesses in data) or they may even represent macro cultural
sentiments or social and legal requirements. Thus, examining them
with dierent levels of analytical interpretation (micro, meso, and
macro) is important if we are to understand their interrelated power
structures.
36
2.
THE GEO-POLITICS
OF DATA POLLUTION
THROUGHOUT HISTORY AWARENESS OF the role of data
and AI in society and the economy has emerged in varied sectors,
ranging from scientic elds and discourses of the 1950s (e.g., early
mathematics and computer science) to the business culture and
mindset of the big data hype of the 1990s, to then take form in
the global AI policy discourses of the late 2010s. In particular, the
sustainability and ethical implications of AI and data are now integral
to the main governmental and intergovernmental agendas and policy
documents of global power actors. To politically govern the macro
challenges of data pollution, a holistic approach and coordinated
global approach among key power actors is needed. Nevertheless,
the role of big data storage and AI processing, and the impact on
the personal, social and natural environment are currently mostly
addressed separately, and managed in distinct policy elds.64
In public discourse, geopolitics surrounding AI have been dubbed
the ‘global AI race’. It stands between world regions and can be
37
considered the result of the scientic paradigms, histories and design
cultures of the evolution of computer and information technologies
and big data environments. Imagined and developed within the
connes of the science lab decades ago, AI systems have today evolved
and moved outside the lab into private and public sectors, extending
key decision-making processes with new critical infrastructures
infused with AI prediction and analysis. When mathematics
professor John McCarthy coined the term ‘articial intelligence’ at
the Dartmouth Summer Research Project seminar in 1956, computer
scientists and mathematicians working in the eld were primarily
focused on the automation of computation processes. However,
McCarthy wanted to explore how computation could move beyond
processing information only, to think and learn from information
like humans.65 In the following years, the eld was shaped by eorts
to develop expert systems that were based on programmed rules
and human expertise. In the 1990s and 2000s, the advancement of
digital technologies—with the conversion of all types of information
from photos to audio recordings into a set of numbers that could be
processed by computers—enabled the collection of huge amounts of
data.66 This digitalised big data environment became the foundation
of what are today called ‘machine learning’ systems, the most practical
application of AI in the early 21st century. With machine learning, an
AI system no longer needs a human expert as the basis of knowledge.
Instead, the system learns and evolves with data, thereby gaining
autonomy or semi-autonomy.
The competition among regional players for global leadership
in the eld of AI research, development and innovation has always
been characterised by negotiations regarding the power, values and
interests that have materialised in the invention and consolidation
of our socio-technical environments. Those interests cover various
issues, such as AI data resources, capital investment, AI technical
38
innovation, practical and commercially viable research and education,
and even the ‘ethics’ of data and AI. Thus, not only AI research and
innovation, but also risk mitigation and governance have become
forms of cultural positioning and competitive advantages in the
momentum of 21st-century global AI.67 Essentially, it is a competition
between dierent powers that still does not yet represent a united
trajectory for humankind or the protection of the planet. 68
The current geo-political negotiations and regional positioning
on AI, especially in terms of ethical awareness and risk-based
approaches, may appear novel in a public discourse which until only
recently was characterised by the technology worship and thrill of
the 1990s and early 2000s. However, geo-political concerns with the
adverse eects and disturbance of the natural environment and eco-
systems caused by man-made technological progress have always
been at the core of the global sustainable development agenda. The
Human Environment Conference that took place in Stockholm,
Sweden in 1972 was the rst global conference to recognise the impact
of human science and technology on the environment, stating the
urgency to collaborate and act globally:69
In the long and tortuous evolution of the human race on this planet, a
stage has been reached when, through the rapid acceleration of science and
technology, man has acquired the power to transform his environment in
countless ways and on an unprecedented scale. 70
Delegations from 114 governments attended the conference,
and the pre-conference activities in Stockholm brought together
thousands of unocial observers from all over the world. This was
also where the term ‘Environmentally Sound Technologies’ (ESTs)
was coined to represent technologies (or rather entire technological
systems) that can help reduce environmental pollution while
39
at the same time being sustainable by design and during their
implementation and adoption.
In 1992, the Agenda 21 action plan was created at the United
Nations Conference on Environment and Development (UNCED)
(also known as the Earth Summit) held in Rio de Janeiro, Brazil,
which brought together multiple stakeholders from 179 countries
to discuss the impact of human socio-economic activities on the
environment. The agenda called for governments and other powerful
stakeholders to implement a range of strategies to achieve sustainable
development in the 21st century (among other topics), restating the
need for the development and transfer of ESTs.
Environmentally sound technologies protect the environment, are less
polluting, use all resources in a more sustainable manner, recycle more of
their wastes and products, and handle residual wastes in a more acceptable
manner than the technologies for which they were substitutes.71
In 2015, the UN adopted the 17 Sustainable Development Goals
(SDGs), equally emphasising not only the need for ESTs to help
reach those goals, but also restating that this required the adoption
of alternative, environmentally sound development strategies and
technologies.72
The ‘trustworthy’ and ‘human-centric’ AI policy agenda (described
previously as the human-centric governance approach to AI) has
evolved alongside that of a global sustainable development agenda.
The recent political narrative on AI and sustainability in particular
is thus not an arbitrary emphasis. Intertwined with the awareness
of and intention to tackle the social and ethical implications of
AI, there is also a broader global policy agenda regarding the
environmental impact of science and technology which has evolved
over decades. However, only lately have political objectives on the
40
role of AI in sustainable development turned into a more globally
shared political goal. Along with other changes, this happened due
to a merger between the environmental global policy agenda and
the information society/internet governance agenda relating to the
human rights implications of ICTs and the ethics of AI.
As described previously, governments and intergovernmen-
tal organisations around the world have, in recent years, proposed
and presented AI ethics principles and recommendations as well as
general political strategies that, while aiming to ensure good market
conditions, innovation and scientic development in the eld of AI,
have set in motion governance activities that address the social and
ethical impacts of AI. Looking at them in detail, they include several
early statements and intentions on the environmental impact of AI.
Europe in particular is leading the development of policies and
regulation on trustworthy and sustainable AI. The European Green
Deal (2019) mentions several environmental concerns in regards to
AI and stresses that ‘sustainability’ must be a core point of departure
for the development of not only AI technologies, but a digitised soci-
ety in general. For example, it states:
[…] Europe needs a digital sector that puts sustainability at its heart. The
Commission will also consider measures to improve the energy eciency and
circular economy performance of the sector itself, from broadband networks
to data centres and ICT devices. The Commission will assess the need for
more transparency on the environmental impact of electronic communication
services, more stringent measures when deploying new networks and the
benets of supporting ‘take-back’ schemes to incentivise people to return their
unwanted devices such as mobile phones, tablets and chargers.73
Furthermore, the EU’s Coordinated plan on AI (2018) stated:
41
[…] AI uptake requires access to dedicated low-power AI processors that
provide the necessary processing power and are more ecient, by several
orders of magnitude, than general-purpose processors.74
Moreover, it mentions the intention to support research in ‘greener
AI’, simultaneously addressing the energy consumption of AI and
potentially including ‘environmental score’ criteria in the public
procurement of AI.75 This was reviewed in 2021 with the Communication
on Fostering a European approach to articial intelligence76 and a revised
coordinated plan77 restating the role of AI in reaching European
Green Deal objectives, and the intention to build strategic leadership
in sectors including climate change and the environment, as well as
a focus on building a Green Deal data space and the incorporation
of environmental questions in international coordination and
cooperation on AI. Importantly, this also includes the intention to
explore the denition of key performance indicators to identify and
measure the negative and positive environmental impact of AI,
building on the European Commission’s work on resource and
energy-ecient and sustainable infrastructure for data storage and
processing.
Notably, following the rst launch of the EU’s Coordinated
AI plan in 2018, the EU High-Level Group on AI (HLEG) was
established in 2018, composed of 52 selected members consisting
of individual experts and representatives from dierent stakeholder
groups. Tasked with the development of ethical guidelines and
policy and investment recommendations for the EU, the group
developed seven key requirements that AI should meet in order to
be deemed ‘trustworthy’, with one in specic emphasising ‘social and
environmental well-being’:
AI systems should benet all human beings, including future generations.
42
It must hence be ensured that they are sustainable and environmentally
friendly. Moreover, they should take into account the environment, including
other living beings, and their social and societal impact should be carefully
considered.78
The establishment and negotiation of the requirements of the
HLEG’s ethics guidelines illustrate this dawning awareness of the
environmental impact of AI on the social and natural environment
in the region’s policy-framed ethics work. At the time of the
establishment of the HLEG on AI, the European Commission had
around 700 active expert groups that were tasked with draing
opinions or reports advising it on particular subjects. The work of these
high-level expert groups is not binding, and the EC is independent
in the way it is used.79 Nevertheless, when the HLEG on AI presented
its ethics guidelines to the European Commission in March 2019,
a communication was published shortly aer: Building Trust in
Human-Centric AI. In it, the HLEG stated its support for the seven key
requirements of the guidelines and encouraged all stakeholders to
implement them when developing, deploying or using an AI system.
At the end of 2019, the then newly elected President of the European
Commission, Ursula von der Leyen, stated: In my rst 100 days in oce,
I will put forward legislation for a coordinated European approach on the
human and ethical implications of Articial Intelligence.80
Notwithstanding, while most of the HLEG’s recommendations
were reected in the European Commission’s AI Act proposal that
followed in 2021 in the form of mandatory requirements for high-
risk AI, social and environmental well-being was not. In another part of
the world, the US Department of Commerce’s National Institute of
Standards and Technology (NIST) draed an AI Risk Management
Framework which was presented in March 2022: similarly, it did
not include a reference to the environmental impact and risks of AI.
43
While this does not necessarily reect the lack of political will to act on
the environmental impact of AI, what is missing is the coordination
of policy eorts within and across regions, in order to reach shared
goals regarding sustainability in general and in relation to AI.
The sustainable development agenda (which considers both
positive and negative environmental impact of green technologies)
only recently moved into the geo-political agenda on AI. Though
gradual, we are increasingly seeing the recognition that technology
such as AI can help us reach sustainable development goals, while
also acknowledging the fact that, in and of itself, AI can have adverse
environmental consequences (as a man-made component of the
environment) and accordingly the urgent need to address the
sustainable development and implementation of AI. For example,
once the EU AI Act proposal reached the EU parliament negotiation
stage in 2022, environmental considerations were included. In
addition, several other policy instruments in Europe with a geo-
political impact and/or attention reect an awareness of and political
willingness to act on not only the environmental fallout of AI, but
on data pollution in particular. For instance, the EU Data Strategy of
2020 describes the legal components and governance approach to
create a common European data space and a single market where data
can be shared for thematic areas, such as the European Green Deal.
Furthermore, the 2021 ‘Digital Decade’ strategy that constitutes the
European Commission’s vision for the development of the European
digital economy and the transformation of European businesses by
2030 emphasises the need to address the sustainability of data.81
More specically, in 2021 an increasing number of initiatives
and statements were published worldwide with the aim of ensuring
global collaboration in areas such as the development of sustainable
AI and with an emphasis on the environmental impact of AI and
data. The US and EU Trade and Technology Council (TTC) came
44
out that year with an inaugural joint statement with several items
on AI and the creation of a working group on the climate and clean
tech that is meant to identify opportunities, measures and incentives
to support technology development, transatlantic trade and
investment in climate-neutral technologies, products and services,
and, importantly, to include collaboration with third countries (as
they are referred to in the TTC Inaugural Joint Statement) to jointly
explore methodologies and tools.82 In addition, in September 2021,
the European Commission’s Service for Foreign Policy Instruments
(FPI) and the Directorate General for Communications Networks,
Content and Technology (DG CONNECT), in collaboration with
the European External Action Services (EEAS), ocially launched
its International Outreach for Human-Centric Articial Intelligence
(InTouchAI.eu)83 initiative to help promote the EU’s vision on sustainable
and trustworthy AI, a large foreign policy instrument project engaging
with international partners on regulatory and ethical issues of
AI at a global level.84 Moreover, in December 2021, the European
Commission and the High Representative for Foreign Aairs and
Security Policy launched a new European Strategy (Global Gateway)
to boost smart, clean and secure links in digital, energy and transport and
strengthen health, education and research systems across the world.85
All of this happened a couple of years into the Covid-19 pandemic,
which has forced global collaboration on global challenges in
many policy spheres (including the digital sphere, with Covid-19
‘passes’ and contact tracing apps). At the same time, a number of AI
documents were produced with geo-political signicance illustrating
a crucial awareness among global stakeholders of the environmental
impact of data and AI. Noticeably and with forceful global
recognition, in November 2021 UNESCO member states adopted
the Recommendation on the Ethics of Articial Intelligence which,
among other things, states that actors involved in the lifecycle of AI
45
systems:
[…] should reduce the environmental impact of AI systems, including but
not limited to its carbon footprint, to ensure the minimization of climate
change and environmental risk factors, and prevent the unsustainable
exploitation, use and transformation of natural resources contributing to the
deterioration of the environment and the degradation of ecosystems. 86
Furthermore, that same month, the Responsible AI group of
the Global Partnership on Articial Intelligence (GPAI)—a multi-
stakeholder initiative set up by 15 countries in 2020 that expanded
to 25 country members in 2021—published Climate Change & AI:
Recommendations for Government. This inuential report includes key
recommendations on reducing the negative impact of AI on the
climate by, for example, incorporating climate impact considerations
into AI regulation strategies, funding mechanisms, and procurement
programmes.87 Additionally, in late 2021 the voice of companies and
enterprises around the globe, the World Economic Forum, published
The AI Governance Journey: Development and Opportunities, an insight
report acknowledging AI as an emitter of carbon and thus the need
to address the issue through global collaboration:88
As knowledgeable as we have become in tackling some areas, a considerable
amount of thought and work remains on other downstream eects of AI on
the planet. Though certainly championed for its potential to help tackle global
issues such as climate change, the infrastructure around AI systems has also
come under scrutiny for its carbon output.
It should be noted that the conceptualisation and implementation
of ‘data sharing’ infrastructures for Earth observation and climate
data, among other things, has received attention as part of the
46
sustainable development agenda. It is an essential foundation for
global coordination on the mitigation of climate change in particular,
with several international initiatives aiming at the creation of open
data spaces with global reach. One example is the Destination Earth
(DestinE) project: set up as part of the European Commission’s Green
Deal and Digital Strategy, its aim is to develop, on a global scale, a
digital model of the Earth to monitor and predict the interaction
between natural phenomena and human activities.89 Global initiatives
and statements on the sustainable use of data in particular have been
set up also. For example, in October 2021, at the UN’s World Data
Forum, leading statisticians working in the international statistical
system called for a Global Data Convention to safeguard sustainable
development.90
In conclusion, it has become increasingly evident that the impact
of AI data on the natural, social and personal environment deserves
a place in the geopolitics of the agenda on sustainable development.
Because of its adverse environmental consequences, a global digital
environment simply cannot evolve without a globally coordinated
political agenda that aims to ensure sustainable design, innovation,
business and implementation. It is also the reason why avid attention
must be paid to the political and social discourses and interest
negotiations that are today shaping the geopolitical response to the
data pollution of AI. As of now, awareness of the adverse eects of big
data storage and processing among dominant geopolitical powers is
evolving, and diverse political responses are emerging in dierent
regions and within dierent frameworks and elds of governance.
However, a more holistic coordinated geo-political response is still
nascent. For instance, most environmental data sharing initiatives
are not asking if or how the energy costs of AI and, in particular the
data, required for global AI systems for climate change should be
47
weighed against the benets of their deployment.
What we need to understand now is how political, social and design
cultures on data and AI evolve in the context of these dierent geo-
political agendas and the negotiation that takes place between the
various societal interests invested there. On this matter, we need a
clearer view of the macro powers and interests that are competing in
this eld, i.e., the power dynamics of the geo-political battle between
dierent approaches to data and AI. Thus, what also needs further
exploration is the level of political awareness of data pollution, and
how interests in AI and data are negotiated and nally gathered
around solutions that may or may not support a sustainable AI global
narrative of change.
The still-dominant narrative of the early aspirations of AI
researchers to achieve human-level intelligence for computers and
the big data thrills of the late 20th and early 21st centuries do not fully
encompass the environmental problems of data and AI. This is why
we need a new narrative for AI data, one that would help reveal the
challenges we are unable to see today. Adding data pollution to the
geopolitical sustainable development agenda, for instance, opens up
yet another global power problem which rests in the very constitution
of not only multistakeholder, but also multi-community and regional
participation in the creation of the emerging sustainable AI agenda.
That is: those who are currently dening problems, identifying
solutions and determining the speed of their implementation, are
the ones who already have an AI infrastructural advantage.91 In
today’s information-driven and networked global society, power is
integrated in very real and material digital data architectures and, as is
increasingly highlighted in public debate, these digital architectures
most oen sustain the already-powerful while putting others at
a disadvantage. The asymmetries reproduced in the architecture
of the big data society and the political narratives shaping it are
48
replicated in very dierent experiences of data pollution and also in
the exclusion from participation in global agenda-setting discourses
on trustworthy AI and sustainability. For example, while citizens of
developing countries are experiencing the adverse eects of data
pollution on their natural environments, rarely do they have a say in
the global sustainable development debate on AI.
Power asymmetries may be traced back to the formation of AI
as an innovation and marketplace dominated by a few technology
giants. Firstly, we may refer here to a form of ‘digital colonialism’
(which will be further explored in the next chapter in the context
of ‘global opportunities’) powered by dependency on these few
technology giants, e.g., the dominant cloud providers on which AI
developers depend. A recent report draed for the Africa Policy
Research Institute by Rachel Adams on AI in Africa illustrates the gap
between the production and development of AI, which requires the
use, availability and extraction of local human and natural resources
and the true manifestation of the benets of AI in that same region.92
It also outlines the dominance of foreign AI technologies and their
incompatibility with local development priorities. This technology
dependency is not only further deepening the gap between the ‘Global
North’ and the ‘Global South’, in fact it also hampers the very act
of measuring the carbon emissions of AI and data.93 Encouragingly,
as Adams states in the report, national AI strategies are increasingly
being launched with methods and promises regarding the sustainable
local development of AI in the Global South. For example, countries
across dierent regions of Africa are developing or about to develop
national AI strategies to guide its adoption alongside the development
of local AI skills and research, talent attraction, management of the
sustainability of AI, the development of open data platforms, etc.
In addition, as associate fellow at ORF’s Centre for Security
Strategy and Technology Trisha Ray describes it, while technology
49
giants (like Microso, Alphabet, Facebook and Amazon) have
responded to climate concerns with ‘net zero’ policies and initiatives,
they oen rely on what she calls the decades-old inequitable carbon
oset system. At the same time, measuring their carbon emissions is a
nearly impossible task due to lack of transparency of the emissions
of these companies’ operations, which should include not just their
own facilities, but also their broader supply chains.94
While the double edge of ESTs has always been part of the global
sustainable development agenda, it seems that attention to AI data
pollution in particular has been less of a concern in terms of the
sustainable AI agenda until only recently. Considering the dominant
power dynamics and competing interests represented in the
evolution of BDSTIs into AISTIs, we may argue that data pollution
is not just a new term for the global sustainable development
agenda, but also represents the voice and counter narrative of the
minority stakeholders participating (or not) in information society
policymaking in general.
50
3.
CATALOGUE OF
DATA POLLUTION
DOMAINS
DATA POLLUTION ISSUES TAKE many forms. Neither data
pollution practices nor the environmental impact of data pollution
can be identied in one area, one sphere of practices or one type of
environment. The following is a catalogue of dierent domains in
the natural, social and personal environments where data pollution
can have adverse eects. They are explored in terms of the power
conditions and contexts that are impacted by data pollution. For
issues of clarity, the domains are described individually. However,
it is important to note that they are oen intrinsically interlinked.
Thus, for example, data in the domain of nature will also have an
impact on the domain of global opportunities and vice versa. As will
be illustrated, this is most oen due to interlinked power dynamics
in increasingly globalised personal, natural and social environments.
The objective of this catalogue is to help frame data pollution as an
environmental problem that extends across several domains. Among
other things, the domains arose from discussions at meetings with
the Sustainable AI Lab’s DPP Group from September 2021 to April
2022.
51
1. Nature
To measure and mitigate data pollution in the domain of nature,
there is an urgent need to ‘re-materialise’ it.96 Information and
communication technologies (ICTs), data design and AI all constitute
not just virtual, but also material infrastructures that pollute the
natural environment in very concrete ways. However, this ‘materiality’
of infrastructure is oen neglected in science, policy and business
discourse on the limitless data resources of the digital economy
and society.97 Re-materialising98 AI and data means recognising its
entire material infrastructure, including the energy it consumes, its
hardware and the electronic waste, the mining of precious minerals,
its data processing and storage, and the water needed to cool data
centres.99
Even so, it is an incrementally complex task to measure the extent
of AI data pollution and its impact on nature, as this depends on
a larger whole made of many individual components, such as the
actual power sources used.100 In fact, when it comes to the impact
95
52
of, for example, the carbon footprint of AI, what can be most easily
measured, as Kaack et al. argue,101 is most likely not that which
has the largest impact, and thus eorts to align machine learning
development with climate change strategies, for example, are
hampered. They therefore introduce a systematic framework for
describing the greenhouse gas emissions of machine learning (ML).
Data pollution in the domain of nature is invisible to AI
practitioners, but it can still be addressed as a component of their
competencies, education and technological dependencies. Tools
for researchers and engineers are being developed to measure the
environmental impact of the AI models they develop and train.
For instance, the CODECARBON tool is a soware package that is
integrated into the Python codebase and estimates the amount of
carbon dioxide (CO2) produced by the cloud or personal computing
resources used to execute a given set of code.102 It then shows ways
to reduce emissions by optimising the code or by hosting cloud
infrastructure in geographical regions that use renewable energy
sources. Additionally, sustainability management mechanisms can
be introduced with ‘Sustainability Budgets’ implemented in soware
design, incentivising and rewarding developers and designers for
coming up with solutions to reduce the carbon footprint of their
programs.103 Another approach is suggested by the Green Soware
Foundation, which recommends the creation of alternatives to
large AI models that require energy-intensive big data sets and
computational infrastructures, and the draing of standards to
measure the carbon footprint of an AI model.104
Alternative technologies and methods, and tools and frameworks
for AI practitioners to measure, report on and tackle data pollution
belong to the micro context of the development of individual
AI products and services. Yet, data pollution is not an isolated
phenomenon that can be addressed in individual contexts of
53
development and use alone. Data pollution in the domain of nature
is the result of a complex set of processes, technology design and,
in particular, power dynamics dened by globally distributed
technological dependencies and concentrations of power. A global
analysis of not only the extent of data pollution is important, but
the granularity of the problem is pivotal specially for coordinated
responses that are sensitive to the micro contexts of design and use
in regions worldwide. However, such an analysis is complicated by
the concentration of power in the hands of a few dominant actors
in AI design and development. For example, if we want to analyse
data centre energy use in regions worldwide, regional data centre
statistics can only be obtained from the four main cloud service
providers that most AI developers depend on: Amazon Web
Services, Google Cloud, IBM Cloud and Microso Azure. 105 In other
words, data on data pollution in the domain of nature on a global
scale is concentrated in the data centres of the very few actors which
dominate the technological services used by AI practitioners. This
concentration of power adversely impacts the essential granular
analysis of data pollution and inhibits coordinated global responses,
as many regions with less powerful actors are not represented in
statistics on data pollution.
54
A variety of ethical implications of AI can be mentioned, but
one of the ethical issues that is under-researched and under-
valued at the moment is environmental consequences, e.g., that
there is a link between the ‘pollutant eect’ of data and AI. When
we train algorithms, there are CO2 emissions, electronic waste,
mining of precious minerals, etc.
- Aimee van Wynsberghe, Data Pollution &
Power Group, 1st Meeting Mini Report, 2021
At the data centre level, there have been calls for more
reection by soware engineers and data scientists on the choices
that they make and what they run on. Everything is run on big
player platforms that make environmental claims that are hard
to scrutinise and understand. This is where accountability
comes into play as an essential element of governance: issues of
transparency and the ability to scrutinise.
- Jenny Brennan, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
Many companies want to use AI to address their
sustainability issues in innovative ways, but oen the actual
impact of such projects is unclear. It is really important to
create a means for assessing and evaluating impact and to help
industries actually improve things, and prevent projects where
in reality nothing changes.
- Lynn H. Kaack, Data Pollution & Power
Group, 3rd meeting, 2022
2. Science and Innovation
Data pollution is the outcome of data ‘in use’ within AI socio-
technical systems, but we may also consider ‘data exhaust’ and ‘data
spills/waste’—what is also referred to as ‘dark data’—which form a
55
majority of the big data stored worldwide. Dark data is data that is
not in use but is kept by companies and organisations for compliance
reasons or future potential uses. Most of this data is never put to use.106
As the accumulation and processing of big data is an environmental
problem, the big data interest of AI engineers, industries, policy-
makers and data scientists in the domain of AI science and innovation
can also be addressed as a fundamental data pollution problem. In
a big data economy, the most powerful companies and institutions
are guided by a big data mindset107, and practices are dictated by the
imagination of big data as an unlimited resource that will not run
out like natural resources. Big data holds the promise of endless use
and reuse,108 and the collection and storage of big data becomes an
end in and of itself. Success equals the ability to go beyond the limits
and borders that lock in data and everything is translated into digital
data to quantify the world.109 Thus, locked into the potential of data
are problems sought and solved with novel modes of data storage,
collection, processing and analysis. New ways of making sense of big
data become goals in themselves and, accordingly, AI is perceived
as a key to opening up the potential of big data. From this point of
view, data pollution initiatives and policies are essentially obstacles,
and lobbying eorts on specic legal provisions in the regulation
negotiation process would depart from the ‘locked-in data potential’
problem, also becoming the root of ‘check list’ and compliance-only
cultures in big data and AI science and innovation.
Tackling data pollution in the science and innovation domain
requires a counter science and technology-based ‘culture of data
sustainability’, which must be supported in policy, innovation and
education. There is a need for environmentally sound development
strategies for alternative technologies and, in this regard, we might
present and insist on alternative ‘realities’ of AI design that, for
example, do not rely on big data only. Chahal and Toner illustrate how
56
research on ‘small data’ approaches to AI has grown with methods
such as transfer learning, data labelling, articial data generation,
Bayesian methods and reinforcement learning.110 They argue that
these small data methods are relevant when enabling AI in areas where
there is, by denition, little or no data (for example in forecasting
rare natural hazards or in predicting disease for a population with no
digital health records), and that pretrained transfer learning models
can, for instance, also reduce training time and consequently the
computational resources needed to train algorithms.
57
In healthcare data analysis, we have a tendency to think that
the more datathe better, or that more data accumulation means
more accuracy. Algorithms will be trainedwith as much data
as possible to get that 1% more accuracy in devices and tools,
without considering the environmental impact. This is of course
not unique to the healthcare sector. It is the same inmany other
sectors.
- Signe Daugbjerg, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
We talked to academics and stakeholders addressing the
issue of what to do with the environmental impact of AI, the
digital world and data. We want to understand the values that
drive people and how they frame the problem they are trying
to address. I think important questions are: Which values are
driving this? Who should have responsibility for what? What are
the relationships between the dierent stakeholders? These are
descriptive and normative questions about power relationships
and data pollution.
- Federica Lucivero, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
3. Democracy
Technologies are highly political and embedded in the dynamics
of ruling powers by design and implementation. However, in various
ways and with dierent degrees of force, technologies not only reect,
58
represent, and reinforce visible politics, established norms and
institutional standards of a society, they also transform powers. Thus,
just like any technology, AI holds the potential and the risk to either
strengthen constitutional democracies or challenge them. How do
we understand data pollution in the context of the politics of modern
democracies? A democracy is founded on sensitive information
balances between citizens and the state, which is stipulated in laws
(e.g., human rights laws, charters and conventions), state governance,
institutional procedures and frameworks regarding the conduct of
elected representatives and public servants. We can refer to this as
the information ecosystem of a democracy.
Modern democratic societies need socio-technical infrastructures
that reinforce and ensure a democratic ecosystem of information
distribution between states and citizens. As a citizen in a democracy,
you have the right to access information about state conduct and
the right not, for instance, to be subjected to surveillance without
reasonable suspicion. Your democratic empowerment as a citizen is
based on knowledge about how your personal information is being
gathered and processed; it means that you have knowledge about the
policies that you are exposed to in order to make informed decisions,
for example during elections.
Today, AI is ingrained in society in multiple forms via increasingly
complex digital systems that have been developed to contain and
make sense of large amounts of data and to act on that knowledge.
Socio-technical big data and AI infrastructures (BDSTIs and AISTIs)
are also a key component of the politics and the very functioning
of modern democracies. Big data and articial intelligence were
key tools in the election campaigns of US Presidents Barack Obama
(2012)111 and Donald Trump (2016).112 The same can be said of Indian
Prime Minister Narendra Modi’s election in 2014.113 Big data and
AI do not only form the infrastructure of the elections of political
59
representatives: a democracy is reinforced and implemented by
administrations made up of public authorities and public servants.
That is, a democracy and its public authorities are mutually supportive
as they depend on each other for their everyday operations.114 In its
Automating Society annual report, non-prot research and advocacy
organisation Algorithmwatch maps the use of automated decision-
making applications in the public policy sphere in Europe, looking
at everything from algorithms used for grading in schools, to those
used for distributing family benets, identifying individual risk
factors related to social exclusion in young adults, and helping judges
understand trends arising from previous court rulings.115
Now, if big data and AI form the socio-technical infrastructure of
the information ecosystem of a democracy, its procedures and its
institutions, then data pollution can be considered an environmental
problem that challenges its very balance. That is, while citizens’ data
waste/exhaust might not be useful to the commercial actors collecting
and storing the data, the creation of data-intense socio-technical
infrastructures can and has oen been repurposed, for example,
to enforce mass surveillance activities of states and to provide
asymmetric access to the data on citizens that can be transformed
into black box manipulation of voter behaviour during elections.
As one of the instrumental EU policymakers behind the EU
General Data Protection Regulation (GDPR), Paul Nemitz, argues,
the accumulation of digital power that shapes the development
and deployment of AI is a threat to the human rights, democracy
and rule of law that are the cornerstones of liberal constitutions.116
Several examples can be provided to illustrate the adverse impact of
data pollution in the domain of modern constitutional democracies.
The Cambridge Analytica scandal is the most famous. It revealed a
British consultancy rm’s use of machine learning to analyse the data
of 87 million people worldwide to inuence democratic processes
60
in the US and the UK.117 The Articial Intelligence and Democratic
Values Index (2021) issued by the Center for AI and Digital Policy
(CAIDP) assesses AI adoption and policy strategies in 50 countries
worldwide in terms of their democratic impact. It mentions the
adverse eects of AI on the social environment, such as China’s use
of AI to score citizens in terms of their alliance to the state, and to
control and target ethnic minorities and protesters as the source of
widespread fear and scepticism.118 Other examples are ‘infrastructural’
in nature (see the Infrastructures Domain) when facial recognition
systems in countries from Austria to the US are embedded in public
spaces and used for surveillance purposes, or when AI is used for
predictive policing, thereby predetermining human destinies based
on historical data.
During the pandemic, the digitalisation of politics
accelerated. Before that, we felt the eects of data pollution in
the US presidential election in 2016 with Cambridge Analytica
and Donald Trump’s election. We saw how a soware program
could inuence choices and the ballots by targeting individuals
with personalised ads. We also understood that data protection
is not only about privacy protection; it is also about the public
sphere and we have seen that its impact can take form in terms of
private, public and even global harm. It can actually damage an
entire political and electoral political system and our democracy.
- Sebnem Yardimci-Geyikci, Data Pollution
& Power Group, 1st Meeting Mini Report, 2021
61
4. Human Rights
In terms of human rights, data pollution constitutes a corrosion
of the international system that protects the rights of people
everywhere. International human rights as we know them today
were developed, institutionalised, standardised and embedded
into international agreements, laws, procedures and practices over
decades. Nevertheless, in the past ten years, not only was the legal
implementation of human rights challenged in the context of
emerging ICTs and the development of a digital sphere, but the very
justication of specic human rights, such as the right to private life,
has been questioned. International human rights have their origin in
the UN Declaration of Human Rights (UNDHR) that was draed by
representatives from regions worldwide and proclaimed in 1948, in
addition to other international instruments, treaties and covenants.
Furthermore, the European Convention of Human Rights (ECHR)
was signed by 47 member states and entered into force in 1953.
Mechanisms are in place for monitoring the compliance of state
parties to the UNDHR with their human rights obligations, while
ECHR signatories are accountable to the European Court of Human
Rights (ECtHR). In the EU, the Charter of Fundamental Rights,
which came into force in 2009, further embeds the rights of EU
citizens in EU law. Here, the protection of personal data (article 8) as
a fundamental right of EU citizens is also delineated in an extensive
data protection regulatory framework (the GDPR).
The development of BDSTIs in the late 1990s to early 2000s was
enabled by the accumulation, centralisation and tracking of digital
data across geographical territories and legal jurisdictions. As they
were integrated in society, individual human rights protections were
increasingly challenged and held up against interests of, for instance,
states to control and gather intelligence, or the interests of data-based
62
business models of internet platforms. Arguments against the right
to privacy were legitimised by invasive state and business practices.
In this way, privacy got a bad name for itself119 and was at one point even
described as no longer a social norm.120
Data pollution in the human rights domain amounts to a gradual
corrosion of the environment of an established human rights system.
The ECtHR has on several occasions interpreted and made decisions
on the challenges that the progress of digital technologies pose to
the ECHR’s territorial denition of jurisdiction.121 In 2013, following
Edward Snowden’s revelations of a mass surveillance intelligence
global system, the United Nations General Assembly even had to
rearm that the same rights that people have oine must also be
protected online.122
Nevertheless, international human rights are also gaining
a foothold as instruments for citizens and citizen advocacy
organisations to formally challenge and hold accountable the entities
responsible for environmental problems that are directly aecting
human health and well-being. For instance, although the ECHR does
not contain a specic right to a healthy environment, environmental
issues that aect the rights of people, such as the more traditional
forms of pollution (e.g., industrial emissions and hazardous waste)
are increasingly being brought before the ECtHR.123 Similarly, data
pollution issues that aect individual rights are also increasingly
oen being legally challenged as part of a human rights framework.
Famously, in 2013 following the Snowden revelations, Austrian
lawyer and founder of NOYB Max Schrems led a complaint with
the Irish Data Protection Commissioner stating that Facebook Inc.
(today renamed as Meta Platforms Inc.) was illegally sharing his
personal data with the US. This case led to the invalidation of the Safe
Harbour data-transfer agreement between the EU and US, which is
still being renegotiated today.124
63
Data pollution and AI pollution is about the deterioration
of human rights on an intra-generational as well as
intergenerational scale. Under this umbrella of data and AI as
a deterioration of human rights, we can think of data ethics.
There is a list of ethical issues related to the data life cycle of AI.
How data has been collected, how it has been acquired, how it
has been sourced, how it is stored, how it is been labelled.
- Aimee van Wynsberghe, Data Pollution &
Power Group, 1st Meeting Mini Report, 2021
5. Infrastructure
Our personal, social and natural environments are increasingly
extended and transformed by BDSTIs and AISTIs (see glossary).
The social environment is sustained and altered in and by social
networking platforms that extend real-life social relations. The same
happens in the personal environment when identity is expressed
in proles and feeds or accumulated through proling algorithms.
Moreover, the natural environment is transformed and extended by
constantly evolving socio-technical infrastructures.
In this regard, we can think of the way in which the personal, social
and natural environments of our global society have evolved. In the
modern world, reduced travel times and costs shorten the distance
between dierent places and the development of global means of
communications (from the telegraph to the World Wide Web) has
transformed human experience and representations of time and
space. This has amounted to an annihilation of space through time that
has transformed the very objective qualities of time and space.125
64
Essentially, dierent types of socio-technical infrastructures
(such as BDSTIs and AISTIs) are the ‘human’ components of the
environment.126 What this means is that humans actively invent,
create, design and repair socio-technical infrastructures, and they
also negotiate, compete with and exercise power over it.127 The
infrastructure that shrunk the globe, from railroads to information
highways, was made by humans and, as such, it is imbued with
human controversies and interest negotiations, which many times
throughout history has resulted in the domination of one people
and social group over another. Infrastructure is made up of sites of
human power struggles and social conicts128 invested with dierent
visions and hopes regarding the human occupation of space.129
As argued elsewhere, AI and big data have become the socio-
technical infrastructure of power and access (AISTIs and BDSTIs) in
society today:
The computer hardware and soware of the AI systems that store and
handle big data are embedded in society and, like bridges, streets, parks,
railroads and airways, they form our spatial environment, although they
are dierent from traditional infrastructures. Roads and bridges, for
instance, form the basic material architectonics of society and thus provide
or limit access to places, but they are passive, so to speak, when mapping
and expressing human motives, morals and social laws. AISTIs transform the
very objective material qualities of space. Quite literally they transform space
into interconnected digital data […] AISTIs also lock us in specic positions,
providing or denying access based on the processing of personal data. They
are mediating spaces.130
The human infrastructural power component of personal, social
and natural environments can also be aected by data pollution.
And, just like air pollution, where human exposure is increased
65
by the concentration of pollutants in the air, the adverse human
impact of data pollution is increased by the concentration of data
pollutants in BDSTIs and AISTIs.131 The space of ows of the network
society consists of electronically linked places dominated by
managerial elites.132 That is, power is distributed in the very design of
information infrastructures and thus data pollution emerges when
the concentration of data power in specic ‘places’ in socio-technical
infrastructure is high and adversely impacts the power distribution
between dierent actors in society.
In previous sections, the accumulation and concentration of data
possession, storage and access has been illustrated to negatively
impact our social and personal environments in terms of chipping
away at human rights (such as the right to privacy) and democratic
societies. However, what is less obvious is that it also changes the
natural environment in terms of response and mitigation when, as
mentioned above, it aects a granular analysis of the carbon footprint
of major AI models on a global scale, as regional data centre statistics
can only be derived from the managerial elites.133
Issues to consider in relation to AI sustainability are therefore
who has access to data and data collection, which is currently
very asymmetrical, for designers and scientists, but also for
governments and companies that are not part of the big tech
ecosystem. The idea of pollution is associated with something
that is excessive, something that by its sheer size and volume
is damaging. How can that damage be better controlled? If not
through size, then it should be through other means.
- Carolina Aguerre, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
66
6. Decision-Making
Data pollution in the design of autonomous or semi-autonomous
decision-making systems can have adverse impacts in the human
decision-making domain. In the realm of everything from civic
participation to social networking, judicial practice and many
others, decision-making is increasingly extended by components of
machine learning data analysis (also referred to as ADM Systems) that
create evidence for, support, and/or replace the process of arriving
at a point in which a decision can be made, oen even making the
decision itself.
The eects of data pollution in the human decision-making
domain are most profoundly expressed as the reinforcement
or creation of bias and discrimination in society. Friedman and
Nissenbaum explored bias in the design of early 1990s computer
systems supporting decision-making processes in various domains.134
They illustrated how computer systems unjustly benetted or put
some groups at a disadvantage, and developed three categories with
which to discern bias in the design of systems:
Pre-existing bias comes from the outside of the computer system.
It can be individual or social, and it already exists in social contexts
and in the personal biases and attitudes held by the developers of
the system. This type of bias is embedded in a computer system
either explicitly and deliberately or implicitly and undeliberately by
institutions or individuals.
Technical bias comes from technical constraints or limitations,
like imperfections in pseudorandom number generation that, for
example, systematically favour those at the end of a database.
Finally, emergent bias appears in the context of use of a computer
system.
67
We might similarly address the data pollution of decision-making
environments as a simultaneously social and technical component
of the very data design of an ADM system embedded in a decision-
making environment: An ADM system trained on an existing societal
language and discourse (for example news articles) pollute a socio-
technical decision-making environment when ‘existing bias’ in
society is reproduced in the ADM system. Bolukbasi et al. (2016),
for instance, found that the ‘word-embedding’ machine learning
methods that are most commonly used for language processing in
online search engines were reinforcing societal gender bias because
they were trained on Google news articles. Words such as architect,
philosopher, nancier and similar titles were grouped together
semantically as extreme he words, whereas words such as receptionist,
housekeeper and nanny were grouped together as extreme she words.135
Similar data pollution in the decision-making domain may also
happen when minorities are not included in data design teams and
when their interests are not reected in data design or generally when
the developmental phase of AI is framed within specic cultural
contexts.136
Data pollution can also emerge as a ‘technical bias’ of, for instance,
a data classication system that fails to include representative data on
minority groups. In one recent example, a study of patients in Boston
(USA) revealed how an ADM system used to score the health status of
patients waiting for a kidney transplant assigned healthier scores to
African Americans on the list (thereby potentially aecting decisions
regarding their eligibility for a transplant), as it was including race as
a category in the design of the system.137
Finally, socio-technical decision-making environments may
be polluted when biases ‘emerge’, as ADM systems begin to take
precedence over human decisions. For example, when exams were
suspended due to the COVID-19 pandemic in the UK in 2020, the
68
British exam board, Ofqual, deployed an algorithm to generate
student grades. This meant that teacher assessments of each
individual student were replaced by an algorithm, which took into
account their school’s performance in the past. The result was that
the grades of students from large state schools decreased, while the
grades of those attending smaller fee-paying schools increased.138 In
another context, the Dutch digital welfare fraud detection system
SyRI (Systeem Risico Indicatie) was used to detect the likelihood
of an individual committing tax or benet fraud by analysing large
data sets from dierent sources. Discrimination emerged quite
clearly when the system was deployed primarily in low-income
neighbourhoods.139
In the human decision-making domain, we may suggest to
consider two forms of data pollution:
Intentional data pollution
In the development of machine learning systems, data pollution
is traditionally associated with training data that causes the system
to learn an incorrect model, resulting in the misclassication of
samples or actions that run contrary to set objectives. Data pollutants
can be intentionally inserted in training data with malicious intent.
For instance, Microso’s AI chat bot Tay (released on Twitter in
2016) was trained on interactions with people on Twitter. It suddenly
started posting oensive tweets and had to be shut down. Microso
claimed that Tay had been ‘attacked’ by internet trolls with oensive
language and had evolved based on these interactions.140 Intentional
data pollution does oen not result in the data pollution of decision-
making environments, as the polluted system simply will not work
according to set objectives and is thus never deployed in practice—or
only deployed shortly, as was the case with Microso’s Tay.
69
Unintentional data pollution
The greatest impact on the human decision-making domain is
most oen the result of unintentional data pollution. That is, when
‘pre-existing’ social and individual bias are embedded in the data
design of ADM systems, in training data or by non-representative
development teams, or when bias ‘emerges’ aer the ADM systems
have been embedded as normative socio-technical infrastructure in
human decision-making domains. In these cases, the reproduction
and reinforcement of the bias of an ADM system will result in
discrimination against individuals or groups in the specic domain.
There are several examples of decision-making domains being
aected by data pollution. The most profound eects are found in
examples where AI systems and tools are adopted and implemented
as normative socio-technical infrastructures, taking precedence
over human decision-making processes. For example, a report on
algorithmic bias by a group of researchers from the University of
Chicago states that biased algorithms are deployed throughout the
US healthcare system, generally having an inuence on decisions
about how patients are treated by hospitals, insurers, etc. Such
ADM systems are oen allowed into the decision-making domain
of healthcare without vetting and oversight and have been used for
decades without assessment of their ethical impact.141 142 Another such
example is the mobility service Uber’s ‘robo-rings’. The company
employs algorithms to discover fraudulent activity by drivers, ring
them without human intervention. Several such cases have illustrated
unchallenged errors and their very real human consequences. In
2020, four drivers in the UK and Portugal who had been dismissed
based on decisions made by the algorithm led suit against Uber,
claiming that algorithm-based ring violates Article 22 of the GDPR,
which establishes a prohibition of decision-making based solely on
automated processing.143
70
Particularly, the human(-centric) approach stands out in the
data pollution decision-making domain as an approach that
ensures the development of socio-technical infrastructures of
human empowerment that involve human life, experience and
critical empowerment in the very data design, governance, use and
implementation144. As illustrated by Pasquale with a number of case
studies in health care, education and media focusing on his rst law of
robotics AI, complementing rather than replacing human expertise
realises important human values.145
Take, for example, the introduction of bias in our data sets.
When data sets under-represent a minority group, because they
were not included in trials. When you set up a research study
or a trial, you will have very strict parameters for who will be
included in each trial and it is rarely the most ill patients, the
elderly or the homeless, for example. The ones we need to help
the most are oen the least represented. When algorithms are
made, based on historical ‘medical facts’ from these studies, these
population groups will not be represented.
- Signe Daugbjerg, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
71
7. Global Opportunities
Building inclusive and sustainable economies and societies on a
global scale by creating global opportunities is an ambition rooted
in the UN Agenda on Sustainable Development. However, this
ambition is being hampered by data pollution. Data pollution in the
global opportunities domain is ‘data colonialism’, as it reinforces
existing social hierarchies and colonial power dynamics that impact
the distribution of global opportunities.146
Uneven world power dynamics, geopolitics, territorial expansion
and globalisation processes have, throughout history, shaped the
opportunities (or lack thereof) of nations and people. Likewise,
although they oer new economic and social opportunities,
unbalanced processes of digitalisation have reinforced divides
between the Global North and the Global South from the outset.
Accordingly, the rst global summit on the information society, the
‘World Summit on the Information Society’ (WSIS), was established
at the UN General Assembly in 2001 with the stated aim to develop
and promote an inclusive global information society, and enable
the opportunities of new ICTs for all and address emerging digital
divides.147 Nevertheless, today in 2022 it is evident that the big
data and AI ‘revolution’ has made the greatest dierence in terms
of opportunities in the economies of developed nations, while
developing nations are being le behind. At the same time, the
very experience and impact of data pollution are most profoundly
experienced in communities that have traditionally been the most
exposed in terms of global power dynamics. Existing inequalities
rooted in a colonial history are replicated and reinforced in the new
digital data systems of power, with data pollution most profoundly
impacting already-vulnerable communities once again. Browne,
for instance, describes the African American experience of digital
72
surveillance as nothing new.148 In fact, it builds on histories of being
subjugated to acts of surveillance, violent branding and control.
Cieslik and Margócsy similarly describe the creation of datacation
infrastructure during the colonial period as the foundation of modern
day asymmetries of power replicated in systems of datacation.149
They refer to cases in which data resources are extracted from the
Global South to be analysed in the Global North, oen exploiting the
insuciency of legal frameworks in the protection of the rights of
local citizens.150
Data pollution is data colonialism, a colonisation and commodication
of everyday life,151 and a component of new forms of digital
colonialism152, that is, the constitution of global power dynamics and
geo-politics in which technological dominance creates the political,
economic and social domination of some nations and peoples, or
some communities, over others. It is a form of colonialism that is
therefore also actively resisted in local communities and regions
aected by it.153
In this regard, we may describe data pollution in the global
opportunities domain much like Mejias and Couldry describe data
colonialism as a type of global exploitation of people and resources:
Instead of territories, natural resources, and enslaved labour, data
colonialism appropriates social resources. While the modes, intensities,
scales and contexts of data colonialism are dierent from those of historic
colonialism, the function remains the same: to dispossess.154
In continuation of this line of reasoning, with an emphasis on
the specic socio-technical evolution of AI, Mhlambi describes
AI colonialism as a colonising impact shaped by dependencies.155 AI
implemented in the Global South is developed primarily by outside
companies and thus exacerbates imbalances in the distribution of
73
power and resources—for instance, when ADM systems implemented
in local contexts reect Western values only. A series of articles
in MIT Technology Review by senior AI Editor Karen Hao et al
illustrates this new form of AI colonialism in case studies in which
the implementation of AI repeats colonial orders.156 In South Africa, AI
surveillance tools reinforce digital apartheid and racial hierarchies;
in Venezuela, AI industries exploit cheap labour for data-labelling,
and in Aotearoa (the Māori name for New Zealand), language
models trained on dominant languages are being challenged by an
indigenous couple and their non-prot radio station. Hao describes
AI colonialism as a new form of colonialism similar to the European
variety seen starting in the 16th-century expansion and ‘discovery’ of
land. Though they are dierent, they are both driven by prot and
power interests:
The AI industry does not seek to capture land as the conquistadors of the
Caribbean and Latin America did, but the same desire for prot drives it to
expand its reach. The more users a company can acquire for its products, the
more subjects it can have for its algorithms, and the more resources—data—
it can harvest from their activities, their movements, and even their bodies.157
Data pollution in the ‘global opportunities domain’ is an adverse
eect that reinforces existing power imbalances in global society
across dierent levels. The early eorts to address global inequalities
and to create a more-balanced information society in the development
of WSIS processes in the early 2000s have now evolved into the
development of regional laws and requirements. This is particularly
true in Europe, where a tough legal stance on data protection, the
Digital Markets Act (DMA), Data Governance Act (DGA), Data Act (DA)
and Digital Services Act (DSA), and a new AI Act proposal are laying
down rules and requirements for digital platforms and developers,
74
turning the region into a ‘regulatory super power’. As described in
Chapter 2 of this report, worldwide recommendations, policies
and strategies are addressing the ethical and social implications
and sustainability of AI. New responsibilities arise for companies
and developers, lawmakers and citizens. However, what is oen not
accounted for in these global risk mitigation strategies (dictated by the
institutions, civil organisations and companies of Western developed
economies) is that countries and regions have moved at dierent
speeds and exist in dierent stages of global ICT and digitalisation.
Developed countries are historically responsible for the majority
of data pollution in the personal, social and natural environment.
However, this is not accounted for when asking developing countries
(which are just now catching up in the global ICT race) to shoulder
the burden of newly established responsibilities and compliance.158
Thus, the global response to data pollution not only constitutes a
response to unequal opportunities and global power dynamics. It
is in and of itself a form of power that may even reinforce existing
global opportunity divides between regions, countries and peoples.
Global coordinated responses to data pollution represent the
power to identify problems and design their solutions, the power to
implement them and at what speed; the power to set priorities that
correspond with the cultural and economic context and capacities
of, for instance, developers and citizens. It is a power that most oen
comes with the loudest voice and the greatest existing force backed by
more resources, and political and economic power. But it also comes
at the expense of other less powerful voices in the global context.
In this way, not only does data pollution reinforce the position of
those already at the greatest disadvantage by replicating existing
asymmetries of power; in our data pollution mitigation strategies,
we risk further bolstering these divides.
75
My rst attempt is to look at sustainability from the 2030
Agenda with its 17 SDGs, 169 targets and 232 indicators, but
more importantly to look at the gaps that emerge from a data
pollution lens in this programme. There is a need to assess how
aected and disenfranchised/marginalised communities may be
able to devise their own ideas as to what data about them and
their environment should be and look like.
- Carolina Aguerre, Data Pollution & Power
Group, 2nd Meeting Mini Report, 2021
The majority of the costs are hidden from us. They are in the
backyards of vulnerable communities. They are hidden from us
because we don’t live with those consequences every day. Here,
I think there is also a risk of imperialism. India and Africa are
trying to get into the AI space and they look at the European
Commission or the United Nations ethical guidelines for AI.
And they say: you have had all this time to develop your AI and
models and data and now you are telling us that we have to do it
in this ethical way. What is the solution to that? Because in this
way we are accelerating digital divides all over again.
- Aimee van Wynsberghe, Data Pollution &
Power Group, 1st Meeting Mini Report, 2021
8. Time
Data pollution can have damaging eects on various forms of space.
For instance, it may damage social spaces, altering the spatial outline
of the information architecture of modern liberal democracies; it
76
may damage the physical space of our natural environment with
greenhouse gas emissions; or it may intrude on the protective layers
of our personal spaces. However, the very temporal constitution of
our natural, social and personal environments can also be polluted
with data. The data design and classication models of AI will always
only take into account what is useful to the system, i.e., the specic
interest the system is designed for. In this way, AI systems may
ultimately reduce dynamic cultures and multiple experiences in the
interest of the AI model. Qualitative pasts and multiple futures do
not make sense in and by themselves in AI systems.159 The Ofqual
algorithm mentioned before, for example, reduced students’ personal
qualitative lives and performances to the eect of their schools’
historical performance. AI tools developed for judicial systems are
another example: created to support a judge’s decision, there are AI
systems that process case law and then present a summary decision.
When systems such as these become normative in judicial contexts,
privileging the quantitative AI analysis of case law decisions over the
qualitative contextual judgement of the individual judge, they also
lock his future choice into the mass of these “precedents”, as one Council of
Europe committee charter has described it.160
AI temporal rationality is a form of data pollution that limits human
action and individual responsibilities. Most oen it disempowers
the experiences and voices of less powerful communities that are
reduced to mere instantaneous data that the systems use and act on
according to the dominant interests of society. For example: when
minority groups are underrepresented in data used as the basis for
decisions made on social benets, when critical scientic medical
analysis only benets one privileged group, or on the other hand
when a minority group is overrepresented in data in such a way that
puts them at a disadvantage in society, such as data from specic city
zones used for predictive policing.
77
Last but not least, it is important to consider the
intergenerational dimension–what we do with data right now
will aect the future. Or, to put it dierently, we have power over
the future with our current data practices, intergenerational
balances. We are shaping society with our current practice.
- Pak-Hang Wong, Data Pollution & Power
Group, 1st Meeting Mini Report, 2021
78
4.
DATA POLLUTION
QUESTIONS
PHILOSOPHER GILLES DELEUZE ONCE stated that True
freedom lies in a power to decide, to constitute the problems themselves.161
It is important to ask what kind of reality the problems we include
in the global sustainable development agenda present and who
has an interest in solving these specic problems. Who identies
the problem and who creates the solutions? Crucially, which and
whose problems are not being considered? Who is le out of the
‘problem solving’ and ‘agenda setting’? In other words, to solve an
environmental problem for the planet and for all we need to also be
ready to address new problems and ask new questions.
In an era of big data and AI innovation and competition, the
adverse side eects of data pollution on our social, natural and
personal environments are currently mostly addressed as isolated
side eects within specic domains. They are not considered at
large and together as components of socio-technical spaces of
79
empowerment or disempowerment.
Stating data pollution as a problem is in and of itself a challenge
to existing power dynamics and the questions we ask based on this
new framing of an environmental problem will guide solutions and
global societal engagement in mitigating the environmental impacts
of AI data.
In this last section of the white paper, big data and AI are restated as
environmental problems with questions that will open up a discussion
about dierent aspects of data pollution in dierent domains and
among various power actors. The questions were developed by the
members of the Data Pollution and Power Group and edited by the
author.
80
AI in Society
Whose narrative about the role of AI in society do
our AI tools, science and conceptualisation serve? And
who bears the greatest risk?
Is some environmental data pollution more
acceptable than others? Should we dierentiate between
pollution in certain elds based on needs or importance?
For instance, are CO2 emissions from healthcare
algorithms more acceptable than CO2 emissions from
use of social media, banking or shipping?
Who is responsible for solving and nding solutions
to the dierent types of issues arising from the data
pollution domains?
What does a coordinated response to the complexity
of data pollution look like?
What is the current dominant viewpoint on data in AI
science and innovation? Which views remain invisible?
Data Pollution Questions
81
Science, development and standard-setting
Who are ‘tech workers’? How do their individual
cultural, social and economic contexts and skill sets
inuence the data design of AI?
Can we describe the life cycle of AI for an environmental
and sustainable impact assessment?
Should dierent measures be taken or considered for
the training or development period of AI? If so, how can
this be made explicit?
Can standards for data infrastructures incorporate a
data pollution aspect?
How can we make data pollution that impacts the
natural environment transparent? Which parameters
should be included in a transparent CO2 analysis?
Is there a way to create a carbon footprint calculator
that easily, transparently and aordably reports on carbon
emissions, which can be compared within and across
sectors?
82
What are the existing technological dependencies
in society and for AI developers? Are there new
dependencies arising that may increase data pollution
in specic domains?
Who has an interest in the data of the design of a
specic AI system? And how are these interests met
by design? Are there conicts of interests and how are
they resolved by design?
How do we ensure equal representation of
impacted population groups in AI data design? How
do we ensure the transparency of baseline data used
for calculations/predictions?
How are dierent AI methods related to dierent
levels of energy consumption?
How do we redene the value criteria for AI developers to
include data pollution considerations?
What does a good AI model that is both ecient and sus-
tainable look like?
83
Policy, law and international collaboration
How does the changing relationship between states and
big technology companies impact politics? Does it require
a new social contract and how will this new social contract
be designed? What role will AI play in these new social
contracts?
How do we make the data pollution problem more
comprehensible to a wider eld of policy makers?
How can the issue of data pollution be included
in national, regional and international sustainable
development policy and strategies?
What are the mechanisms for establishing a clearer
link between big data/AI sustainability and the SDGs?
The development of data-sharing infrastructures for
climate and planetary observation data with the use of AI
technology is a key focus among world powers. Is there a
way to assess the balance of the environmental benets of
initiatives as such, with their energy costs and impacts on
the natural environment?
84
Education
In many sectors, data pollution is not discussed or
deemed important. What is the best way to raise awareness
of the issue?
How do we change the conception of ‘just because we
can develop it, we should’ to ‘do we really need this? How
and when will it be benecial? Who will it benet?’
What can we ask from developing countries in terms
of data pollution responsibilities and compliance? Do we
need to develop a baseline for ‘survival emissions’ and
‘luxury emissions’?
85
CONCLUSION
A Data Pollution Movement
THIS WHITE PAPER OUTLINES the connections between the
dierent actors and components of a nascent environmental data
pollution movement with ‘sustainability’ as the thread that links
its elements in a shared understanding and approach. The main
objective is to ensure that data pollution of AI in particular is included
in the global sustainable development agenda.
Data pollution is an environmental problem with interrelated
adverse impacts on our natural, social and personal environments.
It is the unsustainable handling and distribution of data resources
dened in a global society with power dynamics that are transformed,
aected and even produced by interconnected streams of data. Data
pollution reinforces and aects asymmetric power balances between
actors on a local, regional and global scale. This is why we need a data
pollution movement.
The data pollution movement is already taking form. In the
policy and legal space several policy initiatives have recently
86
been negotiated and put in place to address the sustainability and
ethical implications of the adoption and implementation of AI and
data-based systems and technologies. Governments worldwide
and intergovernmental organisations have presented AI ethics
principles and recommendations - several with a special focus
on the sustainability of AI and the environmental impact. Since
2017, no less than 60 countries worldwide have adopted articial
intelligence policies.163 The EU, in particular, has here taken the
strongest regulatory position. Thus, a comprehensive European data
protection regulatory framework was adopted in 2016 to harness
threats to privacy and individual empowerment in an age of massive
collection, storage and use of big data. In 2018 the EU’s AI Strategy was
adopted and in 2021 the world’s rst AI law proposal was published
with a risk-based approach.164
Expectedly, in the tech industry, we have also seen the emergence
of new AI and data companies with an ethical agenda, such as the
Finnish privately-held AI lab Silo. AI, which builds human-centric
AI solutions to support rather than replace humans in various
work situations, all with the slogan ‘AI for People’. Also, larger,
more established companies are increasingly dierentiating their
business practices with an ethical stance on data. This includes
consumer tech giant Apple, whose CEO, Tim Cook, for years used
the argument that he ‘sells products, not user data’ to dierentiate
the brand from its Silicon Valley competitors. In this sphere, we also
are seeing the emergence of AI ethics and sustainability claims and
initiatives. Unfortunately, however, many activities in this sphere do
not account for historical data pollution and thus advantages. They
do not address the core structural power problems of technological
dependency creation and data power centralisation. Moreover, while
presenting sustainable data and AI practices in one domain, they
continue data pollution practices in others while enacting no or very
87
little real meaningful change.
In terms of technical data infrastructure, the ‘personal data store’,
‘trust’ and ‘stewardship’ movement has been ongoing for a while
now among innovative entrepreneurs with the aim to shi data
power asymmetries embedded in current data infrastructures. As a
result, a range of new services that by default respect people’s privacy
and empower individuals with their data have been developed.
The MyData global community includes organisations, SMEs,
individuals and local networks working with the aim to: … help people
and organisations to benet from personal data in a human-centric way. To
create a fair, sustainable, and prosperous digital society for all.165 Many of
these are challenging the privacy implications and CO2 emissions
of an asymmetrical data economy that collects and stores data on
central servers. They call for ‘greener data’ with a decentralized data
trust model. Much is le to be explored both in terms of the basic
functioning, interoperability and, last but not least, legal framework
of data trusts and cooperatives, but the movement is growing and
expanding.
Tides are changing in the sea of big data, and society is starting
to understand and act on this shi. There is a sense of urgency to
develop and implement an ethical and sustainable approach to data
and AI, and the world’s most advanced companies and governments
are positioning themselves within this movement. Nevertheless, we
are still far from the kind of widespread societal awareness that will
lead to real change.
This white paper is a step in that direction. It explores the powers,
interests and impacts of data pollution in eight domains, which can
be summarised as follows:
88
Nature
Data pollution is a carbon footprint. It can be addressed in the
design phase of AI as a component of the competences, practices,
education and technological dependencies of AI practitioners.
However, the extent and impact of data pollution is incrementally
complex to measure and mitigation strategies are accordingly
dicult to design and apply. We need a global coordinated response
that recognises the power players shaping the contexts in which
data pollution and its impact on the natural environment can be
measured and tackled.
Science & Innovation
Data pollution is part of the culture of big data science and
innovation. In a big data economy, the most powerful technology
companies, institutions and accordingly also AI practices are dictated
by the collective imagining of big data as an unlimited resource and
opportunity. Tackling data pollution in the science and innovation
domain requires a counter-balanced science and technology ‘data
sustainability culture’ supported in policy, innovation and education.
There is also a need for environmentally sound development
strategies for alternative technologies.
Democracy
Data pollution is an imbalance in the information eco-systems
of constitutional democracies. A democracy is founded on sensitive
information balances between citizens and the State, which is
stipulated in laws, state governance, institutional procedures and
frameworks for the conduct of elected representatives and public
servants. Modern democratic societies must ensure socio-technical
infrastructure that reinforces and ensures the democratic ecosystem
of information distribution between citizens, States and other
powerful actors.
89
Human Rights
Data pollution is a corrosion of the international human
rights system. As big data and AI socio-technical infrastructures
(BDSTIs and AISTIs) are integrated in society, individual human
rights protections are increasingly challenged and held up against,
for example, the interests of nation States to control and gather
intelligence, or the interests of the data-based business models of
internet platforms. Fortunately, data pollution issues that aect
people’s rights are also more and more oen challenged in court via
human rights legal instruments.
Infrastructure
Data pollution is a concentration of data power in socio-
technical infrastructure. Just like with air pollution, where human
exposure is increased by the concentration of pollutants in the air, its
negative eects are increased in correlation with the concentration
of data pollutants in the socio-technical infrastructure of personal,
social and natural environments.
Decision-Making
Data pollution is a bias in human decision-making with adverse
consequences for individuals and society. Decision-making in the
domains of everything from civic participation, social networking,
judicial practice, etc. is increasingly extended with Autonomous
Decision-Making Systems (ADM Systems). The human impact of
data pollution in the various domains of human decision-making
are most profoundly expressed as the reinforcement or creation of
discrimination in society.
Global Opportunities
Data pollution is colonialism. It reinforces existing social
hierarchies and colonial power dynamics that impact the distribution
90
of global opportunities. The big data and AI ‘revolution’ has made
the greatest dierence in terms of opportunities in the economies of
the Global North, while leaving the Global South behind. At the same
time, the very experience and impact of data pollution are the most
intense in communities and among people that have traditionally
been the most exposed in local and global power dynamics.
Time
Data pollution is a disempowering rationalisation of time. The
data design and classication models of AI only take into account
what is useful to the system which is established by dominant
interests in it. In this way, AI systems ultimately reduce dynamic
cultures and multiple experiences. Qualitative pasts and multiple
futures do not make sense in and of themselves in AI systems. Data
pollution of time disempowers the experiences and voices of less
powerful communities that are reduced to mere instantaneous data
to be used and acted on according to dominant interests.
In conclusion, power is currently integrated in very real digital
data architectures and, as is increasingly highlighted in public debate,
it most oen upholds the world’s most powerful actors while putting
others at a disadvantage. These asymmetries of power are hidden in
the narratives shaping AI governance, business strategies, and even
science and innovation that result in very dierent experiences of
data pollution. This is why we need the data pollution movement.
91
Endnotes & Bibliography
92
1 www.datapollution.eu
2 www.sustainable-ai.eu
3 www.humboldt-foundation.de
4 Hasselbalch, G. (2021) Data Ethics of Power A Human Approach in the Big
Data and AI Era, Edward Elgar.
5 Ibid.
6 Hasselbalch, G. (2019) Making sense of data ethics. The powers behind
the data ethics debate in European policymaking. Internet Policy Review,
8(2).
7 Hasselbalch, G. (2021) “A framework for a data interest analysis of arti-
cial intelligence” FIRST MONDAY, Volume 26, Number 7 - 5 July 2021 doi:
http://dx.doi.org/10.5210/fm.v26i7.11091
8 34.1. of the Agenda 21 https://www.un.org/esa/dsd/agenda21/res_agen-
da21_34.shtml
9 Ibid. Hasselbalch (2021) endnote 4, p. 11; Star, S.L., Bowker, G. C. (2006)
How to Infrastructure? In L.A. Lievrouw and S. Livingstone (eds.) Hand-
book of New Media. Social Shaping and Social Consequences of ICTs, pp.
230–245. Updated student edition. SAGE Publications Ltd. Bowker, G.C.,
Baker, K., Millerand F., Ribes D. (2010) Toward Information Infrastructure
Studies: Ways of Knowing in a Networked Environment. In J. Hunsinger,
L. Klastrup, M.M. Allen, M. Matthew (eds.), International Handbook of In-
ternet Research.
Springer Netherlands. Harvey et al. (2017) Ibid endnote 11.
10 See also Aguerre, C. (forthcoming, 2022) Baseline study and methodology
for the EU human-centric approach to AI in a global context, InTouchAI.eu. The
baseline study and this section of the white paper were developed in close
collaboration between G. Hasselbalch and C. Aguerre.
11 Hughes, T.P. (1983) Networks of power: Electrication in Western society
1880–1930. The John Hopkins University Press.; Bijker, W.E., Hughes, T.P.,
Pinch, T. (eds.) (1987) The social construction of technological systems.
MIT Press; Misa, T.J. (1992) Theories of Technological Change: Parame-
ters and Purposes. Science, Technology, and Human Values, 17(1) (Winter,
1992), 3–12.; Bijker, W.E., Law, J. (eds.) (1992) Shaping technology/building
society: Studies in sociotechnical change. MIT Press.; Edwards, P. (2002)
Infrastructure and modernity: scales of force, time, and social organiza-
tion in the history of sociotechnical systems. In T.J. Misa, P. Brey, A. Feen-
berg (eds.) Modernity and Technology (pp. 185–225) MIT Press; Harvey,
P., Jensen, C.B., Morita, A. (eds.) (2017) Infrastructures and Social Com-
plexity: A Companion. Routledge.
93
12 Report of the United Nations Conference on the Human Environ-
ment, Stockholm, 5-16 June 1972 (1973) https://digitallibrary.un.org/re-
cord/523249?ln=en
13 Lapenta, F. (2021) Our Common AI Future. https://dataethics.eu/our-com-
mon-ai-future/
14 ”Report of the World Commission on Environment and Development:
Our Common Future” (1987), United Nations https://www.are.admin.ch/
are/en/home/media/publications/sustainable-development/brundt-
land-report.html
15 van Wynsberghe, A. (2021) ”Sustainable AI: AIforsustainability and the
sustainabilityofAI”.AI Ethics. https://doi.org/10.1007/s43681-021-00043-
6
16 Ibid. Hasselbalch (2021), endnote 4
17 Ibid. Hasselbalch (2021), endnote 4
18 “European enterprise survey on the use of technologies based on ar-
ticial intelligence” (2020), IPSOS for European Commission https://
digital-strategy.ec.europa.eu/en/library/european-enterprise-sur-
vey-use-technologies-based-articial-intelligence.
19 Elish, M. C., Boyd, D. (2018) “Situating methods in the magic of big data
and articial intelligence”. Communication Monographs, 85(1), 57–80.
20 ”The Role of Data in AI Report for the Data Governance Working Group
of the Global Partnership of AI” (2020), Digital Curation Centre, Trilateral
Research, School of Informatics, The University of Edinburgh
https://gpai.ai/projects/data-governance/role-of-data-in-ai.pdf
21 Hasselbalch, G. (2021) “A framework for a data interest analysis of arti-
cial intelligence” FIRST MONDAY, Volume 26, Number 7 - 5 July 2021 doi:
http://dx.doi.org/10.5210/fm.v26i7.11091
22 Schneier, B. (2006), ”The Future of Privacy”, https://www.schneier.com/
blog/archives/2006/03/the_future_of_p.html
23 Mayer-Schonberger, V., & Cukier, K. (2013). Big data: A revolution that will
transform how we live, work and think. John Murray, p. 15.
24 Mai, J-E. (2019). “Situating Personal Information: Privacy in the Algo-
rithmic Age”. In R.F. Jørgensen (Ed.), Human Rights in the Age of Platforms
(pp. 95-116). MIT Press, p. 111.
25 Ibid. Mayer-Schonberger, V., & Cukier, K. (2013).
26 The UN 2030 Agenda for Sustainable Development (https://sdgs.un-
.org/2030agenda ) is committed to achieving sustainable development in
94
its three dimensions – economic, social and environmental.
27 Schneier, B. (2006), ”The Future of Privacy”,
https://www.schneier.com/blog/archives/2006/03/the_future_of_p.html
28 Hasselbalch, G., Tranberg, P. (2016). Data Ethics The New Competitive Ad-
vantage, Publishare.
29 Ibid. p. 183
30 Ben-Shahar, O. (2019) ”Data Pollution”, Oxford University Press on be-
half of The John M. Olin Center for Law, Economics and Business at Har-
vard Law School.
https://doi.org/10.1093/jla/laz005
31 Ibid. p. 106
32 Lucivero, F. and Samuel, G. et al. (June 19, 2020) “Data-Driven Unsus-
tainability? An Interdisciplinary Perspective on Governing the Environ-
mental Impacts of a Data-Driven Society”.
http://dx.doi.org/10.2139/ssrn.3631331
33 The Shi Project (2019). “Lean ICT – Towards Digital Sobriety”
https://theshiproject.org/wp-content/uploads/2019/03/Lean-ICT-Re-
port_The-Shi-Project_2019.pdf
34 Mytton, D. (2020) ”Hiding greenhouse gas emissions in the cloud”.
Nature Climate Change 10, 701 (2020). https://doi.org/10.1038/s41558-020-
0837-6
35 Wineld, A. F. T. (28th June 2019) ”Energy and Exploitation: AIs dirty
secrets”, Alan Wineld’s weblog; Strubell, E., Ganesh, A., McCallum, A.
(2019) ”Energy and Policy Considerations for Deep Learning” in NLP,arX-
iv:1906.02243
36 Lucivero, F. ”Big Data, Big Waste? A Reection on the Environmental
Sustainability of Big Data Initiatives.Sci Eng Ethics 26, 1009–1030 (2020).
https://doi.org/10.1007/s11948-019-00171-7
37 https://sdgs.un.org/2030agenda
38 Data pollution may also be delineated specically in terms of economic
impacts. In this white paper the ”economic” impacts are considered as
components of the ”social environment” and ”personal environments”.
39 Corbett, C.J. (2018), ”How Sustainable Is Big Data?”. Prod Oper Manag,
27: 1685-1695.
https://doi.org/10.1111/poms.12837
40 Harvey, D. (1990) The Condition of Postmodernity: An Enquiry into the
Origins of Cultural Change. Basil Blackwell, p. 286- 307
95
41 Castells, M. (2010) The Rise of the Network Society (Second edition).
Wiley Blackwell.
42 Misa, T.J. (1988) How Machines Make History, and How Historians (And
Others) Help Them to Do So. Science, Technology, and Human Values,
13(3/4) (Summer –Autumn, 1988), 308–331; Misa, T.J. (1992) Theories of
Technological Change: Parameters and Purposes. Science, Technology,
and Human Values, 17(1) (Winter, 1992), 3–12; Misa, T.J. (2009) Findings
follow framings: navigating the empirical turn. Synthese, 168, 357–375.
43 Ibid. Misa, T.J. (2009) p. 367
44 Ibid. Edwards, P. (2002) endnote 11
45 Ibid.
46 Ibid. Hughes, T.P. (1983) endnote 11; Hughes, T.P. (1987) The evolution
of large technological systems. In W.E. Bijker, T.P. Hughes, T.Pinch (eds.)
The social construction of technological systems (pp. 51–82). MIT Press.
47 See also Aguerre, C. (2022) Baseline study and methodology for the EU hu-
man-centric approach to AI in a global context, InTouchAI.eu. The baseline
study and this section of the white paper were developed in close collabo-
ration between G. Hasselbalch and C. Aguerre.
48 https://www.itu.int/net/wsis/docs/geneva/ocial/dop.html
49 See: https://digital-strategy.ec.europa.eu/en/policies/european-ap-
proach-articial-intelligence
50 COMMUNICATION FROM THE COMMISSION TO THE EUROPEAN
PARLIAMENT, THE EUROPEAN COUNCIL, THE COUNCIL, THE EU-
ROPEAN ECONOMIC AND SOCIAL COMMITTEE AND THE COMMIT-
TEE OF THE REGIONS, Brussels, 25.4.2018 COM(2018) 237 nal https://
eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52018D-
C0237&from=EN
51 UNESCO (2021) Recommendation on the Ethics of Articial Intelligence
https://unesdoc.unesco.org/ark:/48223/pf0000380455
52 Rainey, S., Goujon, P. (2011) “Toward a Normative Ethical Governance
of Technology. Contextual Pragmatism and Ethical Governance. In R.
von Schomberg (ed.) Towards Responsible Research and Innovation in the In-
formation and Communication Technologies and Security Technologies Fields
(pp. 48–70). European Commission; Wineld, A.F.T., Jirotka, M. (2018)
“Ethical governance is essential to building trust in robotics and articial
intelligence systems”. Philosophical Transactions of the Royal Society A:
Mathematical, Physical and Engineering Sciences, 376(2133); Hasselbalch,
G. (2021) Ibid. endnote 4
53 de Wachter, M.A.M. (1997). The European Convention on Bioethics.
96
Hastings Center Report, 27(1), 13–23. https:// onlinelibrary .wiley .com/
doi/ full/ 10 .1002/ j .1552 -146X.1997 .tb00015, p. 14
54 Council of Europe (1997), Convention for the Protection of Human
Rights and Dignity of the Human Being with regard to the Application of
Biology and Medicine: Convention on Human Rights and Biomedicine,
article 2. European Treaty Series - No. 164, Oviedo, 4.IV.1997. https:// www
.coe .int/ en/ web/ conventions/ full -list/ -/ conventions/
treaty/ 164
55 See e.g. Lori Witzel’s very valid critique of the human-centered ap-
proach proposing a dierent approach based in work being done by in-
digenous researchers and practitioners in AI where “human-centered”
shis from “human-at-the-top” to “human-as-part-of-a-biome. https://
medium.com/@loriaustex/is-human-centered-ai-enough-for-real-sus-
tainability-29565395d2fa
56 Ibid. Hasselbalch (2021), endnote 4, p. 12 : “The Human Approach: The
‘human-centric’ or ‘human-centred’ approach was a popular term in late 2010s’
policy and advocacy discourses on the ethics of AI and big data, used as a way
to recentre the sociotechnical developments in these elds on the human interest.
In this book, I further explore and conceptualise this term, but I refer to it as the
‘human approach’. I do this to emphasise the role of the human as an ethical being
with a corresponding ethical responsibility for not only the human living being
but also for life and being in general. In practical terms the human approach is
associated with the human interest in the data of AI through the involvement of
human actors in the very data design, use and implementation of AI. The human
approach of a data ethics of power specically constitutes a critical reection on
the power of technological progress as well as the big data and AI sociotechnical
systems we build and imagine.
57 Ibid. Hasselbalch, G. (2021), endnote 4, p. 5
58 For more on the legal, cultural and social context of a European hu-
man-centric governance approach see in particular here Aguerre, C.
(2022) Baseline study and methodology for the EU human-centric approach to AI
in a global context, InTouchAI.eu.
59 Ibid. endnote 4. Hasselbalch, G. (2021).
60 See e.g. COMMUNICATION FROM THE COMMISSION TO THE EU-
ROPEAN PARLIAMENT, THE COUNCIL, THE EUROPEAN CENTRAL
BANK, THE EUROPEAN ECONOMIC AND SOCIAL COMMITTEE,
THE COMMITTEE OF THE REGIONS AND THE EUROPEAN INVEST-
MENT BANK, Annual Sustainable Growth Strategy 2020 https://eur-lex.
europa.eu/legal-content/EN/TXT/?qid=1578392227719&uri=CELEX-
%3A52019DC0650
97
61 Ibid. Hasselbalch, G. (2021), endnote 4; Ibid. Hasselbalch, G. (2019), end-
note 6
62 Ibid. endnote 7. Hasselbalch, G. (2021).
63 Ibid. endnote 4. Hasselbalch, G. (2021).
64 This chapter focuses on the geo-politics of current socio-technical
changes. Nevertheless, it is recognised that “governance” consists of a va-
riety of components that do not only include the development of laws
and policies created by single policy actors. Technology is an expression
of social practice and human shaping taking form and changing in tech-
nological, material, social, economic, political and cultural environments.
Legislators are the most obvious actors of governance, but also citizens,
engineers, civil society actors and industries take part in the ‘governing’ of
the heterogenous complex processes of sociotechnical change.
65 Moor, J. (2006) “The Dartmouth College Articial Intelligence Confer-
ence: The Next Fiy Years”. AI Magazine, 27(4), 87–91.
66 Alpaydin, E. (2016) Machine Learning. MIT Press.
67 Hasselbalch, G. (2020). ”Culture by design: A data interest analysis of the
European AI policy agenda”. First Monday, 25(12). https://doi.org/10.5210/
fm.v25i12.10861
68 Ibid. Lapenta, F. (2021), endnote 13
69 See: https://www.youtube.com/watch?v=jKaYPk5YnsU
70 Report of the United Nations Conference on the Human Environment,
Stockholm, 5-16 June 1972. https://undocs.org/en/A/CONF.48/14/Rev.1
71 United Nations Conference on Environment & Development, Rio de
Janerio, Brazil, 3 to 14 June 1992 AGENDA 21. https://sustainabledevelop-
ment.un.org/content/documents/Agenda21.pdf
72 “Environmentally Sound Technologies”, UN Environment Programme
https://www.unep.org/regions/asia-and-pacic/regional-initiatives/sup-
porting-resource-eciency/environmentally-sound
73 The European Green Deal (11.12. 2009, P.9) https://eur-lex.europa.eu/
resource.html?uri=cellar:b828d165-1c22-11ea-8c1f-01aa75ed71a1.0002.02/
DOC_1&format=PDF
74 COMMUNICATION FROM THE COMMISSION TO THE EUROPE-
AN PARLIAMENT, THE EUROPEAN COUNCIL, THE COUNCIL, THE
EUROPEAN ECONOMIC AND SOCIAL COMMITTEE AND THE COM-
MITTEE OF THE REGIONS Coordinated Plan on Articial Intelligence,
p. 14.
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=COM:2018:795:FIN.
98
75 Ibid. p. 40
76 See: https://digital-strategy.ec.europa.eu/en/library/communica-
tion-fostering-european-approach-articial-intelligence
77 See: https://digital-strategy.ec.europa.eu/en/library/coordinat-
ed-plan-articial-intelligence-2021-review
78 High-Level Expert Group on Articial Intelligence (HLEG A) (2019)
Ethics Guidelines for Trustworthy AI. https:// ec .europa .eu/ digital -sin-
gle -market/ en/ news/ethicsguidelines -trustworthy -ai
79 Smuha, N.A. (2019) The EU approach to ethics guidelines for trust-
worthy articial intelligence. Computer Law Review International, 20(4),
pp. 97–106.
80 Quoted in Hasselbalch, G. (2020). Culture by design: A data interest
analysis of the European AI policy agenda. First Monday, 25(12).
https://doi.org/10.5210/fm.v25i12.10861
81 See: https://digital-skills-jobs.europa.eu/en/actions/european-initia-
tives/digital-decade
82 https://ec.europa.eu/commission/presscorner/detail/en/STATE-
MENT_21_4951
83 Disclosure: The author of this paper is the Research Lead and Key
Expert on AI Ethics of this initiative.
84 https://digital-strategy.ec.europa.eu/en/policies/international-out-
reach-ai
85 https://ec.europa.eu/info/strategy/priorities-2019-2024/stronger-eu-
rope-world/global-gateway_en
86 RECOMMENDATION ON THE ETHICS OF ARTIFICIAL INTELLI-
GENCE, annex to REPORT OF THE SOCIAL AND HUMAN SCIENCES
COMMISSION (SHS), 41 C/73 (22 November 2021), p. 7
87 https://www.gpai.ai/projects/climate-change-and-ai.pdf
88 The AI Governance Journey: Development and Opportunities (2021),
World Economic Forum, https://www.weforum.org/reports/the-ai-gov-
ernance-journey-development-and-opportunities/, p. 25
89 https://digital-strategy.ec.europa.eu/en/policies/destination-earth
90 https://unstats.un.org/unsd/undataforum/blog/a-global-data-conven-
tion-to-safeguard-sustainable-development
91 Sætra, H.S. (2022). AI for the Sustainable Development Goals (1st ed.).
CRC Press. p. 1. https://doi.org/10.1201/9781003193180
92 Adams R. (2022) AI in Africa KEY CONCERNS AND POLICY CONSIDER-
99
ATIONS FOR THE FUTURE OF THE CONTINENT, Africa Policy Research
Institute. https://afripoli.org/uploads/publications/AI_in_Africa.pdf
93 Kahn, J. (April 22, 2021) “A.I.’s carbon footprint is big, but easy to reduce,
Google researchers say” https://fortune.com/2021/04/21/ai-carbon-foot-
print-reduce-environmental-impact-of-tech-google-research-study/
94 Ray, T. (2021) “Common but Dierent Futures: AI Inequity and Climate
Change,” ORF Special Report No. 172, December 2021, Observer Research
Foundation.
https://www.orfonline.org/research/common-but-dierent-futures/#_n3
95 Picture: Unsplash.com Ivan Bandura
96 Berkhout, F., Hertin, J. (2004) De-materialising and re-materialising:
Digital technologies and the environment. Futures.
https://doi.org/10.1016/j.futures.2004.01.003
97 Ibid. endnote 36 Lucivero, F. (2020)
98 Ibid. endnote 96 Berkhout, F., Hertin, J. (2004)
99 Robbins, S.; van Wynsberghe, A. (2022) Our New Articial Intelligence
Infrastructure: Becoming Locked into an Unsustainable Future. Sustain-
ability 2022, 14, 4829. https://doi.org/10.3390/su14084829
100 Walleser, E. ( July 14th, 2021) Articial Intelligence Has an Enormous
Carbon Footprint https://towardsdatascience.com/articial-intelli-
gence-has-an-enormous-carbon-footprint-239290ebe
101 Kaack, L.; Donti, P.L.; Strubell, E.; Kamiya, G.; Creutzig, F., Rolnick ,
D. (2022) “Aligning articial intelligence with climate change mitigation”,
Nature Climate Change 12, 518-527.
102 https://codecarbon.io/
103 Raper, R.; Boeddinghaus, J.; Coeckelbergh, M.; Gross, W.; Campigot-
to, P.; Lincoln, C.N. Sustainability Budgets: A Practical Management and
Governance Method for Achieving Goal 13 of the Sustainable Develop-
ment Goals for AI Development. Sustainability 2022, 14, 4019. https://doi.
org/10.3390/su14074019
104 Gupta, A. (October 28th 2021) Why should sustainability be a rst class
considerations for AI systems? https://greensoware.foundation/articles/
why-should-sustainability-be-a-rst-class-consideration-for-ai-systems
105 Ibid. endnote 94, Ray, T. (2021)
106 Glanz, J. (22th September 2012) ”Power, Pollution and the Internet”,
New York Times https://www.nytimes.com/2012/09/23/technology/da-
ta-centers-waste-vast-amounts-of-energy-belying-industry-image.html
100
107 Ibid. endnote 23. Mayer-Schonberger, V., Cukier, K. (2013), p. 129.
108 Ibid. p. 101.
109 Ibid. endnote 24. Mai, J-E. (2019), p.111
110 Chahal, H., Toner, H. (2021) “‘Small Data’ Are Also Crucial for Machine
Learning, Scientic American”
https://www.scienticamerican.com/article/small-data-are-also-crucial-
for-machine-learning/
111 Issenerg, S. (December 19, 2012) “How Obama’s Team Used Big Data to
Rally Voters”, MIT Technology Review,
https://www.technologyreview.com/2012/12/19/114510/how-obamas-
team-used-big-data-to-rally-voters/
112 Detrow, S. (March 20, 2018) “What Did Cambridge Analytica Do During
The 2016 Election?” NPR
https://www.npr.org/2018/03/20/595338116/what-did-cambridge-analyt-
ica-do-during-the-2016-election?t=1644913832102
113 Jetley, N.P. (April 10, 2014) ”How big data has changed India elections
CNBC”
https://www.cnbc.com/2014/04/10/how-big-data-have-changed-india-
elections.html
114 Ventriss, C.; Perry J. L.; Nabatchi, T.; Milward, H.B.; Johnston, J.M.
(2019) Democracy, Public Administration, and Public Values in an Era of
Estrangement, PERSPECTIVES ON PUBLIC MANAGEMENT AND GOV-
ERNANCE, Volume 2, Issue 4, December 2019, Pages 275–282, https://doi.
org/10.1093/ppmgov/gvz013
115 AlgorithmWatch (2020) Automating Society Report 2020. Fabio C.,
Fischer, S., Kayser-Bril, N., Spielkamp, M. (eds.) AlgorithmWatch gGmbH,
https://automatingsociety.algorithmwatch.org/
116 Nemitz, P. (2018) ” Constitutional democracy and technology in the
age of articial intelligence” Phil. Trans. R. Soc. A.3762018008920180089
http://doi.org/10.1098/rsta.2018.0089
117 Stupp, C. (6 April 2018) “Cambridge Analytica harvested 2.7 million
Facebook users’ data in the EU”. Euractiv.
https:// www .euractiv .com/ section/ data -protection/ news/cambridge
-analytica -harvested -2 -7 -million -facebook -users -data -in -the -eu/
118 Articial Intelligence and Democratic Values (2021), Center for AI and
Digital Policy, p. 122 https://www.caidp.org/reports/aidv-2021/
119 Cohen, J.E. (2013) What privacy is for. Harvard Law Review, 126(7). p. 1904
120 Johnson, B. (11 January 2010) Privacy is no longer a social norm, says
101
Facebook founder, The Guardian. https:// www .theguardian .com/ tech-
nology/ 2010/ jan/ 11/facebook -privacy
121 Hasselbalch, G. (2010) Privacy and Jurisdiction in the Global Net-
work Society. https://mediamocracy.les.wordpress.com/2010/05/priva-
cy-and-jurisdiction-in-the-network-society.pdf
122 https://digitallibrary.un.org/record/764407/?ln=en
123 https://www.coe.int/en/web/impact-convention-human-rights/hu-
man-rights-and-the-environment
124 See Max Schrems statement (2022) on the latest ”agreement in princi-
ple” on a new data transfer agreement: https://noyb.eu/en/privacy-shield-
20-rst-reaction-max-schrems
125 Ibid. endnote 40 Harvey (1990), p. 241.
126 Ibid. endnote 4, Hasselbalch, G., (2021).
127 Star, L.S. (1999) The Ethography of Infrastructure, The American Be-
havioral Scientist, Nov/Dec 1999; 43(3), 377–392; Star, S.L., Bowker, G.
C. (2006) How to Infrastructure? In L.A. Lievrouw and S. Livingstone
(eds.) Handbook of New Media. Social Shaping and Social Consequences
of ICTs, pp. 230–245. Updated student edition. SAGE Publications Ltd;
Bowker, G.C., Star, S.L. (2000) Sorting Things out: Classication and Its
Consequences. Inside Technology. Cambridge. MIT Press.
128 Reeves, M. (2017) Infrastructural Hope: Anticipating ‘Independent
Roads’ and Territorial Integrity in Southern Kyrgyzstan. Ethnos, 82(4),
711–737.
129 Larkin, B. (2013) The Politics and Poetics of Infrastructure. The Annual
Review of Anthropology 42, 327–43.
130 Ibid. endnote 4, Hasselbalch (2021), p. 86
131 Wong, P.H., Data Pollution & Power 1st Meeting Mini Report, 2021
132 Ibid. endnote 41, Castells (2010), p. 445.
133 Ibid., endnote 94, Ray, T. (2021)
134 Friedman, B., Nissenbaum, H. (1996) Bias in Computer Systems. ACM
Transactions on Information Systems, 14(3), 330–47.
135 Bolukbasi, T., Chang, K-W., Zou, J.Y., Saligrama, V., Kalai, A.T. (2016)
Man Is to Computer Programmer as Woman Is to Homemaker? Debiasing
Word Embeddings, 30th Conference on Neural Information Processing
Systems (NIPS 2016), Barcelona, Spain.
136 Abdilla et al uses the term “Indigenous AI” to describe a process of AI
102
development that counters bias as such based on country specic con-
ceptions and localised Indigenous laws and guided by local protocols. See
https://oldwaysnew.com/publications#new-page-5
137 Simonite, T. (26 October 2020) How an Algorithm Blocked Kidney
Transplants to Black Patients. Wired. https://www.wired .com/ story/ how
-algorithm -blocked-kidney -transplants -black -patients/
138 Hern, A. (14 August 2020) Do the maths: why England’s A-level grad-
ing system is unfair. The Guardian. https://www.theguardian .com/ edu-
cation/ 2020/ aug/ 14/ do -the-maths -why -englands -a -level -grading
-system -is -unfair
139 https://algorithmwatch.org/en/syri-netherlands-algorithm/
140 https://www.theguardian.com/world/2016/mar/29/micro-
so-tay-tweets-antisemitic-racism
141 Obermeyer, Z.; Nissan, R., Stern, M., Eane, S. Bembeneck, E. J., Mul-
lainathan, S. (2021), Algorithmic Bias Playbook. Chicago Booth, The Cen-
ter for Applied Articial Intelligence https://www.chicagobooth.edu/-/
media/project/chicago-booth/centers/caai/docs/algorithmic-bias-play-
book-june-2021.pdf
142 Please also see the Ada Lovelace Institute’s proposal for an Algorithmic
Impact assessment in healthcare: https://www.adalovelaceinstitute.org/
report/algorithmic-impact-assessment-case-study-healthcare/
143 https://www.adcu.org.uk/news-posts/app-drivers-couriers-union-les-
ground-breaking-legal-challenge-against-ubers-dismissal-of-drivers-by-
algorithm-in-the-uk-and-portugal
144 Ibid. endnote 4. Hasselbalch (2021).
145 Frank Pasquale’s New Laws of Robotics: 1. Robotic Systems and AI
should complement professionals, not replace them 2. Robotic Systems
and AI should not counterfeit humanity 3. Robotic Systems and AI should
not zero-sum arms races 4. Robotic Systems and AI must always indicate
the identify of their creator (s) , controller(s), and owner (s). in Pasquale,
F. (2020) New Laws of Robotics: Defending Human Expertise in the Age of AI.
Harvard University Press.
146 Thatcher, J., O’Sullivan, D., Mahmoudi, D. (2016) ”Data colonialism
through accumulation by dispossession: new metaphors for daily data.
Environment and Planning D, 34: 990–1006.
147 https://www.itu.int/net/wsis/docs/background/resolutions/56_183_
unga_2002.pdf
148 Browne, S. (2015) Dark Matters: On the Surveillance of Blackness.
103
Duke University Press. p. 10
149 Cieslik K, Margócsy D. Datacation, Power and Control in Develop-
ment: A Historical Perspective on the Perils and Longevity of Data. Prog-
ress in Development Studies. February 2022. doi:10.1177/14649934221076580
150 See e.g. Eileen Guo and Adi Renaldi’s investigation of WorldCoin’s prac-
tices in developing countries in “Deception, exploited workers, and cash
handouts: How Worldcoin recruited its rst half a million test users” (April
6th, 2022) https://www.technologyreview.com/2022/04/06/1048981/
worldcoin-cryptocurrency-biometrics-web3/
151 Thatcher J, O’Sullivan D, Mahmoudi D. Data colonialism through accu-
mulation by dispossession: New metaphors for daily data. Environment and
Planning D: Society and Space. 2016;34(6):990-1006.
152 Aguerre, C. (forthcoming). Digital platforms and resistance, Dig-
ital Economy Network, SASE Conference 10 July 2022, Amsterdam.
Aguerre, C. (forthcoming). Revisiting Digital Colonialism. GCR21, Univer-
sity of Duisburg-Essen Document Series.
153 See e.g. Aguerre & Tarullo’s (2021) description of Latin American Civ-
il Society Organisations’ data activism in ”Unravelling Resistance: Data
Activism Congurations in Latin American Civil Society, Latin American
Civil Society Organizations’ (CSOs) resistance practices in the context of
datacation” Palabra Clave [online]. 2021, vol.24, n.3, e2435. EpubOct12,
2021. ISSN 0122-8285. https://doi.org/10.5294/pacla.2021.24.3.5.
154 Mejias, U. A. & Couldry, N. (2019). “Datacation. Internet Policy Re-
view, 8(4). https://doi.org/10.14763/2019.4.1428
155 https://hai.stanford.edu/news/movement-decolonize-ai-center-
ing-dignity-over-dependency
156 Hao, A. (April 19, 2022) Articial intelligence is creating a new colonial
world order https://www.technologyreview.com/2022/04/19/1049592/ar-
ticial-intelligence-colonialism/
157 Ibid.
158 Ibid. endnote 94, Ray, T. (2021)
159 Ibid. Hasselbalch (2021) endnote 4, p. 157-58: ”Right now, AISTIs have
increasingly powerful agencies immobilising the living by making it predictable
in time. Just like Bergson’s clock. We always know not only that the clock will
strike 12, but exactly when it will strike. This is a comforting feeling, it empowers
us to manage and coordinate in social environments, but it does not mean that
everything must be known in advance. It does not mean that our futures are set.
The unpredictability of our human lives and societies is what we want to preserve,
because if injustice, wrongful treatment of minority groups and discrimination
104
are set in stone, in the algorithms and their data systems that pervade society, we
do want to have the power to resist the futures inscribed in their runes.
160 CEPEJ (2018) European Ethical Charter on the Use of Articial Intelli-
gence in Judicial Systems and their environment, by the European Com-
mission for the Eciency of Justice (CEPEJ) of the Council of Europe.
Adopted at the 31st plenary meeting of the CEPEJ (Strasbourg, 3–4 De-
cember 2018), p. 67.
161 Deleuze, G. (1991) Bergsonism. Translated by H. Tomlinson, B. Habber-
jam. Urzone, Zone Books. (Originally published in French, 1966), p. 15.
162 Ibid. endnote 94 Ray, T. (2021).
163 “The EU and U.S. are starting to align on AI regulation” (February 1,
2022), Brookings. https://www.brookings.edu/blog/techtank/2022/02/01/
the-eu-and-u-s-are-starting-to-align-on-ai-regulation/
164 REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE
COUNCIL LAYING DOWN HARMONISED RULES ON ARTIFICIAL
INTELLIGENCE (ARTIFICIAL INTELLIGENCE ACT) (2021) https://eur-
lex.europa.eu/legal-content/EN/TXT/?qid=1623335154975&uri=CELEX-
%3A52021PC0206
165 https://www.mydata.org/about/
105
106
Data Pollution is to the big data age what smog was to the industrial age. Our
response to data pollution will develop much like our reaction to traditional
forms of pollution—just much faster and hopefully with dedication and
great force. This white paper describes a nascent environmental data
pollution movement. It frames data pollution in the context of powers
and interests exploring eight domains in which data pollution has the
greatest impact: Nature, Science & Innovation, Democracy, Human
Rights, Infrastructure, Decision-Making, Global Opportunities, and Time.
The main objective is to ensure that data pollution of AI in particular is
included in the global sustainable development agenda.
... Reduced pollutant emissions can primarily be achieved through technological progress; as China's economy enters the fourth industrial revolution, new technologies like digitalization and AI grow (Zhao et al., 2023). At the Conference on the Human Environment in Stockholm, Sweden, in 1972, world leaders realized they had to pay more attention to and work together on environmental challenges to address some of the most pressing issues highlighted by the sustainable development agenda, technological solutions equipped with AI capabilities are essential (Hasselbalch, 2022). Various studies have demonstrated the efficacy of using autonomous technologies to do tasks on a farm, including weeding, watering, monitoring, and planting (Usigbe et al., 2023). ...
... In 2022, the IEEE Standards Association published the Recommended Practice for Quality Management of Datasets for Medical Artificial Intelligence, developed under the leadership of the National Institutions for Food and Drug Control of China and was the first international standard for AI medical datasets (Xue et al., 2023). Foresight in policymaking, prediction of environmental repercussions, better utilization of scarce resources, and optimization of manufacturing processes are a few ways AI might aid in the transition to a greener economy (Hasselbalch, 2022). Nevertheless, computers, AI, and other coding forms could remedy this decision-making problem (Ali et al., 2019). ...
Article
Full-text available
Artificial intelligence (AI) is an umbrella term for a wide range of machine intelligence systems that can replicate the behavior of humans. AI and Big Data have emerged as defining characteristics of the fourth industrial revolution (IR). AI has developed tools. Because of the novelty, the investigation of IR, AI, and their environmental effects is still in the early stages of exploration. This study investigates how IR and AI affect human and environmental health and also discusses IR, AI, machine-human ideas, innovation, and AI's environmental benefits, further examines the challenges of these innovations, and recommends additional studies to explain their progress. As a result, the application of AI technology in environmental management, particularly concerning pollution, has become a significant advancement in reshaping our approach to monitoring the environment. Numerous countries are reaping substantial advantages by integrating AI in creating, executing, and assessing measures to address environmental degradation. These innovations can yield societal advantages and contribute to achieving the Sustainable Development Goals (SDGs) 2030; unfortunately, it is important to acknowledge that these benefits may not align well with environmental sustainability objectives, and the increasing number of electronic gadgets presents an additional concern. Conducting future research is crucial to investigate the growing prevalence of electronic devices utilized for AI, its potential ramifications for the future trajectory of climate change, and the approaches being taken to address the issue. Future research should prioritize conducting lifecycle environmental impact analyses, developing sustainable AI hardware, optimizing renewable energy usage, advancing climate modeling techniques, finding effective solutions for managing e-waste, utilizing AI for environmental monitoring and protection, conducting socio-environmental impact studies, developing policies and regulations, creating energy-efficient AI algorithms, and integrating circular economy principles to ensure that AI advancements align with environmental sustainability.
... AI is perceived as more than a technology in social science research space since it is a sociotechnical assemblage comprising politics, interests and virtual substructures that is tied to physical infrastructures and human entities elsewhere (Hasselbalch, 2021(Hasselbalch, , 2022. AI combines both conscious and unconscious decisions, agencies that define impressions such as access, value and socio-economic categorizations with implications on socio-material in societies where they are enacted (Burch & Legun, 2021;Carolan, 2017;Fourcade & Healy, 2017). ...
Preprint
Full-text available
The study examines how ontonorms propagate certain gender practices in digital spaces through character and the norms of spaces that shape AI design, training and use. Additionally the study explores the different user behaviours and practices regarding whether, how, when, and why different gender groups engage in and with AI driven spaces. By examining how data and content can knowingly or unknowingly be used to drive certain social norms in the AI ecosystems, this study argues that ontonorms shape how AI engages with the content that relates to women. Ontonorms specifically shape the image, behaviour, and other media, including how gender identities and perspectives are intentionally or otherwise, included, missed, or misrepresented in building and training AI systems.
... Still, the growing importance of AI in the global economy makes the use of such technologies unavoidable. The foreign imposition of norms and values implicit in design has therefore become a pressing issue (Hasselbalch 2022;van Wynsberghe 2021). Decoloniality seeks to disrupt the persisting power structures originating from colonialism and aims to replace them with plural and diversified conceptions of values and knowledge. ...
Article
Full-text available
This paper aims to show that dominant conceptions of intelligence used in artificial intelligence (AI) are biased by normative assumptions that originate from the Global North, making it questionable if AI can be uncritically applied elsewhere without risking serious harm to vulnerable people. After the introduction in Sect. 1 we shortly present the history of IQ testing in Sect. 2, focusing on its multiple discriminatory biases. To determine how these biases came into existence, we define intelligence ontologically and underline its constructed and culturally variable character. Turning to AI, specifically the Turing Test (TT), in Sect. 3, we critically examine its underlying intelligence conceptions. The test has been of central influence in AI research and remains an important point of orientation. We argue that both the test itself and how it is used in practice risk promoting a limited conception of intelligence which solely originated in the Global North. Hence, this conception should be critically assessed in relation to the different global contexts in which AI technologies are and will be used. In Sect. 4, we highlight how unequal power relations in AI research are a real threat, rather than just philosophical sophistry while considering the history of IQ testing and the TT’s practical biases. In the last section, we examine the limits of our account and identify fields for further investigation. Tracing colonial continuities in AI intelligence research, this paper points to a more diverse and historically aware approach to the design, development, and use of AI.
Chapter
Full-text available
Populations are impacted differently by Artificial Intelligence (AI), due to different privileges and missing voices in STEM space. Continuation of biased gender norms is exhibited through data and propagated by the AI algorithmic activity in different sites. Specifically, women of colour continue to be underprivileged in relation to AI innovations. This chapter seeks to engage with invisible and elemental ways in which AI is shaping the lives of women and girls in Africa. Building on Annemarie Mol’s reflections about onto-norms, this chapter utilized informal sessions, participant observation, digital content analysis, and AI model character analysis, to identify the gender norms that shape and are shaped by different AI social actors and algorithms in different social ontologies using Kenya and Ghana as case studies. The study examines how onto-norms propagate certain gender practices in digital spaces through character and the norms of spaces that shape AI design, training and use. Additionally the study explores the different user behaviours and practices regarding whether, how, when, and why different gender groups engage in and with AI-driven spaces. By examining how data and content can knowingly or unknowingly be used to drive certain social norms in the AI ecosystems, this study argues that onto-norms shape how AI engages with the content that relates to women. Onto-norms specifically shape the image, behaviour, and other media, including how gender identities and perspectives are intentionally or otherwise, included, missed, or misrepresented in building and training AI systems. To address these African women related AI biases, we propose a framework for building intentionality within the AI systems, to ensure articulation of women’s original intentions for data, hence the use of personal data to perpetuate further gender biases in AI systems.
Chapter
Full-text available
Artificial intelligence (AI) is a general purpose technology (GPT), which is currently enjoying increasing use in strategic decision-making and military affairs. The AI revolution brings significant changes into the current and future socio-economic national and international systems, with AI applications expected to tilt the global balance of power in favour of actors who strategically invest and use this emerging technology. AI-assisted automation is also changing prevailing socio-economic production models on the global scale, and sooner or later, these technologies are expected to exert systemic impacts on the current global order. However, the distribution of AI technologies and skills is not uniform, with the global north dominating the space. Even within the global south, Africa lags way behind other continents.
Chapter
This informative Handbook provides a comprehensive overview of the legal, ethical, and policy implications of AI and algorithmic systems. As these technologies continue to impact various aspects of our lives, it is crucial to understand and assess the challenges and opportunities they present. Drawing on contributions from experts in various disciplines, the book covers theoretical insights and practical examples of how AI systems are used in society today. It also explores the legal and policy instruments governing AI, with a focus on Europe. The interdisciplinary approach of this book makes it an invaluable resource for anyone seeking to gain a deeper understanding of AI's impact on society and how it should be regulated. This title is also available as Open Access on Cambridge Core.
Chapter
In this paper we would like to discuss and evaluate the social and economic impact of contemporary digital ecosystems. We extend the definition of digital ecosystem following the idea of the circular economy based on enhanced and efficient use/reuse of resources and products. We consider an Extended or Circular Digital Ecosystem, a Digital Ecosystem in which the final customers can create value by sharing their data and digital contents through the digital ecosystem services themselves. In this way the data consumers also become data producers. This new concept of a digital ecosystem is closely integrated with the role and actions of the people involved in it, as both users, producers, and managers. A key point of our discussion is the new and very powerful AI algorithms, which are the current focus for research and beyond. We argue that the focus should not only on the algorithms themselves but also be on the data used to train and/or feed such algorithms. We exploit the point of view of data mangers and data curators to address this issue. We make a comparison between the management of scientific research data that is moving towards correct management following the FAIR principles and the data produced every day on the internet by sharing information on social media or any other platform. Since 2009, following the idea of open science, the research community started to focus on the quality of the shared research data. In 2016, the FAIR (Findable, Accessible, Interoperable and Reusable) principles gave the guidelines to share data that are machine interoperable, i.e. that can also be correctly used and interpreted by machines and algorithms. For example, generative AI based on Large Language Models (LLMs), such as the well-known Chat GPT, was trained on a large corpus of text (about 45 TB) to understand and generate text consistently, which consisted mainly of any type of text obtained from web crawling tools. The users of digital ecosystems should be aware that all available data are currently used to train AI algorithms and the results that these algorithms produce depend on the quality of the data used to train them. It is widely demonstrated that database classification by human intervention is prone to biases and errors. Algorithms trained on those datasets will suffer from these biases, resulting in dangerous or inappropriate content. Collecting all these insights, we plead that it is crucial to consider the technical experience of data managers and curators in the wider context of data sharing and management in society, given the immense amount of data produced and available on the network. We also argue that FAIR principles for research data should also be considered in the wider societal context, to avoid potential dangers related to new developments in Artificial Intelligence, which, in our belief and experience, are not harmful in themselves. Our general goal is then to raise awareness on the users of digital ecosystems that the correct sharing of their information can improve the quality of the data available on the network and that the potential dangers of data misuse can be reduced.
Chapter
By setting the spotlight on the intricate relationship between form, content, and outcomes in data-driven educational practices, this chapter explores the nuanced dimensions of value creation in such practices. A series of illustrative examples that deal with the use of data-driven tools to perform and deliver certain tasks are used to discuss the negotiation of standards during instructional interaction, as well as the cultural boundaries between epistemic and non-epistemic knowledge. More specifically, examples from driving simulator-based instruction in a Natural Resource Program show the complex alignment with set standards in practices of assessment and feedback. This chapter critically examines various dimensions, incorporating both political and ethical aspects of data value creation. It emphasizes the necessity for a meta-perspective that addresses the sensitive issues of access and transparency in educational practices. This serves as an ethical measure to make students aware of algorithmic control. Furthermore, the chapter argues for processes of data creation and evaluation that stay close to its users in space and time. By keeping the value of data bound to its context, we avoid the pitfalls of tools’ market-orientation, and we create the conditions for the understanding and theorization of data-driven educational practices.
Research
Full-text available
The mainstreaming of AI and allied emerging technologies will be an emissions-intensive process. At the same time, AI capacity in terms of R&D, investment, data, and infrastructure is currently skewed, focused within a handful of countries, primarily in the developed West. This report examines the interplay of global inequities in AI and climate change, and concludes with recommendations.
Article
Full-text available
Artificial intelligence (AI) is becoming increasingly important for the infrastructures that support many of society’s functions. Transportation, security, energy, education, the workplace, the government have all incorporated AI into their infrastructures for enhancement and/or protection. In this paper, we argue that not only is AI seen as a tool for augmenting existing infrastructures, but AI itself is becoming an infrastructure that many services of today and tomorrow will depend upon. Considering the vast environmental consequences associated with the development and use of AI, of which the world is only starting to learn, the necessity of addressing AI alongside the concept of infrastructure points toward the phenomenon of carbon lock-in. Carbon lock-in refers to society’s constrained ability to reduce carbon emissions technologically, economically, politically, and socially. These constraints are due to the inherent inertia created by entrenched technological, institutional, and behavioral norms. That is, the drive for AI adoption in virtually every sector of society will create dependencies and interdependencies from which it will be hard to escape. The crux of this paper boils down to this: in conceptualizing AI as infrastructure we can recognize the risk of lock-in, not just carbon lock-in but lock-in as it relates to all the physical needs to achieve the infrastructure of AI. This does not exclude the possibility of solutions arising with the rise of these technologies; however, given these points, it is of the utmost importance that we ask inconvenient questions regarding these environmental costs before becoming locked into this new AI infrastructure.
Article
Full-text available
Climate change is a global priority. In 2015, the United Nations (UN) outlined its Sustainable Development Goals (SDGs), which stated that taking urgent action to tackle climate change and its impacts was a key priority. The 2021 World Climate Summit finished with calls for governments to take tougher measures towards reducing their carbon footprints. However, it is not obvious how governments can make practical implementations to achieve this goal. One challenge towards achieving a reduced carbon footprint is gaining awareness of how energy exhaustive a system or mechanism is. Artificial Intelligence (AI) is increasingly being used to solve global problems, and its use could potentially solve challenges relating to climate change, but the creation of AI systems often requires vast amounts of, up front, computing power, and, thereby, it can be a significant contributor to greenhouse gas emissions. If governments are to take the SDGs and calls to reduce carbon footprints seriously, they need to find a management and governance mechanism to (i) audit how much their AI system ‘costs’ in terms of energy consumption and (ii) incentivise individuals to act based upon the auditing outcomes, in order to avoid or justify politically controversial restrictions that may be seen as bypassing the creativity of developers. The idea is thus to find a practical solution that can be implemented in software design that incentivises and rewards and that respects the autonomy of developers and designers to come up with smart solutions. This paper proposes such a sustainability management mechanism by introducing the notion of ‘Sustainability Budgets’—akin to Privacy Budgets used in Differential Privacy—and by using these to introduce a ‘Game’ where participants are rewarded for designing systems that are ‘energy efficient’. Participants in this game are, among others, the Machine Learning developers themselves, which is a new focus for this problem that this text introduces. The paper later expands this notion to sustainability management in general and outlines how it might fit into a wider governance framework.
Article
Full-text available
The collection, processing, storage and circulation of data are fundamental element of contemporary societies. While the positivistic literature on ‘data revolution’ finds it essential for improving development delivery, critical data studies stress the threats of datafication. In this article, we demonstrate that datafication has been happening continuously through history, driven by political and economic pressures. We use historical examples to show how resource and personal data were extracted, accumulated and commodified by colonial empires, national governments and trade organizations, and argue that similar extractive processes are a present-day threat in the Global South. We argue that the decoupling of earlier and current datafication processes obscures the underlying, complex power dynamics of datafication. Our historical perspective shows how, once aggregated, data may become imperishable and can be appropriated for problematic purposes in the long run by both public and private entities. Using historical case studies, we challenge the current regulatory approaches that view data as a commodity and frame it instead as a mobile, non-perishable, yet ideally inalienable right of people.
Article
Full-text available
This work examines the evolution of Latin American Civil Society Organizations’ (CSOs) resistance practices in the context of datafication and how these relate with the overall notions of symbolic domination denounced by the Latin American School of Communication. Although CSOs in Latin America are still exploring the problems surrounding datafication, signs of vitality are already showing in broader debates around human rights, community development, and media policies. The study identifies the main themes underlying datafication work by Latin American CSOs and assesses how they shape resistance practices and CSOs’ perceptions of asymmetrical power relations. While some patterns can fall into existing conceptualisations surrounding resistance practices and data activism, this paper identifies new conceptual and empirical approaches to face the challenges posed by a datafied society.
Article
Full-text available
While there is a growing effort towards AI for Sustainability (e.g. towards the sustainable development goals) it is time to move beyond that and to address the sustainability of developing and using AI systems. In this paper I propose a definition of Sustainable AI; Sustainable AI is a movement to foster change in the entire lifecycle of AI products (i.e. idea generation, training, re-tuning, implementation, governance) towards greater ecological integrity and social justice. As such, Sustainable AI is focused on more than AI applications; rather, it addresses the whole sociotechnical system of AI. I have suggested here that Sustainable AI is not about how to sustain the development of AI per say but it is about how to develop AI that is compatible with sustaining environmental resources for current and future generations; economic models for societies; and societal values that are fundamental to a given society. I have articulated that the phrase Sustainable AI be understood as having two branches; AI for sustainability and sustainability of AI (e.g. reduction of carbon emissions and computing power). I propose that Sustainable AI take sustainable development at the core of its definition with three accompanying tensions between AI innovation and equitable resource distribution; inter and intra-generational justice; and, between environment, society, and economy. This paper is not meant to engage with each of the three pillars of sustainability (i.e. social, economic, environment), and as such the pillars of sustainable AI. Rather, this paper is meant to inspire the reader, the policy maker, the AI ethicist, the AI developer to connect with the environment—to remember that there are environmental costs to AI. Further, to direct funding towards sustainable methods of AI.
Chapter
Scholars from across law and internet and media studies examine the human rights implications of today's platform society. Today such companies as Apple, Facebook, Google, Microsoft, and Twitter play an increasingly important role in how users form and express opinions, encounter information, debate, disagree, mobilize, and maintain their privacy. What are the human rights implications of an online domain managed by privately owned platforms? According to the Guiding Principles on Business and Human Rights, adopted by the UN Human Right Council in 2011, businesses have a responsibility to respect human rights and to carry out human rights due diligence. But this goal is dependent on the willingness of states to encode such norms into business regulations and of companies to comply. In this volume, contributors from across law and internet and media studies examine the state of human rights in today's platform society. The contributors consider the “datafication” of society, including the economic model of data extraction and the conceptualization of privacy. They examine online advertising, content moderation, corporate storytelling around human rights, and other platform practices. Finally, they discuss the relationship between human rights law and private actors, addressing such issues as private companies' human rights responsibilities and content regulation. Open access edition published with generous support from Knowledge Unlatched and the Danish Council for Independent Research. ContributorsAnja Bechmann, Fernando Bermejo, Agnès Callamard, Mikkel Flyverbom, Rikke Frank Jørgensen, Molly K. Land, Tarlach McGonagle, Jens-Erik Mai, Joris van Hoboken, Glen Whelan, Jillian C. York, Shoshana Zuboff, Ethan Zuckerman
Article
There is great interest in how the growth of artificial intelligence and machine learning may affect global GHG emissions. However, such emissions impacts remain uncertain, owing in part to the diverse mechanisms through which they occur, posing difficulties for measurement and forecasting. Here we introduce a systematic framework for describing the effects of machine learning (ML) on GHG emissions, encompassing three categories: computing-related impacts, immediate impacts of applying ML and system-level impacts. Using this framework, we identify priorities for impact assessment and scenario analysis, and suggest policy levers for better understanding and shaping the effects of ML on climate change mitigation. The rapid growth of artificial intelligence (AI) is reshaping our society in many ways, and climate change is no exception. This Perspective presents a framework to assess how AI affects GHG emissions and proposes approaches to align the technology with climate change mitigation.