ArticlePDF Available

Situating Questions of Data, Power, and Racial Formation



This special theme of Big Data & Society explores connections, relationships, and tensions that coalesce around data, power, and racial formation. This collection of articles and commentaries builds upon scholarly observations of data substantiating and transforming racial hierarchies. Contributors consider how racial projects intersect with interlocking systems of oppression across concerns of class, coloniality, dis/ability, gendered difference, and sexuality across contexts and jurisdictions. In doing so, this special issue illuminates how data can both reinforce and challenge colorblind ideologies as well as how data might be mobilized in support of anti-racist movements.
Situating questions of data, power,
and racial formation
Renee Shelby and Kathryn Henne
This special theme of Big Data & Society explores connections, relationships, and tensions that coalesce around data,
power, and racial formation. This collection of articles and commentaries builds upon scholarly observations of data sub-
stantiating and transforming racial hierarchies. Contributors consider how racial projects intersect with interlocking sys-
tems of oppression across concerns of class, coloniality, dis/ability, gendered difference, and sexuality across contexts and
jurisdictions. In doing so, this special issue illuminates how data can both reinforce and challenge colorblind ideologies as
well as how data might be mobilized in support of anti-racist movements.
Data, racial formation, power, intersectionality, algorithmic bias, datacation
This article is a part of special theme on Data, Power and Racial Formations. To see a full list of all articles in this
special theme, please click here:
Although datacation promises better-organized informa-
tion that captures contextualized phenomena and expedites
decision-making, big data is shaped by legacies of inequal-
ity that can enable material and representational harms.
Critical observers have warned that articial intelligence
(AI), big data, and other so-called smarttechnologies
threaten not only to automate discrimination and oppression
but to become central mechanisms through which racism
operates (Benjamin, 2019; Noble, 2018; Barocas and
Selbst, 2016; Stark, 2018). Extending these insights, scholars
of critical data studies have scrutinized how big data contri-
butes to processes of racialization. They provide important
analyses of the pervasiveness of whiteness in AI and
machine learning (Birhane and Guest, 2020; Cave and
Dihal, 2020; Phan, 2019; Schlesinger et al., 2018), the limita-
tions of anti-discrimination and fairnessapproaches to
race and other social hierarchies in machine learning
(Hoffmann, 2019), strategies for operationalizing the multi-
dimensionality of race in sociotechnical systems (Hanna
et al., 2020), and frameworks for addressing racialized
harms, such as algorithmic reparation (Davis et al., 2021).
This scholarship evinces how big data co-produces
racialized social phenomena and inequalities, extending
claims that data and datacation are cultural processes
(e.g. Friedman and Nissenbaum, 1996; Gitelman, 2013;
Kitchin, 2014). Racial co-production is not limited to crit-
ical data studies; it intersects with critical conversations
that span critical race theory (CRT), postcolonial studies,
the sociology of race and ethnicity, science and technology
studies (STS), among others. Work in these allied elds
provides important insights into how race and racism are
deeply entangled in the collection, use, and deployment
of data, which have been gleaned through analyses of
School of Regulation and Global Governance (RegNet), The Australian
National University, Canberra, Australian Capital Territory, Australia
Corresponding author:
Renee Shelby, School of Regulation and Global Governance (RegNet), The
Australian National University, HC Coombs Extension Buliding #8, 8
Fellows Road, Canberra, Australian Capital Territory, Australia.
Creative Commons NonCommercial-NoDerivs CC BY-NC-ND: This article is distributed under the terms of the Creative Commons
Attribution-NonCommercial-NoDerivs 4.0 License ( which permits non-commercial
use, reproduction and distribution of the work as published without adaptation or alteration, without further permission provided the original workis
attributed as specied on the SAGE and Open Access page (
Big Data & Society
JanuaryJune: 14
© The Author(s) 2022
Article reuse guidelines:
DOI: 10.1177/20539517221090938
classication (Goldberg, 2001; Zuberi, 2001), methodology
and disciplinary practice (Daniels, 2013, 2015; Walter and
Andersen, 2013; Zuberi and Bonilla-Silva, 2008), racialized
social control and surveillance (Browne, 2015; Monahan
2010), and liberation (Coleman, 2009; Kadiri, 2021).
They demonstrate how there is much to be gained within
data studiestheoretically and practicallyfrom deepen-
ing engagement with intellectual schools of thought that
have long been concerned with race and racism.
This thematic special theme explores how data and
technological platforms constitutively contribute to con-
temporary racial hierarchies, attending to both sociocultural
and material implications. The papers in this collection
showcase interdisciplinary insights from scholars working
across elds of gender studies, library and information
sciences, Internet and media studies, STS, socio-legal
studies, and sociology. Drawing together case studies and
theoretical explorations, authors make productive inroads
in new and emergent conversations regarding how data
emerge in and through racial projects as they intersect
with systems of class, colonialism, dis/ability, gender, and
sexuality. They illustrate how explicit engagement with
interdisciplinary theories of race and racism can enhance
understandings of big datas material impacts and can
inform means of addressing these impacts. In doing so,
the articles and commentaries not only contribute to
ongoing scholarly debates about how data are mobilized
to innovate, interrupt, and even generate racisms, but also
aid in identifying strategies to support anti-racist and sover-
eignty movements.
Unpacking data, power, and racial
Considering big data as a mechanism of racialized power
prompts a range of critical questions. How do modes of
datacation normalize racial classication systems and
mask their sociocultural underpinnings? To what extent
can big data work in the service of liberatory agendas?
What are the opportunities and risks of practices, protec-
tions, and systems that promise more equitable outcomes?
These questions are especially important when faced with
the seductionof data-driven knowledge production and
quantication (see Merry, 2016).
Recognizing that others are asking these questions in
relation to data sets and data set development
(Scheuerman et al., 2021; Buolamwini and Gebru, 2018),
model development and racial classication (e.g. Hanna
et al., 2020; Angwin and Larson, 2016), and the production
of race on digital media platforms (e.g. Brock, 2009, 2020;
Tynes et al., 2011), this special theme considers the
dyanamic relationships between datacation and racial
formation. In reference to the racial formation, a term
commonly associated with work by sociologists Omi and
Winant (1994: 12), we mean how race becomes dened
and contested throughout society, both in collective action
and personal practice, with a focus on the processes by
which social, economic, and political forces determine the
content and importance of racial categories, and by which
they are in turn shaped by racial meanings. Racial forma-
tions mark historical, political, and social processes through
which power takes shape and becomes articulated in and
through racial categories. Earlier analyses that share these
concerns attend to how data have operated in the service
of substantiating and transforming social categories of dif-
ference across contexts and jurisdictions (e.g. Chun, 2009;
Goldberg, 1997; Hammonds, 1997; Reardon, 2004). More
recent scholarship focuses on the sociocultural implications
that affect racial hierarchies, challenges colorblind under-
standings of data and algorithms, interrogates how techno-
logical platforms discipline social interaction, and examines
how data become animated through situated knowledge
(Browne, 2015; Mcharek et al., 2013; Muhammad, 2011;
Noble and Tynes, 2016; Walter, 2016).
This collection captures connections and tensions
between data and racial formation across different scales,
sites, and structures, reecting on how they manifest in
lived experience and representational forms. Here, authors
use and extend analyses of racial formation by illustrating
how data can operate in the service of substantiating and
transforming inequalities across contexts and jurisdictions.
In sum, the papers in this special theme address how data
become implicated within the interlocking systems of dom-
ination and oppression that affect everyday lives and
Overview of this special theme
The collection features analyses that illustrate how data are
mobilized to innovate and interrupt forms of racism. Their
ndings illuminate how data can both instantiate and chal-
lenge colorblind ideologies. Providing nuanced insights
about interlocking inequalities, this special theme advances
theoretical understandings of data and racial formation and
offers points of caution for anti-racist movements. As calls
for data-driven systems for social good and demands for
technology in the public interest have gained traction in
recent years, these contributions are particularly timely:
they provide many examples that demonstrate the import-
ance of attending to sociopolitical, subjugated, and tech-
nical knowledges when disentangling the materialities of
data production, advocacy, and critical data-related inquiry.
The opening commentary by Phan and Wark (2021)
takes up Gilroys provocative claim that the time of
racemay be coming to a close(1998: 840) as a starting
point for reconsidering how the mediated nature of dataed
processes evince shifts in racialization. They ask: As
regimes of computation are largely opaque modes of classi-
cation, what does race become? The commentary
2Big Data & Society
documents epistemological shifts in which racialized sub-
jects emerge through assemblages of data, revealing a
new regime that they refer to as racial formations as data
Hatch (2022), author of the rst article in the theme,
examines how the governance of coronavirus disease
2019 (COVID-19) data became central to addressing
racism in the health and health care in the United States,
acknowledging a common view that racialized COVID-19
health disparities would have been greater without this
data. Hatch challenges this idea by querying whether the
production and circulation of racial health data strengthened
anti-Black racism. He traces how metrics of racial death are
mobilized to institute racist social laws, policies, and
systems. Using the metaphor of racial antimatterto
capture how statistics can represent the social world in
ways that fail to correspond to lived experiences, Hatch
(2022: 6) examines how data work in the service of
weaponizing knowledge of racial inequalities.
The third contribution to the theme, an article by Henne,
Shelby, and Harb (2021), illustrates how racial capitalism
can enhance understanding of data capital and inequality
through an in-depth study of digital platforms used for inter-
vening in gender-based violence. Examining how reporting
apps use data to support institutionally legible narratives of
violence, the authors draw attention to how reporting
reinforce racialized property relations built on extraction
and ownership, the capital accumulation that reinforces
the inequitable distribution of benets derived through
and from data, and the commodication of diversity and
Sooriyakumarans (2022) commentary is similarly con-
cerned with racialized inequalities etched and shaped by
capitalist relations. Their scope and focus, however,
begins with localized encounters through an autoethnogra-
phically informed reection to trace the impacts and impli-
cations of digitized residential tenancy databases in
Australia. Demonstrating how residential tenancy databases
are racialized technologies with colonial underpinnings,
Sooriyakumarans analysis (2022) articulates the need for
multifaceted frameworks that attend to how racial capital-
ism, state surveillance, and colonialism continue to
operatein this case, in and through tenancy databases.
The next article by Crooks (2021) examines non-prot
efforts to make public schools data driven through the
aggregation, analysis, and visualization of digital data.
Drawing on theoretical explanations of racialized organiza-
tions (Ray, 2019), the analysis illuminates a form of pro-
ductive myopia,a way of pursuing racial projects via
seemingly independent, objective quantications(Crooks,
2021: 2), which enables claims that data can reduce the
impacts of racial inequalities while also facilitating them.
This grounded approach highlights how racial projects are
taken up in public education through EdTech and data-driven
The concluding commentary by Anantharajah (2021)
examines how racial formation takes shape through data
projects, drawing on ethnographic research on climate
nance governance conducted in Fiji. Her explanation of
how climate nance organizations develop and use data
projects to support ows of capital targeting the Pacic ela-
borates on how such practices are mediated through
schemas with both colonial and racial contourslenses
that have racializing implications even though they are
not visible on the surface.
Taken together, the articles and commentaries presented
in this special thematic theme engage longstanding and
emergent concerns regarding datas role within the racial
formation, attending to recent cultural and political devel-
opments as well as geopolitical and sociotechnical shifts.
They showcase how data are not only enrolled in processes
of racial formation, but also how they intersect with projects
of class, dis/ability, gender, and sexuality as well as other
social categories of difference. We hope the collection
serves as a productive resource for readers from a range
of elds and contributes to a generative dialog that
crosses disciplinary boundaries.
Declaration of conicting interests
The authors declared no potential conicts of interest with respect
to the research, authorship, and/or publication of this article.
The authors received no nancial support for the research, author-
ship, and/or publication of this article.
Renee Shelby
Anantharajah K (2021) Postracialism, coloniality, and climate
nance organizations: Implications for emergent data projects
in the pacic. Big Data & Society 8: 17.
Angwin J and Larson J (2016) Propublica responds to companys
critique of machine bias story. ProPublica, 29 July. https://
critique-of-machine-bias-story (accessed 27 March 2022).
Barocas S and Selbst A (2016) Big datas disparate impact.
California Law Review 104: 671732.
Benjamin R (2019) Race After Technology: Abolitionist Tools for
the New Jim Code. Cambridge, UK: Polity Press.
Birhane A and Guest O (2020) Towards decolonizing computa-
tional sciences. arXiv 2009: 14258.
Brock A (2009) Life on the wire. Information, Communication &
Society 12(3): 344363.
Brock A (2020) Distributed Blackness. New York, NY: New York
University Press.
Browne S (2015) Dark Matters: On the Surveillance of Blackness.
Durham, NC: Duke University Press.
Shelby and Henne 3
Buolamwini J and Gebru T (2018) Gender shades: Intersectional
accuracy disparities in commercial gender classication.
Proceedings of the 1st Conference on Fairness,
Accountability and Transparency 81: 7791.
Cave S and Dihal K (2020) The whiteness of AI. Philosophy &
Technology 33(4): 685703.
Chun WHK (2009) Introduction: race and/as technology; or, how
to do things to race. Camera Obscura: Feminism, Culture, and
Media Studies 24(1): 735.
Coleman B (2009) Race as technology. Camera Obscura:
Feminism, Culture, and Media Studies 24(1): 177207.
Crooks R (2021) Productive myopia: Racial organizations and
EdTech. Big Data & Society 8: 116.
Daniels J (2013) Race and racism in internet studies: A review and
critique. New Media and Society 15(5): 695719.
Daniels J (2015) My brain database doesnt see skin color:
Color-blind racism in the technology industry and in theorizing
the web. American Behavioral Scientist 59(11): 13771393.
Davis JL, Williams A and Yang MW (2021) Algorithmic repar-
ation. Big Data & Society 8: 2.
Friedman B and Nissenbaum H (1996) Bias in computer systems.
ACM Transactions on Information Systems 14(3): 330347.
Gilroy P (1998) Race ends here. Ethnic and Racial Studies 21(5):
Gitelman L (2013) Raw DataIs an Oxymoron. Cambridge,
MA: The MIT Press.
Goldberg DT (1997) Racial Subjects: Writing on Race in America.
London: Routledge.
Goldberg DT (2001) The Racial State. Maiden, MA:
Hammonds EM (1997) New technologies of race. In: Calvert M
and Terry J (eds) Processed Lives: Gender and Technology
in Everyday Life. London: Routledge, pp.107122.
Hanna A, Denton E, Smart A, et al. (2020) Towards a critical race
methodology in algorithmic fairness. Proceedings of the 2020
Conference on Fairness, Accountability, and Transparency:
501512. doi:10.1145/3351095.3372826.
Hatch A (2022) The data will not save us: Afropessimism in the
COVID-19 archives. Big Data & Society Big Data & Society
9: 113.
Henne K, Shelby R and Harb J (2021) The datacation of
#MeToo: Whiteness, racial capitalism, and anti-violence tech-
nologies. Big Data & Society 8: 114.
Hoffmann AL (2019) Where fairness fails: data, algorithms, and
the limits of antidiscrimination discourse. Information,
Communication and Society 22(7): 900915.
Kadiri A (2021) Data and afrofuturism: An emancipated subject?
Internet Policy Review 10(4): 126.
Kitchin R (2014) The Data Revolution: Big Data, Open Data,
Data Infrastructures and Their Consequences. Thousand
Oaks, CA: SAGE.
Mcharek A, Schramm K and Skinner D (2013) Topologies of
race: Doing territory, population, and identity in Europe.
Science, Technology, and Human Values 39(4): 468487.
Merry SE (2016) The Seductions of Quantication: Measuring
Human Rights, Gender Violence, and Sex Trafcking.
Chicago, IL: University of Chicago Press.
Monahan T (2010) Surveillance in the Time of Insecurity. New
Brunswick, NJ: Rutgers University Press.
Muhammad KG (2011) The Condemnation of Blackness: Race,
Crime, and the Making of Modern Urban America. Harvard
University Press: Cambridge, MA.
Noble SU (2018) Algorithms of Oppression: How Search Engines
Reinforce Racism. New York, NY: New York University
Noble SU and Tynes BM (2016) The Intersectional Internet:
Race, Sex, Class, and Culture Online. New York, NY: Peter
Omi M and Winant H (1994) Racial Formation in the United
States. New York, NY: Routledge.
Phan T (2019) Amazon echo and the aesthetics of whiteness.
Catalyst: Feminism, Theory, Technoscience 5(1): 138.
Phan T and Wark S (2021) Racial formations as data formations.
Big Data & Society 8: 15.
Ray V (2019) A theory of racialized organizations. American
Sociological Review 84(1): 2653.
Reardon J (2004) Race to the Finish: Identity and Governance in
an Age of Genomics. Princeton, NJ: Princeton University
Scheuerman MK, Denton E and Hanna A (2021) Do datasets have
politics? Disciplinary values in computer vision dataset devel-
opment. Proceedings of the ACM on Human-Computing
Interaction (5): 137. doi:10.1145/3476058.
Schlesinger A, OHara KP and Taylor AS (2018) Lets talk about
race: Identity, chatbots, and AI. Proceedings of the 2018 CHI
Conference on Human Factors in Computing Systems:114.
Sooriyakumaran D (2022) Systems that never forget: Residential
tenancy databases in the Australian private rental market. Big
Data & Society 23(9).
Stark L (2018) Facial recognition, emotion, and race in animated
social media. First Monday. doi:10.5210/fm.v23i9.9406.
Tynes BM, Garcia EL, Giang MT, et al. (2011) The racial land-
scape of social networking sites: Forging identity, community,
and civic engagement. I/S: A Journal of Law and Policy for the
Information Society 7: 71100.
Walter M (2016) Data politics and indigenous representation in
Australian statistics. In: Kukutai T and Taylor J (eds)
Indigenous Data Sovereignty: Toward an Agenda. Canberra,
ACT: ANU Press, pp.7998.
Walter M and Andersen C (2013) Indigenous Statistics: A
Quantitative Research Methodology. Walnut Creek, CA: Left
Coast Press.
Zuberi T (2001) Thicker Than Blood: How Racial Statistics Lie.
Minneapolis, MN: University of Minnesota Press.
Zuberi T and Bonilla-Silva E (2008) White Logic, White Methods:
Racism and Methodology. Plymouth, UK: Rowman &
4Big Data & Society
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
This article illustrates how racial capitalism can enhance understandings of data, capital, and inequality through an in-depth study of digital platforms used for intervening in gender-based violence. Specifically, we examine an emergent sociotechnical strategy that uses software platforms and artificial intelligence (AI) chatbots to offer users emergency assistance, education, and a means to report and build evidence against perpetrators. Our analysis details how two reporting apps construct data to support institutionally legible narratives of violence, highlighting overlooked racialised dimensions of the data capital generated through their use. We draw attention to how they reinforce property relations built on extraction and ownership, capital accumulation that reinforces benefits derived through data property relations and ownership, and the commodification of diversity and inclusion. Recognising these patterns are not unique to anti-violence apps, we reflect on how this example aids in understanding how racial capitalism becomes a constitutive element of digital platforms, which more generally extract information from users, rely on complex financial partnerships, and often sustain problematic relationships with the criminal legal system. We conclude with a discussion of how racial capitalism can advance scholarship at the intersections of data and power.
Full-text available
This paper reports on a two-year, field-based study set in a charter management organization (CMO-LAX), a not-for-profit educational organization that operates 18 public schools exclusively in the Black and Latinx communities of South and East Los Angeles. At CMO-LAX, the nine-member Data Team pursues the organization's avowed mission of making public schools data-driven, primarily through the aggregation, analysis, and visualization of digital data derived from quotidian educational activities. This paper draws on the theory of racialized organizations to characterize aspects of data-driven management of public education as practiced by CMO-LAX. I explore two examples of how CMO-LAX shapes data to support racial projects: the reconstruction of the figure of chronic truants and the incorporation of this figure in a calculative regime of student accomplishment. Organizational uses of data support a strategy I call productive myopia, a way of pursuing racial projects via seemingly independent, objective quantifications. This strategy allows the organization to claim to mitigate racial projects and, simultaneously, to accommodate them. This paper concludes by arguing for approaches to research and practice that center racial projects, particularly when data-intensive tools and platforms are incorporated into the provision of public goods and services such as education.
Full-text available
The concept of an individual, liberal data subject, who was traditionally at the centre of data protection efforts has recently come under scrutiny. At the same time, the particularly destructive effect of digital technology on Black people establishes the need for an analysis that not only considers but brings racial dimensions to the forefront. I argue that because Afrofuturism situates the Black struggle in persistent, yet continuously changing structural disparities and power relations, it offers a powerful departure point for re-imagining data protection. Sketching an Afrofuturist data subject then centres on radical subjectivity, collectivity, and contextuality.
Full-text available
Machine learning algorithms pervade contemporary society. They are integral to social institutions, inform processes of governance, and animate the mundane technologies of daily life. Consistently, the outcomes of machine learning reflect, reproduce, and amplify structural inequalities. The field of fair machine learning has emerged in response, developing mathematical techniques that increase fairness based on anti-classification, classification parity, and calibration standards. In practice, these computational correctives invariably fall short, operating from an algorithmic idealism that does not, and cannot, address systemic, Intersectional stratifications. Taking present fair machine learning methods as our point of departure, we suggest instead the notion and practice of algorithmic reparation. Rooted in theories of Intersectionality, reparative algorithms name, unmask, and undo allocative and representational harms as they materialize (American English sp) in sociotechnical form. We propose algorithmic reparation as a foundation for building, evaluating, adjusting, and when necessary, omitting and eradicating machine learning systems.
Full-text available
This commentary uses Paul Gilroy’s controversial claim that new technoscientific processes are instituting an ‘end to race’ as a provocation to discuss the epistemological transformation of race in algorithmic culture. We situate Gilroy’s provocation within the context of an abolitionist agenda against racial-thinking, underscoring the relationship between his post-race polemic and a post-visual discourse. We then discuss the challenges of studying race within regimes of computation, which rely on structures that are, for the most part, opaque; in particular, modes of classification that operate through proxies and abstractions and that figure racialized bodies not as single, coherent subjects, but as shifting clusters of data. We argue that in this new regime, race emerges as an epiphenomenon of processes of classifying and sorting – what we call ‘racial formations as data formations’. This discussion is significant because it raises new theoretical, methodological and political questions for scholars of media and critical algorithmic studies. It asks: how are we supposed to think, to identify and to confront race and racialisation when they vanish into algorithmic systems that are beyond our perception? What becomes of racial formations in post-visual regimes?
Full-text available
This commentary explores the potential consequence of latent racial formation in emergent climate finance data projects and draws from ethnographic research on climate finance governance conducted in Fiji. Climate finance data projects emerging in the Pacific aim to ease the flow of finance from the Global North to the South. These emergent data projects, such as renewable energy resource availability and investment mapping, are imbedded in the climate finance organizations that fund, develop, and use them. Thus, the commentary explores climate finance organizations through the lens of Ray’s (2019) theory of racial organizations, highlighting the ways in which important climate-related resources are mediated through racial and colonial schemas. The racial mediation of two key resources are spotlighted in this discussion: the finance itself and knowledge. Given that the Pacific region is at the coalface of climate change’s existential effects, the just allocation of resources is imperative. In interrogating the ways in which emergent data projects may deny these resources based on hidden racial schemas, the paper cautions against new and old forms of colonization that may be mobilized through even well-meaning techno-benevolent fixes ( Benjamin, 2019 ).
Data is a crucial component of machine learning. The field is reliant on data to train, validate, and test models. With increased technical capabilities, machine learning research has boomed in both academic and industry settings, and one major focus has been on computer vision. Computer vision is a popular domain of machine learning increasingly pertinent to real-world applications, from facial recognition in policing to object detection for autonomous vehicles. Given computer vision's propensity to shape machine learning research and impact human life, we seek to understand disciplinary practices around dataset documentation - how data is collected, curated, annotated, and packaged into datasets for computer vision researchers and practitioners to use for model tuning and development. Specifically, we examine what dataset documentation communicates about the underlying values of vision data and the larger practices and goals of computer vision as a field. To conduct this study, we collected a corpus of about 500 computer vision datasets, from which we sampled 114 dataset publications across different vision tasks. Through both a structured and thematic content analysis, we document a number of values around accepted data practices, what makes desirable data, and the treatment of humans in the dataset construction process. We discuss how computer vision datasets authors value efficiency at the expense of care; universality at the expense of contextuality; impartiality at the expense of positionality; and model work at the expense of data work. Many of the silenced values we identify sit in opposition with social computing practices. We conclude with suggestions on how to better incorporate silenced values into the dataset creation and curation process.