Content uploaded by Richard Shearmur
Author content
All content in this area was uploaded by Richard Shearmur on Jan 29, 2016
Content may be subject to copyright.
This article was downloaded by: [McGill University Library]
On: 16 June 2015, At: 11:56
Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Click for updates
Urban Geography
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/rurb20
Dazzled by data: Big Data, the census
and urban geography
Richard Shearmur
a
a
School of Urban Planning, McGill University, Room 400
Macdonald-Harrington Building, 815 rue Sherbrooke Ouest,
Montréal, Québec H3A OC2, Canada
Published online: 10 Jun 2015.
To cite this article: Richard Shearmur (2015): Dazzled by data: Big Data, the census and urban
geography, Urban Geography, DOI: 10.1080/02723638.2015.1050922
To link to this article: http://dx.doi.org/10.1080/02723638.2015.1050922
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
EDITORIAL
Dazzled by data: Big Data, the census and urban geography
Richard Shearmur*
School of Urban Planning, McGill University, Room 400 Macdonald-Harrington Building, 815 rue
Sherbrooke Ouest, Montréal, Québec H3A OC2, Canada
About five years ago I wrote an editorial describing the role that the census (and other
data sources that are authoritative, open to scrutiny, representative of the entire population,
and resting on slowly evolving and relatively consensual definitions) play in building a
shared imaginary and vocabulary that can be used to describe society, highlight changes ,
point out injustices, and identify geographic regularities, exceptions, and peculiarities
(Shearmur, 2010). At the time the Canadian long-form census, the data-gathering exercise
which had allowed Canadian social scientists to track changes over the years, had been
abandoned. Five years later, the results of the 2011 National Hous ehold Survey are in, and
they are virtually unusable to describe Canadian society, or to document trends—such as
the suspected spatial and social polarization of incomes— which have (probably) been
occurring.
I, and many other social science researchers, have voiced these concerns on and off
over the last five years. A common response has been to consider our concerns anachro-
nistic: why on earth worry about the loss of census data when Big Data are here? The
marvels, infinite possibilities and sheer newness of Big Data are contrasted with the staid
and limited informat ion that—it is thought—can be gleaned from the census. For exam-
ple, Facebook can use its information to track the formation and dissolution of networks
in real time, and cell phone companies can map the movements of their customers: can the
census do that? New York City analysts, by searching their massive databases for
correlations, can predict which drain covers are likely to blow up (Mayer-Schönberger
& Cukier, 2014): can the census do that? Our security agencies are probably better
informed than we are about where we will be tomorrow: can the census do that? A
group of students has analysed Foursquare check-ins to estab lish “neighbourhood” zones
within cities
1
: can the census do that? Furthermore, we are told that things will only get
better: researchers at Google are building computers based on neural networks that can
digest almost infinite masses of Big Data and use them to learn new things (i.e. the
computers learn how to learn, following the probabilistic and hierarchical logic of neural
networks) and to “think”. A brave new world is upon us: more data and more computing
capacity are going to reveal all and solve our problems, hubris reminiscent of the post-war
cyberneticists (Mirowski, 2002), data-driven regional scientists of the 1960s and some
GIS analysts of the 1990s. The Big Data vision takes us right back to Laplace’s
positivistic demon, the imaginary—but now, we are told, realisable—entity which,
armed with perfect information, will be able to predict the future, taking all humanness,
imperfection and doubt out of our lives (Laplace, 1814). The Big Data view of the world
*Email: Richard.shearmur@mcgill.ca
Urban Geography, 2015
http://dx.doi.org/10.1080/02723638.2015.1050922
© 2015 Taylor & Francis
Downloaded by [McGill University Library] at 11:56 16 June 2015
is one of absolute determinism, and the proposed solution to residual uncertainty is more
data and better computers.
2
Of course, these new data and computational techniques are powerful tools, and they
allow us to better apprehend and analyse the “thing” world. Businesses can use them to
analyse, influence and predict (in a limited way) the market behaviour of their customers.
Transport agencies can better manage flows and congestion, and combinations of topo-
graphical, vegetation and other featu res can be analysed and visualised far better than
before. Most of these technologies have useful applications, but also have limitations: they
are just as error-prone as any other system that bases its decisions on past trends and
correlations, and, being complex and inscru table, errors are often difficult to pinpoint.
Individuals—such as innocent travellers algorithmically selected to be on no-fly lists—get
crushed, and whilst administrative systems have always done this, Big Data will only
exacerbate, magnify and accelerate the problem. Likewise, no amount of Big Data can
predict entirely novel phenomena—something that humans with imagination have some-
times been able to do, and, indeed, to creat e.
But do Big Data and neural computers tell us much about society? To what extent can
the most powerful neural network ima ginable, which possesses all the Big Data in the
world, really grapple with human desires, with issues of justice, and with deliberative and
conflictual processes where there is no correct answer, no rabbit to pull out of a hat, but
clashes of will, persuasion, emotions, life and death? Furthermore, apart from learning
about these things second-hand, what can a computer—even one that has taught itself to
learn from Big Data—really know about fear, joy, hate or even, more prosaically, about
lying in the grass with the sun shining on its face?
There are two major flaws to the Big Data vision, flaws that have been discussed for
the last 200 years, but which are worth recalling given the role now assigned to Big Data.
First, Big Data can only be about codifiable and digitised information. Big Data and
powerful computers—even those capable of “learning”—abstract totally from the human
corporeal and emotional experience which is alien to them. Since society is not a machine
—pace Laplace’s demon—no amount of information and computing can be equated with
understanding. Understanding—unless one alters its definition to mean correlations and
predictions based on codified past experience—is intrinsically human and calls upon both
remembering and forgetting, calls upon choices, values and theories which carry meaning
for people. There are many understandings of social phenomena, not a singl e under-
standing, and these differing perspectives are irreconcilable with the determinism implicit
behind Big Data rhetoric. It is only if Big Data become society, peripheralising the human,
that their claim to know all will have some (inhuman) validity.
Second, and in line with critiques of the surveillance socie ty, Big Data may appear to
successfully understand and predict things, but only because powerful interests behind
them are shaping the world. Thus, for instance, if Big Data are used to identify terrorists,
but if the notion of “terrorist” is defined by the algorithms and ideologies fed into Big
Data computers, then—tautologically—Big Data will appear successful at this task.
Likewise, if Big Data pick up trends, and are then used to facilitate those trends, then
the data cease to be a tool for understanding the world but rather one for shaping it.
Given these two fundamental points, I will now return to the more limited question
which motivates this editorial: can Big Data—the type currently being generated and used
—tell us more about society than census-type data? There are no doubt divergent opinions
on this, but I suggest that Big Data can reveal different things, can generate some useful
ideas, but cannot repla ce census -type data (see Graham & Shelton, 2013).
2 R. Shearmur
Downloaded by [McGill University Library] at 11:56 16 June 2015
The reasons for this are straightforward: however big the data, Big Data are not about
society, but about users and markets. They are therefore inherently biased in that they do
not track people who fall outside the particular markets or activities being tracked. This is
why these data are incre dibly useful to operators, but only in narrow areas despite their
size. Furthermore, Big Data —collected passively—cannot provide complex cross-tabula-
tions linking individuals to a wide variety of attributes, to families, households, neigh-
bourhoods and jobs. They can often be used to infer relationships, gathering imperfect
information from a wide variety of sources, but such inferences run into common
statistical problems such as the ecological fallacy, non-representative sampling bias and
self-selection bias.
Another limitation of Big Data is fluidity of categories and of definitions: the
concepts tha t un de rpin Big Data collection make short-t erm operational sense to the
data gathe rer s, but do n ot reflect cate gor ies, classifications and concepts developed
through slow public deliberation and dialogue over a number of years—centuries in
some ca ses when it comes to censuses (Shearmur, 2010). Thus, the concepts and
definitions that structure Big Data are rarely what researchers need: rather, researchers
may a dapt their conce pts to the data available—an oppo rtu nistic way of conducting
research that may lead to interesting observations but that w il l often bypass ideas
important to academic and social debate. The concepts will have been implicitly
shaped and constrained by the data gatherers and providers , a further way in which
Big Data sha pe, rather tha n reflect, society.
3
Having said this, the power of Big Data should not be underestimated. They can
reveal new dynamics, can allow for the study of certain processes in real time and can
highlight relationships and correlations that may pass unnoticed using classical methods
and data. As such, for the purposes of social science (and urban geography in particu-
lar), t hey can serve to generate hypotheses inductively: but however useful inductive
reasoning may be, theory and imagination are necessary to understand and interpret
observations. Furthermore, even inferential reasoning requires unbiased data: for infer-
ences that extend beyond the particular markets or operations for which data are
gathered, well-selected samples of the target population remain necessary. To under-
stand how society as a whole is structured and evolves, census-type data are inescapably
essential: and, of course, census data provide a sampling framework against which the
biases of Big Data can be estimated. Without the census there is no such thing as a
population against which sampling and representativity can be assessed.
Big Data are a welcome and interesting new tool which will provide new insights into
urban geographic and other social processes—if they become widely accessible to
researchers. However, given that they are generated by and for businesses, transport
authorities and similar bodies for operational purposes, their principal applications will
remain within the confines of those operations. The danger is that they are dazzling, they
are big and they look—to governments such as the Canadian one, whisper ed to by Big
Business, which understands the power of Big Data—pretty much like other data except
there are more of them and they are cheaper to collect.
The potentiality and limits of Big Data (and of associated advances in computing)
need to be critically explored and understood. Likewise, the continued importance of
gathering census-type data in order for society to be imagined, tracked and apprehended
as a whole—not as a series of superimposed markets and operations—needs to be
reaffirmed. And finally, the messy, human, emotion-driven, political nature of social
processes, which neither census, Big Data nor computers of any stripe can capture, should
Urban Geography 3
Downloaded by [McGill University Library] at 11:56 16 June 2015
be remembered every time hubristic projects such as Google’s DeepMind—the aim of
which is “to solve intelligence ” (Simonite, 2014)—are invoked.
Notes
1. http://www.fastcodesign.com/1669554/a-map-of-your-city-s-invisible-neighborhoods-accord
ing-to-foursquare
2. Set against this hubristic discourse, there is a growing body of work that looks critically at Big
Data. For instance, Boyd and Crawford (2011) present six “provocations” for—or limitations of
—Big Data which overlap some of the points being made in this editorial.
3. Mahrt and Scharkow (2013) discuss many of the limitations of using Big Data in social science
research, and Graham and Shelton (2013) provide a thoughtful discussion on the potentials and
pitfalls of Big Data in human geography.
References
Boyd, Danah, & Crawford, Kate (2011). Provocations for a cultural, technological and scholarly
phenomenon. Information, Communication and Society, 15(2), 662–679.
Graham, Mark, & Shelton, Taylor (2013). Geography and the future of Big Data, Big Data and the
future of geography. Dialogues in Human Geography, 3(3), 255–261.
Laplace, Pierre-Simon (1814). Essai philosophique sur les probabilités. Paris: Courcier. Retrieved
from http://books.google.fr/books?id=rDUJAAAAIAAJ&printsec=frontcover&hl=fr&&pg=
PA2#v=onepage&q&f=false
Mahrt, Merja, & Scharkow, Michael (2013). The value of Big Data in digital media research.
Journal of Broadcasting & Electronic Media, 57(1), 20–33.
Mayer-Schönberger, Victor, & Cukier, Kenneth (2014). Big Data. New York, NY: Mariner Books.
Mirowski, Philip (2002). Machine dreams: Economics becomes a cyborg science. Cambridge:
Cambridge University Press.
Shearmur, Richard (2010). Editorial—A world without data? The unintended consequences of
fashion in geography. Urban Geography, 31(8), 1009–1017.
Simonite, Tom (2014, December 2). Google’s intelligence designer. MIT Technology Review.
Retrieved from http://www.technologyreview.com/news/532876/googles-intelligence-designer/
4 R. Shearmur
Downloaded by [McGill University Library] at 11:56 16 June 2015