ThesisPDF Available

Fact factories: Wikipedia and the power to represent

Authors:

Abstract and Figures

Wikipedia is no longer just another source of knowledge about the world. It is fast becoming a central source, used by other powerful knowledge brokers like Google and Bing to offer authoritative answers to search queries about people, places and things and as information infrastructure for a growing number of Web applications and services. Researchers have found that Wikipedia offers a skewed representation of the world that favours some groups at the expense of others so that representations on the platform have repercussions for the subjects of those representations beyond Wikipedia’s domain. It becomes critical in this context to understand how exactly Wikipedia’s representations come about, what practices give rise to them and what socio-technical arrangements lead to their expression. This ethnographic study of Wikipedia explores the values, principles and practices that guide what knowledge Wikipedia represents. It follows the foundational principles of Wikipedia in its identity both as an encyclopaedia and a product of the free and open source software and internet freedom rhetoric of the early 2000s. Two case studies are analysed against the backdrop of this ideology, illustrating how different sets of actors battle to extend or reject the boundaries of Wikipedia, and in doing so, affect who are defined as the experts, subjects and revolutionaries of the knowledge that is taken up. The findings of this thesis indicate that Wikipedia’s process of decision-making is neither hierarchical nor is it egalitarian; rather, the power to represent on Wikipedia is rhizoid: it happens at the edges rather than in the centre of the network. Instead of everyone having the same power to represent their views on Wikipedia, those who understand how to perform and speak according to Wikipedia’s complex technical, symbolic and policy vocabulary tend to prevail over those who possess disciplinary knowledge about the subject being represented. Wikipedians are no amateurs as many would have us believe; nor are they passive collectors of knowledge held in sources; Wikipedians are, instead, active co-creators of knowledge in the form of facts that they support using specially chosen sources. The authority of Wikipedia and Wikipedians is garnered through the performative acts of citation, through the ability of individual editors to construct the traces that represent citation, and through the stabilization and destabilization of facts according to the ideological viewpoints of its editors. In venerating and selecting certain sources among others, Wikipedians also serve to reaffirm traditional centres of authority, while at the same time amplifying new centres of knowledge and denying the authority of knowledge that is not codified in practice. As a result, Wikipedia is becoming the site of new centres of expertise and authoritative knowledge creation, and is signalling a move towards the professionalization of the expertise required to produce factual data in the context of digital networks.
Content may be subject to copyright.
Fact Factories:
Wikipedia and the power to represent
Heather Ford
Kellogg College
Word count: 81,209
August 2015
Thesis submitted in partial fulfilment of the requirements for the degree of DPhil in
Information, Communication, and the Social Sciences in the Oxford Internet Institute at
the University of Oxford.
2
Acknowledgements
The previous page lists only one name, but so many people were part of the journey that
brought this thesis to fruition.
I thank my two supervisors, Mark Graham and Eric Meyer for challenging me to hone
my arguments and for responding so openly to my requests for feedback. Mark was
instrumental in my choice to come to Oxford to do my DPhil at the OII and Eric made
brilliant suggestions that made me think about Wikipedia in fresh ways. Victoria Nash at
the OII was also instrumental in supporting my application to the Clarendon Fund,
without which I would not have been able to pursue the DPhil at Oxford. Jenna Burrell
and Deidre Mulligan at UC Berkeley supported my application to the OII, and Jenna was
an early mentor and constant inspiration in the ethnography of online communities.
Thanks also to who I call my unofficial supervisor, C. W. Anderson who I met when he
co-organised the Objects of Journalism pre-conference to the 2013 ICA conference in
London. Chris has become a significant source of inspiration and encouragement,
reading successive versions of articles, long rambling emails and chapters, and engaging
patiently with me in discussions about the value and shortcomings of different
theoretical frameworks and offering extremely valuable insights on why Wikipedia has
developed the way it has.
Writing can be a lonely experience, especially without family nearby, and as someone
who withers without the company of friends this was always going to be my most
significant challenge. I am therefore hugely indebted to my extended family of friends
around the world who sent messages of support and encouragement. In particular I
thank Darja Groselj and Isis Hjorth: Darja, for being a kindred spirit in the long and
trying days writing up at Oxford’s Social Science Library, and Isis, for her steadfast belief
(and example) in getting the thesis written on time. Thanks also to my friend Rachelle
Annechino who helped edit some of the chapters and to Hannah Harris for being an
enthusiastic reader and supportive friend throughout my time at Oxford.
Part of this thesis was written up in Devon at the wonderful Urban Writers Retreat.
Thanks to Charlie Haynes for cake and company, to James Alexandrou who furnished me
with the title of the thesis, and to the inimitable Louise Ells who wrote opposite (in her
slipstream, as I like to think) at writing desks in Devon and Oxford. Thanks also to all
my Facebook friends who sent encouraging postcards from around the world that kept
me on track and to my close friends across three continents (especially Carrie, Olivia,
Dan, Thomas, Rachel, Amy, Meg, Lisa, Roseanne, Steph, Vicky, Lara A. and Lara M.,
Clarence, and so many others) that have sent writing wishes from across the waves.
Thanks must also go to the many dedicated Wikipedians who I have met and talked to
over the past five years and more. Special thanks to Dror Kamir who taught me so much
about what it is like to be a Wikipedian in our many hours of conversation, to the
participants of the OII’s Wikipedia in the Middle East project in Egypt and Jordan, and to
Ocaasi and many others who continue to inspire me with their positive outlook and
hopes for the future.
Lastly, to my parents: Ken and Jenny Ford. Two incredible people who have answered
both tearful and ecstatic calls, and who have given everything they have to support a
journey that they never had the benefit of enjoying themselves. They remain my
greatest inspirations.
3
0 Abstract
Wikipedia is no longer just another source of knowledge about the world. It is fast
becoming a central source, used by other powerful knowledge brokers like Google and
Bing to offer authoritative answers to search queries about people, places and things
and as information infrastructure for a growing number of Web applications and
services. Researchers have found that Wikipedia offers a skewed representation of the
world that favours some groups at the expense of others so that representations on the
platform have repercussions for the subjects of those representations beyond
Wikipedia’s domain. It becomes critical in this context to understand how exactly
Wikipedia’s representations come about, what practices give rise to them and what
socio-technical arrangements lead to their expression.
This ethnographic study of Wikipedia explores the values, principles and practices that
guide what knowledge Wikipedia represents. It follows the foundational principles of
Wikipedia in its identity both as an encyclopaedia and a product of the free and open
source software and internet freedom rhetoric of the early 2000s. Two case studies are
analysed against the backdrop of this ideology, illustrating how different sets of actors
battle to extend or reject the boundaries of Wikipedia, and in doing so, affect who are
defined as the experts, subjects and revolutionaries of the knowledge that is taken up.
The findings of this thesis indicate that Wikipedia’s process of decision-making is
neither hierarchical nor is it egalitarian; rather, the power to represent on Wikipedia is
rhizoid: it happens at the edges rather than in the centre of the network. Instead
of everyone having the same power to represent their views on Wikipedia, those who
understand how to perform and speak according to Wikipedia’s complex technical,
symbolic and policy vocabulary tend to prevail over those who possess disciplinary
knowledge about the subject being represented. Wikipedians are no amateurs as many
would have us believe; nor are they passive collectors of knowledge held in sources;
Wikipedians are, instead, active co-creators of knowledge in the form of facts that they
support using specially chosen sources.
The authority of Wikipedia and Wikipedians is garnered through the performative acts
of citation, through the ability of individual editors to construct the traces that represent
citation, and through the stabilization and destabilization of facts according to the
ideological viewpoints of its editors. In venerating and selecting certain sources among
others, Wikipedians also serve to reaffirm traditional centres of authority, while at the
same time amplifying new centres of knowledge and denying the authority of
knowledge that is not codified in practice. As a result, Wikipedia is becoming the site of
new centres of expertise and authoritative knowledge creation, and is signalling a move
towards the professionalization of the expertise required to produce factual data in the
context of digital networks.
4
Contents:
Abstract ................................................................................................................................................................. 3
Chapter 1: Introduction .............................................................................................................................. 11
1.1 Wikipedia becomes authoritative ....................................................................................... 14
1.2 The demise of the gatekeeper and the rise of the amateur ....................................... 21
1.3 Stratification and skew ............................................................................................................ 27
1.4 The rise of new expertise? ...................................................................................................... 29
1.5 Research design and objectives ............................................................................................ 40
1.6 Conclusion ..................................................................................................................................... 45
Chapter 2: Theoretical framework ......................................................................................................... 47
2.1 Social vs. technological determinism ................................................................................. 49
2.2 Facts and knowledge................................................................................................................. 58
2.3 Language, ideology and rhetoric .......................................................................................... 67
2.4 Software and new tools for language construction ...................................................... 76
2.5 Co-production .............................................................................................................................. 81
2.6 Conclusion ..................................................................................................................................... 84
Chapter 3: Research design and methodology .................................................................................. 87
3.1 A multi-sited approach............................................................................................................. 91
3.2 Case selection ............................................................................................................................... 95
3.3 Methods and principles ......................................................................................................... 100
3.4 Data analysis............................................................................................................................... 109
3.5 Navigating challenges ............................................................................................................. 112
3.6 Ensuring quality in qualitative enquiry .......................................................................... 113
3.7 Ethical considerations ............................................................................................................ 116
3.8 Conclusion ................................................................................................................................... 118
Chapter 4: Encyclopaedic identity and the verifiability principle ........................................... 119
5
4.1 Wikipedia’s encyclopaedic heritage ................................................................................. 120
4.2 Verifiability ................................................................................................................................. 127
4.3 Verifiability as ideology ......................................................................................................... 137
4.4 Ideological strains .................................................................................................................... 147
4.5 Conclusion ................................................................................................................................... 151
Chapter 5: The slow progress of surr .................................................................................................. 153
5.1 An account of the fact’s travels ........................................................................................... 156
5.2 How well did the fact travel? ............................................................................................... 184
5.3 The fact’s terrain ....................................................................................................................... 190
5.4 Conclusion ................................................................................................................................... 195
Chapter 6: The rapid and fruitful travel of the Egyptian Revolution of 2011 facts .......... 199
6.1 An account of the fact’s travels ........................................................................................... 202
6.2 Traveling companions ............................................................................................................ 213
6.3 True companions ...................................................................................................................... 230
6.4 How well did the fact travel? ............................................................................................... 242
6.5 Conclusion ................................................................................................................................... 246
Chapter 7: Wikipedia’s search for encyclopaedic identity ......................................................... 249
7.1 Wikipedia’s traveling facts ................................................................................................... 252
7.2 The reconfiguration of authority and expertise .......................................................... 262
7.3 The distribution and sources of power on Wikipedia ............................................... 273
7.4 Conclusion ................................................................................................................................... 280
Chapter 8: Conclusion ................................................................................................................................ 283
8.1 Key contributions ..................................................................................................................... 286
8.2 Implications ................................................................................................................................ 294
8.3 Limitations .................................................................................................................................. 298
8.4 Future work ................................................................................................................................ 300
6
8.5 Concluding remarks ................................................................................................................ 301
References ...................................................................................................................................................... 303
Primary sources ........................................................................................................................................... 321
Appendix A: Participant information sheet ...................................................................................... 323
Appendix B: Consent form ....................................................................................................................... 324
7
Figures:
Figure 1.1 Screenshot of results of user query for ‘London’ on Google .................................. 15
Figure 1.2 Screenshot of search results on Facebook for ‘London’ displaying information
extracted from Wikipedia .......................................................................................................................... 17
Figure 1.3 Wikipedia’s organizational chart according to Arazy, Nov & Ortega (2014) .. 34
Figure 1.4 Wikimedia Foundation projects ........................................................................................ 42
Figure 2.1 The stabilization of facts and artifacts according to Bijker, Hughes & Pinch
(1987) ................................................................................................................................................................. 50
Figure 2.2 Factors influencing how far a fact travels according to Morgan (2010b) ........ 64
Figure 3.1 Screenshot of Twitter feeds for four keyword searches ....................................... 101
Figure 3.2 Graph used in one of the interviews for trace interviewing ................................ 108
Figure 3.3 Memo from 31 January 2015 ............................................................................................ 109
Figure 4.1 Screenshot of the entry for ‘encyclopaedia’ on Encyclopaedia Britannica
online ................................................................................................................................................................ 125
Figure 4.2 Screenshot of the Wikipedia entry for ‘encyclopaedia’ .......................................... 126
Figure 4.3 Hierarchy of rules on Wikipedia ...................................................................................... 128
Figure 4.4 Core content policies from English Wikipedia........................................................... 129
Figure 4.5 ‘Wikipedian protester’ by Randal Munroe, xkcd ...................................................... 135
Figure 4.6 Google Trends report showing the use of the term ‘citation needed’ (outside
of Wikipedia) ................................................................................................................................................. 137
Figure 5.1 The facts and traveling companions of surr ............................................................... 157
Figure 5.2: Interview with Mr Deepak Tripathi by [Siddharath] ............................................. 159
Figure 5.3 Surr images created by [Siddharth.tripathi] and [Utcursch] respectively ..... 162
Figure 5.4 Infobox for the Oral Citations Project hosted by the Wikimedia Foundation in
2011 .................................................................................................................................................................. 163
Figure 5.5 Screenshot from the Oral Citations (‘People are Knowledge’) Project page . 163
8
Figure 5.6 The consolidation of surr on English Wikipedia ....................................................... 165
Figure 5.7 Revision as of 13 February 2012 ..................................................................................... 178
Figure 5.8 Revision as of 24 February 2012 ..................................................................................... 178
Figure 5.9 Revision as of 7 March 2012 by [MER-C] ..................................................................... 179
Figure 5.10 Revision as of 18:27, 8 May 2013 ................................................................................. 181
Figure 5.11 Screenshot of the Huggle interface .............................................................................. 182
Figure 6.1 Facts and traveling companions of the Egyptian Revolution of 2011 article203
Figure 6.2 Infobox from the first version of the Egyptian Revolution of 2011 (then
‘Protests’) article .......................................................................................................................................... 205
Figure 6.3 Screenshot of the three cleanup tags automatically appended to the head of
the Egyptian Revolution of 2011 article when it was published on 25 January 2011 .... 206
Figure 6.4 Death count table from the 31 January 2011 version of the Egyptian
Revolution of 2011 Wikipedia article ................................................................................................. 210
Figure 6.5 Numbers of edits to the Egyptian Revolution of 2011 article on English
Wikipedia in 2011 ....................................................................................................................................... 213
Figure 6.6 10 most prolific editors of the article and talk page of the Egyptian Revolution
of 2011 article (as of 5 May 2015) ........................................................................................................ 216
Figure 6.7 The three most prolific bot editors of the Egyptian Revolution of 2011 English
Wikipedia article .......................................................................................................................................... 219
Figure 6.8 Domains of citations added to the Egyptian Revolution of 2011 article from
25 January 2011 to 14 July 2014. .......................................................................................................... 221
Figure 6.9 20 Most popular sources in the 2011 Egyptian Revolution article ................... 223
Figure 6.10 Citations for the Egyptian Revolution of 2011 English Wikipedia article
according to author type .......................................................................................................................... 224
Figure 6.11 Most popular media sources from the 2011 Egyptian Revolution article ... 225
9
Figure 6.12 Academic citations from the Egyptian Revolution of 2011 Wikipedia article
as of 14 July 2014 ........................................................................................................................................ 228
Figure 6.13 Mohamed ElBaradei’s tweet linked as a citation to the Egyptian Revolution
of 2011 article as of 14 July 2014 ......................................................................................................... 236
Figure 6.14 Citation counts of social media sources removed over time from Egyptian
Revolution of 2011 article ....................................................................................................................... 236
Figure 6.15 Rate at which traditional media vs. social media citations were removed
from the Egyptian Revolution of 2011 Wikipedia article ........................................................... 237
Figure 6.16 The 2011 Egyptian Revolution as represented in Wikidata as Q29198 ...... 243
10
Tables:
Table 3.1 Comparing the traveling companions of surr and the 2011 Egyptian
Revolution on English Wikipedia ............................................................................................................ 97
Table 3.2 Data and documents analysed .............................................................................................. 99
Table 3.3 Table of key sources from the field .................................................................................. 102
Table 3.4 Creator categories codebook .............................................................................................. 111
Table 5.1 Wikipedia articles created or added to as part of the Oral Citations Project . 155
Table 5.2 Editors of surr and their appearance in the spaces used for discussing or
editing the article ......................................................................................................................................... 186
Table 7.1 Strategies for stabilizing and destabilizing articles by editors ............................. 268
11
1 Chapter 1: Introduction
Source: https://twitter.com/Pawelotti/status/609718088242606082
My interest in Wikipedia began when I heard of a case in which Wikipedia editors had
repeatedly deleted an article about the Kenyan superhero character, Makmende from
English Wikipedia. Makmende is the Sheng (Swahili slang) word for ‘hero’ that
originated from the character Dirty Harry played by Clint Eastwood in the 1983 film
‘Sudden Impact’. The term, ‘Makmende’ (an amalgam of phrase, ‘Make my day’ uttered
by Eastwood’s character in the film) was popular in the 1990s in Kenya but enjoyed a
resurgence when Nairobi-based band, ‘Just a Band’ featured Makmende in their
YouTube music video for the song, ‘Ha He’ in March 2010.
1
The music video went viral in
Kenya and inspired a series of remixes on Facebook and Twitter. Ethan Zuckerman
(2010) explained how Kenyans had tried to start an article dedicated to Makmende on
Wikipedia and that they had their contributions repeatedly deleted and I wrote a follow-
up to the deletion story in an article about Kenyans’ motivations to participate in
Wikipedia (Ford, 2011).
My article prompted debate on Wikipedia forums. Through this, and through further
research into the practice of article deletions, I learned that my initial reaction to the
1
See https://www.youtube.com/watch?v=_mG1vIeETHc.
12
case was misguided. I originally believed that ideological interests were the primary
force behind the actions taken by editors to remove the article from Wikipedia.
Wikipedia editors, being predominantly white, Western men, were simply unwilling to
recognise the importance of cultural phenomena taking place far away from them.
Learning more about the process of article deletion, however, I began to recognise two
key factors that I had not previously understood. The first was that the actions taken by
editors to delete the article were the actions of a very small number of individuals acting
on their own interpretation of Wikipedia policy, rather than the entire body of editors
interpreting policy in the same way. What I originally recognised as ‘Wikipedia deleting’
Makmende, I came to understand as a few, widely distributed editors interpreting policy
in particular ways that would allow them to legitimise their deletion of the Makmende
article, but by extension many other editorial acts as well.
The second was that Wikipedia’s processes are not solely human activities carried out
on a neutral platform, but are strictly mediated by software code and that these
interactions between humans and code follow a predetermined logic that strongly
shapes how phenomena within articles are accepted. When the Makmende article was
first created it was evaluated by editors in a semi-automated process that employs
particular heuristics and filters to determine whether the content met Wikipedia
standards. When content moves through Wikipedia’s socio-technical system, it is
transposed by different work groups, each focusing on particular aspects and using
different technologically mediated lenses with which to evaluate it. The representation
of Makmende on Wikipedia, in other words, was decontextualized and broken up into
myriad different pieces through the semi-automated process that prevented editors
13
recognising what this knowledge represented for those who were attempting to have it
seen by the world.
Furthermore, intricate details about the process were exposed by the debate on the
Wikimedia-l mailing list which suggested numerous policy reasons why the article could
have been deleted, and equally numerous reasons why it could have been accepted.
There was already a Wikipedia article about ‘Just a Band’. According to some editors, the
best place for facts about Makmende was in that article. On the other hand, it was
argued that the Makmende character and the music video that was inspired by him were
significant to Kenya’s cultural history since they represented Kenya’s first Internet
meme. In a first attempt at creating an article about Makmende, the author had written a
sentence that looked to be vandalism since it took the form of the Chuck Norris-style
jokes that emerged from the Makmende meme. The second attempt to create the article
constituted a copyright violation because its content, although factual, was taken from a
music website and was unattributed. According to editors contributing to the discussion
on Wikimedia-l, however, Wikipedians are encouraged to improve articles that are
weak, rather than to dismiss them entirely through deletion.
What was clear throughout this passionate debate was that it was really important to
Kenyans to have Makmende represented on English Wikipedia. When I interviewed
Kenyan Wikipedians in 2011, they saw the deletions as just another series of actions
taken by Westerners to dismiss a fact of Kenyan culture as unimportant and un-notable.
For the Kenyans whom I spoke with, Wikipedia had originally promised an opportunity
for them to have facts about their world displayed equally with others on a global
platform. What was originally recognised as an opportunity for recognition, to be seen
as a creative participant in a world, became a disappointment through the example of
Makmende and other examples of deletionism that I have written about.
14
More than anything, this event demonstrated how English Wikipedia is being seen as an
authoritative platform for facts about the world, that English Wikipedia has significant
power to represent knowledge and that Wikipedia editors have become a significant
power brokers for the dissemination of authoritative knowledge. It therefore becomes
important to understand how Wikipedia filters knowledge about the world. The answer
to this question requires more than simplistic theories about individuals with
idiosyncratic and unpredictable interests. Instead, Wikipedia, is highly complex,
distributed and mediated and the power to represent one’s knowledge within
Wikipedia’s socio-technical system is a result of the mastery of both its technical and
social features.
This thesis engages with the question of how an ostensibly egalitarian platform for
creating a reliable reference source has reconfigured notions of expertise, authority and
the power to represent knowledge in the age of the network. Specifically, it is an attempt
to understand the impact of Wikipedia’s representational system on who are considered
the experts, the research subjects and the revolutionaries of the facts that it represents.
Before suggesting how such questions might be answered, it is important to understand
exactly how and why Wikipedia has become an important venue for the representation
of facts about the world. In order to understand this, we must first look at how
Wikipedia has become a critical feature of the infrastructure of the Internet as a whole,
particularly in its relationship to search engines such as Google.
1.1 Wikipedia becomes authoritative
On the 16th of May 2012, Google announced a new project called the Knowledge Graph
that signalled the move towards a new trend in search. In a blog post entitled
15
Introduction to the Knowledge Graph: things not strings’, Senior Vice President of
Google Engineering, Amit Singhal wrote that Google would be using ‘public sources such
as Freebase, Wikipedia and the CIA World Factbook’ to enable more efficient resolution
of queries by users within Google’s domain rather than them having to navigate away
(Singhal, 2012). Instead of offering the user a long list of possible answers to their
queries, Google would also present facts about a user’s query in a summarized infobox.
A search for ‘London’, for example, would result in a prominent infobox on the right-
hand side of the page listing facts about London extracted from Wikipedia, among other
sites as can be seen in Figure 1.1.
Figure 1.1 Screenshot of results of user query for ‘London’ on Google
Source: Google.com, 11 June 2015
This move was significant for two key reasons. Firstly, it established Google as a source
of facts rather than an indexer of unverified information, and secondly, it validated
Wikipedia as an authoritative source of those facts. Instead of Google presenting a list of
alternative sources that may provide the answer to a user’s query, Google was making a
16
statement that it could also know the answer to that query. With Singhal’s comment that
Wikipedia was a ‘public source of information, Google implied that Wikipedia was
representative of a collective consensus that made it authoritative as a source of facts
(rather than mere claims) about the world.
Google had always prioritized Wikipedia in its search results. According to a 2012 study
by search engine optimisation company Intelligent Positioning, Wikipedia pages appear
on the first page of Google for 99% of searches and Wikipedia is the first result on
Google for 56% of searches (Silverwood-Cope, 2012). Google has been an important ally
to Wikipedia, sending 61% of its traffic to Wikipedia between 2003 and 2008 a period
during which Wikipedia traffic grew by 8,000% (Johnson, 2008).
2
The Knowledge Graph
was different, however, because it changed the fundamental communicative
relationship between the user, Google and Wikipedia. Instead of a user asking the search
engine for all the possible venues in which they could find an answer to their query, the
user was being presented with the answer and there was an assumption that Wikipedia
was the authoritative source of such answers.
Since 2012, other search engines such as Bing, AOL and Lycos have replicated this
service by using Wikipedia and other knowledge bases to present facts about a user’s
query. Facebook similarly presents information from Wikipedia about people, places
and things using its Graph search tool (see figure 1.2) and multiple software
applications are starting to extract facts from Wikipedia in order to enrich their services.
Such extraction activities have been enhanced by a project of the Wikimedia Foundation
called Wikidata started in 2012. Funded in part by Google, Wikidata is a database that
2
It isn’t only Google that prioritizes Wikipedia in search results. Microsoft’s search engine, Bing,
was found to favour Wikipedia in search results even more than Google (Goodwin, 2012).
17
extracts information from Wikipedia and other Wikimedia Foundation projects, as well
as other sources of open data, and makes it available in both a human and machine-
readable format. Wikidata’s format and open license enables third parties such as
Google to reuse structured information from Wikipedia and/or to use the platform as a
host for their own structured data.
3
Now, a number of websites and applications embed
data extracted via Wikidata, and some external data repositories, most notably Google’s
Freebase, have been discontinued and their data and API migrated to Wikidata. This has
led to a growing centralisation of factual data housed in Wikimedia projects.
Figure 1.2 Screenshot of search results on Facebook for ‘London’ displaying information
extracted from Wikipedia
Source: Facebook.com, 11 June 2015
As these changes were implemented relatively recently at the time of this study, and
were also largely unexplored in terms of their impacts on the perception of facts online,
this study has been designed to take advantage of this. Facts have enormous power to
3
Data is structured when it resides in fixed fields according to a data model that standardizes
how different data elements relate to one another.
18
determine who the winners and losers are in important political battles. Some of the
more notable battles over facts have taken place over the existence of climate change
and the dangers of tobacco smoking (Oreskes & Conway, 2011), but battles about how
authoritative statements are structured are a feature of daily life from the ways in
which statements about our health are made by medical practitioners to the naming of
political groups like the group that calls itself the Islamic State (Dathan, 2015).
Because of this power, the platforms that are more trusted than others to display facts
about the world have become important sites of struggle in which different groups of
actors vie for control over their representation. As Morgan (2010b, p. 4) notes, certain
facts travel better than others, and it is the possibilities for facts to travel well that is
important to our lives. In countries where facts about HIV-AIDS have not travelled well
because they have been deemed to be illegitimate, for example, the dangers of the
epidemic have been exacerbated (Morgan, 2010b, p. 4).
These struggles are increasingly coming to the attention of the news media as they start
to investigate the representation of phenomena that are important features of public
debates. Recently, journalists have covered the speed at which English Wikipedia
changed the article on Bruce Jenner
4
to represent her new name and gender; one
journalist wrote that this signified the urgency of updating an information source that
everybody uses (Ramos, 2015). Wikipedia administrators’ management of the
gamergate
5
controversy was also recently subject to review when the Guardian
4
Caitlyn Jenner (was Bruce Jenner) is an American athlete who came out as a transgender
woman in April 2015.
5
The gamergate controversy occurred when several women from the video game industry were
subjected to a campaign of misogynist attacks from August 2014.
19
published a story about Wikipedia’s arbitration committee voting to ban certain editors
from editing articles about feminism (Hern, 2015).
Not only are the news media starting to cover how Wikipedia represents the world, but
also how politicians, public officials and corporations are editing articles in which they
have a potential conflict of interest. In March 2015, a number of media outlets in the
United States covered how the New York Police Department was caught editing articles
on police brutality (Mathis-Lilley, 2015) and during the 2015 British election campaign,
there was a controversy around an IP address linked to Tory MP, Grant Schapps being
banned from editing (Ramesh, 2015). The Wikimedia Foundation strongly discourages
editors who have a conflict of interest from editing articles directly, and banned
hundreds of accounts from a public relations firm in 2013 for making edits that were
determined to be subverting Wikipedia policies (Arthur, 2013).
Wikipedia’s portrayal of phenomena has become so influential that many governments
recognise it as a critical platform for propaganda. Journalists are regularly tipped off by
Wikipedia editors or by automated engines that publish details of articles relating to
local politics edited by users under government IP addresses. In one example, editors
from inside the Kremlin and Kremlin supporters tried to change an article on Russian
Wikipedia relating to the MH17 disaster to reflect the narrative advocated by the
Russian government that a Ukrainian jet had tried to shoot down the plane. In the
United States, the entire US Congress IP address range was banned after repeated
vandalism, among which was an edit claiming that Donald Rumsfeld was an alien lizard
who eats Mexican babies (Miller, 2014).
20
Scientists and academics are also starting to recognise the importance of having their
research reflected in Wikipedia articles as an effective method of amplifying their ability
to communicate research results to the broader public (Teplitskiy, Lu, & Duede, 2015).
With the increased pressure on scientists and academics to indicate the public impact of
their work, researchers are becoming increasingly interested in whether Wikipedia
articles contain citations to their work (Reich, 2011) and in promoting themselves in
articles about their work (Elvebakk, 2008). Academics also use Wikipedia for their own
research (Weller, Dornstädter, Freimanis, Klein, & Perez, 2010), either as a source of
background reading or in citations (Dooley, 2010).
6
Although some academics have been able to successfully edit Wikipedia articles with
citations to their research, there have been some notable failures. In 2012, history
professor Timothy Messer-Kruse wrote about his experience trying (and failing) to edit
the article about the Haymarket riot, a subject that he had been researching for 10 years
(Messer-Kruse, 2012). Editors of the article continuously reverted Messer-Kruse’s edits
when he wrote that earlier analysis of the riots was incorrect. Editors declared firstly
that his edits constituted original research and needed to be attributed to a reliable
source. When Messer-Kruse published an academic book about the subject years later
and cited the publication, his edits were once again reverted because editors declared
that he had a conflict of interest.
It isn’t only in the areas of history and biography that conflict occurs. A list of
controversial subject areas on Wikipedia includes science, economics, linguistics and the
environment, along with politics, religion and people. The list of controversies regarding
6
Citations to Wikipedia are still controversial in many fields where a citation to Wikipedia would
be frowned upon.
21
science, biology and health include topics relating to AIDS, aspartame, intelligent design
and obesity, thus indicating the ways in which Wikipedia mirrors debates in larger
society.
7
Wikipedia is, however, not a perfect mirror of knowledge no representation
is. It is therefore important to understand how such representation occurs and who (or
what) has the power and authority to determine what Wikipedia represents. Wikipedia
seems to be both signalling the rise of new centres of expertise and authority in those
who are able to effectively edit Wikipedia while at the same time as reaffirming certain
knowledge authorities in its choice of sources. Analysing exactly how authority on
Wikipedia is constructed empirically is the goal of this thesis.
1.2 The demise of the gatekeeper and the rise of the amateur
The dominant theory surrounding Wikipedia and other participatory production
systems including free and open source software, citizen science, citizen journalism and
volunteered geographic information is founded on the idea that the Internet has enabled
the removal of the gatekeeper figure and the rise of the everywo/man. This idea is
present in both scholarship and media discourse, and is highlighted by the use of
imagery where power is being wrested from traditional publishing institutions,
institutions that house knowledge workers, as well as institutions supporting the
education and training of knowledge workers. The Internet and its free and open
principles meant that no longer did people require traditional academic and media
publishers to distribute their writing and creative work, no longer were institutions
required in order to gain access to expensive equipment, no longer did people require
educational institutions in order to be educated or certified. When information was free
7
See https://en.wikipedia.org/wiki/Wikipedia:List_of_controversial_issues.
22
and accessible, institutions would crumble and with them the gatekeepers who
prevented the free flow of information.
The 2006 Time Magazine Person of the Year Award perhaps best illustrates this
zeitgeist. In December 2006, Time Magazine declared that its person of the year was
‘you’. The cover of the magazine featured a computer screen with the words, You. Yes,
you. You control the Information Age. Welcome to your world’ (Grossman, 2006). The
editorial argued that ordinary people now controlled the means of producing
information and media because they dissolved the power of the gatekeepers who had
previously controlled the public’s access to information.
[2006 is] a story about community and collaboration on a scale never seen
before. It's about the cosmic compendium of knowledge Wikipedia and the
million-channel people's network YouTube and the online metropolis MySpace.
It's about the many wresting power from the few and helping one another for
nothing and how that will not only change the world, but also change the way
the world changes. (Grossman, 2006)
The idea that the ‘many’ were ‘wresting power from the few’ was shared by a host of
commentators and scholars at the time. This democratic ideal was inspired by the belief
that many more people were now doing work that had previously been done by
credentialed individuals within large institutions. Now, non-academics could write
encyclopaedia articles, laypeople could produce films, and concerned citizens could
produce news articles. With the decrease in the costs of the means of production and the
connection of millions of ordinary people to a network of potential audience members,
co-producers, employees and publishers, anyone could be a journalist, an engineer or a
scientist (Gillmor, 2008; Leadbeater & Miller, 2004; Shirky, 2009); now anyone could be
an expert in something (Weinberger, 2011). We had moved from a ‘read only’ culture to
a ‘read write culture (Lessig, 2009) that was characterised by active cultures of
participation (Jenkins, 2006) instead of passive consumption.
23
Yochai Benkler (2006) offers one of the prevailing theories for explaining Wikipedia as a
platform for the free expression of people who are unencumbered by the gatekeepers of
the past. Benkler argues that there have been significant changes in the organization of
information production that have resulted in the rise of nonmarket and non-proprietary
production, where individuals are able to take a more active role than was previously
possible in the production of information goods. This rise of individual agency leads to
an inevitable clash with previous hierarchical market-driven industries which will
inevitably decide the fate of each of these models. Benkler uses Wikipedia as a key
example of nonmarket, non-proprietary and non-hierarchical peer production where
incentives to create cultural goods are based not on price signals but by pro-social goals.
He extends the idea of ‘pro-social goals’ in a paper with Helen Nissenbaum (Benkler &
Nissenbaum, 2006) which asserts that peer production offers opportunities for people
to exhibit and experience virtuous behaviour, and enables positive character formation
among those who participate in projects like Wikipedia.
Benkler’s framing of peer production is based on two key theories of power relating
firstly to the power of individuals who co-create peer production products like
Wikipedia, and secondly to the products that are developed through peer production.
Firstly, Benkler argues that individuals’ autonomy is on the rise. Because it is cheap to
communicate in the networked public sphere, individuals can represent their own
interests as well as loosely associate with others with similar interests, and thereby
avoid the gatekeeping power of the media.
The various formats of the networked public sphere provide anyone with an
outlet to speak, to inquire, to investigate, without need to access the resources of
a major media organization. (Benkler, 2006, p. 11)
24
Because individuals are independent from organisations, corporations, and the market,
the assumption is that individuals have equal power to speak because they can very
easily (that is, cheaply) access the means of producing information. Peer production
products such as Wikipedia are able to avoid bias, according to Benkler, because
corporate media pressures are not there to taint their editors. Benkler argues that the
new networked communication environment enables many more people (to) connect
their perspectives to many others and to do so in a way that cannot be controlled by
media owners and is not as easily corruptible by money as were the mass media
(Benkler, 2006, p. 11).
Secondly, Benkler writes that information products in the networked public sphere can
never have too much power (unlike their corporate media counterparts) because there
are many alternative products to choose from, and it is difficult to buy attention or use
money to squelch an opposing view. In other words, the networked public sphere
enables parallel alternative visions of the world where individuals are not reliant or
dependent on the mass media. Benkler compares the power of individuals and
information products within the non-market sphere with that of corporate media, which
he argues restricts particular points of view because there are too few gatekeepers in
relation to the interests that need to be served, and because corporate media tends to
concentrate on politically unengaged programming in order to gain the most profit.
In each of these areas, Benkler compares non-market peer production favourably to
corporate media. Whereas corporate media suffer from too few gatekeepers to
represent diverse interests, individuals can represent their own interests in the
networked information sphere; whereas corporate media owners have too much power
to shape opinion, diverse groups of individuals collaborating together produce a range
25
of alternatives; whereas corporate media produces politically unengaged programming,
the networked public sphere enables individuals who are unhindered by these
restrictions to produce politically engaged, reflexive programming, thereby performing
the watchdog function the media was previously tasked with.
The image of the destruction of the gatekeeper, in the form of either the knowledge
institution or the knowledge worker has been emphasised by other scholars. David
Weinberger (2011), for example, writes that the Internet has enabled the destruction of
the gatekeeper figure who took the form of the editor or curator.
No editors and curators who get to decide what is in or out. No agreed-upon
walls to let us know that knowledge begins here, while outside uncertainty
reigns at least none that everyone accepts… The Internet is what you get when
everyone is a curator and everything is linked. (Weinberger, 2011, p. 45)
Weinberger’s belief is that, in the past, ‘(e)xperts were a special class (Weinberger,
2011, p. 67) where in order to publish books, people had to pass through editorial
filters, but that ‘On the Net, everyone is potentially an expert in something’ (Weinberger,
2011, p. 67).
Accompanying the theme of the gatekeeper’s demise was an image of the rise of a
different figure, that of the amateur (Keen, 2008; Leadbeater & Miller, 2004; Shirky,
2009). Amateur identity is defined by the independence of an individual from the
institutions and organisations that previously housed expert identities, and is signified
by non-professionals becoming involved in the production of news (‘citizen journalism’),
science (‘citizen science’) and engineering through free and open source software
projects. Previously, the work of journalism, science and engineering had been confined
to those working within (and certified by) institutions that employed them as
journalists, scientists and engineers. The rapidly diminishing cost of the means of
producing information led to disruption, not only in the ways in which information was
26
being accessed, but also in the identities of those who were said to be producing
information.
For Dan Gillmor (2008), a new cadre of citizen journalists (previously the audience) was
now producing news outside the purview of the few large media conglomerates. Clay
Shirky (2009) noted that the wide availability of tools for organizing and
communicating has led to the ‘mass amateurization of society’ that breaks previous
definitions of journalism, journalists and journalistic privilege. This is the age, or the
‘cult’ according to Andrew Keen (2007), of the amateur.
A number of scholars have responded to Benkler, Shirky and others who have heralded
participatory media as liberatory. Nathaniel Tkacz (2012), for example, argues that
‘discourses of collaboration are, like openness, depoliticized’ (p. 82) by those who tend
to downplay the organising forces within collaborative work but that there is an
invisible politics at work in projects like Wikipedia. Kress, Finn and Turner (2011)
doubt peer production’s revolutionary potential and note that there are consequences of
‘peer production’s failure to develop institutional mechanisms that secure bureaucratic
values such as inclusion, explicit rule-making, accountability and institution persistence’
(p. 14). Others have developed empirical studies in order to engage with theories of
peer production projects. Studying the Wreckamovie peer production film community,
Isis Hjorth (2013) challenged core assumptions associated with peer production of
culture, particular in relation to the supposed distinction between networked cultural
production and peer production. In the realm of empirical research on Wikipedia, there
are similarly conflicting reports from the field.
27
1.3 Stratification and skew
Despite the chorus of voices declaring this to be an age of active participation in the
production of knowledge and culture, there is evidence of a significant process of social
stratification along a number of different lines within peer production communities
where participation is, according to the rhetoric, open to everyone. Whereas Facebook
and Twitter have more or less equal numbers of female and male users (Duggan, Ellison,
Lampe, Lenhart, & Madden, 2015), demographic studies of Wikipedians (Glott & Ghosh,
2010; Hill & Shaw, 2013; Lam et al., 2011), citizen scientists (TTFNROB, 2015) and map-
makers (Stephens, 2013) indicate that there is a significant gender, geographic and
socio-economic skew in who participates in open projects. Demographic studies of
Wikipedians indicate the highest skews such that between 84 and 90% of Wikipedia
editors are men (Glott & Ghosh, 2010; Hill & Shaw, 2013; Lam et al., 2011), the majority
have tertiary education and a significant number of editors across language versions
speak English (Wikimedia Foundation, 2011).
Some scholars argue that Wikipedia’s representations of the world reproduce existing
asymmetries. Mapping geotagged articles in English Wikipedia, Mark Graham finds that
almost all of Africa is poorly represented in Wikipedia (Graham, 2011, p. 275). On the
issue of Wikipedia’s gender skew, Lam et al. (2011) finds that the low proportion of
females participating in English Wikipedia resulted in measurable imbalances relating
to content quality so that articles relevant to women are significantly shorter and have
lower assessment ratings than those interesting to men (Lam et al., 2011, p. 6).
Similarly, Reagle and Rhue (2011) find that, although Wikipedia biographies on women
are longer and more numerous than Encyclopaedia Britannica in absolute terms,
Wikipedia articles on women are more likely to be missing than are articles on men
relative to Britannica’ (p. 1138). Analysing accounts of Singaporean and Philippine
28
history on Wikipedia, Luyt (2011) argues that, despite the potential of new media for
making visible previously marginalized voices, a more likely outcome is a reproduction
of the status quo in historical representation.
Other studies have indicated that Wikipedia’s representations do not only mirror
existing asymmetries but can actually exacerbate those asymmetries. Graham, Hogan,
Straumann, & Medhat (2014) argue that it isn’t only issues of connectivity that prevent
people in developing countries from contributing to Wikipedia and that representation
is a vicious cycle for those with strong editing cultures in local languages, while those on
the peripheries of these countries fail to reach critical mass (Graham et al., 2014, p. 14).
Similarly, Joseph Reagle (2013) extends these findings with a qualitative study in which
he argues that low female participation in free culture communities, particularly within
Wikipedia, is the product of a culture that is alienating towards women and that gender
disparities are even worse in Wikipedia than in the computing culture from which it
arose. Reasons for this include that geek stereotypes can be alienating, open
communities are especially susceptible to difficult people, and the ideas of freedom and
openness can be used to dismiss concerns and rationalize the gender gap as a matter of
preference and choice.
Other scholars have investigated the process by which Wikipedia editors decide what
should be included or excluded from the encyclopaedia. A number of articles are deleted
every day on Wikipedia, either according to a process of deliberation amongst editors or
a process in which an administrator unilaterally decides to delete an article according to
criteria for speedy deletion. A study of article deletions on the English Wikipedia that I
undertook with Stuart Geiger in 2011, for example, showed a clear division among
experienced and inexperienced users. The article deletion process is managed by a
29
relatively small number of longstanding users, and the majority of deleted articles are
deleted under the criteria of no indication of importance rather than for spam,
copyright violations or patent nonsense which constituted only about 6% of all deletions.
We also found that the majority of deletion discussions have very few participants, most
of whom have previously participated in such discussions as experienced users (Geiger
& Ford, 2011).
This growing stratification of users along the lines of experience is extended in work by
Schneider, Samp, Passant & Decker (2013). The authors analysed argumentation
patterns that are used in evaluation and decision-making on Wikipedia using a sample
of the 500 deletion discussions (on average) that take place every week on Wikipedia.
They found that familiarity with Wikipedia’s policies and norms correlates with
newcomers’ ability to craft persuasive arguments, and that acceptable arguments
employ community-appropriate rhetoric that demonstrates knowledge of policies and
values. There are 56 English Wikipedia policies, about a hundred guidelines and
hundreds of essays about Wikipedia norms and values. Knowing how to speak in
Wikipedia’s complex language seems to be increasingly important to being a productive
member of the community.
1.4 The rise of new expertise?
Expert identity is not only the result of an individual’s technical mastery over a
particular subject, independent from her social context. Numerous scholars have
recognised that identity is, in fact, as much a social phenomenon as it is an outcome of
technical prowess (Goffman, 1959; Jasanoff, 2004; Maasen & Weingart, 2006). A person
can call herself a journalist on her blog, for example, but she may very well not be able to
claim journalistic privilege if the state calls on her to release information about her
30
sources. There is no intrinsic quality to any particular representation of knowledge;
what is important is understanding why certain knowledge claims are successful and
others are unsuccessful (Bloor, 1984).
Expertise is socially constructed and is not directly related to any intrinsic quality of an
idea, but rather to whether society (in the form of institutions such as academia or the
state) gives credence to what one says or does. Ordinary people may now be performing
work previously done by professionals, but it doesn’t follow that they have all been
accorded the same credibility or the same power as the gatekeepers they are
supposedly supplanting.
The narrative about ordinary people taking over from the mass media assumes a clean
break between one social structure and another, but historically technologies never
produce such radical change. Societies still show dependencies on old technologies and
traditional power centres long after they become unnecessary (Edgerton, 2008). Putting
the question of expertise within its historical context provides a much more complex
picture about the origins of the disruptions that we are currently witnessing. A more
contextual, historical approach to the formation of identities is necessary in order to
understand which groups are in control of knowledge representation in the age of the
network.
According to Maasen and Weingart (2006), debates about the nature of expertise are not
new, but are re-surfacing because of three key parameters that have changed in society.
Firstly, industrialised countries have become increasingly democratized with a growing
proliferation of organisations operating outside the sphere of formal political parties.
Secondly, the 1960s anti-nuclear and environmental debates saw scientists being drawn
in to represent both poles of a debate, with the consequence that the public saw science
31
as not presenting a single view, and that scientific knowledge could be contradictory,
incomplete and biased. This resulted in a decrease in the authority accorded to scientific
knowledge, and the conclusion that scientific knowledge could no longer be taken as
neutral, objective and reliable. Finally, the democratization of expertise has seen the
demystification of scientific knowledge and of scientists, so that now scientific
knowledge is seen as uncertain, risky and incomplete (Maasen & Weingart, 2006, p. 2).
These three key changes have led to a situation where, in spite of a loss of authority of
the traditional scientific expert, the reliance of policy makers on experts to help craft
policy regarding an ever-expanding number of niche topics has increased significantly
(Maasen & Weingart, 2006, p. 4). Technology has certainly helped to enable access to
expertise beyond the academy and to enable experts to widely disseminate their
findings, but Maasen and Weingart make an important point about the need for
historically situated accounts of expertise that account for the ways in which there is
stasis as well as change in the way that notions of expertise circulate in society.
Sheila Jasanoff (2004) offers the theory of co-production for conceptualising changes in
technology and their impact on expert identities. According to Jasanoff, at moments of
significant social and technological change, there is a re-ordering of social structures
often by reaffirming the legitimacy of existing social arrangements. During times of
change, a host of new actors begin to participate in the production and distribution of
knowledge in order to fill gaps in expertise that result from the adoption of new
technologies. In response to change, societies must redraft rules of social order by
reaffirming some roles and introducing others.
32
The theory of co-production is developed out of the field of Science and Technology
Studies (STS) which problematizes the role of science and technology in social change
by laying bare the socio-technical processes by which science and technologies advance,
and by interrogating how the social and the technical are imbricated with one another
during such change. STS scholars investigate the social and technical contexts in which
scientific knowledge and technologies are produced, asking questions about why certain
knowledges are accepted as true, scientific facts, and others as myths, opinions or
ideologies.
In order to understand how knowledge is taken up and who is involved in its
construction and performance, a fruitful method is to study the ideological discourse
that actors engage in as they construct and debate competing claims about the world
(Geertz, 1973; Gieryn, 1983, 2001). Discourse does not only involve words, however,
but also the enactment of conventions. We do things with words but we also do words
with things (Latour, 1991). Digital speech acts (Isin & Ruppert, 2015) are thus a critical
component of understanding how representations of knowledge are both constructed
and performed in the online environment.
This discursive process by which knowledge claims are constructed and debated is
neither purely social nor purely technological. Wikipedia’s socio-technical environment,
for example, is heavily mediated by code and code plays a significant role in determining
what Wikipedia represents. Some scholars have started to investigate the effects of
Wikipedia’s coded environment in the area of bots, the automated actants developed by
Wikipedia editors in order to perform automated editing tasks on the encyclopaedia.
33
Geiger and Ribes (Geiger & Ribes, 2010), for example, demonstrate the role of non-
human actors in the process of vandal banning, arguing that the decentralized activity
enabled by automated and semi-automated tools on Wikipedia is a type of distributed
cognition (p. 117). Geiger (2011) extends this work to show how bots are exercising
control over a vision of what the encyclopaedia should be and how editors should work
together within it. In describing the increasing role of algorithmic actors on Wikipedia,
Geiger demonstrates the ways in which bots produce order and enforce rules on the
encyclopaedia.
Similarly, Niederer and van Dijck (2010) investigate the increasingly important role of
bots in the rise of Wikipedia, arguing that it is impossible to understand Wikipedia’s
response to vandalism without an appreciation of the encyclopaedia as a sociotechnical
system driven by collaboration between users and bots. The authors provide a
schematic overview of Wikipedia users according to their permission levels, and note
that the permission level of bots is below that of administrators but well above the
authority of registered users (Niederer & van Dijck, 2010, p. 1373). Wikipedia’s content
management system, argue Niederer and van Dijck, allows for protocological control, a
mode of control that is at once social and technological one cannot exist without the
other (p. 1373).
Power in these studies is viewed as a product mainly of technical permissions.
Unregistered users, for example, do not have permission to edit protected pages, and
administrators have the ability to ban users and award permissions to other users.
Arazy, Nov and Ortega (2014) attempt to reveal Wikipedia’s organisational hierarchy by
studying the access privileges of almost 5 million Wikipedia members of which about
10,500 hold special access privileges. The authors identify a set of privileges that are
34
granted to users with particular roles, including the ability to grant (or remove) the
access privileges of other contributors (Arazy, Nov & Ortega, 2014, p. 10). The result is a
series of six levels of power from the benevolent dictator at the top of the hierarchy to
ordinary users at the bottom (see figure 1.3 below). According to the authors, this
hierarchy represents an ecology of roles necessary for quality assurance, coordination,
and conflict resolution in online communities (p. 12).
Figure 1.3 Wikipedia’s organizational chart according to Arazy, Nov & Ortega (2014)
Note: Thickness of borders corresponds to the number of participants performing the role. Note
that it is not possible to determine the number of unregistered participants, since an IP address
cannot be linked to a single user.
Source: Arazy, Nov & Ortega,‘The (Wikipedia) world is not flat: On the organizational structure of
online production communities’ (2014)
35
The power to have one’s edits prevail on Wikipedia, however, is not solely a function of
editing permissions. Contributors with the same or similar levels of technical
permissions are often in debate with one another about how phenomena should be
represented, and even administrators, who have some of the highest levels of
permissions, do not always use the power granted to them without following processes
dictated by policy and norms.
Another group of studies has focused on the role of social forces in guiding decision
making on Wikipedia with a particular focus on the role of policy. Policies are an
important element of social interaction on Wikipedia (Bryant, Forte, & Bruckman, 2005;
Pentzold & Seidenglanz, 2006; Viegas, Wattenberg, Kriss, & van Ham, 2007). Analysts
have found that policies are used to appeal to authority in order to justify a contributor’s
changes to an article, and that policies provide a common resource for new users to
learn about editing and behavioural conventions (Viegas et al., 2007). There has been a
significant rise in the number of policy and other administration pages on Wikipedia:
between mid-2003 and late 2005, the number of administrative pages grew at a rate of
nearly eight times that of main article pages (Viegas et al., 2007). Wikipedia policies are
numerous and complex, and encompass so many levels of authority that a user’s
relatively greater understanding of policy enables them to more effectively participate
in debates in order to influence representation (Ford & Geiger, 2012).
A study of talk pages by Pentzold and Seidenglanz (2006) relies on the writings of
Foucault to argue that Wikipedia policy is an important feature in defining the rules by
which participants ‘delimit the sayable, define legitimate perspectives and fix the norms
for the elaboration of concepts’ (p. 65). In response, Kriplean, Beschastnikh, McDonald,
& Golder (2007) argue that Pentzold and Seidenglanz overemphasize the power of
36
policy in defining what editors can do on Wikipedia. Instead, they argue that, although
policy enables collaboration ‘by providing a common language and strategies for action
that contributors can draw on to interpret and apply to difficult or unanticipated
situations’ (p. 171) policies don’t translate into obvious actions, and their ambiguity
leads to ‘power play (p. 172) rather than already defined outcomes.
Power play is defined by the authors as ‘an attempt by an individual or a group to claim
legitimate control over an article’ (p. 172) and the authors develop a list of seven forms
of power play from their sample of talk pages. These include the delimitation of an
article’s scope by certain contributors (article scope), the presentation of decisions
made in the past as absolute and uncontested (prior consensus), the bolstering or
undermining of a position based on the legitimacy of a contributor in terms of their
expertise (legitimacy of contributor), and the discrediting of a source (legitimacy of
source) (p. 172).
Kriplean et al. argue that their approach is based on Giddens’ criticism of Foucault
where, instead of overemphasizing the power of the institution and neglecting the
actions of individuals, they focus, instead, on how groups of contributors claim
legitimate control over content through the discourse of policy(p. 170). These two
approaches are not, however, too distinct from one another. Whereas Pentzold and
Seidenglanz focus on the role of policy in determining what can be said on Wikipedia,
Kriplean et al. investigate how language is used strategically in order to retain power by
some over others.
One of the key stakeholders in Wikipedia that has had less attention by scholars are the
authors of scientific reports, the academics writing books about subjects covered by
37
Wikipedia, as well as the journalists, bloggers, website builders and social media
authors who constitute Wikipedia’s sources and citations. Since policy dictates that all
Wikipedia articles must be based on reliable sources (Wikipedia:Verifiability)
8
the
availability of sources, the perspectives that those sources represent and how they are
evaluated by editors are all important factors in deciding what facts Wikipedia
represents and which facts it ignores.
Empirical research in the analysis of Wikipedia sources has so far analysed the types of
sources that are represented on Wikipedia (Ford, Sen, Musicant & Miller, 2013; Luyt,
2011, 2012; Luyt & Tan, 2010; Nielsen, 2007) as well as how source use relates to the
geographic region in which contributors are located (Sen et al., 2015). Scholars have
found that certain types of sources are preferred by Wikipedia editors and that this has
an impact on how Wikipedia covers different topics. Luyt and Tan (2010), for example,
found that Wikipedians’ preference for United States government sources and online
sources represents a particular point of view about world history, and Luyt (2012)
found that Wikipedians’ preference for short texts that can be more easily mined means
that Wikipedia articles suffer from the congealed consensus of the institutions hosting
them (p. 1873). Luyt argues, in conclusion, that Wikipedia source preference is based
on the assumption that texts are undifferentiated bearers of extractable facts ( p. 1876).
As a consequence, Wikipedia articles tend to include long reams of facts rather than
persuasive analysis and interpretations (Rosenzweig, 2006).
Other studies have investigated the role of source conventions and norms in the
representation of facts. Digital artists, Scott Kildall and Nathaniel Stern (2011), provide a
8
URLs and other citation information for Wikipedia documents are provided in the ‘primary
sources’ section of the references.
38
notable example in discussing their artwork entitled Wikipedia Art which they call a
collaborative performance and a public intervention (Kildall & Stern, n.d.). Kildall and
Stern created a kind of future fact on Wikipedia’s platform by writing an article about
the work and citing sources that they simultaneously created in order to reference the
work. Kildall and Stern used the piece to argue the authoritative nature of Wikipedia,
where something becomes true when on Wikipedia when citations are used as a
performative act (Kildall & Stern, 2011, p. 174).
Both the technical and the social studies of power and decision-making on Wikipedia
offer useful insights into the process by which decisions are made, but there are two
clear gaps in the way that power is defined and analysed within these studies.
The first is that there is a clear division between scholars’ conceptions of power on
Wikipedia as either technologically determined (predominantly through technical
permissions or the actions of automated agents) or determined via social mechanisms
(predominantly relating to policies and norms). Methodologically, this means that
studies involve analysis of either talk pages or edits but not both of these together.
These approaches tend to treat technical agents and contributors separately, without
accounting for the ways in which the social and the technical are imbricated with one
another. On Wikipedia, symbolic action takes numerous forms, only one of which is the
type of talk that takes place on talk pages of article.
Secondly, studies tend not to focus on the subjects of the article themselves, focusing
rather on the process by which editors make decisions within the context of Wikipedia’s
socio-technical structure. The subjects of the article about the city of Johannesburg or
the victims of the latest natural disaster or the sufferers of HIV-AIDS are important
39
stakeholders in the flow of facts through Wikipedia’s socio-technical system. Wikipedia
is an encyclopaedia made up of facts that are becoming authoritative for a wide range of
stakeholders. Understanding the impact of the representation of facts on the subjects of
those facts is therefore as important as understanding who participates in Wikipedia’s
community.
This thesis responds to each of these gaps by analysing the factors that play a role in the
travel of facts (following Morgan, 2010, as touched on earlier and explained in more
detail in Chapter 2) through Wikipedia’s socio-technical system. In doing so, we come to
understand how Wikipedia is reaffirming certain spokespersons of facts (in the form of
sources and citations) while relying on particular types of expertise (in the form of
Wikipedia editors) as well as who is left out of this process.
Instead of dealing with technological and social forces separately, I analyse how the
technological and the social are imbricated with one another in complex ways in the
form of policies, tools, norms and discourses. Instead of analysing only talk pages or edit
logs, I analyse all actions taken by actors and actants as the article progresses over time,
including discussions taking place outside of the talk page on Wikipedia’s meta pages
that are used to discuss issues of common concern, as well as on the multiple mailing
lists, blogs and media reports in which discussions about articles take place. Instead of
only focusing on editors, I also account for actors such as sources (and their authors and
publishers) as well as the subjects themselves, who may or may not have a voice in
making decisions over the article.
40
1.5 Research design and objectives
This thesis considers Wikipedia as a socio-technical system composed of multiple
intersecting networks composed of both social relations between people and technical
aspects relating to organisational structure and processes. The term, socio-technical
system was derived originally from the field of organisational science, particularly
research by Trist, Bamforth and Emery on workers in English coal mines at the
Tavistock Institute in post-World War II Britain.
9
Wikipedia’s socio-technical system is
comprised of Wikipedia’s complex infrastructures and the human behaviour that
accompanies those infrastructures. Infrastructure, here, is defined according to
definitions posited by infrastructure studies scholars who take a broad relational view
of infrastructure that extends beyond bricks and mortar, tubes and wires, to encompass
the wide range of institutional, social practices, norms, standards and other artifacts
(Jackson, Edwards, Bowker, & Knobel, 2007; Sandvig, 2013; Star, 1999). Information
infrastructure, in other words, consists of the often-invisible network of objects, tools,
standards and practices that govern the way that we use the internet (Sandvig, 2013).
Wikipedia can be seen as consisting of infrastructures that maintain its functioning; it
can also be seen as infrastructure for the Web itself. Wikipedia is comprised of a large
network of artifacts, norms and practices that both support large-scale distributed work
and reflect people, places, events, and concepts from the world around us. Wikipedia
infrastructures consist of infrastructures for citation management, for vandalism
prevention, and for the distribution of tasks, amongst others. It is also, in itself, an
infrastructure for other knowledge communities who use the data extracted from
9
See, for example, Trist & Bamforth (1951).
41
Wikipedia to seed their own knowledge bases (for example, search engines and other
apps as indicated earlier in this chapter).
In designing this research project, particular attention was paid to exposing the often-
invisible network of objects and processes, tools and practices, conventions and
vocabulary that are critical to Wikipedia’s daily functioning. Wikipedia is just one of
sixteen projects supported by the Wikimedia Foundation (see figure 1.4). Rather than
being bound by a particular project or language version
10
, this study followed the
progress of sets of facts as they progressed through the network of sites housed by
Wikimedia and beyond those sites as facts find new users and uses.
Facts in Wikipedia are represented in the title of articles, the categories inserted into
articles, the series of facts that make up the body of articles and the links that cross-
reference the article with other articles inside and outside of Wikipedia. Furthermore,
facts travel beyond their initial publication on Wikipedia to other language versions, to
other Wikimedia projects, and outside of Wikipedia as facts are extracted for use on
external websites and applications. Conventions that originate in Wikipedia are also
taken up by other organisations and individuals but have origins in earlier
epistemologies and discourses (as explained in chapter five).
The following of facts in this thesis, then, involved the analysis of representations within
a variety of Wikimedia’s separate sites: the images and audio files embedded in
Wikipedia articles that were originally published on Wikimedia Commons, discussions
on mailing lists administered by the Wikimedia Foundation that referred to project
10
Although numerous versions of facts represented by Wikipedia’s multiple language versions
were followed, English Wikipedia was privileged in this study.
42
documents on Meta-Wiki, discursive structures embedded within Wikimedia
Foundation documents, and Wikidata’s structured representations of facts that are
embedded within numerous language editions of Wikipedia.
Figure 1.4 Wikimedia Foundation projects
Source: https://wikimediafoundation.org/wiki/Our_projects
The two sets of research questions that this thesis answers are defined according to the
practical tracing of facts as they traverse the networks of actors and practices that
become enrolled after facts are first published on Wikipedia, as well as the impact of the
43
dynamics of such travel in the sources of authority and expertise. Research questions
are thus defined as follows:
RQ1: Can facts be shown to travel within Wikipedia's socio-technical system? If
so, do some facts travel further and with greater integrity than others, and why
is this so?
RQ2: Is there evidence that the roles of knowledge authorities and experts are
being reconfigured within Wikipedia's socio-technical system, and if so, how has
this reconfiguration taken place?
Analysing how facts travel within Wikipedia’s socio-technical system at a granular level
provides evidence that demonstrates how certain authorities are more acceptable than
others to Wikipedians and how particular expertise is required in order to enable the
travel of facts within Wikipedia. Wikipedia constitutes a filter of facts that is both
socially and technologically constituted. Understanding the mechanics of that filter is the
primary objective of this thesis.
In order to implement this objective, I have employed the methods and principles of
ethnography. The field site has been constructed as a network (Burrell, 2009) by
following the progress of a series of articles over time, and by defining the human actors,
the non-human actants or objects, as well as the processes and practices that play a role
in how the subject is represented. Once the network was established, I investigated each
of the actors and actants through interviews and further following of related documents,
and initiated series of participant observation activities in order to understand the
mechanics of related practices.
44
Three networks or followings were studied in the collection of data for this project. The
first involved following the reliable sources policy on Wikipedia as a way of
understanding the terrain in which facts travel. The second and third networks were
constructed by following the progress of two groups of facts through Wikipedia’s socio-
technical structure. Surr is a sport played in rural northern India and the article
representing it faced enormous challenges, stopping relatively short in its travels
through Wikipedia and to related databases. The 2011 Egyptian Revolution, on the other
hand, travelled rapidly and widely to a number of different language versions and
produced numerous related facts/articles.
In both cases, surr and the Egyptian Revolution are representative of more than just the
subjects they portray. Surr represents an attempt to extend the boundaries of Wikipedia
to cover issues about which there are little (if any) reliable, published sources according
to Wikipedia’s definitions. The Egyptian Revolution represents a continued expansion of
the boundaries of Wikipedia to include facts relating to breaking news. Both facts have a
political impact on their subjects. The acceptance or rejection of a cultural activity in
rural India is symbolic of whether colonial attitudes towards former colonies have been
abandoned or whether those attitudes have merely taken on a new form. The
representation of popular protests in Egypt’s Tahrir Square has an impact on the
legitimacy of future governments. While this is not a study designed to argue that these
cases are representative (in a probabilistic sense) of what happens in Wikipedia, I show
how they are emblematic of a particular range of ways that facts can, and as I will show,
do travel.
45
1.6 Conclusion
Every day, thousands of decisions are being made on Wikipedia about which facts
should be included and how they should be represented. These decisions have an impact
on the visibility (or invisibility) of subjects not only on Wikipedia’s platform but on the
sites and databases that extract Wikipedia facts for their own uses. Wikipedia’s
authority has become so great that it has become a battleground in which different
interests vie for control over the facts that structure our lives.
Understanding how these decisions are made and who Wikipedia benefits most is
important for two key reasons. The first is that search engines are recognising
Wikipedia as public infrastructure for facts, thus Wikipedia’s representations and those
advancing such representations need to be critiqued. Facts produced within Wikipedia
are increasingly fading into the background and becoming black boxed, so that readers
often accept Wikipedia’s representation of the world as natural and obvious, when it is
actually limited and constructed. Although some well-resourced media organisations
and academics are starting to critique Wikipedia’s representation of subjects, there is a
growing divide between those who understand Wikipedia’s representations in the
context of its complex rules and affordances, and those who do not. This thesis aims to
add to the practical stock of knowledge about how Wikipedia works and for whom it
works.
In addition to the identification of Wikipedia as public infrastructure, Wikipedia’s model
is recognised by many as a model for the development of public infrastructure. As more
and more private interests are being represented on Wikipedia, it becomes important to
understand whether Wikipedia can, indeed, represent public interests as well as
46
whether Wikipedia’s biases are able to be discussed and deemed acceptable or not, or
whether they will remain hidden in the black box.
This thesis is a step towards understanding how expert identities are being
reconfigured in the current information environment, and what the political
implications of such changes are on the representation of knowledges around the world.
Initially portrayed as David to traditional media’s Goliath, Wikipedia wasn’t always
recognised as authoritative. Since anyone could edit Wikipedia, it seemed to offer the
opportunity for everyone to have their views represented, free from the biases and
strains of capitalist media. Wikipedia was part of a growing number of participatory
platforms that were considered open for ordinary people to be able to control.
Today it has become increasingly clear that Wikipedia is possibly a new Goliath,
reconfiguring the network of actors involved in the production of knowledge about the
world. Wikipedia has become a powerful authority over the domain of facts, but
questions remain regarding how decisions are made about what to include and exclude
in Wikipedia’s representation of phenomena, and who (or what) is in charge of such
representations. This thesis challenges theories that position Wikipedia and other
participatory media projects as foregoing the need for information gatekeepers, giving
rise to a group of unnamed amateurs and giving voice to all who participate in their on-
going construction. Recognising Wikipedia as a socio-technical phenomenon that has
established new centres of authority and expertise through the enactment of symbolic
performances, this thesis demonstrates how digital networks give rise to new centres of
expertise while reaffirming traditional centres of authority.
47
2 Chapter 2: Theoretical framework
(P)eople enact themselves as subjects of power through the Internet, and at the
same time bring cyberspace into being (Isin and Ruppert, 2015, p. 12)
Wikipedia has, in the past, been theorised as an example of ‘social media’ (Fuchs, 2013;
van Dijck, 2013), ‘peer production’ (Benkler, 2006) and ‘produsage’ (Bruns, 2008).
Benkler and Fuchs, for example, classify Wikipedia in terms of its non-profit
collaborative model. Benkler (2006) groups Wikipedia with other volunteer-driven
projects such as Linux
11
and SETI@home
12
in order to argue that these initiatives
represent a new democratic force in society. Similarly, Fuchs (2013) argues that
Wikipedia has socialist potential because, unlike exploitative sites such as Facebook and
Google, Wikipedia is free of adverts, open source, communally constructed and freely
available to all. Van Dijk (2013), on the other hand, considers Wikipedia along with
Facebook, Twitter, Flickr and YouTube as a social media platform in which particular
cultural, economic and ideological forces sustain an ecosystem of connective media on
the Internet.
There is value, however, in theorising Wikipedia in its own terms. Wikipedia defines
itself in policy and practice primary as an encyclopaedia, as distinct from sites like
11
Linux is computer operating system assembled under the model of free and open-source
software development and distribution.
12
SETI@home is a scientific experiment that uses computers connected to the internet in the
Search for Extraterrestrial Intelligence (SETI). Internet users are invited to participate by
running a program that downloads and analyzes radio telescope data (see
http://setiathome.ssl.berkeley.edu/).
48
Facebook or Twitter
13
, and specifically as a platform for the representation of facts.
Science and Technology Studies, particularly scholarship relating to the communication
of scientific facts is therefore relevant to theorising how Wikipedia influences the travel
of facts within its socio-technical system and what this means for the distribution of
power to represent knowledge.
This chapter provides the theoretical framework for situating the two research
questions that have been posed the first about the reasons why some facts travel
further than others in Wikipedia; the second about the impact of such conditions on the
ways in which knowledge authorities and experts are being reconfigured. The chapter
begins with an outline of the debates within STS about the social vs. technological
shaping of technology in order to situate the current research.
The chapter moves on to introducing Morgan’s (2010b) framework for factors
influencing the travel of facts that includes facts’ ‘good companions’, ‘boundaries and
terrain’ and ‘character’. The second part of the chapter highlights the ways in which the
boundaries between authoritative and non-authoritative companions, verifiable and
unverifiable truth, and reproducible or un-reproducible facts has been defined by
knowledge communities in the construction of everyday discourse in the past. Such
discourse proves to be both descriptive (whereby rivals denigrate one another using
descriptive language, for example) but is also performative in the sense of Isin and
Ruppert’s (2015) digital speech acts.
13
The policy, WikipediaNotFacebook (Wikipedia:What Wikipedia is Not) declares that
Wikipedia is not a social networking service like Facebook and Twitter’.
49
The final section provides an analysis of co-production (Jasanoff, 2004) as a way of
framing the second research question regarding authority and expertise. As Jasanoff
writes, reconfirming traditional authorities during times of significant socio-technical
change is a way of putting things back together, although the identities of the expert will
change. Empirically examining how such authorities and identities are being
constructed within Wikipedia is highlighted as the core goal of the chapters that follow.
2.1 Social vs. technological determinism
The extent to which technologies and facts develop according to either social or
technological forces constitutes a key debate in the field of science and technology
studies. On the social constructivism end of the scale, distinctions are made between
‘mild’ and radical’ social constructivism (Sismondo, 1993) with those who represent
mild social constructivist views arguing that science and technology involve social
factors and radical social constructivism that technologies are purely the product of
socially negotiated meanings. The Social Construction of Technology (SCOT) approach
developed by Bijker, Hughes and Pinch (1987), for example, recognizes the ways in
which social and forces influence science and technology.
(B)oth science and technology are socially constructed cultures and bring to
bear whatever cultural resources are appropriate for the purposes at hand.
(Bijker, Hughes & Pinch, 1998, p. 20)
According to the SCOT approach, claims become accepted as scientific truth in three key
steps. When scientists initially make claims, and technologies are developed, there is a
high degree of interpretive flexibility. Scientific facts can be widely interpreted and
technologies variously shaped at this stage of their development. Relevant social groups
who are associated with the fact or artifact have different meanings for it because it
fulfils particular functions, and solves specific problems for them. These meanings
50
inevitably come into conflict with one another as they suggest different alternatives for
the fact or the artifact.
A single truth emerges or technologies are stabilized when consensus emerges and
controversies are terminated. This process of closure and stabilization occurs through
either rhetorical closure (whether relevant social groups see the problem as being
solved) or according to a redefinition of the problem within a wider context.
Figure 2.1 The stabilization of facts and artifacts according to Bijker, Hughes & Pinch (1987)
Source: Adapted from Bijker, Hughes & Pinch (1987)
MacKenzie & Wajcman (1985) reiterated the importance of the social forces
underpinning the ways in which science and technologies develop in their influential
volume titled ‘The Social Shaping of Technology’ (MacKenzie & Wajcman, 1985). The
authors stressed the need for recognizing how technology doesn’t affect society in some
independent way but that technology is also socially shaped. According to MacKenzie
and Wajcman, it is not that technology has no social effects but that the effects of
technology are complex and contingent (MacKenzie & Wajcman, 1985, p. 4). Included in
the volume were articles such as Langdon Winner’s ‘Artifacts have politics’ that
indicated how technologies are not neutral but are designed in order to open certain
social options and close others.
Fact/artifact Controversy Stabilization through
closure/redefinition
51
In addition to recognising the importance of social and cultural factors in influencing the
development of technology, MacKenzie and Wajcman noted in the second volume (1999,
p. 23) that they had originally neglected the influence of technology upon social
relations and argued that it is important to recognise that social relations are not
independent of technology. They explain that actor-network theory, developed by
scholars including Latour, Callon, Akrich and Law is useful in understanding how
technology and society are mutually constituted, that artifacts and technologies are
what makes society possible and that both society and technology are made of the same
‘stuff”: networks linking humans and non-humans together.
Actor network theory (ANT) proposes that social issues or practices must be recognized
as networks of relationships among human actors and non-human actants. ANT scholars
reject the classification of issues or practices as either nature or culture, science or
politics, but rather suggest that material artifacts can be agents within networks of
relations. Although the concept that artifacts can have agency is controversial, the
heterogeneity of actor networks places artifacts and objects centrally within the theory.
According to MacKenzie (1996), what is important about ANT is that it reminds us to
keep two aspects of technical change in mind: the first is the way in which the physical
aspects of heterogeneous engineering are influenced by the demands of its social
aspects, that is the ‘social shaping of technology’; the second is that ‘artifacts have
politics’ (Winner, 1980), that is, technologies are not neutral; their adoption and
operation often involves changes to that order.
(T)he actor-network perspective offers a useful critique of the fact that much
social theory conceives of social relations as if they were simply unmediated
relationships between naked human beings, rather than being made possible
52
and stable by artifacts and technologies. (MacKenzie, 1996, p. 14)
Actor Network Theory proposes that what we define as social issues or phenomena are
actually networks of human actors and non-human actants interacting together (and
often at odds with one another). Michel Callon (1986), one of the proponents of ANT,
highlights the ways in which controversies are resolved through the example of the
controversy about the causes for the decline in the population of scallops in St. Brieuc
Bay in France. Callon describes how the researchers who participated in the debate
about what were the causes of the decline in the population of scallops each developed
contradictory arguments and points of view which lead them to propose different
versions of the social and natural worlds. He demonstrates how the researchers who
were successful in imposing their definition of the situation on others were those who
were able to define their identities in such a way that they became an obligatory passage
point in the network of relationships that they were building. Furthermore, successful
researchers were those whose determinations of the roles and identities of different
actors were enacted and unopposed. The fishermen of St. Brieuc Bay, the researchers’
scientific colleagues, the scallops of St. Brieuc Bay and the researchers themselves
become enrolled in a representation of the controversy according to the researchers’
own interpretation of actors’ interests and behavior.
A process of translation occurred when the researchers imposed themselves and the
definition of the situation onto others through problematisation, interessement,
enrolment and mobilization. Problematisation refers to the movement in which certain
actors become indispensable in the network; interessement, the processes by which
actors lock other actors into the roles that they propose; enrolment as the device by
which a set of interrelated roles is defined and attributed to actors who accept them,
53
and mobilization, a set of methods used by actors to ensure that supposed spokespeople
for various relevant collectivities are properly able to represent those collectivities.
Translation is the mechanism by which the social and natural worlds
progressively take form. The result is a situation in which certain entities control
others. Understanding what sociologists generally call power relationships
means describing the way in which actors are defined, associated and
simultaneously obliged to remain faithful to their alliances. (Callon, 1986, p.
224)
Latour and Woolgar’s ‘Laboratory Life’ (1979) similarly analysed the ways in which
science was being constructed within another actor network, that of the laboratory. In
particular, Latour and Woolgar recognized the role of the material environment of the
laboratory in influencing the kinds of facts that are created. Investigating patterns of
communication in a scientific laboratory, the authors traced the routine practices of
publishing papers, negotiating research finances and seeking scientific prestige in the
trajectory of scientific facts. Latour and Woolgar note that there is a heavy focus on
documents and literature in the laboratory with scientists spending ‘the greatest part of
their day coding, marking, altering, correcting, reading, and writing.’ (p. 49)
Latour extended these findings into his book, ‘We have never been modern’ (1993)
where he proposes that in order to study the relationship between science and society,
we need to maintain the unity between science and the social. Modern (Western) society
has constructed this division but it continuously threatens to unravel with the existence
of what Latour calls ‘hybrids’ that mix politics, technology, science and culture. Objects
such as the cloned sheep Dolly or government experts are neither natural nor social but
rather linked together in heterogeneous actor-networks constituted by the public
interaction between people, heterogeneous objects and discourses.
According to Sheila Jasanoff (2004), however, the problem with Latour’s conception of
power within actor-networks is that it is determined by the size and concentration of
54
the network. Latour writes that power tends to concentrate in ‘centres of calculation’
such as the printing presses, statistical formulas, maps and other ‘inscription devices’
which render dominant perceptions of the world into portable representations (Latour,
1987, 1992, 1993). There are significant gaps in Latour’s articulation of power relations
according to Jasanoff because he does not account for diversity in the acceptance of
scientific claims by some groups as opposed to others.
[Latour is] silent on why technological practices or the credibility of scientific
claims varies across cultures; why some actor-networks remain contested and
unstable for long periods while others settle quickly; why work at some nodes
stabilizes a network more effectively than at others; or what role memories,
beliefs, values and ideologies play in sustaining some representations of nature
and the social world at the expense of others. (Jasanoff, 2004, p. 39)
Instead, Jasanoff argues for a co-productionist approach to power in which power
becomes lodged in representations, discourses, identities and institutions during times
of significant socio-technical change. Underlying the theory of co-production is a theory
of the impact of social and technological change that sees both social and technological
factors as imbricated with one another, rather than happening in sequence. Instead of
technologies causing social change, technologies are themselves created as a result of
changes that cannot be narrowed down as either social or technical.
Co-production can be seen as an attempt to bring together strains of social and
technological determinism over the history of STS, as well as to bring in a normative
lens that recognizes the role of power in socio-technical change which has been lacking
in the field.
Jasanoff’s expansion of the co-production idiom focuses on a Foucauldian notion of
power where power is not about the ability to conduct violence or force, but rather in
the sense in which power is the ability to classify. Foucault (1980) argues that society
55
tends to value particular forms of knowledge over others, to recognise some forms of
knowing and not others, to accord some people with the designation of expert and
others as lacking true knowledge. The power to represent knowledge is therefore the
power to define the rules underlying determinations of what is true and false. Foucault
uses the term power/knowledge to signify how truth is constructed by societies
according to the power of certain groups.
Truth is a thing of this world: it is produced only by virtue of multiple forms of
constraint. And it induces regular effects of power. Each society has its regime
of truth, its “general politics” of truth: that is, the types of discourse which it
accepts and makes function as true; the mechanisms and instances which enable
one to distinguish true and false statements, the means by which each is
sanctioned; the techniques and procedures accorded value in the acquisition of
truth; the status of those who are charged with saying what counts as true.
(Foucault, 1980, p. 131)
Foucault explains how these regimes of truth are the result of scientific discourse and
institutions and are reinforced and redefined continuously through the media, the
education system, political and economic ideologies. The battles that are waged over
truth are not only about which are the absolute truths but rather about the ‘rules
according to which the true and false are separated and specific effects of power are
attached to the true’; a battle about ‘the status of truth and the economic and political
role it plays’ (Foucault, 1980, p. 132). ‘Truth regimes’, in others words, are sets of rules
that define truth in particular ways.
Scientific objectivity is a ‘truth regime’ that Donna Haraway (1988) critiques in her
work on ‘situated knowledges’. According to Haraway, objectivity as it is practiced in
male-dominated science is an illusion. There can be no ‘infinite vision’ – it is a ‘god trick’
(Haraway, 1988, p. 581). On the other hand, recognising the situated nature of all
knowledge enables a new understanding of objectivity which is about ‘limited location
and situated knowledge, not about transcendence and splitting of subject and object’
56
(Haraway, 1988, p. 583). Haraway believes that such ‘subjugated standpoints’ are
preferred because they seem to promise more adequate, sustained, objective,
transforming accounts of the world.’ (Haraway, 1988, p. 883) From this perspective, all
knowledges are local knowledges; there can be no universal knowledge.
The benefit of Jasanoff’s co-production idiom, according to Lievrouw (2014, p. 31) is
that it is useful in understanding the relationship between technology and society, in
particular how power is materialized not as ‘an abstract “force” or institutional
“structure,” but… as observable in the physical forms of social practices, relations, and
material objects and artifacts’ (Lievrouw, 2014, p. 31). Citing Irwin (2008, p. 589),
Liewrouw writes that,
Co-production encourages a move away from strong social determinism, and the
assumption that ‘social controversies around science are “really” all about
politics or that complex areas of innovation can be reduced to “social
construction’ (Lievrouw, 2014, p. 31)
According to the co-production idiom, the making of identities is one of the key ways in
which technology and society co-produce one another. Expert identities are the result
not only of the mastery over particular skills, but the tacit knowledge and authority
accorded to particular identities by society. As such, it is important to understand how
scientific authority involves the power to define what counts as the truth and how
scientific authority is conferred upon by society.
Underlying the theory of co-production is a theory of the impact of social and
technological change that sees both social and technological factors as imbricated with
one another, rather than happening in sequence. Instead of technologies causing social
change, technologies are themselves created as a result of changes that cannot be
narrowed down as either social or technical.
57
Co-production can be seen as an attempt to bring together strains of social and
technological determinism over the history of STS, as well as to bring in a normative
lens that recognizes the role of power in socio-technical change which has been lacking
in the field. Although co-production is useful in framing identity as a co-product of
society and technology, it has mostly been applied to cases in which there are already
stable categories of science and scientist, and there is consequently a gap in
understanding the particularities of contexts in which activity is almost entirely
mediated by software and code. Mediated environments afford very different types of
activity, identity and power relations.
Users are operating under varying levels of anonymity and pseudonymity that provide
different levels of identity construction and certification; users have different levels of
technical permissions that afford different types of activities; and the architecture of the
spaces in which users operate are determined by a particular type of materiality: the
materiality of code. These materialities have an effect on the kinds of identities that are
either re-affirmed or co-produced because they affect which actors are able to
undertake which actions in particular environments.
In addition, the process of fact building in particular is necessary to be explored in order
to understand both the social and technical forces that are relevant to the travel of facts
within Wikipedia’s socio-technical structure. In the next section I highlight research
relating to the travel of facts, followed by analyses of how the environment in which
facts travel is negotiated and architected.
58
2.2 Facts and knowledge
Although Wikipedia aims to represent ‘the sum of all human knowledge’ (Wikimedia
Foundation, n.d.), one of the site’s foundational policy documents (Wikipedia:What
Wikipedia is Not) indicates that Wikipedia is an encyclopaedia that represents facts,
rather than a series of other types of formats, media or systems. There are three key
differences between knowledge and facts that provide insight into some of the
foundational limitations that Wikipedia encounters when trying to represent knowledge
about the world.
Firstly, facts are a representation of knowledge rather than a mirror of reality. Reality
must be represented and communicated in language and, according to social
psychologist Sandra Jovchelovitch who writes about knowledge in context,
representation is the only path to knowledge that we have.
Our knowledge of the world depends on representational processes; as mediating
structures bridging the world of subjects and the world of objects they deeply affect the
structure of knowledge. Knowledge in this sense is neither a copy of the world nor is it
the world but it is in the world. Knowledge systems are proposals of the world literally
representations whose processes of construction we need to understand and to
unpack if we are to understand their complexity and variability in social life
(Jovchelovitch, 2006, p. 100).
Secondly, the social features of facts are progressively removed as they travel to other
contexts. Whereas knowledge resides in the mind or in the flow of practice (Lave &
Wenger, 1991), facts are structured and perceived as distinct from the context from
which they arise and can therefore be moved around from one context to another.
59
Jovchelovitz writes that, although ‘knowledge is a plural and plastic phenomenon’, for
many ‘the trick behind the construction of true knowledge seems to be the progressive
detachment of the internal structures of knowing from the subjects, communities and
cultures that give knowledge its substance and its raison d’être’ (p. 98).
Thirdly, there are different ways of knowing the world, but only certain groups and
identities are attributed with being able to develop true knowledge in the form of facts,
whereas other groups are said to merely produce opinion, myth, or belief. Jovchelovitz
argues that it is in the realm of science that contemporary societies tend to locate true
knowledge in the form of facts and that people other than scientists (‘a housewife, a five-
year-old child, or a peasant living in a rural community’) are determined to rather hold
‘lay beliefs, ideologies, myths, or superstitions, but not knowledge’ (p. 99).
In order to understand statements that are taken as factual and therefore authoritative,
we need to understand the process by which facts travel from their point of origin to
other contexts. The travel of facts is critical to their power; facts that don’t travel well
cannot become known and therefore do not have power to define their phenomena.
Facts need to move out of the laboratory or other origins of their production in order to
become known, independent and authoritative. According to Steven Shapin, the
communication (or representation) of facts is not distinct from, but part of their
construction.
(S)peech about natural reality is a means of generating knowledge about reality,
of securing assent to that knowledge, and of bounding domains of certain
knowledge from areas of less certain standing. (Shapin, 1984, p. 481)
In the context of Wikipedia, where editors are forbidden from producing their own
research, all facts represented on Wikipedia are produced elsewhere. Like any other
60
representation of knowledge, representations of facts on Wikipedia are not a perfect
mirror but rather a (re)presentation of knowledge with its own unique characteristics.
In communicating facts through the medium of the Wikipedia article and in response to
Wikipedia’s policies, norms and local interactions with other editors, Wikipedians add
particular elements to the fact as it moves through the Wikipedia system.
The way in which Wikipedians summarise a fact in a scientific article, for example,
imposes a particular emphasis and classification, a particular association as the fact is
grouped with other facts. The fact needs to remain largely intact as it travels, but its
companions and the environment in which it travels is instrumental to that travel and
therefore to its on-going construction.
A research project entitled ‘How Well Do “Facts” Travel?’ (Morgan, 2010a) investigated
the transmission and reception of facts ‘traveling across time, between disciplines,
between academia and policy, between the lay public and the specialist professional,
and in the physical sense across countries, embodied in people, spoken, written,
performed and executed, or disembodied in books, diagrams, and technologies’ (see the
project website at http://www.lse.ac.uk/economicHistory/Research/facts). In an
introduction to the book from the project, Mary S. Morgan (2010b) defines facts as
‘autonomous, short, specific and reliable pieces of knowledge’ (p. 8). Facts come in many
forms. They are distinguished by their diversity in expression, scope and size by the
different communities that employ them. Facts can be expressed and represented both
linguistically or in the form of objects and artifacts, they can be little, big, singular,
multiple or generic (p. 8).
61
The travel of facts is the central remit of the research project, with travelling ‘well’ being
defined in two key ways. Firstly, when a fact travels well it travels ‘with integrity’ so that
it travels more or less intact from one place to another. Secondly, a fact travels well
when it travels ‘fruitfully’, that is, beyond its spatial or disciplinary boundaries to find
new users and uses of the fact (p. 12). On the other hand, facts may be impeded from
traveling well by ‘bad companions, companions who alter the fact to subvert it, re-label
it, cast doubt on it, and otherwise discredit it as they see it on its way’ (p. 30).
Expanding on the ‘fruitful’ travel of facts, Morgan articulates their position with regards
to the construction or discovery of facts that has been a feature of debates in the
sociology of knowledge and the so-called ‘science wars’ of the late twentieth century.
According to Morgan, there are two core ways of thinking about facts from the
production side. On the one hand, there is the idea that facts are found or discovered
after heavy labour in the laboratory, field, archive or museum. On the other hand, facts
are viewed as being constructed out of activities of social networks and practical
instruments. Since Morgan is focused specifically on the travel of facts after production,
she looks to two theoretical positions regarding whether (or how) facts change as they
travel.
According to Latour (1987), the ‘marks’ of science are mobile and will travel only if they
are immutable, presentable, readable and combinable. Fleck (1979), on the other hand,
argues that facts are developed and understood only within knowledge communities. As
knowledge travels from one community to another it has to be translated and in the
process changes to some degree its meaning and thus loses its integrity in traveling.
62
Morgan argues that her sense of what happens to facts as they travel ‘fits untidily
between Latour’s marks and Fleck’s community facts’ (p. 13). Their notion of facts is
that they are not always as mobile as Latour would suggest since they are not
necessarily linguistic (marks on a headstone, for example). Their conception is that facts
are also more independent than Fleck suggests facts are. According to Morgan, the travel
of facts may involve the transformation in meaning since a community defines its own
facts, but the circulation of such knowledge doesn’t necessarily involve its
transformation (p. 15).
Morgan uses the analogy of facts as rubber balls to articulate the changeability of facts
as they travel:
They have a certain shape; they can be carried, rolled, squeezed, bounced, kicked
and thrown without harm to them; and they can be used in many different ways
and in different situations. (Morgan, 2010b, p. 15-16)
In addition to their steadfast and mutable characteristics, facts have accompanying
details or qualifications that constitute contextual elements that may be shed on a fact’s
travels. In addition to losing contextual elements, facts may also pick up extra elements
on their travels and become sharpened (p. 16).
In the context of a Wikipedia article, facts are reflected in the article’s title and its
redirects
14
, in the content of the article and the associations of facts with particular
headings, in categories and images and captions and infoboxes and hyperlinks. A fact
travels well within Wikipedia if the source is accurately reflected in the context of the
article, if the fact is translated into other languages on Wikipedia and linked to other
14
A redirect is a page that has no content itself but sends the reader to another page, usually an
article or section of an article. Redirects enable a page to be reached under alternative titles on
Wikipedia.
63
articles. The fact travels well if it and its constituent facts were extracted so that they
could be represented in Wikidata, and other platforms that rely on Wikipedia data. A
fact travels less well within Wikipedia if the source is inaccurately summarised in the
article, if it is not translated into other languages, if its viability as a reliable fact is
questioned by other editors using disruption mechanisms and if the facts were not
extracted by Wikidata and thus not available for extraction in other platforms such as
Google.
Facts are present in every domain, and in every domain, facts are contested. Although
much research in STS has focused on the construction and flow of facts in the field of
science, facts are the bedrock of both academic and non-academic enterprises that are
based on the creation and dissemination of facts and knowledge. Like natural science,
history advances by scholars interrogating facts established in the past. Facts are a form
of shared knowledge. Facts do not represent belief or opinion but rather knowledge that,
according to (a community’s) standards of evidence of discipline, time and place,
that community has good reason to take those things as facts and will be likely to
have the confidence to act upon them as facts. (Morgan, 2010, p. 11)
Facts are produced according to these standards of evidence imposed by the community
such as rules of observation, experimentation or inscription in scientific discipline. In
order for facts to become accepted, however, they have to communicated to others.
According to Morgan, it’s not that there is a free market for facts and that ‘good facts’
will necessarily crowd out ‘bad facts’. Actually, some facts find fierce resistance among
people with opposing interests and ideologies. Morgan highlights three key factors that
impact the travel of facts. They include ‘good company’, ‘boundaries and terrain’, and
‘character’ (see figure 2.2).
64
Figure 2.2 Factors influencing how far a fact travels according to Morgan (2010b)
Source: adapted from Morgan (2010b)
‘Good company’ refers to the people and structures that support a fact's travels but are
not part of the facts themselves and can be discarded when the fact reaches a new
destination. According to Morgan, good travelling companions are required for facts to
travel well and with authority.
(Travelling companions) range from the mundane level of labels and packaging,
to the more material vehicles of transportation, as well as to the people involved
in chaperoning, and from the various kinds of institutional structures that
support travelling knowledge, to the technical standards that carry facts with
them (Morgan, 2010b, p. 27)
The packaging of facts ensures that they travel well in terms of retaining their integrity
as they pass from one location to another. Facts traveling in images, in specimens, in bio-
informatics data all need to be carried by people, by data processes and technical
standards. This is not about the agencies of individual producers of facts but those who
package facts for travel, the users or audiences who unpack them, the network of people
and things via which they travel and the social arrangements within which these travels
are embedded (p. 21)
Traveling companions
Authorities
Labels and packaging
Technical standards
Processes
Speakers, authorities
Boundaries and terrain
Norms
Policies
Epistemologies
Character
Attributes
Characteristics
Functions
65
In Wikipedia’s socio-technical environment, ‘traveling companions’ can be identified in
the editors who summarise sources and reflect them as facts in articles, by the
classification standards and techniques operating within Wikipedia and its associated
projects including Wikidata, as well as the technical permissions that enable certain
editors greater control over editing all aspects of the article, including the tools used to
frame content such as templates.
‘Boundaries and terrain’ refers to the disciplinary landscape, material elements, and
requirements for a specific technical understanding that limit the range of traveling
facts. According to Morgan, the terrain metaphor can be defined in terms of numerous
features.
We can construct the terrain in sociological terms, for example as a disciplinary
landscape in which expertise, trust, and power form the features of the terrain
and define the barriers to be overcome. Or we can construct it in terms of the
material elements of the science or humanities in which models, instruments and
experiments - or archives and previous historical authorities - constitute the
terrain. A third possibility is to interpret the terrain and boundaries in cognitive
and epistemic terms, where the requirements for a specific technical
understanding, or a knowledge of historical period, limit the range of the
travelling facts or their ability to remain intact as they travel. (Morgan, 2010b, p.
31)
Applied to Wikipedia, I define boundaries and terrain as the norms, policies and
epistemological standards applied within Wikipedia’s environment that enable certain
facts to be travel well and others to be stopped in their tracks. How Wikipedia defines its
boundaries is a product of its encyclopaedic identity, its origins in the free and open
sources software movement and the Internet as liberatory, and its relationship to
authoritative sources of knowledge. This environment plays a very particular role not
only in how facts travel through it, but which facts will travel more easily than others
based on how they are defined by different actors as being located within or outside of
the boundaries being set.
66
Finally, the character of facts refers to the specific attributes, characteristics, and
function of facts. Facts have unique qualities with the potential to develop their scope or
become generic. Facts can be ‘understandable’, ‘surprising’, ‘colourful’, ‘reproducible’,
and ‘adaptable’. Facts with character travel well, that is, they travel far and with
integrity.
Within Wikipedia’s socio-technical system, certain facts travel further than others
because of their character. Thus, facts that are reproducible such as demographic data
about cities are well represented in Wikipedia. Reproducibility depends on the
availability of data about cities as well as the automated agents on Wikipedia that
produce and translate particular types of articles and facts. Not all countries in the
world maintain the same level of detail for cities and municipalities, towns and villages,
and so not all cities’ are represented by facts that are reproducible.
Morgan produces a clear framework for understanding how facts travel, but research
into the construction and popularization of facts has shown that the process of defining
who are ‘good’ traveling companions, what are the boundaries between fact and
opinion, as well as which facts are more reproducible than others are equally important
to understanding the travel of facts. Studies in this arena have considered the social
(Bijker, Pinch & Hughes, 1987) and socio-technical (Latour, 1987; Jasanoff, 2006; Isin
and Ruppert, 2015) interactions that result in certain facts being spread further than
others, the ideological impetus behind the travel of facts (Geertz, 1973; Gieryn, 1983,
1999) and the infrastructure required for the spread of facts (Shapin, 1984).
67
Common to all of these studies is a conception of rhetoric as the basis of definitional
practice and boundary work. Studies have predominantly been done in the realm of
science studies where scholars have investigated how stakeholders collectively
articulate the boundaries between facts and opinions, science and non-science, and
ideology and science. Also relevant are studies of controversies where particular facts
are being debated through debates over who are the appropriate authorities to speak on
behalf of particular subjects.
2.3 Language, ideology and rhetoric
Clifford Geertz (1973) developed a theory of ideology out of a concern that ideology was
becoming a pejorative term rather than an analytical concept. According to many
scholars, ideology occurs when there are deviations from social scientific objectivity and
ideology is frequently used as a pejorative term for statements that the speaker believes
to be biased, emotional or distorted as opposed to independent, unaffected and
scientific. Geertz argues that the boundary between ideology and science is unclear.
Where, if anywhere, ideology leaves off and science begins has been the Sphinx’s
Riddle of much of modern sociological thought and the rustless weapon of its
enemies. (Geertz, 1973, p. 194)
According to Geertz, what is missing is a way of understanding how some ideological
statements achieve resonance and success, while others are determined to be lies or
propaganda. Instead of evaluating statements according to whether they are ideological
or scientific, we need to separate out the concept of ideology and understand why
certain statements are amplified while others are stopped in their tracks.
Two key theories dominate our understanding of ideology, according to Geertz. The first
is the ‘interest theory’; the second, the ‘strain theory’. Interest theories of ideology,
advanced by Marxist scholars, argue that ideological statements are expressions of
68
individuals’ strategic moves to garner advantage in the social (class) system. Strain
theory, on the other hand, frames ideology as a response to the inherent inconsistencies
(strains) of everyday life.
For the first, ideology is a mask and a weapon; for the second, a symptom and a
remedy. In the interest theory, ideological pronouncements are seen against the
background of a universal struggle for advantage; in the strain theory, against
the background of a chronic effort to correct sociopsychological disequilibrium.
In the one, men pursue power; in the other, they flee anxiety. (Geertz, 1973, p.
201)
Interest and strain theories are not incompatible to Geertz. Whereas interest theories
represent strains on the individual, strain theories represent strains on society. Both are
symbolic, both see ideology as distorting social reality, both require that in order to
explain ideological statements on needs to examine their social context.
Geertz provides a definition of ideology as ‘systems of interacting symbols, as patterns
of interworking meanings’ (p. 207). People become attached to certain symbolic systems
rather than others, argues Geertz, and analysts require a deeper understanding of how
symbolic formulation works in order to understand their meaning. We need, according
to Geertz, an understanding of style something that is currently missing from the
sociologist’s toolbox.
With no notion of how metaphor, analogy, irony, ambiguity, pun, paradox,
hyperbole, rhythm, and all the other elements of what we lamely call “style”
operate even, in a majority of cases, with no recognition that these devices are
of any importance in casting personal attitudes into public form, sociologists
lack the symbolic resources of out of which to construct a more incisive
formulation. (Geertz, 1973, p. 209)
In order to understand meaning making, Geertz recommends that we look at the social
realities in which the symbol appears in order to understand how socio-psychological
strains are expressed in symbolic forms. Symbolic systems are frameworks for
understanding how to act in the world; they are ‘extrapersonal mechanisms for the
69
perception, understanding, judgment, and manipulation of the world’ (p. 216). Geertz
writes that during times of significant change we have the most use for ideologies in
order to make our way.
It is precisely at the point at which a political system begins to free itself from
the immediate governance of received tradition, from the direct and detailed
guidance of religious or philosophical canons on the one hand and from the
unreflective precepts of conventional moralism on the other, that formal
ideologies tend first to emerge and take hold. (Geertz, 1973, p. 219)
According to Geertz, this doesn’t mean societies return to naïve traditionalism when
faced with significant change but rather ‘ideological retraditionalization’ (p. 220). It is
when socio-psychological strain meets with an absence of cultural resources by which to
make sense of the strain that ideologies are formulated and advanced.
Gieryn (1983, 1999) adopts Geertz’s definition of ideologies as systems of interacting
symbols to indicate the main ways in which the boundaries between science and non-
science are determined. Studying the rhetoric engaged in by scientists in ousting their
rivals during public debates, Gieryn writes that boundary-work is used by ideologists of
a profession or occupation such that:
when the goal is expansion of authority or expertise into domains claimed by
other professions or occupations, boundary-work heightens the contrast
between rivals in ways flattering to the ideologists' side; when the goal is
monopolization of professional authority and resources, boundary-work
excludes rivals from within by defining them as outsiders with labels such as
‘pseudo’,’ ‘deviant’, or ‘amateur’; when the goal is protection of autonomy over
professional activities, boundary-work exempts members from responsibility
for consequences of the work by putting the blame on scapegoats from outside.
(Gieryn, 1983, p. 792)
Scientists’ authority, according to Gieryn, is maintained by the public articulation of
scientific practice and episodes of boundary-work which involves attributing particular
characteristics to the institution of science in order to construct ‘a social boundary that
distinguishes some intellectual activity as non-science’ (p. 782). Gieryn theorised that
scientists establish boundaries between science and non-science, scientists and non-
70
scientists through a process of classification work in which they tried to show how
science was unique from rival forms.
Gieryn acknowledges that the characteristics of science vary - science is no single thing.
The boundaries of science are ambiguous, flexible, historically changing, contextually
variable, internally inconsistent, and sometimes disputed’ (p. 792). The characteristics
assigned to science are sometimes inconsistent because scientists need to erect separate
boundaries in response to different obstacles to their pursuit of authority and resources.
Boundaries are sometimes contested by scientists with different professional ambitions
and there is ambiguity from the simultaneous pursuit of separate professional goals,
each requiring a boundary to be built in different ways. Whereas some speakers use
certain characteristics to define science according to their own interests and strains,
others may attribute science with completely different characteristics.
Furthermore, success in boundary-work, according to Gieryn, is temporary. Success,
defined as the successful attribution of particular features of science to the field, is
always provisional and contextual. Instead of operating as ‘determinants of who wins’,
‘Scientific practices and antecedent representations of it form a repertoire of
characteristics available for selective attribution on later occasions (Gieryn, 2001, p.
406). Descriptions of science and scientists, in others words, operate as specific,
concrete practices that can be more easily pointed to in order to advance interests. Such
descriptions act as rhetorical evidence or maps of knowledge work.
Gieryn’s understanding of how struggles to define science are won and lost is useful
because it points to the ways in which success is determined by the ability to define
particular roles using rhetorical moves that set up a territorial arrangement and have
those definitions accepted. Strategies, then, are particular to the context and the
71
environment; they require situationally specific vocabularies in order to succeed.
Although Geertz and Gieryn predominantly talk about boundaries between science and
other knowledges, the theory of ideologies as symbolic systems and boundaries being
determined by ideological statements or rhetoric can be applied to other knowledge
communities.
The problem with Gieryn and Geertz’s conception of the methods used to construct
boundaries between different types of knowledge and different types of knowers is that
they focus only on the descriptive nature of language instead of also recognizing that
language can also be performative (Austin, 1975). Knowledge claims aren’t just made,
they are made to perform; the utterance of statements does work in the world.
Shapin (1984) recognises the importance of literary and social conventions, in addition
to material technologies required to legitimate knowledge. Shapin recounts what was
required in order for Robert Boyle's experiments in pneumatics in the late 1650s and
early 1660s to succeed in being seen as the legitimate means by which knowledge was
to be generated and evaluated. Infrastructure needed to be constructed in order to
enable the authentication of factual claims. Such infrastructure included the material
technology of the air-pump that produced the experiments, a literary technology
whereby the phenomena produced by the pump were made known to those who
weren't direct witnesses, and the social technology of conventions people should
employ in dealing with one another and considering knowledge claims. Boyle succeeded
in building, exemplifying and defending the technology necessary for producing
legitimate knowledge by developing material, social and literary technologies.
72
Shapin recounts how Boyle suggested that matters of fact were to be produced in a
public space where experiments were collectively performed and directly witnessed and
an abstract space constituted through virtual witnessing. Virtual witnessing was
enabled by the use of literary techniques that produced realistic images of the scene of
the experiment with great deal of circumstantial detail and using prolixity and
iconography. An authenticated matter of fact was treated as a mirror of nature; a theory,
by contrast, was clearly man-made and could be contested. Boyle's linguistic boundaries
thereby acted to segregate what could be disputed from what could not. Shapin argues
that literary conventions appropriate for virtual witnessing needed to be constructed in
order to support consensus-building, reconciliation and scale.
An appropriate language had to perform several functions. First, it had to be a
resource for managing dissent and conflict in such a way as to make it possible
for philosophers to express divergent views while leaving the foundations of
knowledge intact, and, in fact, buttressing these foundations... Second, it had to
facilitate reconciliation amongst existing sects of philosophers, mobilizing that
reconciliation so as to reinforce the foundational status of matters of fact… Third,
such a language had to constitute a vehicle whereby matters of fact could
effectively be generated and validated by a community whose size was, in
principle, unlimited. (Shapin, 1984, p. 507)
The product of the pump was not an inscription as Latour and Woolgar (1979) had
argued; it was, instead, a visual experience that had to be transformed into an
inscription by a witness. Witnesses were chosen among the ranks of professors rather
than peasants and followed taken-for-granted conventions for reliability. Membership
was also limited to those who could use the appropriate conventions in conducting their
‘virtual witnessing’.
Not everyone may speak; the ability to speak entails the mastering of special
linguistic competences; and the use of ordinary speech is taken as a sign of non-
membership and non-competence. (Shapin, 1984, p. 509)
Latour (1987) demonstrates the importance of literary conventions of fact building
within modern science where scientists use tools such as literature, numbers, images
73
and figures in order to convince others of their claims. According to Latour, although
the field of rhetoric is ‘despised’ because it mobilises external allies such as ‘passion,
style, emotions, interests, lawyers’ tricks and so on’ in the service of an argument,
rhetoric is the primary way in which scientists convince others of their claims (p. 61).
The difference between rhetoric and science, according to Latour, is not that rhetoric
makes use of allies that science refrains from using but that rhetoric ‘uses only a few and
(science) very many’ (p. 61). External allies (or resources) for scientists come in the
form not only of style but also in the form of literature, numbers, geometrical figures,
equations, images, mathematical objects etc.
Latour’s study highlights the ways in which the performance of conventions is part of
what it means to be a scientist, a part of the profession’s identity. He outlines how
scientists must continually work to build the number of nodes in their network by
‘bringing friends in’ (by alluding to what other authorities said or wrote), referring to
former texts (by either supporting or attacking them), and being referred to by later
texts since a claim or statement needs to be cited by the next generation of papers in
order to become taken for granted (p. 30-44). In order to persuade, scientists initially
connect their claims with the claims of other, more established writers and their
writings, and then eventually remove those citations as a fact becomes taken-for-
granted.
The value of Shapin’s focus on literary conventions and Latour’s highlighting of the
actions taken by scientists in the development of facts is their recognition of the
performativity of language. Language does not only act to describe the world; language
can also perform. J. L. Austin (1962) developed the concept of ‘performative utterances’
to describe the types of speech acts enabled by promising, greeting, warning, ordering
74
and congratulations statements, for example. Since then, the field of linguistics and the
philosophy of language have investigated how speech acts are distinguished by the
intentions behind the speech act: ‘there is the act of saying something, what one does in
saying it, such as requesting or promising, and how one is trying to affect one's
audience’ (Bach, 1998, p. 81).
Isin and Ruppert (2015) extend the theory of speech acts to encompass a theory of
digital speech acts. According to the authors, digital speech acts involve conventions that
have been developed in online environments that include not only words but also
images and ‘sounds and various actions such as liking, coding, clicking, downloading,
sorting, blocking, and querying’ (p. 13). These constitute a type of speech act that has
performative (and political) force. In particular, digital speech acts, according to Isin and
Ruppert, are acts in which a digital citizen subject articulates that either ‘I, we, (or) they
have a right to’ (p. 12) perform these acts.
The authors base their theory of digital performativity on Foucault’s notion of power as
‘action upon action’ or ‘conduct upon conduct’ where individuals carve out social
positions through the iterative enacting of conventions over time. Some speech acts
have performative force: we don’t only talk about the world, but we also act through
talking, and in the online space, we talk through doing.
Isin and Ruppert argue that ‘Digital citizens come into being through the meshing of
online and offline lives’ (p. 19) and that cyberspace is not separate from ‘real’ space, but
that it is ‘a space of relations between and among bodies interacting through the
Internet’ (p. 12). Despite previous debate about the unproductiveness of the term
‘cyberspace’ (Graham, 2012), Isin and Ruppert argue that the term is useful in
75
demonstrating how ‘people enact themselves as subjects of power through the Internet,
and at the same time bring cyberspace into being’ (p. 12).
Similarly, Dodge and Kitchin (2005) argue that space needs to be theorized as
ontogenetic rather than as absolute. An absolute theory of space is that it is ‘a
geometrical system of organization, a kind of absolute grid within which objects are
located and events occur’ (Dodge & Kitchin, 2005, p. 171) but Dodge and Kitchin argue
that this is reductionist and depicts space as ‘natural and given’ (Dodge & Kitchin, 2005,
p. 171). Instead, they argue for an ontogenetic reading of space where space is
understood ‘as continually being brought into existence through transductive practices
(practices that change the conditions under which space is (re)made)’ (Dodge & Kitchin,
2005, p. 162). In this sense, space is given meaning through human endeavor and
produced through social relations. The authors argue that the
objects, infrastructures, processes and assemblages (produced by code)…
engender, transduce space beckon(ing) new spatial formations and spatiality
into existence. (Dodge & Kitchin, 2005, p. 171)
Liking a post on Facebook, for example, constitutes a digital act mediated through the
network in which real bodies need to be present. In Wikipedia’s terms, such actions
involve the vast set of conventions that have been developed by users over time
including: warning, proposing for deletion, amongst others. These acts constitute a
means of social struggle among actors with a variety of goals, interests and intentions.
Digital acts signify four political questions about the Internet according to Isin and
Ruppert (2014). They include anonymity, extensity, traceability, and velocity (p.13). The
Internet’s affordances or materiality are an important feature that affects socio-
technical relations and the representations, discourses and identities that have been
76
produced as a result of such mediation. The study of software demonstrates that, not
only are digital speech acts constitutive of a new type of language known to those who
act, but digital speech acts can also be multiplied because of software’s capacity for
secondary agency through automated processing.
2.4 Software and new tools for language construction
Coded environments are virtual in the sense that they are constructed out of bits and
bytes rather than bricks and mortar, but digital environments have stability and
materiality despite the fact that they exist in a virtual form. Although digital, code has
effects in the world; code enables certain behaviour while preventing other behaviour.
Code is the stuff out of which identities are created, the spaces in which knowledge and
expertise are enacted and the tools that are wielded by different groups in the
construction of ideology.
The materiality of the coded environment is important to the construction of ideologies
necessary to maintain the boundaries between expert identities. Although the concept
of materiality has been widely defined in multiple disciplines, I adopt Lievrouw’s (2014)
definition in relation to communication and technology studies as ‘the physical
character and existence of objects and artifacts that makes them useful and usable for
certain purposes under particular conditions' (p. 25). The focus on artifacts enables a
consideration of how the objects of digital environments are critical to understanding
practice, but this requires an understanding of how digital objects have materiality.
Faulkner and Runde (2013) recognise that the defining features of objects that they
endure and that they are structured. Objects differ from events in that they are the same
at each point in time in which they exist and they are structured in that they are
77
composed of distinct parts. Each object can be composed of other objects and an object’s
constituent parts aren’t always the same over time, for example when living organisms
change over time. The authors go on to show how technological objects are identified
according to their function or use for a particular group of people.
The difference between physical objects (for example, a bicycle) and digital objects
(such as an algorithm) is that digital objects don’t possess spatial characteristics (such
as location, shape or volume) but instead consist of symbols that are organized to
conform to the rules of the language in which they are expressed (Faulkner & Runde,
2013). Technological identity suggests that identity flows from both function and form.
In other words, the function of a technological object flows from that which is ascribed
to an object by particular social groups, and in order for a function to be sustained over
time, an object must possess physical capabilities. Human activities and social structure
are recursively organised so that we do things either consciously or through routine,
without thinking about the form and function of particular objects. In this way, "Social
structure is at once drawn on in human activities and at the same time (and potentially
transformed) as a largely unintended consequence of those activities" (Faulkner &
Runde, 2013, p. 13)
Faulkner and Runde use this way of conceptualising technological objects' identity by
substituting an object’s material form with its structure (consisting of constituent parts,
their arrangements and interactions but not physical attributes). In this way, they argue,
digital objects deserve close theoretical attention because of their special qualities.
Thus, although artifacts are social and gain their identity from their social context, they
have sustained identities and functions that can be examined at particular points in
time.
78
Software represents new tools by which rhetorical conventions can be constructed.
There are two key ways in which code has power according to Kitchin and Dodge
(2011). Software is an actant with technicity; it does work in the world. Although code is
invisible it ‘produces visible and tangible effects’ (p. 4). Software operates by codifying
the world into ‘rules, routines, algorithms, and captabases’. In this way, code is ‘the
manifestation of a system of thought an expression of how the world can be captured,
represented, processed, and modelled computationally’ (p. 26) Software doesn’t only
represent the world but participates in it (Dourish, 2001).
The ideals that software reflects in its categorisation of the world become a reality when
such categorisation becomes the common-sense way of framing particular social
phenomena. When a census form enables respondents to choose only female or male
categories, for example, this reinforces the gender binary. The ideal world as framed by
the software reinforces a world categorized in this way. According to Kitchin and Dodge,
one of the effects of the abstraction by software algorithms and data models in
rendering aspects of the world is that ‘the world starts to structure itself in the image of
the capta and code a self-fulfilling, recursive relationship develops.’ (Kitchin & Dodge,
2011, p. 41)
The second point is that code has what Adrian Mackenzie (2006) calls ‘secondary
agency’ in that it can be programmed to operate automatically without human oversight
(Kitchin & Dodge, 2011, p. 5). An example here is the work of an algorithm that
automatically parses data according to a set of encoded rules. Tarleton Gillespie (2010)
writes that a class of algorithms that he calls ‘public relevance algorithms’ such as
Google’s search algorithms are a method of ‘producing and certifying knowledge’ and
79
should be subject to greater public debate since they represent ‘a particular knowledge
logic, built on specific presumptions about what knowledge is and how one should
identify its most relevant components’ (Gillespie, 2014, p. 168). The ways in which
algorithms have this effect is in how they select what information is considered most
relevant, manage our interactions by highlighting some contributions and excluding
others, and reflect the public back to themselves, thereby shaping a public’s sense of
itself and with implications for who benefits from that representation.
According to Kitchin and Dodge, the power of code to execute its vision of the world in
this way is dependent on human oversight or authorization. Technological systems are
not entirely autonomous, but rather the autonomy of the system is a function of input,
the sophistication of processing and the range of outputs that the code can produce.
Software may possess secondary agency but it does not have intent. For this reason, we
need to look to those who design and deploy software have the potential to wield
software for their own goals. David Berry (2014) writes that ‘the new gatekeepers to the
centres of knowledge in the information age are given by technologies, cognitive and
data-processing algorithms, data visualization tools and high-tech companies’ (Berry,
2014, p. 181). Algorithms and tools are seen to require technical ‘black boxes’ in order
to simplify systems, but as a result they obscure how systems are constituted (Berry,
2014, p. 183). Black boxes consist of technologies that ‘hide what is inside, sometimes
productively, sometimes not, in order to simplify systems by hiding complexity or to
create abstraction layers.’ (Berry, 2014, p. 183) They are a defining feature of
computationality, according to Berry.
80
The idea of technical obfuscation lies in stark contrast to the principle of transparency
and openness that has become a key feature of the digital environment. Berry counters
that the digital commodity is often available as an end with the means veiled and
backgrounded.
(T)he code(s) are themselves hidden behind an interface or surface which
remains eminently readable, but completely inscrutable in its depths. (Berry,
2014, p. 197)
It is in this environment where code is obscured that the ‘postmodern rich’ will be able
to have access to better cognitive support from computing rather than being better
educated.
They will have the power to affect the system, to change the algorithms and even write
their own code, whereas the dominated will be forced to use partial knowledge,
incomplete data and commodified off-the-shelf algorithms which may paradoxically
provide a glitch between appearance and reality such that the dominated will
understand their condition in the spaces created by this mediation (Berry, 2014, p. 177).
Berry writes that truth is increasingly tied to expenditure and power, because the
pursuit of knowledge is tied to the use of advanced (expensive) technologies and that
this is already happening through the technological efforts to restructure the Web. In
order to build this new computational world order, the existing gatekeepers of
knowledge are already restructuring their data, information and media to enable the
computational systems to scour the world’s knowledge bases to prepare it for this new
augmented age (p. 177).
As we will see in Chapter five and six, Wikipedia editing is almost entirely mediated by
code. The travel of facts within this environment is, therefore, influenced by the
81
materiality of the code that produces the objects, tools, processes and conventions that
drive Wikipedia. From the way in which editors profiles are presented and made
searchable within Wikipedia’s history pages, to the technical permissions of registered
users to edit warning templates and the tools that have been developed to enable vandal
fighting, the materiality of software code is vital to understanding how facts travel in
Wikipedia. Code determines facts’ travelling companions, their boundaries and terrain
and even how the reproducibility of facts is governed. Along with discourse, ideology
and boundary work, an analysis of coded objects and processes is vital to the framework
that enables facts to travel on Wikipedia.
2.5 Co-production
Earlier in this chapter, I outlined the co-production idiom as a useful response to the
weaknesses of some of the STS literature and as a practical framework for
understanding the effects of socio-technical change on the material landscape. Jasanoff
(2006) writes that, during time of significant social and technological change, traditional
authorities are often reaffirmed, while at the same time there is a redefinition of
identities as societies redraft social order in order to fill gaps in expertise that result
from the adoption of new technologies. New power relations become lodged within
identities, institutions, languages and representations.
The identities, institutions, languages and representations created by science
and technology can be politically sustaining, by helping societies to
accommodate new knowledge and technological capabilities without tearing
apart (by reaffirming) the legitimacy of existing social arrangements… When the
world one knows is in disarray, redefining identities is a way of putting things
back into familiar place. (Jasanoff, 2004, p. 39)
Technological and social changes, in other words, do not only involve the introduction of
new roles, but also the re-affirmation of existing authorities. The identity of the expert,
82
in particular, is a core focus of co-productionist writings. The expert is a ‘quintessential
bridging figure of modernity’ (Jasanoff, 2004, p. 39), arising out of the need to translate
the growing complexities of science and technology for broader social goals.
In order to accommodate significant change, Jasanoff shows that there tends to be a re-
drafting of the rules of social order regarding trustworthiness and authority. This
redrafting is neither clear-cut nor does it happen immediately. Instead, boundaries
between expert and non-expert, scientist and non-scientist are being continually
reinscribed. In some cases the authority of certain experts is reaffirmed; in others, there
is a translation of one type of expertise into the context of another as new expert
identities are introduced.
Co-production is a useful framework for investigating the conditions under which facts
travel and whereby authorities are being reconfigured. There are two clear benefits to
the framework. Firstly, co-production avoids the charges of natural and social
determinism because facts (emerging from science or other authorities) are neither a
mirror of reality nor a result only of social and political interests.
Secondly, co-production responds to criticisms that STS is too internalist and lacks
normativity to be critical (Jasanoff, 2004, p. 4). According to Jasanoff, co-production
offers new ways of thinking about power, ‘highlighting the often invisible role of
knowledges, expertise, technical practices and material objects in shaping, sustaining,
subverting or transforming relations of authority’ (p. 4).
In order to understand how authority is being reconfigured within Wikipedia, however,
there is a need to bring together the co-production idiom with theories relating to the
83
strategies, practices and discourses that define the knowledge environment, particularly
in the realm of fact travel. Three core themes relating to Wikipedia’s socio-technical
system are explicated in the chapters that follow:
1. The traveling companions, boundaries and terrain, and character of facts;
2. The identities, discourses and representations in which power becomes lodged;
3. The social and technological sources of power and authority.
These three themes frame identities, discourses and representations within Wikipedia
as the locations in which power becomes lodged and that are co-produced by social and
technological forces. Identities, discourses and representations also influence the ways
in which facts travel: identities determine who the authorities are and therefore which
are considered ‘good company’; discourses are the primary means of communities
expressing and reproducing ideologies that constitute the terrain, and the
characteristics of representations (or facts, in Wikipedia’s case) can help ensure their
travel or put them at a significant disadvantage.
The ways in which the landscape in which facts travel is constructed is through both
discursive and performative means. Within the digital environment, performativity is a
process of enacting digital speech conventions as well as defining actors according to
their own ideological interests. As a result, power becomes lodged in identities,
discourses and representations, with social and technological forces continually
enacting the power relations that define them.
This does not, however, mean that the outcome of such networked relations is final,
since other actors are continually joining networks and changing their dynamics. The
84
boundaries between science and non-science, professionals and amateurs, facts and
theories are neither dualistic nor is their identity decided rapidly or once-and-for-all.
Instead, such categories are enacted and performed in multiple locales in which
particular identities and roles feature. In Lynch’s (2006) terms, expertise and the
associated domains of expert knowledge are a local interactional production rather than
an expression of rules, policies or laws.
What becomes important, then, is in discovering the local battlefronts where territorial
claims are being fought for and negotiated. Such battles play out in defining the roles
and identities of knowledge actors. The tools used to define roles and identities are
garnered from the socio-technical system in which knowledge claims are made. In
Wikipedia’s case, the socio-technical system is almost entirely mediated by code which
means that we need to look to the particular affordances of that system, as well as how
those affordances interact with users and practices, in order to fully understand the
process by which Wikipedia and Wikipedians maintain their authority over factual
information.
2.6 Conclusion
Can facts be shown to travel within Wikipedia's socio-technical system? If so, do some
facts travel further and with greater integrity than others, and why is this so? Is there
evidence that the roles of knowledge authorities and experts are being reconfigured
within Wikipedia's socio-technical system, and if so, how has this reconfiguration taken
place?
In order to answer these questions, this thesis uses theory from Science and Technology
Studies relating to the ways in which knowledge communities construct the boundaries
85
between truth and fiction, science and ideology, empirical facts and theoretical
conjecture. Central to this construction is the identity of the expert as a ‘bridging figure
of modernity’ (Jasanoff, 2004, p. 39). Instead of new expertise being immediately
recognized as technologies change, however, there is usually an intermediate process in
which traditional authorities are reaffirmed while social order is reconfigured. Experts
are recognized as such when they are able to speak the language of membership. In the
digital environment, this language is both descriptive and performative, involving digital
speech acts and conventions that need to be learned through the sharing of tacit
knowledge.
In order to understand how expertise is being reconfigured within Wikipedia, there is a
need to analyse the descriptive and performative language that is being spoken and
enacted as facts travel through Wikipedia’s socio-technical system. Facts travel well
when accompanied by good traveling companions, a favorable environment and
advantageous characteristics. The definitions of ‘good’, ‘favorable’ and ‘advantageous’,
however, are dependent on the socio-technical context in which facts are traveling. The
context that defines these terms can be found not only in what a knowledge community
thinks it does, but also how those ideals play out in practice. In Gieryn’s terms, we need
to look at the ideals of what Wikipedians say they do but also what happens in the
practice of particular facts and articles as well.
Using this framework, the next chapters provide an analysis of the role of each of these
factors. Chapter four describes the boundaries and terrain of Wikipedia with particular
emphasis on Wikipedia policies and norms. The principle of verifiability is discussed as
foundational to Wikipedia’s claims to authority and in Shapin’s terms represents the
literary, social and material technologies that support the travel of facts within
86
Wikipedia’s socio-technical system. Wikipedia’s policies are instructive of how it
accords authority to different sources of knowledge, and what is required in order to
become an effective speaker of the language of Wikipedians, but it is at the edges of the
network, in Jasanoff’s discourses and representations in which power is lodged.
Chapters five and six provide an analysis of two examples of the edges of Wikipedia’s
network in the discussion of two attempts to extend the boundaries of what facts
Wikipedia will include in its corpus. Chapter five deals with the travel of facts about