Content uploaded by Bongwon Suh
Author content
All content in this area was uploaded by Bongwon Suh on Jul 25, 2014
Content may be subject to copyright.
Power of the Few vs. Wisdom of the Crowd: Wikipedia and
the Rise of the Bourgeoisie
Aniket Kittur
University of California,
Los Angeles
Los Angeles, CA 90095 USA
nkittur@ucla.edu
Ed Chi, Bryan A. Pendleton,
Bongwon Suh
Palo Alto Research Center Inc.
Palo Alto, CA 94304 USA
{echi, bp, suh}@parc.com
Todd Mytkowicz
University of Colorado at
Boulder
Boulder, CO 80309 USA
mytkowit@colorado.edu
ABSTRACT
Wikipedia has been a resounding success story as a
collaborative system with a low cost of online participation.
However, it is an open question whether the success of
Wikipedia results from a “wisdom of crowds” type of effect
in which a large number of people each make a small
number of edits, or whether it is driven by a core group of
“elite” users who do the lion’s share of the work. In this
study we examined how the influence of “elite” vs.
“common” users changed over time in Wikipedia. The
results suggest that although Wikipedia was driven by the
influence of “elite” users early on, more recently there has
been a dramatic shift in workload to the “common” user.
We also show the same shift in del.icio.us, a very different
type of social collaborative knowledge system. We discuss
how these results mirror the dynamics found in more
traditional social collectives, and how they can influence
the design of new collaborative knowledge systems.
Author Keywords
Wikipedia, Wiki, collaboration, collaborative knowledge
systems, social tagging, delicious.
ACM Classification Keywords
H.5.3:n. [Information Interfaces]: Group and Organization
Interfaces - Collaborative computing, Web-based
interaction, Computer-supported cooperative work; H.3.5
[Information Storage and Retrieval]: Online Information
Systems; K.4.3 [Computers and Society]: Organizational
Impacts – Computer-supported collaborative work.
INTRODUCTION
Wikipedia is an online collaborative encyclopedia whose
most distinctive feature has been its low cost of
participation -- users do not even have to register to
contribute. This openness to new users has been cited as
both a source of strength and weakness [6]. Despite or
because of it, Wikipedia has grown exponentially in users
and information since 2002 [14] and has been highlighted
as a success story of low-cost collaborative knowledge
systems.
The distinctive openness of Wikipedia suggests that one of
its key strengths lies in attracting contributions from new
users who may make few edits. This suggests a kind of
“wisdom of crowds” effect” [12] in which a large number
of people making small contributions can create a quality
product.
However, many prominent Wikipedians argue that a small
number of prolific users, rather than a large crowd, are the
driving force behind the success of Wikipedia. For
example, Jimmy Wales, the founder of Wikipedia, argues
that most of the work on Wikipedia is done by a small
number of users, citing that as of December 2004, 2.5% of
the registered users on the site made half of the edits [15].
In a Sept. 4, 2006 post to his blog[11], Aaron Schwartz
published the results of his study of several articles on
Wikipedia suggesting that measured by the change in
content of each edit, less-active users of Wikipedia are
actually creating much of the text in these articles.
Schwartz’ blog entry was slashdotted, and only deepened
the mystery on who really writes Wikipedia. Is it the work
of a few elites or is it the wisdom of the crowd?
Who does the work in Wikipedia has important
implications both for the allocation of resources within
Wikipedia and for the design of novel collaborative
knowledge systems. Jimmy Wales has been quoted as
saying “I spend a lot of time listening to those four or five
hundred” top users [11], suggesting that the development of
tools and features within Wikipedia may be targeted for the
user groups that are most influential. Similarly, when
designing a collaborative knowledge system it is important
to predict who will be using the system for what purposes,
and to make design decisions and feature choices that
support important users.
In this study we examine the distribution of work in
Wikipedia over time to answer the question of who does the
work in Wikipedia. We examine “elite” vs. “common” user
contributions over time, with the elite defined either by
status (administrators) or by participation level (high-edit
1
users). Two different metrics (number of edits and change
in content) provide converging evidence on an answer.
Finally, to see whether the results found on Wikipedia
generalize, we examine del.icio.us, a very different type of
collaborative knowledge system.
RELATED WORK
A number of studies have quantified the growth of
Wikipedia as a network or graph [1][2][19]. These studies
suggest that the dynamics of Wikipedia are consistent with
those typically found in complex networks. They also find
many characteristics in common across Wikipedias in
different languages [19], and with the structure of the
World Wide Web [1][2][19].
Voss showed that the content on Wikipedia has been
growing exponentially since 2002 [14], whether measured
by articles, words, links, or bytes, or users (though he only
examined two classes of users: those making more than 5
edits in a month or more than 100 edits in a month). He
also showed that the number of unique authors per articles
follows a power law, as does the number of articles per
author. Interestingly, these measures also appear consistent
across Wikipedias of different languages (though with
slightly different parameters), suggesting similar underlying
generation processes.
Buriol et al. [1] found article and user growth consistent
with Voss’ findings. They additionally characterized user
edits over time, showing that the average number of edits
peaked in January 2003, and has been steadily declining
since then. However, this analysis was aggregated across
all users, precluding a more detailed breakdown.
Viegas et al. studied the edit patterns of articles through
“history flow visualizations” [13]. In this technique they
visualized how article edit histories changed, identifying
sections of articles that changed or remained constant over
time. They also examined the growth of 273 articles in
Wikipedia, showing that only 21% of edits reduced the size
of a page, with 6% reducing by more than 50 characters.
However, their data was collected using the May 2003
Wikipedia; as we shall describe below, much has changed
since then.
METHODS
In the following analyses, we used a history dump of the
English Wikipedia that was generated on 7/2/2006. The
dump included over 58 million revisions, from more than
4.7 million wiki pages, of which 2.4 million are article-
related entries in the encyclopedia. To process this data, we
imported the raw text into the Hadoop [7] distributed
computing environment running on a cluster of commodity
machines, while importing the structure into a clone of the
Wikipedia’s own databases for direct analysis. The Hadoop
infrastructure allowed us to quickly explore new content
analysis techniques while minimizing code optimization
time, while the database allowed us to inspect Wikipedia
statistics in their native format.
To calculate the work done while editing an article, we
calculated both the number of edits made and the change in
content between edits. We model change as the number of
words added and removed, as calculated by a traditional
“diff” operation [9]. However, we used words as units
instead of lines, allowing greater precision than previous
studies (e.g., in [13], where the change of a comma would
count an entire line as different). For both measures we
aggregated edits over all 58+ million revisions, grouping by
time and user participation level. User participation level
was calculated based on the total number of edits made by a
user.
ANALYSIS
Rise and Fall of Admins’ Influence
We first examined the influence of Wikipedia
administrators (admins). Admins consist of a small group
of power users who have gone through a stringent peer
selection process and can perform more types of actions
than a regular user, such as temporarily blocking a page
from being edited. Admins typically have an established
track record of heavy editing and commitment to improving
Wikipedia. In our Wikipedia data, there were 967 admins
averaging 12,280 edits each. The admins represent an
interesting “elite” group for these reasons: there are
relatively few of them; they have a strong record of editing;
and they have been peer-selected as belonging to a class
trusted with more power than a normal user.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Propor tion of total ed its made by a dmin
s
Figure 1. Percentage of total edits made by admins.
For each month in Wikipedia’s history, we calculated
admin influence as the number of edits made by admins
divided by the total number of edits made in that month.
Figure 1 shows the percentage of edits made by admins out
of the total edits in Wikipedia. The figure shows a rise in
the percentage of total edits made by admins to a peak of
59% of total edits in late 2002. This period of high
influence lasted until 2004, at which time the data shows a
decline in the percentage of edits made by admins that
continues through the latest 2006 data, to a low of 10% of
total edits.
Why is there such a dramatic decline in the proportion of
edits made by administrators?
Some Hypotheses for the Phenomenon
Decrease in number of admins’ edits
One possibility is that this decline in admins’ influence is
driven by a decrease in the absolute number of edits made
by admins. For example, admins may have a limited
lifespan on Wikipedia and the decline could be a result of
fewer admins making edits, or the same number of admins
making fewer edits. To answer this question we calculated
the number of edits made per month by admins. Figure 2
shows that the number of edits made by admins per month
has been steadily rising. Although there is a dropoff in the
graph toward the end in 2006, this cannot account for the
dramatic decline which began in 2004.
0
100000
200000
300000
400000
500000
600000
700000
800000
2001 2002 2003 2004 2005 2006
Total edits made b y admins
Figure 2. Number of edits per month made by admins.
This admin edit dropoff is an intriguing trend that merits
further study. However, we believe that it may merely
reflect the start-up time associated with becoming an
admin. That is, some of the admins whose edits would
contribute to that part of the curve will not attain admin
status until sometime in the future, and so their edits are not
yet counted in the graph. For example, a user joining in,
say, February 2006, will not likely to have became an
admin by July 2006, which is the latest data we have. We
could not count this user’s edits as admin edits, even though
she might become an admin later.
Bots made maintainence easier
Another potential reason for the decline in admin edits is a
reduction in the maintenance workload for administrators.
There have been a number of automated bots created for
use in Wikipedia which help with maintenance functions
such as identifying and reverting vandalism and spam [18].
If these bots are taking over some of the workload that
previously had to be done by admins, that might account for
the decline in edit percentage seen in Figure 1. However,
Error! Reference source not found. shows that this is not
the case. The percentage of edits made by bots is fairly low
and does not fit the declining admin pattern. Furthermore,
the percentage of vandalism in Wikipedia does not appear
to be decreasing [4].
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Proportion of total edits made by bots
Figure 3. Percentage of total edits made by bots.
RISE OF THE CROWD
From the data above, the rise and decline of the percentage
of edits made by admin users is a phenomenon that is not
explained by a decrease in admin editing or workload.
Instead, it suggests the hypothesis that the decline could be
due to a rise in the number of edits made by non-admins,
which would support the idea of the growing influence of
the masses. In the following, we use a different way of
analyzing the distribution of work done in Wikipedia in
order to test whether this is truly the case.
While the previous analyses dealt with the administrator
user class, there are some advantages that can be gained by
creating user classes by a different metric; specifically, the
total number of edits made by a user. First, this allows us
to verify that the rise and decline in influence found in the
admin group applies to “elite” users and is not an artifact of
being an admin. Second, this provides a data-driven metric
which is not dependent on particularities of the admin
selection process.
We classified users into one of five groups based on the
total number of edits they made in Wikipedia: more than
10,000 edits (10k+); between 5,001 and 10,000 edits (5-
10k); between 1,001 and 5,000 edits (1-5k); between 101
and 1000 edits (100-1k); and 100 or fewer edits (100-). We
then calculated the percentage of total edits that each group
made.
These percentages are shown in Figure 4. Importantly, the
same pattern of rise, dominance, and decline as seen in the
admins appears for the user class with the most edits (10k+)
– the expert “elite”. The decline of the “elite” users appears
to be accompanied by an increase in the percentage of edits
made by users with less than 100 edits – the novice
“masses”.
3
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2001 2002 2003 2004 2005 2006
% Total Edits
<100
100-1
k
1-5k
5-10k
10k+
Figure 4. Percentage of total edits made by users with
differing editing levels.
A different view of the interactions between groups can be
seen in Figure 5, which shows the raw number of edits
made by each user group per month. The number of edits
made per month by each group increases over time to 2006.
From this plot it is possible to see that the number of edits
made by users with less than 100 edits has been growing
much faster than the growth of the 10k+ group (or, indeed,
any other group).
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2001
2002
2003
2004
2005
2006
Edits
10k+
5-10k
1-5k
100-1k
<100
Figure 5. Number of edits per month made by users with
differing editing levels.
Note that high-edit user influence is not accounted for by a
decrease in their absolute activity since their edit rate
increases from 2004 through 2006, while their proportion of
edits is in decline. This is consistent with the admin data
above.
The above analyses demonstrate that the rise in edits by
users with less than 100 edits is driving the declining
proportion of high-edit user influence. However, what is
accounting for the rise in edits by the low-edit group? Is
this growth due to an increase in the population of low-edit
users, or does it mark a shift in their editing pattern?
The editing rate for each user group is shown in Figure 6
(essentially, Figure 5 normalized by the number of users per
group per month). The average number of edits per month
for each user group appears to be relatively stable for much
of the history of Wikipedia. While the low-edit group lines
are bounded in their possible range (e.g., the group with
less than 100 edits could not make an average of 100 or
more edits per month), they are remarkably flat throughout.
The 10k+ group also shows a non-decreasing pattern,
providing further evidence that their decline in influence is
not a result of a decline in absolute activity.
Figure 7 shows the raw population growth for each user
group. All groups show exponential population growth,
with a small leveling out of high-edit groups in 2006 that
likely reflects the lag in a user being counted as part of that
group.
1
10
100
1000
10000
2001 2002 2003 2004 2005 2006
Edits
10k+
5-10k
1-5k
100-1k
<100
Figure 6. Average number of edits per user per month.
1
10
100
1000
10000
100000
1000000
2001 2002 2003 2004 2005 2006
Users
10k+
5-10k
1-5k
100-1k
<100
Figure 7. Population growth for each user group.
However, plotting the percentage of the total population
made up by each user group shows that the low-edit group
is increasing in size faster than the high-edit group (Figure
8). This is consistent with and accounts for the growth in
total edits made by the low-edit group, and the proportional
decline of edits made by the high-edit group.
0%
5%
10%
15%
20%
25%
30%
35%
2001
2002
2003
2004
2005
2006
% Users
<100
100-1k
1-5k
5-10k
10k+
Figure 8. Percentage of users in each user group over time.
CHANGE IN EDIT CONTENT
The previous analyses looked at the number of edits made
by different types of users. However, an issue with these
analyses is that edits themselves could differ greatly in the
amount of changes to an article. By counting each edit
instead of the length of each edit, we effectively treat, say,
the deletion of a comma as equivalent to the addition of
three paragraphs of text. Thus to characterize the amount
and kinds of work done by different user types we need to
analyze the change in content of each edit. Using
distributed processing we were able to calculate the change
in content for all 58+ million revisions on a word-by-word
basis (see Methods for more details).
We first analyzed changes in content length made by
admins. The percentage of words changed by admins out
of the total changed words is shown in Figure 9. This
shows that the number of words changed by admins peaked
in mid-2002 at 63% of all changed words, but then declined
to 13% in the current data. Thus it appears consistent with
the data on raw edits shown in Figure 1. However, if we
discount the 2006 data due to the lag effect described
earlier, it looks like the percentage of words changed by
admins during 2005 remained stable at approximately 30%.
This is in marked contrast to the percentage of total edits
made by admins, which declined steadily from about 30%
to 10% during 2005 (see Figure 1).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Proportion changed words (admins)
Figure 9. Proportion of words changed by admins.
Figure 10 shows the reason for this difference. Admins
increased sharply in the number of words changed per
month in the 2005-2006 period (again, the drop in 2006 is
likely due to the lag effect). Thus, while the number of
edits made by admins did not keep pace with the number
made by other users, the average number of words changed
per month made up for it, and resulted in what looks like a
stable period.
0
20000000
40000000
60000000
80000000
100000000
120000000
2001 2002 2003 2004 2005 2006
Average changed words (admins)
Figure 10. Average words changed per month by admins.
We also analyzed the data using the data-driven breakdown
of users described earlier. Figure 11 shows the distribution
of changed words over time as a function of user editing
levels. The overall rise and decline of elite (10k+) user
influence (from a peak of about 50% to the latest level of
near 30%) is consistent with the trend found in Figure 4.
However, like the analysis of the admins above, the
percentage of work as measured by changed words remains
higher than measured by total edits, remaining stable at
about 30% during 2005.
5
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
<100
100-1k
1-5k
5-10k
10k+
Changed words
Figure 11. Percentage of changed words in edits made by users
with differing editing levels.
The average number of words changed per month is shown
in Figure 12. Comparing this graph to Figure 5 shows that,
remarkably, the number of words changed by elite users has
kept up with changes made by novice users, even though
the number of edits made by novice users has grown
proportionately faster.
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2001 2002 2003 2004 2005 2006
Changed words
10k+
5-10k
1-5k
100-1k
<100
Figure 12. Average words changed per month.
The above data demonstrate that the rise and decline of the
influence of elite users found above does not depend on the
type of metric used (either percentage of edits or percentage
of changed content). However, while the percentage of
edits declined sharply in the 2005-2006 period, the
percentage of changed content has remained remarkably
stable. Thus though their influence may have waned in
recent years, elite users appear to continue to contribute a
sizeable portion of the work done in Wikipedia.
Furthermore, based on the above data, edits by elite users
appear to be substantial in nature. That is, they appear to be
doing more than just fixing spelling errors or reformatting
citations. One possibility accounting for this is that they
simply revert more than other others, and while reverting
only takes a few clicks, it can look like many words have
changed. However, an analysis removing revert edits does
not substantially change the findings.
Another question is how different user editing levels differ
in the type of edits they make. Schwartz proposed that
although elite users make many edits, novice users are the
ones contributing most of the new content [11]. In contrast,
Wales suggests that elite users drive content creation while
contributions from novice users tend to be more of the
spelling error fixing variety [11]. We examined this issue
by separately counting the total number of words added and
deleted by different user types. The ratios of words added
to words removed per revision are shown in Figure 13. As
the user participation level increases, the ratio also rises,
with novice (<100 edit) users adding .86 words for every
word removed but elite and admins users having ratios
much higher (1.81 and 1.76, respectively). These data
suggest that the more experienced the user, the more
content is contributed. Indeed, novice users appear to
remove more content than they create. While this does not
mean that their contributions are not valuable (removing
unnecessary or low quality content can be an effective way
of improving quality), it does suggest that experienced
users tend to add more new content than novice users.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
<100 100-1k 1-5k 5-10k >10k Admins
Ratio of Added:Removed Words .
Figure 13. Ratio of words added to words removed per
revision for different user classes.
SHIFTS IN OTHER ONLINE SYSTEMS: DEL.ICIO.US
Is the rise and decline of elite users specific to Wikipedia or
is it a more general phenomenon found in growing
collaborative knowledge systems? To address this question
we examined the distribution of work over time in another
social collaborative system: del.icio.us.
Del.icio.us is a popular site on which users bookmark web
pages using free-form tags rather than fixed categories.
Web pages can exist with multiple tags, and tags can have
multiple associated web pages, unlike a traditional
classification organization. The social nature of del.icio.us
arises from users’ ability to see what others have tagged.
They can also see the most popular pages overall or for
specific tags, leading to an impromptu ranking system for
highly tagged pages.
A key difference between del.icio.us and Wikipedia is that
the former does not promote direct interaction between
users; instead, its power derives from the aggregation of
many users’ individual data. As such it is an interesting
contrast case to the high degree of interaction found in
Wikipedia.
We examined the distribution of work over time in
del.icio.us as measured by the number of bookmarks added
per user. As in the earlier analysis, users were split into
classes based on their total number of bookmarks added.
Figure 14 shows the percentage of bookmarks made by
different user classes. As in Wikipedia, we see a marked
decline in the percentage of edits made by the highest-edit
class from a high of 78% to a low of 27% in the latest data
(June 2006). There is a corresponding rise in the lowest-
edit class, from a low of 3% to the current high of 31%.
Note that del.icio.us shows only a steady decline in the
influence of elite users, with no initial rise as seen in
Wikipedia. This is an intriguing difference that merits
further study.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1/04 7/04 1/05 7/05 1/06
% Total bookmarks
0-161
161-271
271-415
415-628
628+
Figure 14. Percentage of bookmarks made by different user
classes in del.icio.us.
Figure 15 shows the number of bookmarks per week for the
different user classes. This figure is evidence that, like
Wikipedia, the steady decline of elite user influence is not
due to a decrease in their participation: the highest-
bookmark users continue to increase in participation
throughout the years. (The dip in 2006 is likely due to lag
effects in amassing enough bookmarks to be considered
part of the elite group, just as we saw in Wikipedia. It
cannot account for the continued decline of elite influence
since 2004.) Instead, the effect appears to be driven by the
growth in low participation users. Thus although
del.icio.us, like Wikipedia, continues to grow, there is a
dramatic shift in influence from the power of the few to the
rise of the crowds.
0
50000
100000
150000
200000
250000
1/04 7/04 1/05 7/05 1/06
Bookmarks per wee
k
0-161
161-271
271-415
415-628
628+
Figure 15. Number of bookmarks per week for different user
classes.
DISCUSSION
Although the population and content of Wikipedia appear to
be in continued exponential growth, a closer look revealed a
major shift in the distribution of work in the system. We
discovered an initial rise and subsequent decline in the
influence of “elite” users. This result held true whether
elite users were defined by peer-selected groups
(administrators) or data-driven groups (high-edit users).
We demonstrated that this decline was not due to a decrease
in elite user activity or to shifts in user group editing
patterns, but instead was driven by marked growth in the
population of low-edit users – the rise of the bourgeoisie.
These results were consistent whether the data were
analyzed by edit count or by the actual change in content.
We also examined del.icio.us, a social collaborative
bookmarking site which has also experienced tremendous
growth. Again we discovered a shift in the distribution of
work from the elite (high bookmark) to the novice (low
bookmark) users. This raises the intriguing hypothesis that
this change of influence over time may be a typical
phenomenon of online collaborative knowledge systems
and may occur despite what appears to be constant
continued overall growth.
One way of viewing the shift in influence from elite users
to novice users is as a process of technology adoption [10].
Elite users are the early adopters who select and refine the
technology. They are followed by a majority of novice
users who begin to be the primary users of the system.
7
However, collaborative products like Wikipedia are
different from traditional technology products in that the
product itself changes as a direct result of adoption. That
is, the end user who begins participating in Wikipedia
immediately has an effect on it. In this sense collaborative
products resemble dynamic social systems more than fixed
products, as they are in a state of constant change based on
the prevailing opinions of the population.
For such systems to spread, early participants must generate
sufficient utility in the system for the larger masses to find
value in low cost participation. Like the first pioneers or
the founders of a startup company, the elite few who drove
the early growth of Wikipedia generated enough utility for
it to take off as a more commons-oriented production
model; without them, it is unlikely that Wikipedia would
have succeeded. Just as the first pioneers built
infrastructure which diminished future migration costs, the
early elite users of Wikipedia built up enough content,
procedures, and guidelines to make Wikipedia into a useful
tool that promoted and rewarded participation by new users.
To carry the analogy further, as emerging social systems
grow, the influence of the early founders begins to wane.
The people who start a company are rarely the same as
those who run it; the pioneers were dwarfed by the influx of
settlers. Similarly, the influence of elite users whose
contributions drove Wikipedia until recently has been
shifting to the novice masses. With such population growth
comes the need for structure, procedure, and hierarchies.
Already there is evidence of increasing structure and
bureaucracy evolving to handle system growth. Until 2004,
the arbiter of serious disputes and the only person with the
ability to ban non-vandal users was Jimmy Wales [16];
since then an Arbitration committee has been established to
do so, as well as a Mediation committee which focuses on
helping users resolve their disputes before they reach the
level of needing arbitration. Informal structures have also
been evolved, such as the Mediation Cabal -- an unofficial
group of normal users who try to help mediate disputes –
and the Association of Members’ Advocates.
This view of Wikipedia as an emerging social system
suggests that it may be entering a critical period. The
recent massive influx of low-participation users has resulted
in a large shift in the distribution of work done in the
system. How Wikipedia reacts to this shift may be a major
determinant of its future viability and continued growth.
Future Directions and Application
These findings suggest additional avenues for further
research. First, do social stratifications (the hierarchical
arrangement of social classes within a society happen in
other social collaborative systems? There are some
anecdotal evidences that social stratification does happen in
open-source development [1], multi-player online games
[5], and bulletin board systems such as Slashdot [8].
Second, another research question is “what causes the
social stratification in the Wikipedia society?” Do
stratifications from other online communities result directly
from an increase in participation by common classes of
users? Interestingly, in sociology, social stratification is
believed by proponents of structural-functional theory to be
beneficial in stabilizing the existence of societies. Conflict
theorists such as Max Weber believe stratification occurs
due to status and power differentials [17]. Viewed from
this perspective, the invention of the admins class in
Wikipedia could have predicted the stratification of the
Wiki-society. The clear subsequent shift in power among
levels of stratification is an intriguing trend that merits
study in other online social systems.
The results described here also have implications for the
design of collaborative knowledge systems. One
recommendation is that during the early phase of the system
resources should initially be allocated towards building
tools for power users and improving expert features, as this
is the population driving early growth. However, as the
population increases resources should be shifted towards
improving ease of use and effectiveness for novice users, as
well as developing structures and procedures that can
support a large influx of users. It also suggests that
designers should continue to reevaluate the user population
in anticipation of the shifts seen here.
CONCLUSION
Wikipedia’s growth as a reference tool and an online
community has caught the attention of researchers
worldwide. Little is currently known about the dynamics of
its social structure. A current raging debate is “who writes
Wikipedia?” Is it the work of a small group of elite users,
or is it the input from the wisdom of a large crowd?
In this paper, we show that the story is more complex than
explanations offered before. In the beginning, elite users
contributed the majority of the work in Wikipedia.
However, beginning in 2004 there was a dramatic shift in
the distribution of work to the common users, with a
corresponding decline in the influence of the elite. These
results did not depend on whether work was measured by
edits or by actual change in content, though the content
analysis showed that elite users add more words per edit
than novice users (who on average remove more words than
they added). The decline of elite user influence was also
shown to occur in del.icio.us, a social collaborative
knowledge system with a very different participation
structure from Wikipedia, suggesting that it may be a
common phenomenon in the evolution of online
collaborative knowledge systems. The data presented in
this paper suggest that user dynamics in Wiki-society merit
further study and provide insights into allocating resources
when building online collaborative knowledge systems.
10. Rogers, Everett M. (1962 and 1995). Diffusion of
Innovations. New York: Free Press.
ACKNOWLEDGMENTS
We would like to thank Peter Pirolli and Stuart Card for
valuable advice and the User Interface Research Group for
engaging discussion on this topic. 11. Schwartz, Aaron. Who Writes Wikipedia?
http://www.aaronsw.com/weblog/whowriteswikipedia
(Blog retrieved Sept 20, 2006).
REFERENCES 12. Surowiecki, James (2004). The Wisdom of Crowds: Why
the Many Are Smarter Than the Few and How
Collective Wisdom Shapes Business, Economies,
Societies and Nations. New York: Doubleday, 2004.
1. Buriol, L., Castillo, C., Donato, D., Leonardi, S., and
Millozzi, S.: "Temporal Evolution of the Wikigraph".
To appear in Proc. of the Web Intelligence Conference.
Hong Kong, December 2006. IEEE CS Press. 13. Viégas, F. B., Wattenberg M., Dave K., Studying
cooperation and conflict between authors with history
flow visualizations, In Proc. of the SIGCHI conference
on Human factors in computing systems, p.575-582,
April 24-29, 2004, Vienna, Austria
2. Capocci, A., Servedio, V.D.P., Colaiori, F., Buriol L.S.,
Donato, D., Leonardi S., Caldarelli, G. “Preferential
attachment in the growth of social networks: the case of
Wikipedia”. arXiv.org/physics/0602026 (2006).
3. Chance, Tom. The social structure of open source
development.
http://programming.newsforge.com/article.pl?sid=05/01/
25/1859253 (retrieved Sept 20, 2006).
14. Voss, J. Measuring Wikipedia. In Proceedings of the
ISSI 2005 (Stockholm, Sweden, July 24-28, 2005).
15. Wales, J., Wikipedia, Emergence, and The Wisdom of
Crowds. http://mail.wikipedia.org/pipermail/wikipedia-
l/2005-May/039397.html (2005). (Retrieved Sept 21,
2006)
4. Kittur, A., Suh, B., Chi, E., Pendleton, B. A., “He says,
she says: Conflict and coordination in Wikipedia”.
Submitted. 16. Wikipedia.org. Wikipedia: Arbitration Committee.
http://en.wikipedia.org/wiki/Wikipedia:Arbitration
(Retreived Sept. 29, 2006).
5. Koster, Raph. Small Worlds: Competitive and
Cooperative Structures in Online Worlds. Talk given at
GDC2003.
http://www.raphkoster.com/gaming/smallworlds.html
(Retrieved Sept 21, 2006). 17. Wikipedia.org. Max Weber.
http://en.wikipedia.org/wiki/Max_Weber. (Retrieved
Sept. 20, 2006).
6. Hafner, Katie. Growing Wikipedia Refines Its 'Anyone
Can Edit' Policy. New York Times, June 17, 2006.
http://www.nytimes.com/2006/06/17/technology/17wiki
.html
18. Wikipedia.org. Wikipedia: Registered bots.
http://en.wikipedia.org/wiki/Wikipedia:Registered_bots
19. Zlatic, V., Bozicevic, M., Stefancic, H., Domazet, M.,
Wikipedias: Collaborative web-based encylopedias as
complex networks, Physical Review E, vol 74 (2006).
7. Hadoop Project, http://lucene.apache.org/hadoop/
8. Malda, Rob. (aka CmdrTaco) Slashdot Moderation.
http://slashdot.org/moderation.shtml (Retrieved Sept 21,
2006).
9. Myers, E., "An O(ND) Difference Algorithm and its
Variations", Algorithmica Vol. 1 No. 2, (1986), p 251.
9