Detecting Wikipedia Vandalism using WikiTrust?
Lab Report for PAN at CLEF 2010
B. Thomas Adler1, Luca de Alfaro2, and Ian Pye3
firstname.lastname@example.org, Fujitsu Labs of America, Inc.
email@example.com, Google, Inc. and UC Santa Cruz (on leave)
firstname.lastname@example.org, CloudFlare, Inc.
Abstract WikiTrust is a reputation system for Wikipedia authors and content.
WikiTrust computes three main quantities: edit quality, author reputation, and
content reputation. The edit quality measures how well each edit, that is, each
change introduced in a revision, is preserved in subsequent revisions. Authors
who perform good quality edits gain reputation, and text which is revised by sev-
eral high-reputation authors gains reputation. Since vandalism on the Wikipedia
is usually performed by anonymous or new users (not least because long-time
vandals end up banned), and is usually reverted in a reasonably short span of
time, edit quality, author reputation, and content reputation are obvious candi-
dates as features to identify vandalism on the Wikipedia. Indeed, using the full
set of features computed by WikiTrust, we have been able to construct classifiers
that identify vandalism with a recall of 83.5%, a precision of 48.5%, and a false
positive rate of 8%, for an area under the ROC curve of 93.4%. If we limit our-
selves to the set of features available at the time an edit is made (when the edit
quality is still unknown), the classifier achieves a recall of 77.1%, a precision of
36.9%, and a false positive rate of 12.2%, for an area under the ROC curve of
Using these classifiers, we have implemented a simple Web API that provides
the vandalism estimate for every revision of the English Wikipedia. The API can
be used both to identify vandalism that needs to be reverted, and to select high-
quality, non-vandalized recent revisions of any given Wikipedia article. These
recent high-quality revisions can be included in static snapshots of the Wikipedia,
or they can be used whenever tolerance to vandalism is low (as in a school setting,
or whenever the material is widely disseminated).
Between 4% and 6% of Wikipedia edits are considered vandalism . Although most
First, locating and reverting vandalism ends up consuming the scarce time of Wikipedia
editors. Second, the presence of vandalism on the Wikipedia lessens its perceived qual-
ity and can be an obstacle to the wider use of its content. For instance, in spite of
?The authors like to sign in alphabetical order; the order of the authors may not necessar-
ily reflect the relative sizes of the contributions. This research has been in part supported by
the Institute for Scalable Scientific Data Management, an educational collaboration between
LANL and the University of California Santa Cruz.
Wikipedia being an outstanding educational resource, its use in schools is hampered
by the risk of exposing students to inappropriate or incorrect material inserted by van-
dals. Third, any static compilation of Wikipedia articles, such as those produced by the
Wikipedia 1.0 project, is liable to contain a non-negligible amount of vandalized re-
visions. The presence of vandalism in the compilations reduces their appeal to schools
and other environments, where their static, no-surprise nature would otherwise be most
appreciated. As the compilations are static, and published on media such as DVD disks
or USB keys, the only way to remedy the vandalism is to publish new compilations —
incurring both significant cost and risking the inclusion of new vandalism.
Automated tools help reduce the impact of vandalism on the Wikipedia by identi-
fying vandalized revisions, both facilitating the work of editors and allowing the au-
tomated exclusion of vandalized revisions from static compilations and other settings
two use cases — helping editors, and filtering revisions — we distinguish two aspects
of the Wikipedia vandalism detection problem:
– Zero-delay vandalism detection. The goal of the automated tool is to identify van-
dalism as soon as it is inserted. Thus, to identify vandalism the tool can make use
only of features available at the time the edit takes place: in particular, no features
that can be acquired in the future of the edit can be used. This type of vandal-
ism detection is most useful to help the work of editors, who can be altered to
the vandalism and take appropriate action in timely fashion. Zero-delay vandalism
detection was the focus of the Task 2 PAN 2010 workshop evaluation.
– Historical vandalism detection. The goal of the automated tool is to find vandalized
revisions wherever they may occur in the revision history of Wikipedia articles.
This type of vandalism detection is most useful when filtering vandalism out of the
revisions that are displayed to visitors, as in the Flagged Revisions project , or
included in a static compilation.
The goal of this work was to build, and evaluate the performance of a vandalism detec-
tion tool that relies on the features computed by WikiTrust.4WikiTrust is a reputation
system for Wikipedia authors and content, based on the algorithmic analysis of the
evolution of Wikipedia content [1,2]. WikiTrust computes the quality of each revision,
according to how much of the change introduced by the revision is preserved in subse-
quent revisions. This computation involves the comparison of each revision with both
previous and subsequent revisions. It results in a quality index comprised between −1,
for revisions that are entirely reverted, and +1, for revisions whose contribution is kept
unchanged. Authors gain or lose reputation according to the quality of the revisions they
make. WikiTrust then uses author reputation to compute the reputation of the text com-
prising each revision, at the granularity of the individual word, according to how well
the text has been revised by high-reputation authors. In particular, after each revision,
the text that has been inserted or heavily modified has a small amount of reputation, in
proportion to the reputation of the revision’s author, while the text that has been left un-
changed has gained a small amount of reputation, again in proportion to the reputation
of the revision’s author.
We decided to base our vandalism detection tool on a simple and efficient archi-
tecture. WikiTrust stores the information about revision quality, author reputation, and
text reputation in database tables that complement the standard database tables used
by the Mediawiki software to implement the Wikipedia. To classify a revision as a
vandalism or regular revision, our tool reads information from the WikiTrust and Me-
diawiki database tables about the revision and its author, and feeds this information to
a decision-tree classifier which produces the desired output. We relied on the machine-
learning toolset Weka to train and evaluate a classifier . Our decision to rely only
on information readily available in the Mediawiki and WikiTrust database tables has
enabled us to produce an efficient Web-based API for our classifier: given the revision
id of a Wikipedia revision, the API performs the database lookups and return the clas-
sification of the revision in milliseconds. This makes our classifier well-suited to the
real-time identification of vandalism and filtering of revisions up to the scale of the
Since vandalism tends to be performed by anonymous or novice authors, who have
little or no reputation, and since vandalism tends to be reverted promptly, corresponding
to revisions of low quality as measured by WikiTrust, we expected our tool to perform
fairly well at historical vandalism detection. Indeed, when evaluated on the PAN 2010
Wikipedia vandalism corpus , our tool was able to achieve a recall of 83.5% of van-
dalism, with a precision of 48.5% and a false positive rate of 8.2%, corresponding to
an area under the ROC curve  of 93.4%. To our surprise, our tool performed rea-
sonably well even at the task of zero-delay vandalism detection, achieving a recall of
82.8% with a precision of 28.6% and false positive rate of 14.4%, leading to an area
under the ROC curve of 90.9% (these results are summarized in Table 1). The surprise
is due to the fact that, in evaluating the performance for zero-delay vandalism, we have
had to exclude the two most potent classification features we had: revision quality and
user reputation. We had to discard the revision quality feature because it is based on a
comparison between the given revision and future revisions, and these future revisions
are of course not available when a revision is inserted.
On the other hand, the author reputation feature is available at the time a revision
is made, and it would be usable in any real use of our tools. Unfortunately, we had to
exclude this from the evaluation performed for the PAN 2010 Workshop, due to the
time lag between the revisions being used for the evaluation, and the evaluation itself.
The problem is that WikiTrust keeps track only of the current value of user reputation.
At the time the PAN 2010 Workshop Task 2 evaluation took place, the values of user
reputation in the WikiTrust database reflected author reputation as of May 2010. The
PAN 2010 Task 2 evaluation dataset was instead based on revisions that had been en-
tered in November or December 2009. Thus, the author reputation values available to
us were in the future of the revisions to be evaluated, and we deemed them unsuitable
for the evaluation of the performance of zero-delay vandalism detection. Undoubtedly,
the performance we report for the zero-delay tool is lower than the real performance we
can achieve by including also the user reputation feature.
2 Features and Classification
The WikiTrust vandalism detection tool follows a standard two-phase machine learning
architecture, consisting of a feature-extraction component followed by a classifier.
In selecting the features to feed to the classifier, we have limited our consideration to
the features that can be readily derived from the information available in the database
tables used by WikiTrust, or by the Mediawiki software that serves the Wikipedia. This
constraint was imposed so that the resulting tool could work on-line, in real-time, pro-
viding vandalism detection for any Wikipedia revision in a fraction of a second. As the
WikiTrust database tables replicate some of the information present in the Mediawiki
database tables, in practice we could derive all features from the WikiTrust tables alone:
this enabled us to implement the vandalism detection tool as a self-contained web API
on top of the WikiTrust database at UC Santa Cruz.
We describe below the features we extracted. We annotate with “H” the features
that were extracted for use by the historical classifier, and we annotate with “Z” those
that were extracted for use by the zero-delay classifier; we also indicate in brackets the
feature name used by the classifier. Not all features we extracted for use by a classifier
ended up being used: many were discarded by the classifier training process, as they
were of too little significance to be worth using.
– Author reputation [Reputation] (H). Author reputation is an obvious feature to
use, since vandalism tends to be performed predominantly by anonymous or novice
tection tool, this feature is included both for zero-delay and for historical vandalism
detection: author reputation is in fact available at any time for any user. However,
for the purposes of the PAN 2010 Workshop evaluation, we have had to forego this
feature, due to the time lag between the revisions, entered in November-December
2009, and the values of reputation available to us, updated as of May 2010.
– Author is anonymous [Anon] (H,Z). In addition to author reputation, we also con-
sidered the fact whether the author was anonymous or not. Interestingly, whenever
author reputation was included as a feature, the feature stating whether the author
was anonymous or not was not used by the classifier. Evidently, knowing that a
revision was authored by a low-reputation author was enough information: whether
the author was anonymous, or a novice, did not seem to matter.
– Time interval to the previous revision [Logtime_prev] (H,Z), time interval to
the next revision [Logtime_next] (H). We provided as features the quantities
log(1+t), where t is amount of time from the preceding revision, or to the follow-
ing revision. We thought this feature might be useful, as spam is usually reverted
promptly. Indeed, the Logtime_next feature was used, but with a very low threshold
of 2.74, corresponding to a delay of only a dozen seconds between a revision and
the next one.
– Hour of day when revision was created [Hour_of_day] (H,Z). We observed a
correlation between the probability of vandalism, and the hour of the day at which
the revision was created (timing signals have been used in a more sophisticated way
for vandalism detection in ). The classifier did not use this feature: either it was
unable to exploit it, or the information it contained was subsumed by that contained
in other, more significant features.
– Minimum revision quality [Min_quality] (H). In WikiTrust, every revision r is
judged with respect to several past and future revisions. In detail, the quality q(r |
r−,r+) of r with respect to a past revision r−and a future revision r+is defined
q(r | r−,r+) =d(r−,r+) − d(r,r+)
where d(r,r?) represents the edit distance between r and r?(for the details on this
edit distance, see ). To understand this formula, it is useful to consider it from the
point of view of the author A+of the future revision r+. From the point of view of
A+, the distance d(r−,r+) − d(r,r+) represents how much closer to A+’s work
the revision has become, and thus, it measures the improvement done by r upon
r−. The amount d(r−,r) measures the amount of change done by introducing r.
Thus, q(r | r−,r+) is a measure of the improvement, divided by the total change:
it is equal to -1 for entirely reverted revisions (where r−= r+), and to +1 if the
change introduced by r with respect to r−is perfectly preserved in r+.
Every revision is evaluated with respect to up to 6 past and 6 future revisions .
The minimum revision quality is the minimum quality computed with respect to all
past and future revisions considered. A low value for the minimum revision quality
indicates that at least one future author has reverted, in part or entirely, the edit that
led to the revision. Minimum revision quality was the most influential feature for
detecting vandalism in the historical vandalism detection tool.
– Total weight of judges [Judge_weight] (H). Not all triples (r−,r,r+) used to
compute the quality of revision r are given the same weight. The higher the reputa-
tion rep(A+) of the author A+of r+, the higher the weight we give to the computed
quality q(r | r−,r+). Additionally, if r+is very different from both r−and r, then
the computed quality is given less weight, as it it difficult to compute what fraction
of the change from r−to r has been preserved in r+. Thus, we give to each judging
triple (r−,r,r+) the weight
The total weight of the judges is the total weight of all triples used to judge the
revision r. This feature was not used by any classifier.
– Average revision quality [Avg_quality] (H). In addition to the minimum revision
quality mentioned above, we have also considered the average quality of a revision,
with respect to the past and future revisions with which it has been compared,
weighed as above. In cases in which the minimum revision quality was above the
−0.662 threshold, the average quality was a strong signal, with a discrimination
threshold of 0.156.
– Maximum dissent [Max_dissent] (H). The maximum dissent of a revision mea-
sures how close the average revision quality is to the minimum revision quality.
This feature turned out to be useful in the classifier.
3 · (1 + d(r−,r))
· log(1 + rep(A+)).