Conference PaperPDF Available

To Sign Up, or not to Sign Up? Maximizing Citizen Science Contribution Rates through Optional Registration

Authors:

Abstract and Figures

Many citizen science projects ask people to create an account before they participate – some require it. What effect does the registration process have on the number and quality of contri-butions? We present a controlled study comparing the effects of mandatory registration with an interface that enables people to participate without registering, but allows them to sign up to 'claim' contributions. We demonstrate that removing the requirement to register increases the number of visitors to the site contributing to the project by 62%, without reducing data quality. We also discover that contribution rates are the same for people who choose to register, and those who remain anonymous, indicating that the interface should cater for differences in participant motivation. The study provides evidence that to maximize contribution rates, projects should offer the option to create an account, but the process should not be a barrier to immediate contribution, nor should it be required.
Content may be subject to copyright.
To Sign Up, or not to Sign Up? Maximizing Citizen Science
Contribution Rates through Optional Registration
Caroline Jay1, Robert Dunne2, David Gelsthorpe3, Markel Vigo1
1School of Computer Science, 2Research IT, 3Manchester Museum
University of Manchester
Manchester, UK
[caroline.jay, rob.dunne, david.gelsthorpe, markel.vigo]@manchester.ac.uk
ABSTRACT
Many citizen science projects ask people to create an account
before they participate – some require it. What effect does the
registration process have on the number and quality of contri-
butions? We present a controlled study comparing the effects
of mandatory registration with an interface that enables peo-
ple to participate without registering, but allows them to sign
up to ‘claim’ contributions. We demonstrate that removing
the requirement to register increases the number of visitors
to the site contributing to the project by 62%, without reduc-
ing data quality. We also discover that contribution rates are
the same for people who choose to register, and those who
remain anonymous, indicating that the interface should cater
for differences in participant motivation. The study provides
evidence that to maximize contribution rates, projects should
offer the option to create an account, but the process should
not be a barrier to immediate contribution, nor should it be
required.
Author Keywords
Citizen Science; Gamification; Productivity; Crowdsourcing
ACM Classification Keywords
H.5.m. Information Interfaces and Presentation (e.g. HCI):
Miscellaneous
INTRODUCTION
Citizen Science – the participation of ‘lay’ volunteers in sci-
entific endeavour – is now an important means of collecting,
curating and analysing data [31]. Notable successes in this
domain include Foldit, a game to illuminate protein structure
[4], and Galaxy Zoo [19], a web app for classifying galaxies.
Projects cover a wide range of topics and activities, and whilst
some require interaction with the external environment [35, 5,
33, 18, 23], many are conducted entirely online, on platforms
such as Zooniverse [29]. This brings huge opportunities –
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
CHI 2016, May 7–12, 2016, San Jose, California, USA.
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-3362-7/16/05 ...$15.00.
http://dx.doi.org/10.1145/2858036.2858319
in many cases people need only an Internet connection and
the motivation to participate. It also brings challenges: how
can we ensure data quality, retain participants and maximize
contributions in this domain?
In this note we examine the role of online registration in de-
termining contribution patterns and participation rates in cit-
izen science. Whilst many projects allow people to partici-
pate without signing up in advance, it is also common to en-
courage – or require – users to create an account. This has
obvious advantages for the platform, as it makes it easier to
keep out automated traffic, monitor contribution quality and
prompt contributors to return to the project after a period of
absence [20, 3, 28, 27, 9]. It is also beneficial for citizen sci-
entists, who are able to keep track of their work, and obtain
information about how their contributions are being used [20,
24, 25, 13]. Registration is also necessary for the functioning
of certain ‘gamified’ UI components used in many citizen sci-
ence projects, such as badges and leaderboards, which reward
those who make significant contributions [2].
While gamification in citizen science has been demonstrated
to be effective and motivating [1, 14, 8], concerns have also
been raised that encouraging competitive behaviour may re-
duce altruism [7], and that game interfaces may have a neg-
ative effect on intrinsic motivation, and alienate traditional
citizen science volunteers [10, 22, 34, 2, 21]. Competition is
now an established part of citizen science, but it is only one
aspect of the complex set of factors motivating participation,
which include not only extrinsic, reward- or reputation-based
factors, but also inherent interest in the task or subject matter,
and the satisfaction of contributing to a collective goal [20].
Previous research has shown that increasing the ‘work’ done
during registration for an online community decreases the
number of people prepared to go through it [17, 6]. Here we
examine the effects of making registration, and participation
in the ‘game’ of contribution, completely optional. In a study
conducted on a palaeontology-focused data cataloguing ap-
plication, visitors are presented, at random, with a mandatory
registration page that they must complete before entering the
project, or allowed to contribute straight away, with the op-
tion of signing-up to ‘claim’ their contributions if they wish.
We hypothesize that removing the barrier of account creation
will increase the number of people who make at least one con-
tribution, but that to resolve cognitive dissonance [11] people
who go to the trouble of signing up, either through neces-
sity or choice, will make more contributions, on average, than
those who do not. We discover that whilst people are, indeed,
more likely to contribute when they do not have to register,
registration status (whether someone has to sign up, chooses
to sign up, or chooses not to sign up) does not appear to af-
fect the number of contributions they make. We therefore
propose a recommendation that, where possible, keeping reg-
istration optional may be a good way to increase the number
of contributors, by removing a barrier to entry for those who
are motivated by personal interest, but offering the possibility
of recognition and competition for participants who are more
extrinsically motivated.
Digitization of a museum fossil collection
Manchester Museum has a world class fossil collection com-
prising around 100,000 fossils. Half of these objects are not
recorded in the museum’s database and as such accessing,
cataloguing, and generating knowledge from the collection
is problematic and time consuming. Staff and volunteers are
photographing the fossils with their corresponding labels as a
stepping-stone to making the artefacts more widely available.
To make these images accessible and useful to the public and
scientists, they must first be catalogued in digital format.
To achieve a fully digitised record of the fossil collection, a
web application was created to crowdsource the entry of fossil
information from the photographs. Its goal was to engage cit-
izen scientists interested in palaeontology, or otherwise keen
to help with curation, in contributing to the scientific goals of
the museum. The application was built using an agile partici-
patory design process. It was constructed initially with feed-
back from curation staff at Manchester Museum to ensure it
functioned correctly from a scientific and technical perspec-
tive, and then refined iteratively during a two-week beta test-
ing period with a convenience sample of people who had not
used the app before.
The application has two stages for serving images to users.
Firstly it shows images in the image queue that have not yet
been completed (see Figure 2). A contributor checks the in-
formation on the label, and then enters it into a form under-
neath the image. Secondly it shows images that have been
completed and moved to the review queue. These images can
be checked by other contributors, who can assess and edit the
data (see Figure 1). This feature was put in to allow users
to self-regulate the quality of the data supplied, ultimately
resulting in fewer inaccuracies [12], and was used in addi-
tion to presenting images multiple times for cross-checking
results [32].
Contributors received a point for each task completed (either
submission or review), and a leaderboard displays the names
of the 10 people with the highest number of points. An activ-
ity feed was added to give a sense of immediacy to the user
experience, enabling the user to see that other users are cur-
rently completing tasks. Where contributors were registered,
their name was displayed on the feed or leaderboard; if a con-
tributor was not registered, ‘Secret Scientist’ was displayed
instead.
Figure 1. Data in a labelled item available for review.
Figure 2. A sample image.
STUDY
The study compares two landing pages: Interface A requires
people to create an account or log in before they can see the
rest of the app; Interface B takes people straight to the app,
and allows them to interact with it and contribute straight
away, with the option of signing up if they wish. Data was
collected for the study for six weeks directly following the
project launch, on August 12th 2015. The application was
promoted via SciStarter1, Manchester Museum’s Twitter and
Facebook accounts, the citizen science and palaentology fo-
rums on Reddit, and internal mailing lists and email bulletins
within the University of Manchester. We tested two core hy-
potheses:
H1: A participant is more likely to make at least one contri-
bution, if he/she does not have to register.
H2: A participant with an account will make a greater number
of contributions as he/she is able to get credit for them.
Method
We used A/B testing (split testing), and the pseudo-random
number generator function built into PHP, to assign visitors
at random to the following groups:
Group A: directed to the registration/login pages to ac-
cess the web application to complete or review image label
data.2
1http://scistarter.com
2https://natureslibrary.co.uk
Group B: directed straight to the web application to com-
plete or review image label data.3
A further category, Group C, consists of those users initially
allocated to Group B who decide to register. The time of the
switch is logged in the database.
For each visitor, a unique ID was set as a tracking cookie, to
allow us to monitor any participant who was not logged in
(this applies to all participants in Group B, but also to mem-
bers of Group A or C who were not logged in). This method
has some limitations: if the same individual accessed the ap-
plication through different browsers or computers this would
result in different entries. On the other hand, it allows us to
unequivocally identify individuals using several IP addresses
(e.g., different Wi-Fi networks) or sharing IPs with other in-
dividuals (e.g., corporate IPs).
Additionally, controlling the traffic generated by robots —
which can account for 40% of the traffic in some web-
sites [16]— is a well-known challenge of A/B testing exper-
iments [15]. These entries may introduce noise in the data,
risking the reliability of the results. Therefore we followed a
systematic approach to distinguish robots and humans:
1. Robots were removed by comparing their user agent string
against the entries in a dictionary4which included common
robot identifiers such as bot, proxy, spider, slurp, etc.
2. People who accepted a cookie were classified as members
of ‘A’ or ‘B’.
3. People who did not accept the cookie were still allocated
to a group on their first visit, and had a unique ID logged,
which remained the same until they left the application. It
is key to consider these individuals as the application al-
lowed people to enter data without accepting the cookie.
A caveat is that if the user did not accept a cookie and re-
turned, he/she would be classified as a new user.
Results
383 individuals were allocated to Group A (registration was
required), while 445 were free to enter data without registra-
tion (Group B). Thirty-two individuals (8.36%) who were al-
located to Group A made at least one contribution, compared
with 57 (13%) of those allocated to condition B – a 62% in-
crease in contribution rate. A Mann-Whitney test indicates
that there is an effect of having to register on likelihood of
contributing W=3.8,p=0.05, which confirms H1.
Figure 3. Number of contributions per individual in each group.
3https://natureslibrary.co.uk/share/index/0
4http://useragentstring.com
As shown in Figure 3 participation rates followed a typical
pattern for a citizen science application. Most people tended
to be ‘dabblers’, contributing in smaller numbers overall (see
the mode in Table 1), with a small group of highly engaged
participants who made large contributions [9].
A Kruskall-Wallis test suggests that, in terms of number of
contributions per individual, there is no difference between
the groups χ2=4.77,p=0.09. In line with this, if we con-
sider individuals from C as members of B, a Mann-Whitney
test rejects group dissimilarity, W=972.5,p=0.6. Con-
sequently H2 is rejected, indicating that the fact of having
registered does not have an effect on the number of contribu-
tions.
Group N % total M Mdn Mo Max
A (32/383) 269 30.22 8.41 5 1 30
B (57/445) 267 30 5.56 3 1 56
C 354 39.78 25.29 6.5 1 137
BC 621 69.78 10.89 3 1 137
Table 1. Descriptive statistics of contributions per group. Column 1
shows the group (nocontributing/noallocated to group initially).
Analysis of Group C
Several individuals (N = 19) who fell in Group B regis-
tered despite the fact this was not mandatory for participa-
tion. Fourteen of them made at least one contribution. This
accounts for 29% of participants in Group B who contributed
at least once. Interestingly, 9 of them registered before mak-
ing any contribution, whereas the remaining five made at least
one —Figure 4 (left) shows the distribution of these individu-
als. Figure 4 (right) shows the number of contributions before
and after registration per individual: on average, the number
of contributions before registering accounts for 18% of total
contributions in a broad range that expands from 1.4 to 44%.
It is worth mentioning that out of the 890 labelled pictures,
269 were submitted by individuals from Group A, 267 by
Group B and 354 by Group C. Table 1 shows that the number
of contributions of those who registered even if they did not
have to clearly stands out as these 14 participants account for
almost 40% of the contributions.
Figure 4. Group C behaviour: on the left, the distribution of individuals
based on the number of contributions they made before they registered;
on the right, the total number of contributions of those who contributed
before and after they registered.
Data Quality
Data needed to be entered in up to seven fields of the form.
A scoring system was developed to determine whether there
was a difference between groups in terms of data quality: 0 =
nothing/garbage; 1 = one field correct; 2 = two or more fields
correct; 3 = completely correct. 5% of the total contributions
for each group was retrieved at random from the database us-
ing an SQL query. The mean quality scores out of three were
2.87 for Group A and 2.71 for Group B (excluding Group C).
On the whole the data entered for the image labels were high
quality, with no garbage entries. A Mann-Whitney test indi-
cates there was no significant difference between groups A
and B when it came to data quality, W=543.5,p=0.62.
DISCUSSION
The results support H1: A participant is more likely to make
at least one contribution if he/she does not have to register.
In our study, 8% of people assigned to Group A made at
least one contribution; for those assigned to Group B, this
was 13%, equating to an additional 5% of site visitors partic-
ipating – a 62% increase in the contribution rate.
A concern around trying to increase this potentially more ca-
sual participation is that it might result in lower quality data.
Would participants take the task seriously if they are not ac-
countable for their results? Previous work has linked online
reputation to provision of higher quality data [30], but there
is also concern that competition based on the number of con-
tributions can lead to less care taken in entering data [10]. In
this study we find that data quality is high, and does not vary
as a function of registration status.
We do not have strong support for H2, as the difference in
contribution rates between the groups is not statistically sig-
nificant. The contribution rates show the long tail distribu-
tion typical of citizen science projects, and it is interesting
to note that this is true of all groups, providing further evi-
dence that this pattern of activity is typical in citizen science
projects. It is, however, interesting to consider some of the
descriptive data from the perspective of motivation and topic
interest. Of the people who moved to Group C (optional reg-
istration), two-thirds did so before making any contributions,
which potentially indicates that they wished to ensure they
were able to get ‘credit’ before starting any work. It therefore
appears important to cater to a group of people who may be
keen to keep track of their contributions, or wish to publicly
participate in the project via the leaderboard and feed.
Additionally, we see contributors in Group C making more
than 70 submissions, and another high contributer (48
submissions) in Group B, whilst everyone in Group A
made fewer than 40 submissions. It is difficult to draw any
conclusions based on a small number of participants, but it
is reasonable to assume that these four contributors were
interested enough in the task to complete it in high volumes.
The contributions of the 14 individuals from group C account
for almost 40% of the 890 pictures that were labelled. If
they had been allocated to Group A, where they would not
have had a chance to even see the task before signing up, it is
possible the project may have lost a lot of entries.
DESIGN RECOMMENDATIONS
The results point to two clear recommendations:
Allow people to start contributing as soon as possible. Vis-
itors should be able to see the task, and contribute, before be-
ing required to register. We saw a 62% rise in the number
of site visitors contributing when we removed the registration
barrier, and allowed people to get straight to the task.
Allow people to register. There are many reasons that it is
helpful to sign up to a project. Citizen scientists get to stay in
touch with the project and become part of a community; plat-
forms get to understand more about their contributors, and are
better able to communicate with them [26]. There is also ev-
idence that more extrinsically-motivated people may want to
register, so they can get explicit credit for their contributions,
and participate in game-aspects of an app [1]. It is possible
this explains the behaviour of at least some of Group C, most
of whom signed up before making any contribution.
Methodological Considerations
This study is relatively small in scale, although the number
of participants is similar to other controlled studies in this
area [21]. The long-tail distribution of contributions also
matches that of other projects, indicating that the sample
could be viewed as representative of ‘typical’ citizen science.
The task was a relatively straightforward classification task,
however, and it is therefore necessary to be cautious about
applying the results to more complex or involved tasks.
The study was controlled and conducted in the wild, lending
it both internal and external validity, but it was purely quan-
titative, and did not collect any data regarding participants’
thoughts and motivations, so it is not possible to be certain
why participants made particular decisions. It should also be
noted that the results apply to purely online studies; where
field work is involved, or there is some other reason that it is
important to identify contributors, optional registration would
not be recommended.
CONCLUSION
This work demonstrates that it is possible to increase con-
tributions to online citizen science by more than 60%, by
allowing people to participate in a project without obliging
them to officially sign up. It also provides evidence that be-
ing able to record contributions, and potentially gain some
form of recognition for them, is important for some people,
and therefore registration should be offered. Many citizen
science projects follow this model, but the way in which reg-
istration is handled by projects varies considerably, and has
not previously been investigated systematically. We report
an empirical study demonstrating that the way in which ac-
count creation is handled really does make a difference, and
propose an evidence-based model for the registration process,
that a new project can default to.
PROJECT DATA
We practise open science, and have made project code
and data available at https://github.com/refractiveco/
natureslibrary and http://iam-data.cs.manchester.
ac.uk/investigations/13.
REFERENCES
1. Anne Bowser, Derek Hansen, Yurong He, Carol Boston,
Matthew Reid, Logan Gunnell, and Jennifer Preece.
2013. Using Gamification to Inspire New Citizen
Science Volunteers. In Proceedings of the First
International Conference on Gameful Design, Research,
and Applications (Gamification ’13). ACM, New York,
NY, USA, 18–25.
2. Anne Bowser, Derek Hansen, Jennifer Preece, Yurong
He, Carol Boston, and Jen Hammock. 2014. Gamifying
Citizen Science: A Study of Two User Groups. In
Proceedings of the Companion Publication of the 17th
ACM Conference on Computer Supported Cooperative
Work & Social Computing (CSCW Companion ’14).
ACM, New York, NY, USA, 137–140.
3. Justin Cheng, Jaime Teevan, Shamsi T. Iqbal, and
Michael S. Bernstein. 2015. Break It Down: A
Comparison of Macro- and Microtasks. In Proceedings
of the 33rd Annual ACM Conference on Human Factors
in Computing Systems (CHI ’15). ACM, New York, NY,
USA, 4061–4064.
4. Seth Cooper, Firas Khatib, Adrien Treuille, Janos
Barbero, Jeehyung Lee, Michael Beenen, Andrew
Leaver-Fay, David Baker, Zoran Popovi´
c, and others.
2010. Predicting protein structures with a multiplayer
online game. Nature 466, 7307 (2010), 756–760.
5. Mark Cottman-Fields, Margot Brereton, and Paul Roe.
2013. Virtual Birding: Extending an Environmental
Pastime into the Virtual World for Citizen Science. In
Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (CHI ’13). ACM, New
York, NY, USA, 2029–2032.
6. Sara Drenner, Shilad Sen, and Loren Terveen. 2008.
Crafting the Initial User Experience to Achieve
Community Goals. In Proceedings of the 2008 ACM
Conference on Recommender Systems (RecSys ’08).
ACM, New York, NY, USA, 187–194.
7. John Duffy and Tatiana Kornienko. 2010. Does
competition affect giving? Journal of Economic
Behavior & Organization 74, 12 (2010), 82 – 103.
8. David Easley and Arpita Ghosh. 2013. Incentives,
Gamification, and Game Theory: An Economic
Approach to Badge Design. In Proceedings of the
Fourteenth ACM Conference on Electronic Commerce
(EC ’13). ACM, New York, NY, USA, 359–376.
9. Alexandra Eveleigh, Charlene Jennett, Ann Blandford,
Philip Brohan, and Anna L. Cox. 2014. Designing for
Dabblers and Deterring Drop-outs in Citizen Science. In
Proceedings of CHI ’14. 2985–2994.
10. Alexandra Eveleigh, Charlene Jennett, Stuart Lynn, and
Anna L. Cox. 2013. “I Want to Be a Captain! I Want to
Be a Captain!”: Gamification in the Old Weather Citizen
Science Project. In Proceedings of the First
International Conference on Gameful Design, Research,
and Applications (Gamification ’13). ACM, New York,
NY, USA, 79–82.
11. Leon Festinger. 1957. A Theory of Cognitive
Dissonance. Stanford University Press.
12. Derek L. Hansen, Patrick J. Schone, Douglas Corey,
Matthew Reid, and Jake Gehring. 2013. Quality Control
Mechanisms for Crowdsourcing: Peer Review,
Arbitration, & Expertise at Familysearch Indexing.
In Proceedings of the 2013 Conference on Computer
Supported Cooperative Work (CSCW ’13). ACM, New
York, NY, USA, 649–660.
13. Ioanna Iacovides, Charlene Jennett, Cassandra
Cornish-Trestrail, and Anna L. Cox. 2013. Do Games
Attract or Sustain Engagement in Citizen Science?: A
Study of Volunteer Motivations. In CHI ’13 Extended
Abstracts on Human Factors in Computing Systems
(CHI EA ’13). ACM, New York, NY, USA, 1101–1106.
14. Nicole Immorlica, Greg Stoddard, and Vasilis
Syrgkanis. 2015. Social Status and Badge Design. In
Proceedings of the 24th International Conference on
World Wide Web (WWW ’15). International World Wide
Web Conferences Steering Committee, Republic and
Canton of Geneva, Switzerland, 473–483.
15. Ron Kohavi, Roger Longbotham, Dan Sommerfield, and
Randal Henne. 2009. Controlled experiments on the
web: survey and practical guide. Data Mining and
Knowledge Discovery 18, 1 (2009), 140–181.
16. Ron Kohavi and Rajesh Parekh. 2003. Ten
supplementary analyses to Improve E-commerce Web
Sites. In In Proceedings of the Fifth WEBKDD
Workshop.
17. Robert Kraut and Paul Resnick. 2012. Building
Successful Online Communities: Evidence-Based Social
Design. MIT Press.
18. Stacey Kuznetsov, Carrie Doonan, Nathan Wilson,
Swarna Mohan, Scott E. Hudson, and Eric Paulos. 2015.
DIYbio Things: Open Source Biology Tools As
Platforms for Hybrid Knowledge Production and
Scientific Participation. In Proceedings of the 33rd
Annual ACM Conference on Human Factors in
Computing Systems (CHI ’15). ACM, New York, NY,
USA, 4065–4068.
19. Chris J Lintott, Kevin Schawinski, Anˇ
ze Slosar, Kate
Land, Steven Bamford, Daniel Thomas, M Jordan
Raddick, Robert C Nichol, Alex Szalay, Dan Andreescu,
and others. 2008. Galaxy Zoo: morphologies derived
from visual inspection of galaxies from the Sloan Digital
Sky Survey. Monthly Notices of the Royal Astronomical
Society 389, 3 (2008), 1179–1189.
20. Oded Nov, Ofer Arazy, and David Anderson. 2014.
Scientists@Home: What Drives the Quantity and
Quality of Online Citizen Science Participation? PLoS
ONE 9, 4 (2014), e90375.
21. Chris Preist, Elaine Massung, and David Coyle. 2014.
Competing or Aiming to Be Average?: Normification
As a Means of Engaging Digital Volunteers. In
Proceedings of the 17th ACM Conference on Computer
Supported Cooperative Work & Social Computing
(CSCW ’14). ACM, New York, NY, USA, 1222–1233.
22. Nathan Prestopnik and Kevin Crowston. 2012.
Purposeful Gaming & Socio-computational Systems: A
Citizen Science Design Case. In Proceedings of the 17th
ACM International Conference on Supporting Group
Work (GROUP ’12). ACM, New York, NY, USA, 75–84.
23. Christine Robson, Marti Hearst, Chris Kau, and Jeffrey
Pierce. 2013. Comparing the Use of Social Networking
and Traditional Media Channels for Promoting Citizen
Science. In Proceedings of the 2013 Conference on
Computer Supported Cooperative Work (CSCW ’13).
ACM, New York, NY, USA, 1463–1468.
24. Dana Rotman, Jen Hammock, Jenny J. Preece, Carol L.
Boston, Derek L. Hansen, Anne Bowser, and Yurong
He. 2014. Does Motivation in Citizen Science Change
with Time and Culture?. In Proceedings of the
Companion Publication of the 17th ACM Conference on
Computer Supported Cooperative Work & Social
Computing (CSCW Companion ’14). ACM, New York,
NY, USA, 229–232.
25. Dana Rotman, Jenny Preece, Jen Hammock, Kezee
Procita, Derek Hansen, Cynthia Parr, Darcy Lewis, and
David Jacobs. 2012. Dynamic Changes in Motivation in
Collaborative Citizen-science Projects. In Proceedings
of CSCW ’12. 217–226.
26. Avi Segal, Ya’akov (Kobi) Gal, Robert J. Simpson,
Victoria Victoria Homsy, Mark Hartswood, Kevin R.
Page, and Marina Jirotka. 2015. Improving Productivity
in Citizen Science Through Controlled Intervention. In
Proceedings of the 24th International Conference on
World Wide Web (WWW ’15 Companion). International
World Wide Web Conferences Steering Committee,
Republic and Canton of Geneva, Switzerland, 331–337.
27. S. Andrew Sheppard and Loren Terveen. 2011. Quality
is a Verb: The Operationalization of Data Quality in a
Citizen Science Community. In Proceedings of the 7th
International Symposium on Wikis and Open
Collaboration (WikiSym ’11). ACM, New York, NY,
USA, 29–38.
28. S. Andrew Sheppard, Andrea Wiggins, and Loren
Terveen. 2014. Capturing Quality: Retaining
Provenance for Curated Volunteer Monitoring Data. In
Proceedings of the 17th ACM Conference on Computer
Supported Cooperative Work & Social Computing
(CSCW ’14). ACM, New York, NY, USA, 1234–1245.
29. Robert Simpson, Kevin R. Page, and David De Roure.
2014. Zooniverse: Observing the World’s Largest
Citizen Science Platform. In Proceedings of the 23rd
International Conference on World Wide Web (WWW
’14 Companion). International World Wide Web
Conferences Steering Committee, Republic and Canton
of Geneva, Switzerland, 1049–1054.
30. Yla R. Tausczik and James W. Pennebaker. 2011.
Predicting the Perceived Quality of Online Mathematics
Contributions from Users’ Reputations. In Proceedings
of the SIGCHI Conference on Human Factors in
Computing Systems (CHI ’11). ACM, New York, NY,
USA, 1885–1888.
31. Ramine Tinati, Max Van Kleek, Elena Simperl, Markus
Luczak-R¨
osch, Robert Simpson, and Nigel Shadbolt.
2015. Designing for Citizen Data Analysis: A
Cross-Sectional Case Study of a Multi-Domain Citizen
Science Platform. In Proceedings of CHI ’15.
4069–4078.
32. Luis von Ahn and Laura Dabbish. 2004. Labeling
Images with a Computer Game. In Proceedings of the
SIGCHI Conference on Human Factors in Computing
Systems (CHI ’04). ACM, New York, NY, USA,
319–326.
33. Jon Whittle. 2014. How Much Participation is Enough?:
A Comparison of Six Participatory Design Projects in
Terms of Outcomes. In Proceedings of the 13th
Participatory Design Conference: Research Papers -
Volume 1 (PDC ’14). ACM, New York, NY, USA,
121–130.
34. Andrea Wiggins. 2013. Free As in Puppies:
Compensating for ICT Constraints in Citizen Science. In
Proceedings of the 2013 Conference on Computer
Supported Cooperative Work (CSCW ’13). ACM, New
York, NY, USA, 1469–1480.
35. Andrea Wiggins and Kevin Crowston. 2011. From
Conservation to Crowdsourcing: A Typology of Citizen
Science. In Proceedings of HICSS ’11. 1–10.
... Requiring potential participants to download bespoke software would likely greatly reduce submission numbers. A study by Jay et al. (2016) demonstrated that asking volunteers to register for a project prior to contributing reduced contributions by up to 60% (Jay et al., 2016). ...
... Requiring potential participants to download bespoke software would likely greatly reduce submission numbers. A study by Jay et al. (2016) demonstrated that asking volunteers to register for a project prior to contributing reduced contributions by up to 60% (Jay et al., 2016). ...
Thesis
This research demonstrates how data collected by citizen scientists can act as a valuable resource for heritage managers. It establishes to what extent visitors’ photographs can be used to assist in aspects of condition monitoring focusing on biological and plant growth, erosion, stone/mortar movement, water ingress/pooling and antisocial behaviour. This thesis describes the methodology and outcomes of Monument Monitor (MM), a project set up in collaboration with Historic Environment Scotland (HES) that requested visitors at selected Scottish heritage sites to submit photographs of their visit. Across twenty case study sites participants were asked to record evidence of a variety of conservation issues. Patterns of contributions to the project are presented alongside key stakeholder feedback, which show how MM was received and where data collection excelled. Alongside this, the software built to manage and sort submissions is presented as a scalable methodology for the collection of citizen generated data of heritage sites. To demonstrate the applicability of citizen generated data for in depth monitoring and analysis, an environmental model is created using the submissions from one case study which predicts the effect of the changing climate at the site between 1980 - 2080. Machine Learning (ML) is used to analyse submitted data in both classification and segmentation tasks. This application demonstrates the validity of utilising ML tools to assist in the analysis and categorising of volunteer submitted photographs. The outcome of this PhD is a scalable methodology with which conservation staff can use visitor submitted images as an evidence-base to support them in the management of heritage sites.
... We further hypothesized that the group that contributed the most to eBird would have the highest scores for achievement motivations, whereas the group with the lowest contribution level would have the highest scores for appreciation motivations (McFarlane 1994;Glowinski and Moore 2014). For the final objective, we hypothesized that achievement motivations would be highest among the highly specialized birders who frequently contributed to eBird, as previous research suggests that those who contribute most to online contributory projects are motivated by achievement (Jay et al. 2016). ...
... It is common for online contributory science projects to promote competition and achievement to increase contributions (Wiggins and Crowston 2011;August et al. 2019). Although this gamification can be effective for promoting participant engagement (Jay et al. 2016), it may deter those who are not driven by achievement but rather by appreciation motivations (Duffy and Kornienko 2010;Robson 2012;Bowser et al. 2014). Our result that achievement was a weaker motivation for birding for non-retained participants than it was for active participants could influence decisions to discontinue contributing to eBird. ...
Article
Contributory citizen science projects (hereafter “contributory projects”) are a powerful tool for avian conservation science. Large-scale projects such as eBird have produced data that have advanced science and contributed to many conservation applications. These projects also provide a means to engage the public in scientific data collection. A common challenge across contributory projects like eBird is to maintain participation, as some volunteers contribute just a few times before disengaging. To maximize contributions and manage an effective program that has broad appeal, it is useful to better understand factors that influence contribution rates. For projects capitalizing on recreation activities (e.g., birding), differences in contribution levels might be explained by the recreation specialization framework, which describes how recreationists vary in skill, behavior, and motives. We paired data from a survey of birders across the United States and Canada with data on their eBird contributions (n = 28,926) to test whether those who contributed most are more specialized birders. We assigned participants to 4 contribution groups based on eBird checklist submissions and compared groups’ specialization levels and motivations. More active contribution groups had higher specialization, yet some specialized birders were not active participants. The most distinguishing feature among groups was the behavioral dimension of specialization, with active eBird participants owning specialized equipment and taking frequent trips away from home to bird. Active participants had the strongest achievement motivations for birding (e.g., keeping a life list), whereas all groups had strong appreciation motivations (e.g., enjoying the sights and sounds of birding). Using recreation specialization to characterize eBird participants can help explain why some do not regularly contribute data. Project managers may be able to promote participation, particularly by those who are specialized but not contributing, by appealing to a broader suite of motivations that includes both appreciation and achievement motivations, and thereby increase data for conservation.
... The variants require varying amounts of image labeling from the user: (from L to R) none, weak (naming objects), and strong (naming and locating objects). signups [13] or keeping tasks small and easy to understand [6]). However, it is not clear that this should be a universal guideline. ...
Preprint
The data that underlies automated methods in computer vision and machine learning, such as image retrieval and fine-grained recognition, often comes from crowdsourcing. In contexts that rely on the intrinsic motivation of users, we seek to understand how the application design affects a user's willingness to contribute and the quantity and quality of the data they capture. In this project, we designed three versions of a camera-based mobile crowdsourcing application, which varied in the amount of labeling effort requested of the user and conducted a user study to evaluate the trade-off between the level of user-contributed information requested and the quantity and quality of labeled images collected. The results suggest that higher levels of user labeling do not lead to reduced contribution. Users collected and annotated the most images using the application version with the highest requested level of labeling with no decrease in user satisfaction. In preliminary experiments, the additional labeled data supported increased performance on an image retrieval task.
... Such shifts may have key implications for our understanding of observer-based biases and for the distortion they create in the communitygenerated database of observations. Together, our findings inform the literature on citizen scientists' perceptions, attitudes, and behavior [11,12,67,[114][115][116][117][118][119][120][121][122][123][124][125][126][127][128][129][130][131][132][133]. Knowing what the biases are and understanding their salience is important for the design of interventions [79] and statistical methods [84][85][86]134] that attempt to alleviate observer-based biases in CS ecological monitoring projects. ...
Article
Full-text available
The collective intelligence of crowds could potentially be harnessed to address global challenges, such as biodiversity loss and species’ extinction. For wisdom to emerge from the crowd, certain conditions are required. Importantly, the crowd should be diverse and people’s contributions should be independent of one another. Here we investigate a global citizen-science platform—iNaturalist—on which citizens report on wildlife observations, collectively producing maps of species’ spatiotemporal distribution. The organization of global platforms such as iNaturalist around local projects compromises the assumption of diversity and independence, and thus raises concerns regarding the quality of such collectively-generated data. We spent four years closely immersing ourselves in a local community of citizen scientists who reported their wildlife sightings on iNaturalist. Our ethnographic study involved the use of questionnaires, interviews, and analysis of archival materials. Our analysis revealed observers’ nuanced considerations as they chose where, when, and what type of species to monitor, and which observations to report. Following a thematic analysis of the data, we organized observers’ preferences and constraints into four main categories: recordability, community value, personal preferences, and convenience. We show that while some individual partialities can “cancel each other out”, others are commonly shared among members of the community, potentially biasing the aggregate database of observations. Our discussion draws attention to the way in which widely-shared individual preferences might manifest as spatial, temporal, and crucially, taxonomic biases in the collectively-created database. We offer avenues for continued research that will help better understand—and tackle—individual preferences, with the goal of attenuating collective bias in data, and facilitating the generation of reliable state-of-nature reports. Finally, we offer insights into the broader literature on biases in collective intelligence systems.
... Prior studies encourage engagement by minimizing barriers, especially from the perspective of objective task complexity perspective (Ridge, 2013;Jay et al., 2016). ...
Article
As the crowdsourcing approach is increasingly being used for digitizing cultural heritage artifacts, there is a rising need for volunteer engagement in such collaborative digital humanities projects. This study focuses on the less explored topic of imbalanced volunteer engagement (IVE); it refers to the fact that most volunteers tend to focus only on a small portion of tasks, making it challenging to sustain cultural heritage crowdsourcing (CHC) projects. Using a public dataset containing 145,168,535 items captured from the Australian Newspaper Digitisation Project, we utilized a machine learning-based causal inference approach to investigate the IVE problem by examining the causal relationships between task content characteristics and volunteer engagements. We used the directed acyclic graph (DAG) to represent the structure, such that a causal relationship consisting of 11 nodes and 16 edges was obtained. Specifically, four causes, including task category, word count, number of task lists, and whether the task was illustrated, directly affect IVE. We further discuss these findings from a theoretical perspective and suggest three propositions: a) nudge-like intervention of a task list, b) subjective (perceived) low task complexity, and c) attraction of task presentation, alleviating the IVE problem. This study contributes to the literature on volunteer engagement in the CHC context and sheds new light on the design and implementation of collaborative digital humanities projects.
... Thus, in the BioPocket survey, we asked whether people would rather login with their existing accounts or create new ones, and people with higher education levels were more willing to create new accounts rather than use their existing ones. Although the majority of respondents in the BioPocket survey stated that they prefer to create a new account, it is important to note that user registration should not be a barrier to participation and should be made flexible so that participants are not discouraged from contributing to a project (Jay et al., 2016). ...
Thesis
Full-text available
The number of citizen science (CS) projects has grown significantly in recent years, owing to technological advancements. One important aspect of ensuring the success of a CS project is to consider and address the challenges in this field. Two of the main challenges in CS projects are sustaining participation and improving the quality of contributed data. Despite the studies that have been conducted to address these two challenges, there is still a need for new approaches, one of which is the use of artificial intelligence (AI) and machine learning (ML) in CS projects. Therefore, the objective of this thesis was to investigate the integration of ML and CS, as well as the role of this integration in addressing CS challenges. A comprehensive review conducted in this study of motivational factors in CS projects indicated that interest in learning about science and receiving feedback were strong motivations among participants in the majority of CS projects. Typically, experts verify the data and provide feedback to participants. However, due to large amounts of data, this manual data verification can be time-consuming. Thus, in this research, it was investigated how the integration of ML and CS can, on the one hand, automate and speed up the data validation process, and on the other hand, increase participant engagement and sustain participation by providing realtime informative feedback. To that end, a biodiversity CS project was implemented with the goal of collecting and automatically validating observations as well as providing participants with real-time feedback. ML algorithms were trained to model species distribution using environmental variables (e.g., land cover) and species data from an existing CS project, and then to validate a new contributed observation based on the likelihood of observing a species in a specific location. Furthermore, volunteers were given real-time feedback on the likelihood of observing a species in a particular location, as well as species habitat characteristics. Moreover, a user experiment was conducted, and the results indicated that participants with a higher number of contributions found the real-time feedback to be more useful in learning about biodiversity and stated that it increased their motivation to contribute to the project. Besides that, as a result of automatic data validation, only 10% of observations were flagged for expert verification, resulting in a faster validation process and improved data quality by combining human and machine power. Finally, based on the findings of the experiments and the discussions that followed, we made some recommendations for CS practitioners to consider before designing a new project or improving an existing one. The future objective of this research is to focus more on the challenges of ML and CS integration, and to investigate how this integration can be applied in other CS fields besides biodiversity.
... Optionally, they could add their demographic data (gender, education, age, length of stay in the territory and residence, and anonymized using a hexagonal grid). The option of anonymous access or non-filling in the demographic data limits the amount of resulting analyzes, but it can increase the number of the people involved and the data collected, in some cases without compromising the quality of the data [39,40]. In both modes it was possible to add new names, objects and notes in the map. ...
Article
Full-text available
The article presents a process of collecting unstandardized toponyms, in particular urbanonyms (place names denoting objects located in the cadastre of the city), within the territory of two municipalities in the Czech Republic. The collecting process was performed in two phases by crowdsourcing, using a web map application created especially for this purpose. In the first phase (October 2019–September 2020) it was collecting as many unstandardized toponyms as possible. In the second phase (October 2020–January 2021) we focused on the degree of the knowledge of these toponyms among the population living within the studied territory. The interest on the side of the general public was surprising in both phases. In the first phase, over five hundred respondents submitted more than two and a half thousand place names, most of them during the first two weeks. More than nine hundred respondents actively participated in the second phase, thanks to which we received an average of 200 responses for each place name. As regards the motivation of the public, it was most often altruism, patriotism, and curiosity that stimulated them; in the second phase, the element of gamification, embedded into the map application, also had a positive effect. The collected data can be used, for instance, in the activities of local authorities in the process of standardization of place names or as reference data for maps used within the integrated rescue system.
... In contrast to the app, the web interface to the ALA requires users to register before allowing A. Stenhouse, T. Perry, F. Grützner et al. Global Ecology and Conservation 28 (2021) e01626 contributions, which may be a barrier to use (Jay et al., 2016;Martin et al., 2016). The web interface does not automatically record location, date and time as the app does, and thus may introduce some errors when these are entered manually. ...
Article
Full-text available
Short-beaked echidna (Tachyglossus aculeatus) are a cryptic and iconic monotreme found throughout the continent of Australia. Despite observational records spanning many years aggregated in national and state biodiversity databases, the spatial and temporal intensity of sightings is limited. Although the species is of least conservation concern at the global level, a subspecies has been declared endangered on Kangaroo Island in South Australia. We need better population data over the whole continent to inform this species’ conservation management. To increase the temporal and spatial resolution of observations which may be used for more accurate population assessments, we developed a mobile app for citizen scientists to easily record echidna sightings and improve the quantity, quality and distribution of data collected for monitoring this species. EchidnaCSI is a free, cross-platform (Android & iOS), open-source app that we developed to collect echidna observational data around Australia. EchidnaCSI has been in use since September 2017 and uses mobile phone sensors to transparently and automatically record metadata, such as species observation location and time and GPS location precision. We examine differences in spatial coverage between these observations and those in existing data repositories in the Atlas of Living Australia and state biodiversity databases, especially in relation to observations in protected areas and to an index of remoteness and accessibility. EchidnaCSI has contributed over 8000 echidna observations from around Australia, more than recorded in all state systems combined, with similar spatial distribution. Although coverage was more limited in some protected areas than the reference data sources, numbers of observations in all remote areas were greater than the reference scientific data except for very remote regions. EchidnaCSI has improved the spatial and temporal intensity of observations for this iconic species and provides a complement to scientific surveys, which might usefully focus on highly protected areas and very remote regions.
... They noted that responding to discussion messages by volunteers was particularly important at the beginning of VCS projects to maintain involvement. On a similar note, Jay et al. [39] found that contribution rates to VCS projects can be signifcantly increased by allowing citizen scientists to contribute without registering for a designated citizen science project. Addressing the data privacy concerns in citizen science, Bowser et al. [3] report that values and norms of citizen science explicitly promote data sharing to achieve a greater good. ...
Article
Full-text available
Contributory citizen science projects face challenges regarding data quantity and quality. To counteract this, the projects must be centred around citizen needs and preferences, while considering aspects such as the data contribution process, including instructions, project promotion, information provision, feedback and recognition, and the design of the respective elements. Based on an understanding of the relevance of these issues affecting data contribution systems, we must determine which elements we can use to meet citizens’ needs and preferences and how to better tailor the system design to citizens’ requirements. The citizenMorph project, which aimed to create a pilot system for citizens to collect and report data on landforms, focused on the development of a citizen-centric system with elements that foster and encourage citizen engagement. We used a specifically conceived development workflow that combined participatory design with the prototyping model to involve citizen representatives in different ways and to different degrees in requirement specification, system design and implementation, and testing. This allowed citizens’ requirements to be specified and comprehensively considered in the citizenMorph system. Based on the input of citizens who were involved in the development process, the citizenMorph pilot system includes a data contribution application and a project-related website with several project-specific elements that focus on attracting and recruiting citizens to participate and increase their initial and ongoing engagement and willingness to report landform data. This includes traditional and web-based promotion elements, a specifically designed information strategy that considers information detail, depth and presentation media, project and task-tailored data contribution instructions and support, and the possibility for users to find and view the data they contributed on a web map.
Conference Paper
Full-text available
The majority of volunteers participating in citizen science projects perform only a few tasks each before leaving the system. We designed an intervention strategy to reduce disengagement in 16 different citizen science projects. Targeted users who had left the system received emails that directly addressed motivational factors that affect their engagement. Results show that participants receiving the emails were significantly more likely to return to productive activity when compared to a control group.
Book
How insights from the social sciences, including social psychology and economics, can improve the design of online communities. Online communities are among the most popular destinations on the Internet, but not all online communities are equally successful. For every flourishing Facebook, there is a moribund Friendster—not to mention the scores of smaller social networking sites that never attracted enough members to be viable. This book offers lessons from theory and empirical research in the social sciences that can help improve the design of online communities. The authors draw on the literature in psychology, economics, and other social sciences, as well as their own research, translating general findings into useful design claims. They explain, for example, how to encourage information contributions based on the theory of public goods, and how to build members' commitment based on theories of interpersonal bond formation. For each design claim, they offer supporting evidence from theory, experiments, or observational studies.
Conference Paper
Many websites encourage user participation via the use of virtual rewards like badges. While badges typically have no explicit value, they act as symbols of social status within a community. In this paper, we study how to design virtual incentive mechanisms that maximize total contributions to a website when users are motivated by social status. We consider a game-theoretic model where users exert costly effort to make contributions and, in return, are awarded with badges. The value of a badge is determined endogenously by the number of users who earn an equal or higher badge; as more users earn a particular badge, the value of that badge diminishes for all users. We show that among all possible mechanisms for assigning status-driven rewards, the optimal mechanism is a leaderboard with a cutoff: users that contribute less than a certain threshold receive nothing while the remaining are ranked by contribution. We next study the necessary features of approximately optimal mechanisms and find that approximate optimality is influenced by the the convexity of status valuations. When status valuations are concave, any approximately optimal mechanism must contain a coarse status partition, i.e. a partition of users into status classes whose size will grow as the population grows. Conversely when status valuations are convex, we prove that fine partitioning, that is a partition of users into status classes whose size stays constant as the population grows, is necessary for approximate optimality.
Conference Paper
In most online citizen science projects, a large proportion of participants contribute in small quantities. To investigate how low contributors differ from committed volunteers, we distributed a survey to members of the Old Weather project, followed by interviews with respondents selected according to a range of contribution levels. The studies reveal a complex relationship between motivations and contribution. Whilst high contributors were deeply engaged by social or competitive features, low contributors described a solitary experience of 'dabbling' in projects for short periods. Since the majority of participants exhibit this small-scale contribution pattern, there is great potential value in designing interfaces to tempt lone workers to complete 'just another page', or to lure early drop-outs back into participation. This includes breaking the work into components which can be tackled without a major commitment of time and effort, and providing feedback on the quality and value of these contributions.
Conference Paper
This paper considers the relationship between depth of participation (i.e., the effort and resources invested in participation) versus (tangible) outcomes. The discussion is based on experiences from six participatory research projects of different sizes and durations all taking place within a two year period and all aiming to develop new digital technologies to address an identified social need. The paper asks the fundamental question: how much participation is enough? That is, it challenges the notion that more participation is necessarily better, and, by using the experience of these six projects, it asks whether a more light touch or 'lean' participatory process can still achieve good outcomes, but at reduced cost. The paper concludes that participatory design researchers could consider 'agile' principles from the software development field as one way to streamline participatory processes.
Conference Paper
A large, seemingly overwhelming task can sometimes be transformed into a set of smaller, more manageable microtasks that can each be accomplished independently. For example, it may be hard to subjectively rank a large set of photographs, but easy to sort them in spare moments by making many pairwise comparisons. In crowdsourcing systems, microtasking enables unskilled workers with limited commitment to work together to complete tasks they would not be able to do individually. We explore the costs and benefits of decomposing macrotasks into microtasks for three task categories: arithmetic, sorting, and transcription. We find that breaking these tasks into microtasks results in longer overall task completion times, but higher quality outcomes and a better experience that may be more resilient to interruptions. These results suggest that microtasks can help people complete high quality work in interruption-driven environments.
Conference Paper
Designing an effective and sustainable citizen science (CS)project requires consideration of a great number of factors. This makes the overall process unpredictable, even when a sound, user-centred design approach is followed by an experienced team of UX designers. Moreover, when such systems are deployed, the complexity of the resulting interactions challenges any attempt to generalisation from retrospective analysis. In this paper, we present a case study of the largest single platform of citizen driven data analysis projects to date, the Zooniverse. By eliciting, through structured reflection, experiences of core members of its design team, our grounded analysis yielded four sets of themes, focusing on Task Specificity, Community Development, Task Design and Public Relations and Engagement, supported by two-to-four specific design claims each. For each, we propose a set of design claims (DCs), drawing comparisons to the literature on crowdsourcing and online communities to contextualise our findings.
Conference Paper
DIYbio (Do It Yourself Biology) is a growing movement of scientists, hobbyists, artists, and tinkerers who practice biology outside of professional settings. In this paper, we present our work with several open source DIYbio tools, including OpenPCR and Pearl Blue Transilluminator, which can be used to test DNA samples for specific sequences. We frame these platforms as things that gather heterogeneous materials and concerns, and enable new forms of knowledge transfer. Working with these hybrid systems in professional and DIY settings, we conducted a workshop where non-biologists tested food products for genetic modifications. Our findings suggest new design directions at the intersection of biology, technology, and DIY: i) DIYbio platforms as rich tools for hybrid knowledge production; and ii) open source biology as a site for public engagement with science.