Revisiting Linus’s Law: Benefits and challenges of open source software peer review

Article (PDF Available)inInternational Journal of Human-Computer Studies 77:52-65 · May 2015with 156 Reads
DOI: 10.1016/j.ijhcs.2015.01.005
Abstract
Open source projects leverage a large number of people to review products and improve code quality. Differences among participants are inevitable and important to this collaborative review process—participants with different expertise, experience, resources, and values approach the problems differently, increasing the likelihood of finding more bugs and fixing the particularly difficult ones. To understand the impacts of member differences on the open source software peer review process, we examined bug reports of Mozilla Firefox. These analyses show that these various types of differences increase workload as well as frustration and conflicts. However, they facilitate situated learning, problem characterization, design review, and boundary spanning. We discuss implications for work performance and community engagement, and suggest several ways to leverage member differences in the open source software peer review process.
Revisiting Linuss law: Benets and challenges of open source software
peer review
Jing Wang
n
, Patrick C. Shih, John M. Carroll
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, United States
article info
Article history:
Received 14 January 2014
Received in revised form
18 December 2014
Accepted 18 January 2015
Communicated by Francoise Detienne
Available online 28 January 2015
Keywords:
Online collaboration
Software peer review
Open source
abstract
Open source projects leverage a large number of people to review products and improve code quality.
Differences among participants are inevitable and important to this collaborative review process
participants with different expertise, experience, resources, and values approach the problems
differently, increasing the likelihood of nding more bugs and xing the particularly difcult ones. To
understand the impacts of member differences on the open source software peer review process, we
examined bug reports of Mozilla Firefox. These analyses show that the various types of member
differences increase workload as well as frustration and conicts. However, they facilitate situated
learning, problem characterization, design review, and boundary spanning. We discuss implications for
work performance and community engagement, and suggest several ways to leverage member
differences in the open source software peer review process.
&2015 Elsevier Ltd. All rights reserved.
1. Introduction
Given enough eyeballs, all bugs are shallow(Raymond, 2001).
Linuss law highlights the power of open source software (OSS) peer
review. As a high-prole model of large-scale online collaboration,
OSS development often involves globally dispersed experts, mostly
volunteers, collaborating over the Internet to produce software with
source code freely available. Peer review is one of the core colla-
borative practices of OSS development: distributed participants
evaluate and test the released software products, and report any
problems they discovered or experienced; others jointly analyze and
identify software defects or deciencies, and generate solutions for
repairing or improving the software products.
Large diverse communities are considered paramount to OSS
peer review processes. More users nd more bugs because adding
more users adds more different ways of stressing the program. []
Each one approaches the task of bug characterization with
aslightly different perceptual set and analytical toolkit, a different
angle on the problem(Raymond, 2001). Extensive studies on OSS
have shown the existence of other dimensions of member differ-
ences, such as heterogeneous motivations (Feller et al., 2005;
Roberts et al., 2006), different expertise in software engineering
and usability (Twidale and Nichols, 2005), and divergent perspec-
tives (Sandusky and Gasser, 2005). The advances of social media
provide opportunities for engaging an even larger audience in OSS
development, and these potential contributors are likely to differ
at even wider dimensions (Begel et al., 2010; Storey et al., 2010).
Thus, understanding the role of member differences in the
collaboration and social processes of OSS peer review, and parti-
cularly how it may be better leveraged is important for enhancing
the understanding of OSS and online large-scale collaboration.
However, little research has directly addressed diverse character-
istics of members; existing work is largely focused on differences
caused by roles (e.g., Daniel et al., 2013), distance (e.g., Cataldo
et al., 2006), or national cultures (e.g., Shachaf, 2008).
To enhance the understanding of the OSS peer review process, we
focus on the differences of participants and their impacts on the
process, building on our previous study that has identied and
characterized the common activities constituting the OSS peer
review process (Wang and Carroll, 2011). We are especially interested
in the less discernable or quantiable attributes (e.g., informational
and value diversity) in the OSS development context, rather than
more readily observable ones (e.g., tenure within the site and the
community, roles, language) as other studies did. To unfold the
impacts of various types of differences, we conducted a case study of
OSS peer review processes in Mozilla Firefox, a high-prole OSS
project involving massive number of participants with a wide range
of attributes. We analyzed member interactions recorded in bug
reports, the central space for Mozillaspeerreview.Participantswho
contributed to bug reports are valuable assets for OSS projects to
retain, as they are probably more motivated than the generic Firefox
users because using bug tracking systems to report, analyze, and x
bugsrequiremoreeffortsthansimplyusingthebrowser.
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/ijhcs
Int. J. Human-Computer Studies
http://dx.doi.org/10.1016/j.ijhcs.2015.01.005
1071-5819/&2015 Elsevier Ltd. All rights reserved.
n
Corresponding author. Tel.: þ1 814 863 8856.
E-mail addresses: jzw143@ist.psu.edu (J. Wang), patshih@ist.psu.edu (P.C. Shih),
jmcarroll@psu.edu (J.M. Carroll).
Int. J. Human-Computer Studies 77 (2015) 5265
Our ndings indicate that informational diversity and value
diversity result in both benets and challenges to the work-related
processes of OSS peer review as well as the well-being of open
source communities. We disentangle the member differences that
are often confounded with the prevalent dichotomic view of core/
periphery and developer/user in current large-scale online colla-
boration literature. Such efforts create opportunities to understand
and support overlooked groups of community participants, includ-
ing triagers and co-developers. We also suggest implications for
designing socio-technical interventions to mitigate the negative
effects and augment the positive impacts of member differences.
Distinct from prior research aiming at converting peripheral parti-
cipation into active contributions, our design proposals offer an
alternative way to embrace, leverage, and support member differ-
ences in communities that thrive on diversity.
2. Related work
2.1. Open source software peer review
OSS peer review is widely believed to be remarkably beneting
from a large community –“many eyeballs”–of members with
different perspectives (Raymond, 2001). In general, the OSS peer
review process begins with one submitting a bug report to the bug
tracking systeman application that helps developers keep track of
reported defects or deciencies of source code, design, and docu-
ments. Others examine the defect causes and request additional
information to determine whether the bug should be xed. Once a
solution is reached, they then commit a change set (mostly a patch) to
the current software product. Our earlier work (Wang and Carroll,
2011)hascodied the process as consisting of four common activities,
including submission (i.e., bug reporting), identication, solution, and
evaluation. These activities were externalized and made available in
bug reports. They serve similar purposes as individual reviews, review
meetings, rework, and follow-up in traditional software review,
respectively, but fundamentally rely on web-based technologies. In
addition to bug tracking systems in which people record and
comment on bugs and issues, version control systems manage and
synchronize committed software changes, while communication tools
such as mailing lists and Internet Relay Chat (IRC) enable developers
to discuss bugs.
Most studies related to the OSS peer review process were
conducted from the software engineering perspective, deliberately
modeling the information needs in bug report quality (Bettenburg
et al., 2008a; Breu et al., 2010), inaccurate bug assignment (Jeong
et al., 2009), efciency and effectiveness of patch review (Rigby
et al., 2008), and distribution of contributions in bug reporting and
xing (Mockus et al., 2002). Rigby et al. articulated stakeholders
involvement and their approaches for managing broadcast-based
patch review (Rigby and Storey, 2011; Rigby et al., 2008). They also
found that stakeholders interacted differently when discussing
technical issues and when discussing the project scope.
With respect to the nature of a collaborative practice, much
research effort related to OSS peer review has been devoted to
explaining the coordination mechanisms (Yamauc hi et al ., 2000;
Crowston and Scozzi, 2008; Sandusky and Gasser, 2005), negotiation
(Sandusky and Gasser, 2005), leadership and governance (Fielding,
1999; Moon and Sproull, 2000), and the role of bug tracking systems
(Bertram et al., 2010). Recent work by Ko and Chilana (2010) analyzed
the reports of Mozilla contributors who reported problems but were
never assigned problems to x, indicating different competences of
members in reporting bugs. Wang et al.sanalysis(Wang and Carroll,
2011) also showed large volume of bug reports failed to identify real
bugs, increasing the cost of ltering them out. This study is to extend
current understanding of OSS peer review by focusing on member
differences of various attributes, particularly the impacts of these
differences on the ways members interact and collaborate during the
review process.
2.2. Diversity in collocated and distributed groups
Diversity is commonly dened as the differences of any attributes
among individuals. As a complex construct, it has been studied in
multiple disciplines, such as organizational behavior, sociology, and
psychology. A complete review of this large body of literature is
beyond the scope of this paper. However, regardless of variations
between typologies, diversity can be of the readily visible attributes
(e.g., gender, ethnicity, and age), of informational attributes (e.g.,
education, work experience, and knowledge), and of attitudes and
values (e.g., whether members agree on what is important within the
community and whether they have similar goals) (Van Knippenberg
and Schippers, 2007; Jehn et al., 1999; Williams and OReilly, 1998).
Diversity of an attribute can be further classied into three types
separation, variety, and disparity (Harrison and Klein, 2007). Separa-
tion refers to the differences in (lateral) position or opinion among
members, primarily of value, belief, or attitude. Variety is the
categorical differences, often unique or distinctive information, while
disparity represents proportional differences along a continuum,
mostly of socially valued assets or resources held among members.
This conceptualization of diversity has important indication on the
need of varying measurement when different types of diversity are
being assessed.
Research on collaboration in collocated groups has a long history
of analyzing diversity of various dimensions. Reviews and meta-
analyses on this large volume of work suggested that the effects of
diversity is contingent on the context: diversity can affect work
processes, performance, and member identication in both positive
and negative ways, and effects of the same diversity dimension may
vary greatly across contexts (Harrison and Klein, 2007; Joshi and
Roh, 2009). For instance, diverse perspectives tend to benet work
performance in short-term tasks, but these positive effects become
much less signicant in longer-term teams, and conicts start to
arise (Joshi and Roh, 2009). In general, a broad range of expertise
and knowledge can enhance problem solving, decision-making, and
even creativity and innovation, while differences of perspectives
and values can result in dysfunctional group processes, conicts,
and poor performance (Milliken et al., 2003; Williams and OReilly,
1998; Van Knippenberg and Schippers, 2007). However, these
studies were largely conducted in collocated groups in organiza-
tions or laboratory experiments.
A few other researchers looked into diversity in virtual teams.
Shachaf (2008) explored the heterogeneity of national cultures of
members from ad hoc global virtual teams at a corporation. The
interviews showed such cultural diversity has positive inuences
on decision-making and negative inuences on communication.
Damian and Zowghi (2003) also focused on cultural diversity and
found that it increased team membersdifculty in achieving
common understanding of software requirements. Several chap-
ters in Hinds and Kiesler (2002) discussed conicts caused by
differences of organizational cultures and informational diversity.
However, such work still analyzed diversity in the settings in
which group or organizational boundaries were clearly dened.
There have been very few studies on volunteer-based large-scale
online communities, such as OSS projects and Wikipedia, and
therefore, they warrant additional examination.
Another theme of relevant research on virtual teams did not
specically analyze diversity but differences that were caused by
distance, such as different information about remote contexts and
different time zones. Unlike diversity literature focused on personal
attributes, this body of work examined environmental factors,
suggesting that dispersed locations led to conicts (Cramton,
J. Wang et al. / Int. J. Human-Computer Studies 77 (2015) 5265 53
2001), communication and coordination (Herbsleb and Mockus,
2003; Cataldo et al., 2006; Olson and Olson, 2000), as well as unin-
formed decisions (Grinter et al., 1999). Our goal is to investigate
member differences beyond the language and location dimensions,
in order to understand how other aspects of variations may inu-
ence collaboration.
2.3. Member differences in large-scale online peer production
A crucial advantage of online peer production lies in its open-
ness, engaging large-scale communities beyond a few experts.
Members in such communities are likely to differ in various ways.
Limited research has been done to articulate these variances and
understand the challenges and benets they bear on. The relevant
studies fall into three major themes, core versus periphery,devel-
opers versus users, and other types of variances. We summarize
them in the following subsections, respectively.
2.3.1. Differences between core and periphery
A body of research on large-scale online peer production touching
on member differences was focused on contrasting core and periphery.
One type of studies in this research theme modeled the structural
characteristics in OSS development collaboration. These efforts distin-
guished core developers from peripheral members and indicated the
relationship between project centrality and performance. Uzzi and his
collaborators (Uzzi, 1997; Uzzi et al., 2007; Guimera et al., 2005)
conceptualized social network structures of teams and their effects on
collaboration in the organizational settings. Subsequently, researchers
have applied social network analysis on collaboration among devel-
opers in OSS development. For example, Borgatti and Everett (2000)
proposed an analytical model for detecting core and peripheral
developers of OSS projects. Using a similar model, Tan et al. (2007)
reported that both direct and indirect ties positively inuence the
productivity of teams, and the greater the cohesive ties that the team
members form in their social network the more productive they are in
OSS development. Adhering to this theme, Dahlander and Frederiksen
(2012) investigated how ones position in the core/periphery structure
affects innovation. Their ndings suggested that an inverted U-shaped
relationship between an individuals position in the core/peripheral
structure and his/her innovation. In addition, spanning multiple
external boundaries remains a positive delineator for most people, but
is detrimental for the most core individuals. Crowston and Howison
(2006) reported the emergence of a hierarchical structure through
analyzing interactions around bug xing of SourceForge projects. This
hierarchy incorporated an additional tier, co-developers who submit
patches, to the core/periphery dichotomy. They also found that the
level of centralization is negatively correlated with project size,
suggesting that larger projects become more modular. In general,
OSS development is a highly decentralized activity, and research has
found that there lacks a signicant relationship between closeness and
betweenness centralities of the project teams and their success in the
majority of the OSS projects (Singh, 2010).
Aside from the effort of analyzing network structure, some
other scholars characterized the distinct activities core and per-
iphery performed in OSS development. In their widely-cited
article, Mockus et al. (2002) reported the differences of contribu-
tion types and scale between core developers and the other
contributors: a group larger than the core by about an order of
magnitude contributed to defect xing, and a group larger by
another order of magnitude contributed to defect reporting.
Rullani and Haeiger (2013) described the roles of core developers
and those in the peripheries, and how the propagation of such
standards is communicated through non-material artifacts such as
code and virtual discussions as a social practice.
Our study disentangles the member differences confounded with
status in the core/periphery structure, unfolding disparity of knowl-
edge and separation of values (Harrison and Klein, 2007). One type
of the disparity regards technical expertise, namely knowledge of
programming and the code repository. Another type of disparity
relates to process knowledge, which pertains to a specic commu-
nity, such as the awareness of community norms, practices, and
agenda. Core developers dominate the higher end of both spec-
trums, while the peripheral members spread over the other side.
The separation of values is instantiated when core and peripheral
members hold opposing beliefs of what is important to the software
application. Conceiving member differences in these more nuanced
waysbeyondthedichotomicviewofcoreandperipherycreate
opportunities to further segment the participants involved in OSS
peer review and to provide better support for their distinct needs.
For instance, a group of people could fall at the higher end of process
knowledge but lower end of technical expertise, like triagers. They
have no intention to convert to core developers, but contribute
signicantly as coordinators and gatekeepers.
2.3.2. Differences between developers and users
Another group of researchers described OSS communities
comprised of developers and users. Unlike the core/periphery
structure that is commonly observed in other types of online
communities, this developer/user distinction exclusively ts soft-
ware development domain. Humancomputer interaction scholars
particularly emphasized the importance of users. Through exam-
ining usability discussions in OSS communities, they argued that
end-users and usability experts could provide special expertise
and knowledge the other developers do not have (Bach et al.,
2009; Twidale and Nichols, 2005). In addition, Ko and Chilana
(2010) studied user contributions to OSS bug reporting, suggesting
that the valuable bug reports primarily came from a comparably
small group of experienced and frequent reporters.
A few studies considered developers and users two different
project roles. Barcellini et al. (2008) characterized how user-proposed
designs were mediated between the user community and the deve-
loper community, identifying the emerging role of boundary span-
ners who participated in parallel in both communities and coordi-
nated the discussions. A recent study analyzing 337 SourceForge OSS
projects showed that variety diversity of project roles was positively
associated with user participation (Daniel et al., 2013). Specically,
the number of users participating in a project is higher when mem-
bers are more equally spread across developers and active users.
In line with thinking of member differences as variety, our work
adds the diversity attribute of specicinformationaboutanissueto
the attributes of technical expertise and project roles. For example,
when describing a crash, members, regardless of being a developer
or a user, probably can contribute information about different para-
meters of the system environment in which the crash occurs.
Moreover, while the OSS peer review process is different from
feature discussions and coding and development reported in
Barcellini et al. (2008)s article, we observed that a group of active
volunteers, formally bug triagers, serve a similar role for bridging
user and developer sub-communities to resolve bugs. We discuss
the design recommendations to support this role, which contributes
to the literature on role emerging design.
2.3.3. Differences of other attributes
A relevant group of studies provided indirect evidence of the
existence of member differences; however, none was focused on
elaborating these differences, particularly how the varying types of
member differences impact the collaborative process. Early sur-
veys on OSS revealed the variant motivations among individual
participants. They subsumed utilitarian reasons (own needs for
J. Wang et al. / Int. J. Human-Computer Studies 77 (2015) 526554
software functions), joy and challenge, and reputation (Feller et al.,
2005; Roberts et al., 2006). Work on negotiation and conicts also
hinted at different opinions participants might hold with respect
to a wide range of issues. Sandusky and Gasser (2005) examined
instances of negotiations in OSS bug nding and xing processes
in depth, showing how negotiation in different contexts affect the
sense-making and problem-solving processes with 3 concrete
examples. Research on Wikipedia, another high-prole example
of large-scale online peer production, was mostly focused on
characteristics of conicts and negotiation among article authors.
For instance, using history ow visualizations, Viégas et al. (2004)
identied action patterns of users who had competing perspec-
tives over article editing. Kittur et al. (2007) also used visualiza-
tions to model what properties of articles highly likely led to
conicts as well as cluster users into opinion groups.
The more direct efforts of analyzing member diversity centered
on a few personal attributes that are relative easy to quantify, such
as tenure, language, and activity characteristics. Besides project
roles, Daniel et al. (2013)s study assessed the effects of diversity of
tenure, languages and activity levels on user participation in
addition to variety of project roles. The results showed that tenure
diversity (i.e., variance of the amount of time since ones registra-
tion at SourceForge) positively affect user participation, whereas
variations in participantsactivity levels was negative associated.
The authors did not nd any signicant impact of language
diversity on user participation. In the context of Wikipedia, Chen
et al. found that increased tenure diversity (i.e., variance of the
amount of time since onesrst edit at Wikipedia) and interest
diversity (i.e., variety of ones edits in different topics) increases
group productivity and decreases member withdrawal, but after a
point increased tenure diversity will increase member withdrawal
(Chen et al., 2010).
Our investigation complements the prior studies with a focus
on member differences of other types, particularly informational
diversity and value diversity, which are relatively difcult to gauge
in a quantitative way. Analysis on these dimensions also enhances
the understanding of those more quantiable diversity types, like
tenure. For instance, member differences with respect to knowl-
edge about the community relate to tenure diversity. We found
these knowledge differences may increase workload of developers
but create opportunities of situated learning for relatively inexper-
ienced participants.
3. Methods
3.1. Case selection and description
To address the richness and complexity of the differences of
massive participants in open source, we conducted a case study of
Mozilla, a large and well-recognized OSS community. We focused
our analysis on its core project, Firefox, expecting its end-user
orientation would provide substantial instances of member differ-
ences, such as different expertise and experience.
The Mozilla community consists of both employees from Mozilla
Foundation and Mozilla Corporation and volunteers. Core developers,
mostly employees, are the ones who contribute signicantly to the
project. They are divided by a variety of different roles, including
module owners, module peers, super reviewers, security group
members, quality assurance members, and support team members.
Each individual has to maintain and facilitate the evolvement of his or
her specic areas. For example, Firefox had a sub-module Session
Restore, which was supported by one sub-module owner and three
peers. Some other developers also actively contribute to the project,
but they are not afliated to any of the organizations of Mozilla. Super
reviewers assess signicant architectural refactoring, any change to
any API or pseudo-API, and all changes that affect how code modules
interact. Peripheral members rarely participate in developing the
software. They are comprised of end-users, web developers who
design extensions or other applications on top of Mozillas technology,
network administrators, third-party developers, and developers from
competitor products.
The peer review process in Mozilla follows a strict review-then-
commit fashion. Any changes (e.g., patches) to a module or sub-
module had to be reviewed by its owner or peers. Similar to other
OSS projects, Mozillas peer review involves a wide range of work
and communication tools, including version control systems (i.e.,
Mercurial), bug tracking systems (i.e., Bugzilla), IRC, mailing lists
and wikis. The bug tracking system is the primary space for peer
review in Mozilla, unlike some other projects mainly relying on
mailing lists for this purpose.
3.2. Data collection and sampling
Bug reports archived in Bugzilla the bug tracking system
Mozilla primarily uses for issue reporting, analyzing, and xing
served as our major data source. Although our analysis focused on
data from Bugzilla, our interpretation of those data was also
informed by informal conversations with active contributors in
the peer review process, examination of the design of Bugzilla, and
relevant work artifacts that were shared publicly. These artifacts
included 41 weekly meeting notes, 22 web documents describing
work procedures and member roles, and 6 blog posts from
individual members discussing Mozillas peer review processes.
We retrieved bug reports created between two stable releases
of Firefox from Bugzilla. This sampling strategy was intended to
capture the possible behavioral changes near releases (Francalanci
and Merlo, 2008). The retrieval was performed on July 28, 2010. It
includes 7322 reports led between the releases of Firefox 3.5 nal
(June 30, 2009) and 3.6 nal (Jan 21, 2010). Additionally, we
examined emails, blogs, wikis, documents and bug reports that
were mentioned or given URLs in the bug reports we retrieved.
We used email addresses to identify contributors in Mozillas
peer review process. Core developers were dened as the ones
listed on Mozillas websites as module and sub-module owners
and peers, Firefox development team members, super reviewers,
security group members and members with additional security
access, quality assurance team lead, and platform engineers during
the time between Firefox 3.5 and Firefox 3.6 releases. 176 devel-
opers were identied in this category. Crowston and Scozzi (2002)
suggested that it may be more useful to dene core developers
based on the amount of their actual contributions rather than their
job titles, but our approach may better differentiate the values and
knowledge of core developers from other volunteers.
Overall, the 7322 bug reports in our data collection were
created by 5418 unique reporters. The number of bug reports
created per reporter ranged from 1 to 82, with the median of
1 report per reporter. 1286 additional contributors participated by
commenting in the discussion space of the bug reports. The
number of bug reports contributed per contributor varied between
1 and 990, with the median of 1 report per contributor. Each bug
report on average involved 2 Mozilla members in the discussion
(median¼2; skewness¼6.765). It suggests that the peer review
process was often a collaborative effort once a bug report was
submitted, in which member differences are likely to surface.
To accommodate the possible variations of work processes and
social interactions across bug reports, we performed stratied
sampling with respect to bug status and resolutions. In Bugzilla,
open bug reports had 4 types of status, unconrmed,new,assigned,
and reopened. Unconrmed bug reports were bugs that had recently
been added to the bug tracking system but not been validated to be
true by anyone. In contrast, the other 3 types of open bug reports
J. Wang et al. / Int. J. Human-Computer Studies 77 (2015) 5265 55
were all conrmed at least by someone, even though their validity
might still be subject to othersjudgment. Closed bug reports had
6 types of resolutions, xed,duplicate,invalid,incomplete,work-
sforme,andwontx.Tabl es 1 a nd 2 summarize the distribution of
the status of open bug reports and the resolutions of closed bug
reports, respectively.
For each of the 10 types of bug status and resolutions, we
sampled 10% of them, which returned 732 bug reports in total. The
sample size was just an initial estimation for starting our qualita-
tive analysis, allowing us to perform in-depth examination instead
of being overwhelmed by voluminous texts as well as to reach
saturation. It eventually turned out to be a satisfactory estimation.
For each type, we rst intentionally selected the bug reports with
number of comments at the 98th percentile in order to observe
fairly complex social dynamics; then for the rest of the sample we
randomly chose reports with fewer comments. All bug reports in
Mozilla were created by a human reporter; they included bugs
detected by automate tests, which were the cases for 70 bug
reports in our retrieved data set and 21 sampled for the qualitative
analysis. Thus, we treated all the bug reports the same when
sampling them. Each bug report had its unique ID and often
consisted of multiple comments. In the discussion space of each
bug report, the rst comment was labeled description, which
was generated by the bug reporter to describe the issue. All the
other comments were indexed in order starting from 1. When
quoting discourses in the next section, we use the format (bug ID;
comment number; contributor type).
3.3. Data analysis
We carried out our qualitative analysis through three phases over
the 732 bug reports with 8484 comments from 2742 contributors.
First, the rst and the second authors randomly selected 50 bug
reports and read them separately. They discussed their observations
and established shared understanding of the peer review process in
Mozilla. Then during the second phase, the rst author inductively
coded the 732 bug reports, iteratively generated 628 codes occurring
a total of 1623 times, and discussed and consolidated them with the
other two authors during their weekly meetings. The frequency of
occurrences we counted was on units that a code referred to. A unit
was a complete episode demonstrating the impacts of member
diversity on the OSS peer review process. It could be part of a
comment. It could also span several comments, because participants
sometimes had to comment on each other to clarify or elaborate
their points in our study context. For example, for one episode we
coded as repeating how to avoid invalid bug reportsincluded two
comments from the bug reporter and an active contributor, respec-
tively. The reporter asked what information should be reported, and
theresponsecommentprovidedlinks to documented instructions.
When one comment was divided into several units for different
codes, we only assigned each unit a single code that best described
the dynamics the unit demonstrated. Finally, 15 groups of codes that
were pertinent to impacts of member differences and occurred the
most frequently and deemed the most relevant to member differ-
ences were selected. Overall, the 15 groups of codes occurred 424
times accounting for 26.12% of total occurrences. These codes and
themes are exclusive ones because of the way we analyzed the data.
Therefore, none of the units were counted more than once for
different codes. Then the authors integrated them into 6 themes.
Table 3 summarizes the 6 themes with their denitions and 15
groups of sub-level codes with thenumberoftheiroccurrences.We
further describe each theme in Section 4, in which the quotes appear
inthesameorderasthesub-levelcodeslistedinTable 3 .
We identied the existence of member differences and their
various types through interpreting participantsdiscourses in the
peer review process. The interpretation was inductive in the way that
we did not bound it to a specic scheme but rather kept open to any
unique information or different value statement emerged. Inferences
generated during this process were subsequently consolidated with
the guidance from the work of Jehn et al. (informational diversity and
value diversity in particular) and Harrison and Klein (2007),Jehn
et al. (1999). Adapting from their denitions of different types of
diversity, we summarize the kinds of diversity emerged from our
analysis in Tab le 4 . For example, an episode in which a commenter
stated he could not program and another commenter indicated
to take over and submitted with a patch would be identied as
disparity of prociency in software engineering.
4. Results
Our qualitative analysis converged onto six major themes,
unfolding the impacts of member differences on OSS peer review
with respect to work performance and community engagement.
These impacts indicate both challenges and benets for OSS
communities: challenges include increased workload and frustra-
tion and conicts, while benets range from situated learning,
problem characterization, design review, to boundary spanning.
They are associated with disparity and variety of expertise,
information and resources as well as separation of values and
beliefs among members. We describe these associations individu-
ally in the following subsections.
4.1. Challenges
4.1.1. Increased workload associated with informational diversity
Much workload of the peer review participants in Mozilla fell
onto screening the large number of non-realbugs as Table 2
suggests. Our qualitative examination found that these bugs were
largely associated with participantsinformational diversity with
respect to levels of technical expertise and experience with the
standard software review practices and community norms. In our
retrieved bug reports, nearly half (n¼3406; 46.51%) lacked atten-
tion or actions (i.e., open bug reports) to move forward in the peer
review process. On the contrary, the non-realbugs had already
cost a signicant amount of participantsefforts (n¼3418; 46.68%).
These bugs constituted 87.28% of the closed bug reports, including
the redundant reports (duplicate), issues caused by reasons other
Table 1
Distribution of the status of open bug reports.
Status Number of bugs % Of open bugs (%) % Of all bugs (%)
Unconrmed 2680 78.68 36.60
New 676 19.85 9.23
Assigned 39 1.15 0.53
Reopened 11 0.32 0.15
Total 3406 100 46.51
Table 2
Distribution of the resolutions of closed bug reports.
Resolution Number of bugs % Of closed bugs (%) % Of all bugs (%)
Fixed 498 12.72 6.80
Duplicate 1346 34.37 18.38
Invalid 937 23.93 12.80
Worksforme 514 13.13 7.02
Incomplete 489 12.49 6.68
Wontx 132 3.37 1.80
Total 3916 100 53.49
J. Wang et al. / Int. J. Human-Computer Studies 77 (2015) 526556
5a5639b3a6fdcc30f86d2d7f.pdf
715.23 KB
  • ... Successful investigations of transparent machine learning require multidisciplinary expertise in (1) human-computer interaction and end-user oriented design processes such as participatory design, interaction design, and scenario-based design [2, 3, 10, 11, 23, 28, 58, 63-65, 73, 83, 91, 94, 97, 104], (2) human computation and crowdsourcing [5,12,14,21,34,36,37,39,40,75,86,90,100,105,106,114], (3) end-user visualisation interfaces and computational data analytics [33, 35, 38, 42, 53, 54, 70-72, 87-89, 92, 93, 95, 101, 110, 116], and (4) computer science education [13,68,76,77,115,117,118]. This research reveals the initial insights on how to make data analytics more accessible to end-users, to empower researchers in scientific inquiry, and to involve the public in citizen science. ...
    Chapter
    Full-text available
    Advances in data analytics and human computation are transforming how researchers conduct science in domains like bioinformatics, computational social science, and digital humanities. However, data analytics requires significant programming knowledge or access to technical experts, while human computation requires in-depth knowledge of crowd management and is error-prone due to lack of scientific domain expertise. The goal of this research is to empower a broader range of scientists and end-users to conduct data analytics by adopting the End-User Development (EUD) models commonly found in today's commercial software platforms like Microsoft Excel, Wikipedia and WordPress. These EUD platforms enable people to focus on producing content rather than struggling with a development environment and new programming syntax or relying on disciplinary non-experts for essential technical help. This research explores a similar paradigm for scientists and end-users that can be thought of as End-User Data Analytics (EUDA), or Transparent Machine Learning (TML).
  • ... Individual differences in expertise, empathy, reciprocity, obligation, and impact shape both willingness to volunteer and the style of volunteering in different ways for different volunteers [34]. One special category of volunteerism takes place online [10], most prominently in the context of free and open source software (FOSS) development [12,25,35]. Similar to other types of volunteering, volunteerism in FOSS projects may be sustained by intrinsic factors such as personal learning [36], to implement some personally desired functionality [31], or reasons such as fun, enjoyment, and the feeling of making a positive contribution [31]. ...
    Conference Paper
    Full-text available
    We present the results of a study of e-NABLE, a distributed, collaborative volunteer effort to design and fabricate upper-limb assistive technology devices for limb-different users. Informed by interviews with 14 stakeholders in e-NABLE, including volunteers and clinicians, we discuss differences and synergies among each group with respect to motivations, skills, and perceptions of risks inherent in the project. We found that both groups are motivated to be involved in e-NABLE by the ability to use their skills to help others, and that their skill sets are complementary, but that their different perceptions of risk may result in uneven outcomes or missed expectations for end users. We offer four opportunities for design and technology to enhance the stakeholders' abilities to work together.
  • ... Since designing and reviewing an architecture of a complex software systems heavily rely on knowledge and expertise from different fields as well as experience and intuitions, which is usually beyond the possession of a given organization, we believe that organizations can employ the peer-review and recombination through crowdsourcing for improving the quality of final architecture design. Linus's law (i.e., "given enough eye balls, all bugs are shallow" [18]) has shown the effectiveness of a peer-review process and it has been adopted as an effective practice for quality improvement by Open Source Software communities [19]. Since a peer-review through crowdsourcing can leverage the experiences and expertise of a large number of individuals, we assert that it can result in better architecture design quality. ...
    Conference Paper
    Full-text available
    Software architecture reviews help improve the quality of architecture design decisions. Traditional reviews are considered expensive and time-consuming. We assert that organizations can consider leveraging peer-reviews and recombination (i.e., promoting design improvement through sharing design ideas) activities to improve the quality of architectures and getting staff trained. This paper reports a case study aimed at exploring the potential impact of combining peer-review and recombination on the quality of architecture design and design decisions made by novice architects, who usually have limited practical experience of architecture design. The findings show that the use of peer-review and recombination can improve both the quality of architecture design and documented decisions. From the decision-making perspective, this study also identifies the main types of challenges that the participants faced during architectural decision making and reasoning. These findings can be leveraged to focus on the types of training novice architects may need to effectively and efficiently address the types of challenges identified in this study.
  • ... The type of software being developed is a relatively evident feature that can differentiate OSS projects. As a commonly used case sampling criterion (e.g., [10,34]), it has been suggested for observing differences of OSS governance [26] and found to be associated with member diversity influencing OSS peer review processes [46], both of which are crucial to identify the differences of our interest. Mozilla produces end-user applications, such as Firefox, engaging a highly heterogeneous community. ...
    Article
    Full-text available
    The power of open source software peer review lies in the involvement of virtual communities, especially users who typically do not have a formal role in the development process. As communities grow to a certain extent, how to organize and support the peer review process becomes increasingly challenging. A universal solution is likely to fail for communities with varying characteristics.
  • Article
    Research Question/Issue: This paper provides a first-time glimpse into the post-campaign financial and innovative performance of equity-crowdfunded (ECF) and matched non-equity-crowdfunded (NECF) firms. We further investigate how direct and nominee shareholder structures in ECF firms are associated with firm performance. Research Findings/Insights: We find that ECF firms have 8.5 times higher failure rates than matched NECF firms. However, 3.4 times more ECF firms have patent applications than matched NECF firms. Within the group of ECF firms, we find that ECF firms financed through a nominee structure make smaller losses, while ECF firms financed through a direct shareholder structure have more new patent applications, including foreign patent applications. Theoretical/Academic Implications: Our findings suggest that there are important adverse selection issues on equity crowdfunding platforms, although these platforms also serve as a catalyst for innovative activities. Moreover, our findings suggest that there is a more complex relationship between dispersed versus concentrated crowd shareholders and firm performance than currently assumed in the literature. Practitioner/Policy Implications: For policymakers and crowdfunding platforms, investor protection against adverse selection will be important to ensure the sustainability of equity crowdfunding markets. For entrepreneurs and crowd investors, our study highlights how equity crowdfunding and the adopted shareholder structure relate to short-term firm performance. Keywords: Equity Crowdfunding, Corporate Governance, Direct Shareholder Structure, Nominee Structure, Firm Performance.
  • Article
    Full-text available
    Linux is considered to be less prone to malware compared to other operating systems, and as a result Linux users rarely run anti-malware. However, many popular software applications released on other platforms cannot run natively on Linux. Wine is a popular compatibility layer for running Windows programs on Linux. The level of security risk that Wine poses to Linux users is largely undocumented. This project was conducted to assess the security implications of using Wine, and to determine if any specific types of malware or malware behavior have a significant effect on the malware being successful in Wine. Dynamic analysis (both automated and manual) was applied to 30 malware samples both in a Windows environment and Linux environment running Wine. Behavior analyzed included file system, registry, and network access, and the spawning of processes, and services. The behavior was compared to determine malware success in Wine. The study results provide evidence that Wine can pose serious security implications when used to run Windows software in a Linux environment. Five samples of Windows malware were run successfully through Wine on a Linux system. No significant relationships were discovered between the success of the malware and its high-level behavior or malware type. However, certain API calls could not be recreated in a Linux environment, and led to failure of malware to execute via Wine. This suggests that particular malware samples that utilize these API calls will never run completely successfully in a Linux environment. As a consequence, the success of some samples can be determined from observing the API calls when run within a Windows environment.
  • Conference Paper
    In online collaboration projects, conflicts often arise in the peer review process, due to the disagreement over whether one’s contribution should be accepted. These conflicts generally have detrimental effects on contributors’ continuing participation in the community. Few studies have investigated how to manage these conflicts effectively. This paper aims to examine the effectiveness of three strategies – rational explanation, constructive suggestion, and social encouragement – in managing conflicts. In an analysis of 170 online software development projects, we investigated how different conflict management strategies aimed at handling contributors’ arguments during the peer review process influenced their subsequent participation in the projects. The results show that (i) conflicts significantly increase contributors’ likelihood of leaving the communities; (ii) neither rational explanations nor social encouragement could reduce the negative consequences of conflicts; (iii) only constructive suggestions have a positive effect in retaining the contributors.
  • Article
    Full-text available
    This chapter develops a dynamic model of how a work group's actual creativity may be affected by its composition and by how its early interactions unfold as members develop ways of working together in order to perform their assigned tasks. The chapter is organized into five sections. The first section presents a perspective on creativity in work groups. It then discusses different forms of diversity relevant to creativity. The third section develops a conceptual model of how diversity might affect a work group's early functioning; it focuses on how diversity may influence members' affective reactions to one another as well as the cognitive processes they use. The fourth section shows how these early interactions combine with variables - such as performance feedback and a group's self-monitoring activities - to affect a work group's subsequent interactions and its creative performance. The discussion targets work groups in organizational settings that have fairly stable membership, or at least have a core set of members who remain with a group for multiple phases of its work life. The chapter concludes with a discussion of the implications of the dynamic process model for future research on work group creativity.
  • Article
    Integrating macro and micro theoretical perspectives, we conducted a meta-analysis examining the role of contextual factors in team diversity research. Using data from 8,757 teams in 39 studies conducted in organizational settings, we examined whether contextual factors at multiple levels, including industry, occupation, and team, influenced the performance outcomes of relations-oriented and task-oriented diversity. The direct effects were very small yet significant, and after we accounted for industry, occupation, and team-level contextual moderators, they doubled or tripled in size. Further, occupation- and industry-level moderators explained significant variance in effect sizes across studies.
  • One hundred and forty-eight pilots were asked to categorize a list of flight-related data elements that could be sent via data link from an FAA automated flight service station to an aircraft or vice versa. The categorization was used to construct a matrix of proximity values for each of the data-element pairs so that a conceptual network of the elements could be constructed using the Pathfinder algorithm developed by Schvaneveldt, Durso, and Dearholt (1985). Additionally, pilots were asked to judge each data element according to how useful the element was for the functions of communication, navigation, and surveillance within the general aviation (GA) flight environment. Elements scoring high on each of these flight-related functions were then subjected to a Pathfinder analysis. The conceptual networks that were created as a result of these analyses are discussed in relation to the development of data link user interfaces for the GA cockpit.
  • Article
    Diversity is a defining characteristic of global collectives facilitated by the Internet. Though substantial evidence suggests that diversity has profound implications for a variety of outcomes including performance, member engagement, and withdrawal behavior, the effects of diversity have been predominantly investigated in the context of organizational workgroups or virtual teams. We use a diversity lens to study the success of nontraditional virtual work groups exemplified by open source software (OSS) projects. Building on the diversity literature, we propose that three types of diversity (separation, variety, and disparity) influence two critical outcomes for OSS projects: community engagement and market success. We draw on the OSS literature to further suggest that the effects of diversity on market success are moderated by the application development stage. We instantiate the operational definitions of three forms of diversity to the unique context of open source projects. Using archival data from 357 projects hosted on SourceForge, we find that disparity diversity, reflecting variation in participants' contribution-based reputation, is positively associated with success. The impact of separation diversity, conceptualized as culture and measured as diversity in the spoken language and country of participants, has a negative impact on community engagement but an unexpected positive effect on market success. Variety diversity, reflected in dispersion in project participant roles, positively influences community engagement and market success. The impact of diversity on market success is conditional on the development stage of the project. We discuss how the study's findings advance the literature on antecedents of OSS success, expand our theoretical understanding of diversity, and present the practical implications of the results for managers of distributed collectives.
  • Article
    Full-text available
    Understanding what motivates participation is a central theme in the research on open source software (OSS) development. Our study contributes by revealing how the different motivations of OSS developers are interrelated, how these motivations influence participation leading to performance, and how past performance influences subsequent motivations. Drawing on theories of intrinsic and extrinsic motivation, we develop a theoretical model relating the motivations, participation, and performance of OSS developers. We evaluate our model using survey and archival data collected from a longitudinal field study of software developers in the Apache projects. Our results reveal several important findings. First, we find that developers' motivations are not independent but rather are related in complex ways. Being paid to contribute to Apache projects is positively related to developers' status motivations but negatively related to their use-value motivations. Perhaps surprisingly, we find no evidence of diminished intrinsic motivation in the presence of extrinsic motivations; rather, status motivations enhance intrinsic motivations. Second, we find that different motivations have an impact on participation in different ways. Developers' paid participation and status motivations lead to above-average contribution levels, but use-value motivations lead to below-average contribution levels, and intrinsic motivations do not significantly impact average contribution levels. Third, we find that developers' contribution levels positively impact their performance rankings. Finally, our results suggest that past-performance rankings enhance developers' subsequent status motivations.
  • Article
    Full-text available
    Giant strides in information technology at the turn of the century may have unleashed unreachable goals. With the invention of groupware, people expect to communicate easily with each other and accomplish difficult work even though they are remotely located or rarely overlap in time. Major corporations launch global teams, expecting that technology will make "virtual collocation" possible. Federal research money encourages global science through the establishment of "collaboratories." We review over 10 years of field and laboratory investigations of collocated and noncollocated synchronous group collaborations. In particular, we compare collocated work with remote work as it is possible today and comment on the promise of remote work tomorrow. We focus on the sociotechnical conditions required for effective distance work and bring together the results with four key concepts: common ground, coupling of work, collaboration readiness, and collaboration technology readiness. Groups with high common ground and loosely coupled work, with readiness both for collaboration and collaboration technology, have a chance at succeeding with remote work. Deviations from each of these create strain on the relationships among teammates and require changes in the work or processes of collaboration to succeed. Often they do not succeed because distance still matters.
  • Article
    Full-text available
    Users often interact and help each other solve problems in communities, but few scholars have explored how these relationships provide opportunities to innovate. We analyze the extent to which people positioned within the core of a community as well as people that are cosmopolitans positioned across multiple external communities affect innovation. Using a multimethod approach, including a survey, a complete database of interactions in an online community, content coding of interactions and contributions, and 36 interviews, we specify the types of positions that have the strongest effect on innovation. Our study shows that dispositional explanations for user innovation should be complemented by a relational view that emphasizes how these communities differ from other organizations, the types of behaviors this enables, and the effects on innovation.
  • Article
    This paper theorizes the intra-organizational dynamics of online communities of creation such as Free and Open Source software projects. It describes the role of the participants at the peripheries of these online communities and analyze how the division of labor among peripheral and core members is handled. The paper further demonstrates that this mode of labor division is possible only if the periphery is able to acquire and absorb the standards associated with the developers’ activities, described here as a social practice. We describe how the propagation of such standards takes place through non-material artifacts such as code and virtual discussions. We show that because of the capacity of these artifacts to effectively disseminate the standards of a social practice, such standards can be transferred not only face to face, but also asynchronously, asymmetrically and openly.