ArticlePDF Available

Review participation in modern code review: An empirical study of the android, Qt, and OpenStack projects

Authors:

Abstract and Figures

Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. Our prior work shows that review participation plays an important role in MCR practices, since the amount of review participation shares a relationship with software quality. However, little is known about which factors influence review participation in the MCR process. Hence, in this study, we set out to investigate the characteristics of patches that: (1) do not attract reviewers, (2) are not discussed, and (3) receive slow initial feedback. Through a case study of 196,712 reviews spread across the Android, Qt, and OpenStack open source projects, we find that the amount of review participation in the past is a significant indicator of patches that will suffer from poor review participation. Moreover, we find that the description length of a patch shares a relationship with the likelihood of receiving poor reviewer participation or discussion, while the purpose of introducing new features can increase the likelihood of receiving slow initial feedback. Our findings suggest that the patches with these characteristics should be given more attention in order to increase review participation, which will likely lead to a more responsive review process.
Content may be subject to copyright.
A preview of the PDF is not available
... Neutron (providing networking as a service for interface devices) and Nova (a controller for providing cloud virtual servers) are mainly written in Python; Qt Base (offering the core UI functionality) and Qt Creator (the Qt IDE) are mainly developed in C++. The selected four projects are the most active projects in the OpenStack and Qt communities, respectively, and they are widely known for a plethora of code review data recorded in the Gerrit code review system [27,28]. ...
Article
Full-text available
Context: As a software system evolves, its architecture tends to degrade, and gradually impedes software maintenance and evolution activities and negatively impacts the quality attributes of the system. The main root cause behind architecture erosion phenomenon derives from violation symptoms (i.e., various architecturally-relevant violations, such as violations of architecture pattern). Previous studies focus on detecting violations in software systems using architecture conformance checking approaches. However, code review comments are also rich sources that may contain extensive discussions regarding architecture violations, while there is a limited understanding of violation symptoms from the viewpoint of developers. Objective: In this work, we investigated the characteristics of architecture violation symptoms in code review comments from the developers' perspective. Method: We employed a set of keywords related to violation symptoms to collect 606 (out of 21,583) code review comments from four popular OSS projects in the OpenStack and Qt communities. We manually analyzed the collected 606 review comments to provide the categories and linguistic patterns of violation symptoms, as well as the reactions how developers addressed them. Results: Our findings show that: (1) three main categories of violation symptoms are discussed by developers during the code review process; (2) The frequently-used terms of expressing violation symptoms are "inconsistent" and "violate", and the most common linguistic pattern is Problem Discovery; (3) Refactoring and removing code are the major measures (90%) to tackle violation symptoms, while a few violation symptoms were ignored by developers. Conclusions: Our findings suggest that the investigation of violation symptoms can help researchers better understand the characteristics of architecture erosion and facilitate the development and maintenance activities, and developers should explicitly manage violation symptoms, not only for addressing the existing architecture violations but also preventing future violations.
... For the exploratory study, we used four active projects from two large and popular open-source communities: OpenStack and Qt. OpenStack is a set of software components that provide common services for cloud infrastructure, and is contributed by many well-known software companies, e.g., IBM, VMware, and NEC (Thongtanunam et al., 2017). Qt is a cross-platform tool and UI framework for creating graphical user interfaces and developing multi-platform applications. ...
Preprint
Full-text available
Code review is widely known as one of the best practices for software quality assurance in software development. In a typical code review process, reviewers check the code committed by developers to ensure the quality of the code, during which reviewers and developers would communicate with each other in review comments to exchange necessary information. As a result, understanding the information in review comments is a prerequisite for reviewers and developers to conduct an effective code review. Code snippet, as a special form of code, can be used to convey necessary information in code reviews. For example, reviewers can use code snippets to make suggestions or elaborate their ideas to meet developers' information needs in code reviews. However, little research has focused on the practices of providing code snippets in code reviews. To bridge this gap, we conduct a mixed-methods study to mine information and knowledge related to code snippets in code reviews, which can help practitioners and researchers get a better understanding about using code snippets in code review. Specifically, our study includes two phases: mining code review data and conducting practitioners' survey. The study results highlight that reviewers can provide code snippets in appropriate scenarios to meet developers' specific information needs in code reviews, which will facilitate and accelerate the code review process.
Article
Full-text available
Increasing code velocity is a common goal for a variety of software projects. The efficiency of the code review process significantly impacts how fast the code gets merged into the final product and reaches the customers. We conducted a qualitative survey to study the code velocity-related beliefs and practices in place. We analyzed 75 completed surveys from SurIndustryDevs participants from the industry and 36 from the open-source community. Our critical findings are (a) the industry and open-source community hold a similar set of beliefs, (b) quick reaction time is of utmost importance and applies to the tooling infrastructure and the behavior of other engineers, (c) time-to-merge is the essential code review metric to improve, (d) engineers are divided about the benefits of increased code velocity for their career growth, (e) the controlled application of the commit-then-review model can increase code velocity. Our study supports the continued need to invest in and improve code velocity regardless of the underlying organizational ecosystem.
Article
Full-text available
Context Due to the association of significant efforts, even a minor improvement in the effectiveness of Code Reviews(CR) can incur significant savings for a software development organization. Objective This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers, to what extent a code review comment is considered useful to them, and how various contextual and participant-related factors influence its degree of usefulness. Method On this goal, we have conducted a three-stage mixed-method study. We randomly selected 2,500 CR comments from the OpenDev Nova project and manually categorized the comments. We designed a survey of OpenDev developers to better understand their perspectives on useful CRs. Combining our survey-obtained scores with our manually labeled dataset, we trained two regression models - one to identify factors that influence the usefulness of CR comments and the other to identify factors that improve the odds of ‘Functional’ defect identification over the others. Results The results of our study suggest that a CR comment’s usefulness is dictated not only by its technical contributions, such as defect findings or quality improvement tips but also by its linguistic characteristics, such as comprehensibility and politeness. While a reviewer’s coding experience is positively associated with CR usefulness, the number of mutual reviews, comment volume in a file, the total number of lines added /modified, and CR interval have the opposite associations. While authorship and reviewership experiences for the files under review have been the most popular attributes for reviewer recommendation systems, we do not find any significant association of those attributes with CR usefulness. Conclusion We recommend discouraging frequent code review associations between two individuals as such associations may decrease CR usefulness. We also recommend authoring CR comments in a constructive and empathetic tone. As several of our results deviate from prior studies, we recommend more investigations to identify context-specific attributes to build reviewer recommendation models.
Article
Full-text available
Open source software development has become more social and collaborative, evident GitHub. Since 2016, GitHub started to support more informal methods such as emoji reactions, with the goal to reduce commenting noise when reviewing any code changes to a repository. From a code review context, the extent to which emoji reactions facilitate a more efficient review process is unknown. We conduct an empirical study to mine 1,850 active repositories across seven popular languages to analyze 365,811 Pull Requests (PRs) for their emoji reactions against the review time, first-time contributors, comment intentions, and the consistency of the sentiments. Answering these four research perspectives, we first find that the number of emoji reactions has a significant correlation with the review time. Second, our results show that a PR submitted by a first-time contributor is less likely to receive emoji reactions. Third, the results reveal that the comments with an intention of information giving, are more likely to receive an emoji reaction. Fourth, we observe that only a small proportion of sentiments are not consistent between comments and emoji reactions, i.e., with 11.8% of instances being identified. In these cases, the prevalent reason is when reviewers cheer up authors that admit to a mistake, i.e., acknowledge a mistake. Apart from reducing commenting noise, our work suggests that emoji reactions play a positive role in facilitating collaborative communication during the review process.
Preprint
Full-text available
Background: Despite the widespread use of automated security defect detection tools, software projects still contain many security defects that could result in serious damage. Such tools are largely context-insensitive and may not cover all possible scenarios in testing potential issues, which makes them susceptible to missing complex security defects. Hence, thorough detection entails a synergistic cooperation between these tools and human-intensive detection techniques, including code review. Code review is widely recognized as a crucial and effective practice for identifying security defects. Aim: This work aims to empirically investigate security defect detection through code review. Method: To this end, we conducted an empirical study by analyzing code review comments derived from four projects in the OpenStack and Qt communities. Through manually checking 20,995 review comments obtained by keyword-based search, we identified 614 comments as security-related. Results: Our results show that (1) security defects are not prevalently discussed in code review, (2) more than half of the reviewers provided explicit fixing strategies/solutions to help developers fix security defects, (3) developers tend to follow reviewers' suggestions and action the changes, (4) Not worth fixing the defect now and Disagreement between the developer and the reviewer are the main causes for not resolving security defects. Conclusions: Our research results demonstrate that (1) software security practices should combine manual code review with automated detection tools, achieving a more comprehensive coverage to identifying and addressing security defects, and (2) promoting appropriate standardization of practitioners' behaviors during code review remains necessary for enhancing software security.
Conference Paper
Full-text available
Background: Despite the widespread use of automated security defect detection tools, software projects still contain many security defects that could result in serious damage. Such tools are largely context-insensitive and may not cover all possible scenarios in testing potential issues, which makes them susceptible to missing complex security defects. Hence, thorough detection entails a synergistic cooperation between these tools and human-intensive detection techniques, including code review. Code review is widely recognized as a crucial and effective practice for identifying security defects. Aim: This work aims to empirically investigate security defect detection through code review. Method: To this end, we conducted an empirical study by analyzing code review comments derived from four projects in the OpenStack and Qt communities. Through manually checking 20,995 review comments obtained by keyword-based search, we identified 614 comments as security-related. Results: Our results show that (1) security defects are not prevalently discussed in code review, (2) more than half of the reviewers provided explicit fixing strategies/solutions to help developers fix security defects, (3) developers tend to follow reviewers' suggestions and action the changes, (4) Not worth fixing the defect now and Disagreement between the developer and the reviewer are the main causes for not resolving security defects. Conclusions: Our research results demonstrate that (1) software security practices should combine manual code review with automated detection tools, achieving a more comprehensive coverage to identifying and addressing security defects, and (2) promoting appropriate standardization of practitioners' behaviors during code review remains necessary for enhancing software security.
Article
Full-text available
Modern Code Review (MCR) is being adopted in both open-source and proprietary projects as a common practice. MCR is a widely acknowledged quality assurance practice that allows early detection of defects as well as poor coding practices. It also brings several other benefits such as knowledge sharing, team awareness, and collaboration. For a successful review process, peer reviewers should perform their review tasks promptly while providing relevant feedback about the code change being reviewed. However, in practice, code reviews can experience significant delays to be completed due to various socio-technical factors which can affect the project quality and cost. That is, existing MCR frameworks lack tool support to help developers estimate the time required to complete a code review before accepting or declining a review request. In this paper, we aim to build and validate an automated approach to predict the code review completion time in the context of MCR. We believe that the predictions of our approach can improve the engagement of developers by raising their awareness regarding potential delays while doing code reviews. To this end, we formulate the prediction of the code review completion time as a learning problem. In particular, we propose a framework based on regression machine learning (ML) models based on 69 features that stem from 8 dimensions to (i) effectively estimate the code review completion time, and (ii) investigate the main factors influencing code review completion time. We conduct an empirical study on more than 280K code reviews spanning over five projects hosted on Gerrit. Results indicate that ML models significantly outperform baseline approaches with a relative improvement ranging from 7% to 49%. Furthermore, our experiments show that features related to the date of the code review request, the previous owner and reviewers’ activities as well as the history of their interactions are the most important features. Our approach can help further engage the change owner and reviewers by raising their awareness regarding potential delays based on the predicted code review completion time.
Conference Paper
Full-text available
Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. To evaluate the impact that characteristics of MCR practices have on software quality, this paper comparatively studies MCR practices in defective and clean source code files. We investigate defective files along two perspectives: 1) files that will eventually have defects (i.e., future-defective files) and 2) files that have historically been defective (i.e., risky files). Through an empirical study of 11,736 reviews of changes to 24,486 files from the Qt open source project, we find that both future-defective files and risky files tend to be reviewed less rigorously than their clean counterparts. We also find that the concerns addressed during the code reviews of both defective and clean files tend to enhance evolvability, i.e., ease future maintenance (like documentation), rather than focus on functional issues (like incorrect program logic). Our findings suggest that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness.
Conference Paper
Full-text available
Code ownership establishes a chain of responsibility for modules in large software systems. Although prior work uncovers a link between code ownership heuristics and software quality, these heuristics rely solely on the authorship of code changes. In addition to authoring code changes, developers also make important contributions to a module by reviewing code changes. Indeed, recent work shows that reviewers are highly active in modern code review processes, often suggesting alternative solutions or providing updates to the code changes. In this paper, we complement traditional code ownership heuristics using code review activity. Through a case study of six releases of the large Qt and OpenStack systems, we find that: (1) 67%--86% of developers did not author any code changes for a module, but still actively contributed by reviewing 21%--39% of the code changes, (2) code ownership heuristics that are aware of reviewing activity share a relationship with software quality, and (3) the proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects. Our results suggest that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
Article
Full-text available
Code review is the process of having other team members examine changes to a software system in order to evaluate its technical content and quality. A lightweight variant of this practice, often referred to as Modern Code Review (MCR), is widely adopted by software organizations today. Previous studies have established a relation between the practice of code review and the occurrence of post-release bugs. While the prior work studies the impact of code review practices on software release quality, it is still unclear what impact code review practices have on software design quality. Therefore, using the occurrence of 7 different types of anti-patterns (i.e., poor solutions to design and implementation problems) as a proxy for software design quality, we set out to investigate the relationship between code review practices and software design quality. Through a case study of the Qt, VTK and ITK open source projects, we find that software components with low review coverage or low review participation are often more prone to the occurrence of anti-patterns than those components with more active code review practices.
Conference Paper
Open source software projects often rely on code contributions from a wide variety of developers to extend the capabilities of their software. Project members evaluate these contributions and often engage in extended discussions to decide whether to integrate changes. These discussions have important implications for project management regarding new contributors and evolution of project requirements and direction. We present a study of how developers in open work environments evaluate and discuss pull requests, a primary method of contribution in GitHub, analyzing a sample of extended discussions around pull requests and interviews with GitHub developers. We found that developers raised issues around contributions over both the appropriateness of the problem that the submitter attempted to solve and the correctness of the implemented solution. Both core project members and third-party stakeholders discussed and sometimes implemented alternative solutions to address these issues. Different stakeholders also influenced the outcome of the evaluation by eliciting support from different communities such as dependent projects or even companies. We also found that evaluation outcomes may be more complex than simply acceptance or rejection. In some cases, although a submitter's contribution was rejected, the core team fulfilled the submitter's technical goals by implementing an alternative solution. We found that the level of a submitter's prior interaction on a project changed how politely developers discussed the contribution and the nature of proposed alternative solutions.
Article
Peer code review is a software quality assurance activity followed in several open-source and closed-source software projects. Rietveld and Gerrit are the most popular peer code review systems used by open-source software projects. Despite the popularity and usefulness of these systems, they do not record or maintain the cost and effort information for a submitted patch review activity. Currently there are no formal or standard metrics available to measure effort and contribution of a patch review activity. We hypothesize that the size and complexity of modified files and patches are significant indicators of effort and contribution of patch reviewers in a patch review process. We propose a metric for computing the effort and contribution of a patch reviewer based on modified file size, patch size and program complexity variables. We conduct a survey of developers involved in peer code review activity to test our hypothesis of causal relationship between proposed indicators and effort. We employ the proposed model and conduct an empirical analysis using the proposed metrics on open-source Google Android project.
Article
Code review is an important part of the software development process. Recently, many open source projects have begun practicing code review through 'modern' tools such as GitHub pull-requests and Gerrit. Many commercial software companies use similar tools for code review internally. These tools enable the owner of a source code change to request individuals to participate in the review, i.e., reviewers. However, this task comes with a challenge. Prior work has shown that the benefits of code review are dependent upon the expertise of the reviewers involved. Thus, a common problem faced by authors of source code changes is that of identifying the best reviewers for their source code change. To address this problem, we present an approach, namely cHRev, to automatically recommend reviewers who are best suited to participate in a given review, based on their historical contributions as demonstrated in their prior reviews. We evaluate the effectiveness of cHRev on three open source systems as well as a commercial codebase at Microsoft and compare it to the state of the art in reviewer recommendation. We show that by leveraging the specific information in previously completed reviews (i.e.,quantification of review comments and their recency), we are able to improve dramatically on the performance of prior approaches, which (limitedly) operate on generic review information (i.e., reviewers of similar source code file and path names) or source coderepository data. We also present the insights into why our approach cHRev outperforms the existing approaches.