Figure - uploaded by Dong Wang
Content may be subject to copyright.
The five most common domains in OpenStack and Qt. review.openstack.org and codereview.qt- project.org are the most common domains within the OpenStack and Qt
Source publication
Code reviews serve as a quality assurance activity for software teams. Especially for Modern Code Review, sharing a link during a review discussion serves as an effective awareness mechanism where "Code reviews are good FYIs [for your information].". Although prior work has explored link sharing and the information needs of a code review, the exten...
Contexts in source publication
Context 1
... answer RQ1, we analyze (1) the trend of link sharing (i.e., how often reviews have shared links overtime), (2) the common domains of the shared links, and (3) the types of link targets. Figures 4, 5, and Table 4 show the results of our analysis which is described in Section 3.3. We now discuss our results below. ...Context 2
... Domain in the Project 93% and 80% of links that are shared in code reviews are internal links. Table 4 shows the ratio between internal and external links that are shared in OpenStack and Qt. We find the majority of links that are shared in the code reviews are internal links (i.e., links that are directly related to the projects). ...Context 3
... find the majority of links that are shared in the code reviews are internal links (i.e., links that are directly related to the projects). More specifically, Table 4 shows that 93% of links shared in OpenStack reviews are internal links, while only 7% of the links are external links (i.e., not directly related to OpenStack). Qt also has a similar ratio, where 80% of the links shared in Qt reviews are internal, and 20% of the links are external. ...Context 4
... results indicate that links that are often shared in the review discussion are directly related to the project. In addition, Table 4 shows that the most common domains for the internal links are review.openstack.org and codereview.qt-project.org, which account for 51% and 79% of the internal links shared in OpenStack and Qt, respectively. ...Similar publications
[Background/Context] The continuous inflow of bug reports is a considerable challenge in large development projects. Inspired by contemporary work on mining software repositories, we designed a prototype bug assignment solution based on machine learning in 2011-2016. The prototype evolved into an internal Ericsson product, TRR, in 2017-2018. TRR's...
Citations
... The authors of the study find that some of the information needs can be satisfied by current tools and research results, but some aspects seem not to be solved yet and need further investigation. Studies investigated the use of links in review comments [143,259]. A case study of the OpenStack and Qt projects indicated that the links provided in code review discussion served as an important resource to fulfill various information needs such as providing context and elaborating patch information [259]. ...
... Studies investigated the use of links in review comments [143,259]. A case study of the OpenStack and Qt projects indicated that the links provided in code review discussion served as an important resource to fulfill various information needs such as providing context and elaborating patch information [259]. Jiang et al. [143] found that 5.25% of pull requests in 10 popular open source projects have links. ...
Background: Modern Code Review (MCR) is a lightweight alternative to traditional code inspections. While secondary studies on MCR exist; it is unknown whether the research community has targeted themes that practitioners consider important.
Objectives: The objectives are to provide an overview of MCR research, analyze the practitioners’ opinions on the importance of MCR research, investigate the alignment between research and practice, and propose future MCR research avenues.
Method: We conducted a systematic mapping study to survey state-of-the-art until and including 2021, employed the Q-Methodology to analyze the practitioners’ perception of the relevance of MCR research, and analyzed the primary studies’ research impact.
Results: We analyzed 244 primary studies, resulting in five themes. As a result of the 1300 survey data points, we found that the respondents are positive about research investigating the impact of MCR on product quality and MCR process properties. In contrast, they are negative about human factor- and support systems-related research.
Conclusion: These results indicate a misalignment between the state-of-the-art and the themes deemed important by most survey respondents. Researchers should focus on solutions that can improve the state of MCR practice. We provide an MCR research agenda, which can potentially increase the impact of MCR research.
... Ebert et al. [5] observed that the inclusion of more people in the code review increases their awareness of the code change, i.e., confusion resolution contributes to knowledge sharing. Recently, Wang et al. [6] observed that developers are likely to share links during review discussions with several intentions to fulfill information needs. Meanwhile, Hirao et al. [7] shed light that the patch linkage (i.e., posting a patch link to another patch) is used to indicate patch dependency, competing solutions, or provide broader context. ...
... OpenStack is an open-source software ecosystem where many well-known organizations and companies, e.g., IBM, VMware, and NEC, collaboratively develop a platform for cloud computing. OpenStack actively performs code reviews through Gerrit, a tool-based code review tool, and is widely studied in the prior work [6], [8], [9]. ...
... Specifically, we use the list of the automated tools that is provided in the work of Thongtanunam et al. [10] Extract Patch Linkage. To identify the patch links, similar to prior work [6], we applied the regular expression to search all messages in the review discussions that include a patch URL in the following format: https?://review.openstack|opendev.org/#/c/ [1][2][3][4][5][6][7][8][9] same. ...
Contemporary development projects benefit from code review as it improves the quality of a project. Large ecosystems of inter-dependent projects like OpenStack generate a large number of reviews, which poses new challenges for collaboration (improving patches, fixing defects). Review tools allow developers to link between patches, to indicate patch dependency, competing solutions, or provide broader context. We hypothesize that such patch linkage may also simulate cross-collaboration. With a case study of OpenStack, we take a first step to explore collaborations that occur after a patch linkage was posted between two patches (i.e., cross-patch collaboration). Our empirical results show that although patch linkage that requests collaboration is relatively less prevalent, the probability of collaboration is relatively higher. Interestingly, the results also show that collaborative contributions via patch linkage are non-trivial, i.e, contributions can affect the review outcome (such as voting) or even improve the patch (i.e., revising). This work opens up future directions to understand barriers and opportunities related to this new kind of collaboration, that assists with code review and development tasks in large ecosystems.
... Ebert et al. [5] observed that the inclusion of more people in the code review increases their awareness of the code change, i.e., confusion resolution contributes to knowledge sharing. Recently, Wang et al. [6] observed that developers are likely to share links during review discussions with several intentions to fulfill information needs. Meanwhile, Hirao et al. [7] shed light that the patch linkage (i.e., posting a patch link to another patch) is used to indicate patch dependency, competing solutions, or provide broader context. ...
... OpenStack is an open-source software ecosystem where many well-known organizations and companies, e.g., IBM, VMware, and NEC, collaboratively develop a platform for cloud computing. OpenStack actively performs code reviews through Gerrit, a toolbased code review tool, and is widely studied in the prior work [6,8,9]. ...
... Specifically, we use the list of the automated tools that is provided in the work of Thongtanunam et al. [10] Extract Patch Linkage. To identify the patch links, similar to prior work [6], we applied the regular expression to search all messages in the review discussions that include a patch URL in the following format: https?://review.openstack|opendev.org/#/c/ [1][2][3][4][5][6][7][8][9]+[0-9]*. ...
Contemporary development projects benefit from code review as it improves the quality of a project. Large ecosystems of interdependent projects like OpenStack generate a large number of reviews, which poses new challenges for collaboration (improving patches, fixing defects). Review tools allow developers to link between patches, to indicate patch dependency, competing solutions, or provide broader context. We hypothesize that such patch linkage may also simulate cross-collaboration. With a case study of OpenStack, we take a first step to explore collaborations that occur after a patch linkage was posted between two patches (i.e., cross-patch collaboration). Our empirical results show that although patch linkage that requests collaboration is relatively less prevalent, the probability of collaboration is relatively higher. Interestingly, the results also show that collaborative contributions via patch linkage are non-trivial, i.e, contributions can affect the review outcome (such as voting) or even improve the patch (i.e., revising). This work opens up future directions to understand barriers and opportunities related to this new kind of collaboration, that assists with code review and development tasks in large ecosystems.
... To do so, we classify the different state changes of a PR. Similar to prior work Wang et al (2021), there are several states of a PR, that is Accepted PR -where the PR has been closed and merged, and Abandoned PR -where the PR has been closed but has not been merged. After identifying the state of each PR, we performed two analyses. ...
The risk to using third-party libraries in a software application is that much needed maintenance is solely carried out by library maintainers. These libraries may rely on a core team of maintainers (who might be a single maintainer that is unpaid and overworked) to serve a massive client user-base. On the other hand, being open source has the benefit of receiving contributions (in the form of External PRs) to help fix bugs and add new features. In this paper, we investigate the role by which External PRs (contributions from outside the core team of maintainers) contribute to a library. Through a preliminary analysis, we find that External PRs are prevalent, and just as likely to be accepted as maintainer PRs. We find that 26.75% of External PRs submitted fix existing issues. Moreover, fixes also belong to labels such as breaking changes, urgent, and on-hold. Differently from Internal PRs, External PRs cover documentation changes (44 out of 384 PRs), while not having as much refactoring (34 out of 384 PRs). On the other hand, External PRs also cover new features (380 out of 384 PRs) and bugs (120 out of 384). Our results lay the groundwork for understanding how maintainers decide which external contributions they select to evolve their libraries and what role they play in reducing the workload.
... Qt is a crossplatform application and UI framework developed by the Digia corporation, but welcomes contributions from the community at large. Since these two communities have made a big investment in code reviews for several years [15] and are widely used in many studies related to code reviews [12,34], we deemed them to be appropriate and representative for our analysis. The OpenStack and Qt communities are composed of several projects, and we selected one of the most active projects from each community (based on the highest number of closed code changes), i.e., Nova 3 from OpenStack and Qt Base 4 from Qt. ...
Background: Technical Debt (TD) refers to the situation where developers make trade-offs to achieve short-term goals at the expense of long-term code quality, which can have a negative impact on the quality of software systems. In the context of code review, such sub-optimal implementations have chances to be timely resolved during the review process before the code is merged. Therefore, we could consider them as Potential Technical Debt (PTD) since PTD will evolve into TD when it is injected into software systems without being resolved. Aim: To date, little is known about the extent to which PTD is identified in code reviews. Many tools have been provided to detect TD, but these tools lack consensus and a large amount of PTD are undetectable by tools while code review could help verify the quality of code that has been committed by identifying issues, such as PTD. To this end, we conducted an exploratory study in an attempt to understand the nature of PTD in code reviews and track down the resolution of PTD after being identified. Method: We randomly collected 2,030 review comments from the Nova project of OpenStack and the Qt Base project of Qt. We then manually checked these review comments, and obtained 163 PTD-related review comments for further analysis. Results: Our results show that: (1) PTD can be identified in code reviews but is not prevalent. (2) Design, defect, documentation, requirement, test, and code PTD are identified in code reviews, in which code and documentation PTD are the dominant. (3) 81.0% of the PTD identified in code reviews has been resolved by developers, and 78.0% of the resolved TD was resolved by developers within a week. (4) Code refactoring is the main practice used by developers to resolve the PTD identified in code reviews. Conclusions: Our findings indicate that: (1) review-based detection of PTD is seen as one of the trustworthy mechanisms in development, and (2) there is still a significant proportion of PTD (19.0%) remaining unresolved when injected into the software systems. Practitioners and researchers should establish effective strategies to manage and resolve PTD in development.
... Qt is a crossplatform application and UI framework developed by the Digia corporation, but welcomes contributions from the community at large. Since these two communities have made a big investment in code reviews for several years [15] and are widely used in many studies related to code reviews [12,34], we deemed them to be appropriate and representative for our analysis. The OpenStack and Qt communities are composed of several projects, and we selected one of the most active projects from each community (based on the highest number of closed code changes), i.e., Nova 3 from OpenStack and Qt Base 4 from Qt. ...
Technical Debt (TD) refers to the situation where developers make trade-offs to achieve short-term goals at the expense of long-term code quality, which can have a negative impact on the quality of software systems. In the context of code review, such sub-optimal implementations have chances to be timely resolved during the review process before the code is merged. Therefore, we could consider them as Potential Technical Debt (PTD) since PTD will evolve into TD when it is injected into software systems without being resolved. To date, little is known about the extent to which PTD is identified in code reviews. To this end, we conducted an exploratory study in an attempt to understand the nature of PTD in code reviews and track down the resolution of PTD after being identified. We randomly collected 2,030 review comments from the Nova project of OpenStack and the Qt Base project of Qt. We then manually checked these review comments, and obtained 163 PTD-related review comments for further analysis. Our results show that: (1) PTD can be identified in code reviews but is not prevalent. (2) Design, defect, documentation, requirement, test, and code PTD are identified in code reviews, in which code and documentation PTD are the dominant. (3) 81.0% of the PTD identified in code reviews has been resolved by developers, and 78.0% of the resolved TD was resolved by developers within a week. (4) Code refactoring is the main practice used by developers to resolve the PTD identified in code reviews. Our findings indicate that: (1) review-based detection of PTD is seen as one of the trustworthy mechanisms in development, and (2) there is still a significant proportion of PTD (19.0%) remaining unresolved when injected into the software systems. Practitioners and researchers should establish effective strategies to manage and resolve PTD in development.
... A recent study has been conducted to study the practice of link sharing and their intentions in code reviews, and to explore what information could be provided through link sharing [13]. Similar to links, code snippets are also considered as one of the measures to convey necessary information during the code review process. ...
... They identified the presence of seven high-level reviewers' information needs. Wang et al. investigated link sharing and their intentions in code reviews [13]. They identified seven intentions behind link sharing in code reviews, in which providing context and elaborating are the most common intentions. ...
Code review is a mature practice for software quality assurance in software development with which reviewers check the code that has been committed by developers, and verify the quality of code. During the code review discussions, reviewers and developers might use code snippets to provide necessary information (e.g., suggestions or explanations). However, little is known about the intentions and impacts of code snippets in code reviews. To this end, we conducted a preliminary study to investigate the nature of code snippets and their purposes in code reviews. We manually collected and checked 10,790 review comments from the Nova and Neutron projects of the OpenStack community, and finally obtained 626 review comments that contain code snippets for further analysis. The results show that: (1) code snippets are not prevalently used in code reviews, and most of the code snippets are provided by reviewers. (2) We identified two high-level purposes of code snippets provided by reviewers (i.e., Suggestion and Citation) with six detailed purposes, among which, Improving Code Implementation is the most common purpose. (3) For the code snippets in code reviews with the aim of suggestion, around 68.1% was accepted by developers. The results highlight promising research directions on using code snippets in code reviews.
... A recent study has been conducted to study the practice of link sharing and their intentions in code reviews, and to explore what information could be provided through link sharing [13]. Similar to links, code snippets are also considered as one of the measures to convey necessary information during the code review process. ...
... They identified the presence of seven high-level reviewers' information needs. Wang et al. investigated link sharing and their intentions in code reviews [13]. They identified seven intentions behind link sharing in code reviews, in which providing context and elaborating are the most common intentions. ...
Code review is a mature practice for software quality assurance in software development with which reviewers check the code that has been committed by developers, and verify the quality of code. During the code review discussions, reviewers and developers might use code snippets to provide necessary information (e.g., suggestions or explanations). However, little is known about the intentions and impacts of code snippets in code reviews. To this end, we conducted a preliminary study to investigate the nature of code snippets and their purposes in code reviews. We manually collected and checked 10,790 review comments from the Nova and Neutron projects of the OpenStack community, and finally obtained 626 review comments that contain code snippets for further analysis. The results show that: (1) code snippets are not prevalently used in code reviews, and most of the code snippets are provided by reviewers. (2) We identified two high-level purposes of code snippets provided by reviewers (i.e., Suggestion and Citation) with six detailed purposes, among which, Improving Code Implementation is the most common purpose. (3) For the code snippets in code reviews with the aim of suggestion, around 68.1% was accepted by developers. The results highlight promising research directions on using code snippets in code reviews.
... Thus, recent work leveraged machine learning techniques to support various activities throughout the code review process, for example, reviewer recommendation [13,43,46,49,56,59], review task prioritization based on code change characteristics [23,38,58] and defect-proneness [30,39,44,45,65]. Several studies also proposed approaches to support reviewers when reading and examining code [14,27,55,63,64]. Although these approaches can reduce the manual effort of reviewers, code authors still need to manually modify the source code until it is approved by reviewers. ...
Code review is effective, but human-intensive (e.g., developers need to manually modify source code until it is approved). Recently, prior work proposed a Neural Machine Translation (NMT) approach to automatically transform source code to the version that is reviewed and approved (i.e., the after version). Yet, its performance is still suboptimal when the after version has new identifiers or liter-als (e.g., renamed variables) or has many code tokens. To address these limitations, we propose AutoTransform which leverages a Byte-Pair Encoding (BPE) approach to handle new tokens and a Transformer-based NMT architecture to handle long sequences. We evaluate our approach based on 14,750 changed methods with and without new tokens for both small and medium sizes. The results show that when generating one candidate for the after version (i.e., beam width = 1), our AutoTransform can correctly transform 1,413 changed methods, which is 567% higher than the prior work, highlighting the substantial improvement of our approach for code transformation in the context of code review. This work contributes towards automated code transformation for code reviews, which could help developers reduce their effort in modifying source code during the code review process.
... Table 6 shows that our Cox proportional-hazard model achieves an adjusted R 2 of 42.2%. Based on the prior quantitative empirical studies in open source projects [58], [36], [61], this score is considered as acceptable since our model is supposed to be explanatory not predictive. In addition, the concordance index (another popular metric that quantifies the correlation between risk predictions and event times [29]) of our model is 0.672, suggesting that our model performs equally well. ...
The widespread adoption of third-party libraries for contemporary software development has led to the creation of large inter-dependency networks, where sustainability issues of a single library can have widespread network effects. Maintainers of these libraries are often overworked, relying on the contributions of volunteers to sustain these libraries. To understand these contributions, in this work, we leverage socio-technical techniques to introduce and formalise dependency-contribution congruence (DC congruence) at both ecosystem and library level, i.e., to understand the degree and origins of contributions congruent to dependency changes, analyze whether they contribute to library dormancy (i.e., a lack of activity), and investigate similarities between these congruent contributions compared to typical contributions. We conduct a large-scale empirical study to measure the DC congruence for the npm ecosystem using 1.7 million issues, 970 thousand pull requests (PRs), and over 5.3 million commits belonging to 107,242 npm libraries. We find that the most congruent contributions originate from contributors who can only submit (not commit) to both a client and a library. At the project level, we find that DC congruence shares an inverse relationship with the likelihood that a library becomes dormant. Specifically, a library is less likely to become dormant if the contributions are congruent with upgrading dependencies. Finally, by comparing the source code of contributions, we find statistical differences in the file path and added lines in the source code of congruent contributions when compared to typical contributions. Our work has implications to encourage dependency contributions, especially to support library maintainers in sustaining their projects.