ArticlePDF Available

Measuring Program Comprehension: A Large-Scale Field Study with Professionals

Authors:

Abstract

During software development and maintenance, developers spend a considerable amount of time on program comprehension activities. Previous studies show that program comprehension takes up as much as half of a developer's time. However, most of these studies are performed in a controlled setting, or with a small number of participants, and investigate the program comprehension activities only within the IDEs. However, developers' program comprehension activities go well beyond their IDE interactions. In this paper, we extend our ActivitySpace framework to collect and analyze Human-Computer Interaction (HCI) data across many applications (not just the IDEs). We follow Minelli et al.'s approach to assign developers' activities into four categories: navigation, editing, comprehension, and other. We then measure the comprehension time by calculating the time that developers spend on program comprehension, e.g. inspecting console and breakpoints in IDE, or reading and understanding tutorials in web browsers. Using this approach, we can perform a more realistic investigation of program comprehension activities, through a field study of program comprehension in practice across a total of seven real projects, on 78 professional developers, and amounting to 3,148 working hours. Our study leverages interaction data that is collected across many applications by the developers. Our study finds that on average developers spend ∼58% of their time on program comprehension activities, and that they frequently use web browsers and document editors to perform program comprehension activities. We also investigate the impact of programming language, developers' experience, and project phase on the time that is spent on program comprehension, and we find senior developers spend significantly less percentages of time on program comprehension than junior developers. Our study also highlights the importance of several research directions needed to reduce program comprehension time, e.g., building automatic detection and improvement of low quality code and documentation, construction of software-engineering-specific search engines, designing better IDEs that help developers navigate code and browse information more efficiently, etc.
Published in IEEE Transactions on Software Engineering, 2017 July, Volume PP, Issue 99, Pages 1-26
http://doi.org/10.1109/TSE.2017.2734091
0 20 40 60 80
Effective Working Hours
100
A B C D E F G
Java C#
Low Medium High
Maintenance
Development
Hengtian IGS

Supplementary resource (1)

... Furthermore, a live demo of our tool is available online. 1 We invite other researchers to extend our opensource software. 2 Video URL: https://youtu.be/3qZVSehnEug ...
... Source code comprehension is still the primary method to come to an understanding of a software system's behavior [1]. This is not unexpected, because developers are trained to recognize recurring patterns and resulting behavior in source code. ...
... We conclude that the participants used the SV as supplement to the code editor for specific comprehension tasks. Traditionally, understanding a software system's behavior is primarily achieved by comprehending the source code [1]. For this experiment, the results related to RQ1 show that our approach was, for example, used by the participants to gain an overview of the target system. ...
Preprint
Full-text available
Software visualizations are usually realized as standalone and isolated tools that use embedded code viewers within the visualization. In the context of program comprehension, only few approaches integrate visualizations into code editors, such as integrated development environments. This is surprising since professional developers consider reading source code as one of the most important ways to understand software, therefore spend a lot of time with code editors. In this paper, we introduce the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors. Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior. Additionally, we incorporate distributed tracing. This enables developers to understand how, for example, the currently handled source code behaves as a fully deployed, distributed software system. Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities. As a result, user interactions are synchronized between code editor and visualization, as well as broadcasted to collaborators. To the best of our knowledge, this is the first approach that combines code editors with collaboratively usable code cities. Therefore, we conducted a user study to collect first-time feedback regarding the perceived usefulness and perceived usability of our approach. We additionally collected logging information to provide more data regarding time spent in code cities that are embedded in code editors. Seven teams with two students each participated in that study. The results show that the majority of participants find our approach useful and would employ it for their own use. We provide each participant's video recording, raw results, and all steps to reproduce our experiment as supplementary package.
... Furthermore, a live demo of our tool is available online. 1 We invite other researchers to extend our opensource software. 2 Video URL: https://youtu.be/3qZVSehnEug ...
... Source code comprehension is still the primary method to come to an understanding of a software system's behavior [1]. This is not unexpected, because developers are trained to recognize recurring patterns and resulting behavior in source code. ...
... We conclude that the participants used the SV as supplement to the code editor for specific comprehension tasks. Traditionally, understanding a software system's behavior is primarily achieved by comprehending the source code [1]. For this experiment, the results related to RQ1 show that our approach was, for example, used by the participants to gain an overview of the target system. ...
Preprint
Full-text available
Software visualizations are usually realized as standalone and isolated tools that use embedded code viewers within the visualization. In the context of program comprehension, only few approaches integrate visualizations into code editors, such as integrated development environments. This is surprising since professional developers consider reading source code as one of the most important ways to understand software, therefore spend a lot of time with code editors. In this paper, we introduce the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors. Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior. Additionally, we incorporate distributed tracing. This enables developers to understand how, for example, the currently handled source code behaves as a fully deployed, distributed software system. Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities. As a result, user interactions are synchronized between code editor and visualization, as well as broadcasted to collaborators. To the best of our knowledge, this is the first approach that combines code editors with collaboratively usable code cities. Therefore, we conducted a user study to collect first-time feedback regarding the perceived usefulness and perceived usability of our approach. We additionally collected logging information to provide more data regarding time spent in code cities that are embedded in code editors. Seven teams with two students each participated in that study. The results show that the majority of participants find our approach useful and would employ it for their own use. We provide each participant's video recording, raw results, and all steps to reproduce our experiment as supplementary package. Furthermore, a live demo of our tool is available online. We invite other researchers to extend our open-source software.
... However, the changing the code will be difficult when the source code of the project is large or the programmer who makes the changes is a different person than the creator who has never even worked on the code. For this reason, programmers need to perform program comprehension first, and studies show that programmers need approximately 21.5 hours a week (58% of 37.5 hours per week) [1]. ...
Article
Full-text available
span lang="EN-US">Feature location is a technique for determining source code that implements specific features in software. It developed to help minimize effort on program comprehension. The main challenge of feature location research is how to bridge the gap between abstract keywords in use cases and detail in source code. The use case scenarios are software requirements artifacts that state the input, logic, rules, actor, and output of a function in the software. The sentence on use case scenario is sometimes described another sentence in other use case scenario. This study contributes to creating expansion queries in feature locations by finding the relationship between use case scenarios. The relationships include inner association, outer association and intratoken association. The research employs latent Dirichlet allocation (LDA) to create model topics on source code. Query expansion using inner, outer and intratoken was tested for finding feature locations on a Java-based open-source project. The best precision rate was 50%. The best recall was 100%, which was found in several use case scenarios implemented in a few files. The best average precision rate was 16.7%, which was found in inner association experiments. The best average recall rate was 68.3%, which was found in all compound association experiments.</span
... Previous studies show that more than half of the time in software development and maintenance process is spent on program comprehension and its related tasks (Xia et al. 2017;Wong et al. 2013). In this process, developers understand the meaning of the program mainly by looking at comments. ...
Article
Full-text available
Code summarization aims to generate concise natural language descriptions for a piece of code, which can help developers comprehend the source code. Analysis of current work shows that the extraction of syntactic and semantic features of source code is crucial for generating high-quality summaries. To provide a more comprehensive feature representation of source code from different perspectives, we propose an approach named EnCoSum, which enhances semantic features for the multi-scale multi-modal code summarization method. This method complements our previously proposed M2TS approach (multi-scale multi-modal approach based on Transformer for source code summarization), which uses the multi-scale method to capture Abstract Syntax Trees (ASTs) structural information more completely and accurately at multiple local and global levels. In addition, we devise a new cross-modal fusion method to fuse source code and AST features, which can highlight key features in each modality that help generate summaries. To obtain richer semantic information, we improve M2TS. First, we add data flow and control flow to ASTs, and added-edge ASTs, called Enhanced-ASTs (E-ASTs). In addition, we introduce method name sequences extracted in the source code, which exist more knowledge about critical tokens in the corresponding summaries and can help the model generate higher-quality summaries. We conduct extensive experiments on processed Java and Python datasets and evaluate our approach via the four most commonly used machine translation metrics. The experimental results demonstrate that EnCoSum is effective and outperforms current state-of-the-art methods. Further, we perform ablation experiments on each of the model’s key components, and the results show that they all contribute to the performance of EnCoSum.
... Differences in abstraction levels: Practitioners often consider overarching goals, which may involve a workflow of many tasks, each further broken down into micro-tasks. For example, the DevOps workflow consists of multiple phases, and each phase, e.g., coding, includes multiple activities, e.g., navigation, editing, comprehension, etc. [65], [66]. In contrast, current AI4SE solutions usually target specific, narrower micro-tasks, such as fault localization, clone detection, API recommendation, code summarization, duplicate bug report detection, etc. ...
Preprint
For decades, much software engineering research has been dedicated to devising automated solutions aimed at enhancing developer productivity and elevating software quality. The past two decades have witnessed an unparalleled surge in the development of intelligent solutions tailored for software engineering tasks. This momentum established the Artificial Intelligence for Software Engineering (AI4SE) area, which has swiftly become one of the most active and popular areas within the software engineering field. This Future of Software Engineering (FoSE) paper navigates through several focal points. It commences with a succinct introduction and history of AI4SE. Thereafter, it underscores the core challenges inherent to AI4SE, particularly highlighting the need to realize trustworthy and synergistic AI4SE. Progressing, the paper paints a vision for the potential leaps achievable if AI4SE's key challenges are surmounted, suggesting a transition towards Software Engineering 2.0. Two strategic roadmaps are then laid out: one centered on realizing trustworthy AI4SE, and the other on fostering synergistic AI4SE. While this paper may not serve as a conclusive guide, its intent is to catalyze further progress. The ultimate aspiration is to position AI4SE as a linchpin in redefining the horizons of software engineering, propelling us toward Software Engineering 2.0.
... Research shows that software developers and maintainers spend 59% [1] of their time on program understanding. A good code comment can improve the efficiency of software development and maintenance. ...
Article
У статті розглянуто ризики, пов'язані з використанням генеративних систем штучного інтелекту (GenAI). Автори наголошують на тому, що країни з технологічно розвиненими законодавчими системами вже регулюють використання GenAI з огляду на захист даних та кібербезпеку. Вказано також на проєкт з адміністрування генеративних послуг у Китаї, в якому наголошено на відповідальність постачальників послуг GenAI за безпеку та точність згенерованого контенту. В контексті з цим автори обговорюють ризики, пов'язані з розробкою програмного забезпечення та IT-продуктів, зокрема використанням LLMAP (Large Language Models for Application Programming). Запропонована класифікація ризиків розрізняє пасивні ризики, що виникають при роботі з GenAI, та активні ризики, пов'язані зі свідомим зловживанням. Автори доводять важливість усвідомленого підходу до використання GenAI та розвитку відповідних заходів контролю та безпеки. Результати дослідження протиставляються рекламним заявам про генеративні системи (GenAI) та вказують на їхню потенційну незавершеність, а також на непередбачуваність якості коду. Наголошується на необхідності урахування пасивних й активних ризиків, пов'язаних з використанням таких систем. Пасивні ризики включають можливість помилок та «галюцинацій» у видачі GenAI, проблеми з генерацією складного коду та неконтрольоване поширення результатів їх роботи. Активні ризики включають можливість зворотного інжинірингу баз даних, зламу системи та отримання "заборонених" даних. Автори рекомендують проводити суворий контроль за використанням GenAI в критичних галузях, які вимагають безперебійної роботи та низької ймовірності помилок. Ними також вказано на необхідність вдосконалення технічних, організаційних і законодавчих заходів для ефективного використання GenAI, таких як контроль якості баз даних, відкритий доступ до вихідних кодів та розвиток систем аудиту і контролю з урахуванням прогресу.
Article
Full-text available
Recent years have witnessed the increasing emphasis on human aspects in software engineering research and practices. Our survey of existing studies on human aspects in software engineering shows that screen-captured videos have been widely used to record developers’ behavior and study software engineering practices. The screen-captured videos provide direct information about which software tools the developers interact with and which content they access or generate during the task. Such Human-Computer Interaction (HCI) data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series HCI data from screen-captured task videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we report a formative study to understand the challenges in manually transcribing screen-captured videos into time-series HCI data. We then present a computer-vision based video scraping technique to automatically extract time-series HCI data from screen-captured videos. We also present a case study of our scvRipper tool that implements the video scraping technique using 29-hours of task videos of 20 developers in two development tasks. The case study not only evaluates the runtime performance and robustness of the tool, but also performs a detailed quantitative analysis of the tool’s ability to extract time-series HCI data from screen-captured task videos. We also study the developer’s micro-level behavior patterns in software development from the quantitative analysis.
Article
Full-text available
Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality.
Conference Paper
Background: The relevance of ESEM research to industry practitioners is key to the long-term health of the conference. Aims: The goal of this work is to understand how ESEM research is perceived within the practitioner community and provide feedback to the ESEM community ensure our research remains relevant. Method: To understand how practitioners perceive ESEM research, we replicated previous work by sending a survey to several hundred industry practitioners at a number of companies around the world. We asked the survey participants to rate the relevance of the research described in 156 ESEM papers published between 2011 and 2015. Results: We received 9,941 ratings by 437 practitioners who labeled ideas as Essential, Worth-while, Unimportant, or Unwise. The results showed that overall, industrial practitioners find the work published in ESEM to be valuable: 67% of all ratings were essential or worthwhile. We found no correlation between citation count and perceived relevance of the papers. Through a qualitative analysis, we also identified a number of research themes on which practitioners would like to see an increased research focus. Conclusions: The work published in ESEM is generally relevant to industrial practitioners. There are a number of topics for which those practitioners would like to see additional research undertaken.
Conference Paper
The utility of source code, as of other knowledge artifacts, is predicated on the existence of individuals skilled enough to derive value by using or improving it. Developers leaving a software project deprive the project of the knowledge of the decisions they have made. Previous research shows that the survivors and newcomers maintaining abandoned code have reduced productivity and are more likely to make mistakes. We focus on quantifying the extent of abandoned source files and adapt methods from financial risk analysis to assess the susceptibility of the project to developer turnover. In particular, we measure the historical loss distribution and find (1) that projects are susceptible to losses that are more than three times larger than the expected loss. Using historical simulations we find (2) that projects are susceptible to large losses that are over five times larger than the expected loss. We use Monte Carlo simulations of disaster loss scenarios and find (3) that simplistic estimates of the 'truck factor' exaggerate the potential for loss. To mitigate loss from developer turnover, we modify Cataldo et al.'s coordination requirements matrices. We find (4) that we can recommend the correct successor 34% to 48% of the time. We also find that having successors reduces the expected loss by as much as 15%. Our approach helps large projects assess the risk of turnover thereby making risk more transparent and manageable.
Article
As code search is a frequent developer activity in software development practices, improving the performance of code search is a critical task. In the text retrieval based search techniques employed in the code search, the term mismatch problem is a critical language issue for retrieval effectiveness. By reformulating the queries, query expansion provides effective ways to solve the term mismatch problem. In this paper, we propose Query Expansion based on Crowd Knowledge (QECK), a novel technique to improve the performance of code search algorithms. QECK identifies software-specific expansion words from the high quality pseudo relevance feedback question and answer pairs on Stack Overflow to automatically generate the expansion queries. Furthermore, we incorporate QECK in the classic Rocchio's model, and propose QECK based code search method QECKRocchio. We conduct three experiments to evaluate our QECK technique and investigate QECKRocchio in a large-scale corpus containing real-world code snippets and a question and answer pair collection. The results show that QECK improves the performance of three code search algorithms by up to 64 percent in Precision, and 35 percent in NDCG. Meanwhile, compared with the state-of-the-art query expansion method, the improvement of QECKRocchio is 22 percent in Precision, and 16 percent in NDCG.
Conference Paper
The number of software engineering research papers over the last few years has grown significantly. An important question here is: how relevant is software engineering research to practitioners in the field? To address this question, we conducted a survey at Microsoft where we invited 3,000 industry practitioners to rate the relevance of research ideas contained in 571 ICSE, ESEC/FSE and FSE papers that were published over a five year period. We received 17,913 ratings by 512 practitioners who labelled ideas as essential, worthwhile, unimportant, or unwise. The results from the survey suggest that practitioners are positive towards studies done by the software engineering research community: 71% of all ratings were essential or worthwhile. We found no correlation between the citation counts and the relevance scores of the papers. Through a qualitative analysis of free text responses, we identify several reasons why practitioners considered certain research ideas to be unwise. The survey approach described in this paper is lightweight: on average, a participant spent only 22.5 minutes to respond to the survey. At the same time, the results can provide useful insight to conference organizers, authors, and participating practitioners.