Example for a bug fix, where a new condition is added. In Fix 1, the line a.foo() is modified by adding whitespaces and part of the textual difference. In Fix 2, a.foo() is not part of the diff

Example for a bug fix, where a new condition is added. In Fix 1, the line a.foo() is modified by adding whitespaces and part of the textual difference. In Fix 2, a.foo() is not part of the diff

Source publication
Article
Full-text available
Context Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective We want to improve our understanding of the prevalence of tangling and the types of changes that...

Citations

... This hypothesis assumes that all changes in a bug-fixing commit are about the bug fix. However, research projects have shown that bug-fixing commits contain other changes that are not related to bug fixes [30]. ...
Preprint
Full-text available
Language constructs inspired by functional programming have made their way into most mainstream programming languages. Many researchers and developers consider that these constructs lead to programs that are more concise, reusable, and easier to understand. However, few studies investigate the implications of using them in mainstream programming languages. This paper quantifies the prevalence of four concepts typically associated with functional programming in JavaScript: recursion, immutability, lazy evaluation, and functions as values. We focus on JavaScript programs due to the availability of some of these concepts in the language since its inception, its inspiration from functional programming languages, and its popularity. We mine 91 GitHub repositories (22+ million LOC) written mostly in JavaScript (over 50% of the code), measuring the usage of these concepts from both static and temporal perspectives. We also measure the likelihood of bug-fixing commits removing uses of these concepts (which would hint at bug-proneness) and their association with the presence of code comments (which would hint at code that is hard to understand). We find that these concepts are in widespread use (1 for every 46.65 LOC, 43.59% of LOC). In addition, the usage of higher-order functions, immutability, and lazy evaluation-related structures has been growing throughout the years for the analyzed projects, while the usage of recursion and callbacks & promises has decreased. We also find statistical evidence that removing these structures, with the exception of the ones associated to immutability, is less common in bug-fixing commits than in other commits. In addition, their presence is not correlated with comment size. Our findings suggest that functional programming concepts are important for developers using a multi-paradigm language, and their usage does not make programs harder to understand.
... Datasets of 150k JavaScript files [62] and 150k methods [51] were used for program generation. More examples are datasets for vulnerability [54,7], tangling commits [34], text summarization [21], Truck Factor Developers Detachment (TFDD) [13], commit classification [6,44], code translation [47] and many more [64,66]. ...
Preprint
Full-text available
End to end learning is machine learning starting in raw data and predicting a desired concept, with all steps done automatically. In software engineering context, we see it as starting from the source code and predicting process metrics. This framework can be used for predicting defects, code quality, productivity and more. End-to-end improves over features based machine learning by not requiring domain experts and being able to extract new knowledge. We describe a dataset of 5M files from 15k projects constructed for this goal. The dataset is constructed in a way that enables not only predicting concepts but also investigating their causes.