Chanchal K. Roy

Chanchal K. Roy
University of Saskatchewan | U of S · Department of Computer Science

About

237
Publications
47,203
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,902
Citations
Citations since 2017
136 Research Items
4920 Citations
201720182019202020212022202302004006008001,000
201720182019202020212022202302004006008001,000
201720182019202020212022202302004006008001,000
201720182019202020212022202302004006008001,000

Publications

Publications (237)
Conference Paper
Software architecture is defined as the structural construction, design decisions implementation, evolution and knowledge sharing mechanisms of a system. Software architecture documentation help architects with decision making, guide developers during implementation, and preserve architectural decisions so that future caretakers are able to better...
Preprint
Full-text available
The content quality of shared knowledge in Stack Overflow (SO) is crucial in supporting software developers with their programming problems. Thus, SO allows its users to suggest edits to improve the quality of a post (i.e., question and answer). However, existing research shows that many suggested edits in SO are rejected due to undesired contents/...
Article
Full-text available
In this paper, we survey state-of-the-art architectural change detection and categorization techniques and identify future research directions. To the best of our knowledge, our survey is the first comprehensive report on this area. However, in this survey, we compare available techniques using various quality attributes relevant to software archit...
Article
Full-text available
Software engineering (SE) methodologies are widely used in both academia and industry to manage the software development life cycle. A number of studies of SE methodologies involve interviewing stakeholders to explore the real‐world practice. Although these interview‐based studies provide us with a user's perspective of an organization's practice,...
Article
Full-text available
The most common use of data visualization is to minimize the complexity for proper understanding. A graph is one of the most commonly used representations for understanding relational data. It produces a simplified representation of data that is challenging to comprehend if kept in a textual format. In this study, we propose a methodology to utiliz...
Article
Full-text available
Software developers often look for solutions to their code-level problems using the Stack Overflow Q&A website. To receive help, developers frequently submit questions that contain sample code segments along with the description of the programming issue. Unfortunately, it is not always possible to reproduce the issues from the code segments they pr...
Preprint
Full-text available
Source code repositories allow developers to manage multiple versions (or branches) of a software system. Pull-requests are used to modify a branch, and backporting is a regular activity used to port changes from a current development branch to other versions. In open-source software, backports are common and often need to be adapted by hand, which...
Conference Paper
Full-text available
Source code repositories such as GitHub allow developers to man- age multiple versions (or branches) of a software system. Pull- requests are used to modify a branch, and backporting is a regular activity used to port changes from a current development branch to other versions. In open-source software, backports are common and often need to be adap...
Article
Sentiment analysis in software engineering (SE) has shown promise to analyze and support diverse development activities. Recently, several tools are proposed to detect sentiments in software artifacts. While the tools improve accuracy over off-the-shelf tools, recent research shows that their performance could still be unsatisfactory. A more accura...
Article
Full-text available
Co-change candidates are the group of code fragments that require a change if any of these fragments experience a modification in a commit operation during software evolution. The cloned co-change candidates are a subset of the co-change candidates, and the members in this subset are clones of one another. The cloned co-change candidates are usuall...
Preprint
Full-text available
The most common use of data visualization is to minimize the complexity for proper understanding. A graph is one of the most commonly used representations for understanding relational data. It produces a simplified representation of data that is challenging to comprehend if kept in a textual format. In this study, we propose a methodology to utiliz...
Preprint
Full-text available
Software developers often look for solutions to their code-level problems using the Stack Overflow Q&A website. To receive help, developers frequently submit questions containing sample code segments and the description of the programming issue. Unfortunately, it is not always possible to reproduce the issues from the code segments that may impede...
Preprint
Full-text available
Sentiment analysis in software engineering (SE) has shown promise to analyze and support diverse development activities. We report the results of an empirical study that we conducted to determine the feasibility of developing an ensemble engine by combining the polarity labels of stand-alone SE-specific sentiment detectors. Our study has two phases...
Article
Full-text available
Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot per...
Conference Paper
Full-text available
Software architectural changes are involved in the structure of more than one module or component and are complex to analyze compared to local code changes. Development teams aim to review architectural aspects (design) of a change commit considering many essential scenarios such as access rules and restrictions on usage of program entities across...
Chapter
In the original version of the book, the following belated correction is to be incorporated: In chapter “NiCad: A Modern Clone Detector”, the affiliation “Queen’s University, Belfast, Northern Ireland” of author “J. R. Cordy” is to be changed to “Queen's University, Kingston, Canada”. The author’s affiliation is corrected in the chapter. The erratu...
Conference Paper
Full-text available
Applications of image registration tasks are computation-intensive, memory-intensive and communication -intensive. Robust efforts are required on error recovery and re-usability of both the data and the operations, along with performance optimization. Considering these, we explore various programming models aiming to minimize the folding operations...
Preprint
Full-text available
Software architectural changes involve more than one module or component and are complex to analyze compared to local code changes. Development teams aiming to review architectural aspects (design) of a change commit consider many essential scenarios such as access rules and restrictions on usage of program entities across modules. Moreover, design...
Preprint
Full-text available
Software developers often fix critical bugs to ensure the reliability of their software. They might also need to add new features to their software at a regular interval to stay competitive in the market. These bugs and features are reported as change requests (i.e., technical documents written by software users). Developers consult these documents...
Preprint
Full-text available
Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot per...
Chapter
Many clone detection tools and techniques have been created to tackle various use-cases, including syntactical clone detection, semantic clone detection, inter-project clone detection, large-scale clone detection and search, and so on. While a few clone benchmarks are available, none target this breadth of usage. BigCloneBench is a clone benchmark...
Chapter
Code clones are exactly or nearly similar code pieces in the source code files of a software system. These mainly get created because of the frequent copy/paste activities of the programmers during development. Many studies have been done on realizing the impacts of code clones on software evolution and maintenance. We performed a comprehensive stu...
Chapter
Code clones are exactly or nearly similar code fragments in the code-base of a software system. Studies have revealed that such code fragments can have mixed impact (both positive and negative) on software evolution and maintenance. In order to reduce the negative impact of code clones and benifit from their advantages, researchers have suggested a...
Chapter
Clone detection is an active area of research. However, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. SourcererCC was developed as an attempt to fill this gap. It is a widely used tok...
Article
Full-text available
The online technical Q&A site Stack Overflow (SO) is popular among developers to support their coding and diverse development needs. To address shortcomings in API official documentation resources, several research works have thus focused on augmenting official API documentation with insights (e.g., code examples) from SO. The techniques propose to...
Conference Paper
Full-text available
The success of developer forums like Stack Overflow (SO) depends on the participation of users and the quality of shared knowledge. SO allows its users to suggest edits to improve the quality of the posts (i.e., questions and answers). Such posts can be rolled back to an earlier version when the current version of the post with the suggested edit d...
Preprint
Code snippets available on question answering sites (e.g., Stack Overflow) are a great source of information for learning how to use APIs. However, it is difficult to determine which APIs are discussed in those code snippets because they often suffer from declaration ambiguities and missing external references. In this paper, we introduce COSTER, a...
Preprint
Full-text available
We propose a framework to mine API usage scenarios from Stack Overflow. Each task consists of a code example, the task description, and the reactions of developers towards the code example. First, we present an algorithm to automatically link a code example in a forum post to an API mentioned in the textual contents of the forum post. Second, we ge...
Preprint
Full-text available
The online technical Q&A site Stack Overflow (SO) is popular among developers to support their coding and diverse development needs. To address shortcomings in API official documentation resources, several research has thus focused on augmenting official API documentation with insights (e.g., code examples) from SO. The techniques propose to add co...
Preprint
Full-text available
Stack Overflow is one of the largest and most popular question-answering (Q&A) websites. It accumulates millions of programming related questions and answers to support the developers in software development. Unfortunately, a large number of questions are not answered at all, which might hurt the quality or purpose of this community-oriented knowle...
Article
Full-text available
Evolutionary coupling is a well investigated phenomenon in software maintenance research and practice. Association rules and two related measures, support and confidence, have been used to identify evolutionary coupling among program entities. However, these measures only emphasize the co-change (i.e., changing together) frequency of entities and c...
Article
Full-text available
Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face three major problems. First, they frequently need to read and analyse multiple results from the search engines to obtain a satisfactory solution. Second, the search is impaired due to a lexical gap between the query (task description)...
Conference Paper
Full-text available
Abstract—Context. The online technical Q&A site, Stack Overflow has changed the way software developers look for solutions. As such, content quality in Stack Overflow is paramount. Users in Stack Overflow can suggest improvement to a post (i.e., answer or question) by suggesting edits to the post. Problem. Recent research shows that a large numbe...
Preprint
Full-text available
There are a great many clone detection tools proposed in the literature. In this paper, we investigate the state of clone detection tool evaluation. We begin by surveying the clone detection benchmarks, and performing a multi-faceted evaluation and comparison of their features and capabilities. We then survey the existing clone detection tool and t...
Preprint
Full-text available
Scientific workflow management systems such as Galaxy, Taverna and Workspace, have been developed to automate scientific workflow management and are increasingly being used to accelerate the specification, execution, visualization, and monitoring of data-intensive tasks. For example, the popular bioinformatics platform Galaxy is installed on over 1...
Article
Full-text available
A code clone is a pair of code fragments, within or between software systems that are similar. Since code clones often negatively impact the maintainability of a software system, several code clone detection techniques and tools have been proposed and studied over the last decade. However, the clone detection tools are not always perfect and their...
Preprint
Full-text available
Duplicated code or code clones are a kind of code smell that have both positive and negative impacts on the development and maintenance of software systems. Software clone research in the past mostly focused on the detection and analysis of code clones, while research in recent years extends to the whole spectrum of clone management. In the last de...
Preprint
Full-text available
A code clone is a pair of code fragments, within or between software systems that are similar. Since code clones often negatively impact the maintainability of a software system, several code clone detection techniques and tools have been proposed and studied over the last decade. To detect all possible similar source code patterns in general, the...
Preprint
Full-text available
The fork-based development mechanism provides the flexibility and the unified processes for software teams to collaborate easily in a distributed setting without too much coordination overhead. Currently, multiple social coding platforms support fork-based development , such as GitHub, GitLab, and Bitbucket. Although these different platforms virtu...
Article
Full-text available
The design and maintenance of APIs (Application Programming Interfaces) are complex tasks due to the constantly changing requirements of their users. Despite the efforts of their designers, APIs may suffer from a number of issues (such as incomplete or erroneous documentation, poor performance, and backward incompatibility). To maintain a healthy c...
Article
Full-text available
To detect large-variance code clones (i.e. clones with many modifications) in large-scale code repositories is difficult because most current tools can only detect almost identical or very similar clones. It has an important impact on downstream software applications such as bug detection, code completion, software analysis, etc. Recently, CCAligne...
Article
Full-text available
Context APIs play a central role in software development. The seminal research of Carroll et al. [15] on minimal manual and subsequent studies by Shull et al. [79] showed that developers prefer task-based API documentation instead of traditional hierarchical official documentation (e.g., Javadoc). The Q&A format in Stack Overflow offers developers...
Preprint
Full-text available
Code reuse by copying and pasting from one place to another place in a codebase is a very common scenario in software development which is also one of the most typical reasons for introducing code clones. There is a huge availability of tools to detect such cloned fragments and a lot of studies have already been done for efficient clone detection....
Chapter
Full-text available
Big Data analytics or systems developed with parallel distributed processing frameworks (e.g., Hadoop and Spark) are becoming popular for finding important insights from a huge amount of heterogeneous data (e.g., image, text, and sensor data). These systems offer a wide range of tools and connect them to form workflows for processing Big Data. Inde...
Chapter
Full-text available
Scientific workflow management system (SWFMS) is one of the inherent parts of Big Data analytics systems. Analyses in such data intensive research using workflows are very costly. SWFMSs or workflows keep track of every bit of executions through logs, which later could be used on demand. For example, in the case of errors, security breaches, or eve...
Article
Full-text available
While there are novel approaches for detecting and categorizing similar software applications, previous research focused on detecting similarity in applications written in the same programming language and not on detecting similarity in applications written in different programming languages. Cross-language software similarity detection is inherent...
Conference Paper
Full-text available
In modern days, mobile applications (apps) have become omnipresent. Components of mobile apps (such as 3rd party libraries) require to be separated and analyzed differently for security issue detection, repackaged app detection, tumor code purification and so on. Various techniques are available to automatically analyze mobile apps. However, analys...
Conference Paper
Developers often reuse code snippets from online forums, such as Stack Overflow, to learn API usages of software frameworks or libraries. These code snippets often contain ambiguous undeclared external references. Such external references make it difficult to learn and use those APIs correctly. In particular, reusing code snippets containing such a...