Abram Hindle's research while affiliated with University of Alberta and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (172)
Electrocardiogram (ECG) abnormalities are linked to cardiovascular diseases, but may also occur in other non-cardiovascular conditions such as mental, neurological, metabolic and infectious conditions. However, most of the recent success of deep learning (DL) based diagnostic predictions in selected patient cohorts have been limited to a small set...
Defect detection at commit check-in time prevents the introduction of defects into software systems. Current defect detection approaches rely on metric-based models which are not very accurate and whose results are not directly useful for developers. We propose a method to detect bug-inducing commits by comparing the incoming changes with all past...
Sentiment analysis is a popular technique to identify the sentiment of a piece of text. Several different domains have been targeted by sentiment analysis research, such as Twitter, movie reviews, and mobile app reviews. Although several techniques have been proposed, the performance of current sentiment analysis techniques is still far from accept...
Docker is becoming ubiquitous with containerization for developing and deploying applications. Previous studies have analyzed Dockerfiles that are used to create container images in order to better understand how to improve Docker tooling. These studies obtain Dockerfiles using either Docker Hub or Github. In this paper, we revisit the findings of...
Single-statement bugs (SStuBs) can have a severe impact on developer productivity. Despite usually being simple and not offering much of a challenge to fix, these bugs may still disturb a developer’s workflow and waste precious development time. However, few studies have paid attention to these simple bugs, focusing instead on bugs of any size and...
Informal communication channels like mailing lists, IRC and instant messaging play a vital role in open source software development by facilitating communication within geographically diverse project teams e.g., to discuss issue reports to facilitate the bug-fixing process. More recently, chat systems like Slack and Gitter have gained a lot of popu...
Software is prone to bugs and failures. Security bugs are those that expose or share privileged information and access in violation of the software's requirements. Given the seriousness of security bugs, there are centralized mechanisms for supporting and tracking these bugs across multiple products, one such mechanism is the Common Vulnerabilities...
Researchers in empirical software engineering often make claims based on observable data such as defect reports. Unfortunately, in many cases, these claims are generalized beyond the data sets that have been evaluated. Will the researcher’s conclusions hold a year from now for the same software projects? Perhaps not. Recent studies show that in the...
The 12-lead electrocardiogram (ECG) is a commonly used tool for detecting cardiac abnormalities such as atrial fibrillation, blocks, and irregular complexes. For the PhysioNet/CinC 2020 Challenge, we built an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis. For each lead,...
Can we generate drum synthesizers automatically? We present an approach for the automatic generation of synthesizer programs for one-shot percussive sounds. Recent advancements in digital synthesis, heuristic search, and neural networks can be utilized for sound generation. Yet the need for data, the problem of open set recognition, and high comput...
Researchers in empirical software engineering often make claims based on observable data such as defect reports. Unfortunately, in many cases, these claims are generalized beyond the data sets that have been evaluated. Will the researcher's conclusions hold a year from now for the same software projects? Perhaps not. Recent studies show that in the...
Software energy consumption is a performance related non-functional requirement that complicates building software on mobile devices today. Energy hogging applications (apps) are a liability to both the end-user and software developer. Measuring software energy consumption is non-trivial, requiring both equipment and expertise, yet researchers have...
One problem when studying how to find and fix syntax errors is how to get natural and representative examples of syntax errors. Most syntax error datasets are not free, open, and public, or they are extracted from novice programmers and do not represent syntax errors that the general population of developers would make. Programmers of all skill lev...
Online resources today contain an abundant amount of code snippets for documentation, collaboration, learning, and problem-solving purposes. Their executability in a "plug and play" manner enables us to confirm their quality and use them directly in projects. But, in practice that is often not the case due to several requirements violations or inco...
Massive Open Online Courses are educational programs that are open and accessible to a large number of people through the internet. To facilitate learning, MOOC discussion forums exist where students and instructors communicate questions, answers, and thoughts related to the course. The primary objective of this paper is to investigate tracing disc...
Machine learning is a popular method of learning functions from data to represent and to classify sensor inputs, multimedia, emails, and calendar events. Smartphone applications have been integrating more and more intelligence in the form of machine learning. Machine learning functionality now appears on most smartphones as voice recognition, spell...
Bug deduplication or duplicate bug report detection is a hot topic in software engineering information retrieval research, but it is often not deployed. Typically to de-duplicate bug reports developers rely upon the search capabilities of the bug report software they employ, such as Bugzilla, Jira, or Github Issues. These search capabilities range...
Examines the concept of "keeping it simple" with respect to software engineering. It’s funny how keeping it simple in software development can often mean revising and refactoring an existing system until it is elegant enough to afford adaptation and change. Simplicity and elegance are the goals of many developers when they’re designing software. De...
Background: An Android smartphone is an ecosystem of applications, drivers, operating system components, and assets. The volume of the software is large and the number of test cases needed to cover the functionality of an Android system is substantial. Enormous effort has been already taken to properly quantify "what features and apps were tested a...
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To t...
On the worldwide web, not only are webpages connected but source code is too. Software development is becoming more accessible to everyone and the licensing for software remains complicated. We need to know if software licenses are being maintained properly throughout their reuse and evolution. This motivated the development of the Sourcerer's Appr...
Execution logs are debug statements that developers insert into their code. Execution logs are used widely to monitor and diagnose the health of software applications. However, logging comes with costs, as it uses computing resources and can have an impact on an application’s performance. Compared with desktop applications, one additional critical...
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. Prev...
On mobile phones, users and developers use apps official marketplaces serving as repositories of apps. The Google Play Store and Apple Store are the official marketplaces of Android and Apple products which offer more than a million apps. Although both repositories offer description of apps, information concerning performance is not available. Due...
Minor syntax errors are made by novice and experienced programmers alike; however, novice programmers lack the years of intuition that help them resolve these tiny errors. Standard LR parsers typically resolve syntax errors and their precise location poorly. We propose a methodology that helps locate where syntax errors occur, but also suggests pos...
Minor syntax errors are made by novice and experienced programmers alike; however, novice programmers lack the years of intuition that help them resolve these tiny errors. Standard LR parsers typically resolve syntax errors and their precise location poorly. We propose a methodology that helps locate where syntax errors occur, but also suggests pos...
Context: Virtual machines provide isolation of services at the cost of hypervisors and more resource usage. This spurred the growth of systems like Docker that enable single hosts to isolate several applications, similar to VMs, within a low-overhead abstraction called containers. Motivation: Although containers tout low overhead performance, do th...
Music transcription involves the transformation of an audio recording to common music notation, colloquially referred to as sheet music. Manually transcribing audio recordings is a difficult and time-consuming process, even for experienced musicians. In response, several algorithms have been proposed to automatically analyze and transcribe the note...
Software energy consumption is a performance related non-functional requirement that complicates building software on mobile devices today. Energy hogging applications are a liability to both the end-user and software developer. Measuring software energy consumption is non-trivial, requiring both equipment and expertise, yet many researchers have f...
This work investigates the properties of crash reports collected from Ubuntu Linux users. Understanding crash reports is important to better store, categorize, prioritize, parse, triage, assign bugs to, and potentially synthesize them. Understanding what is in a crash report, and how the metadata and stack traces in crash reports vary will help sol...
This work investigates the properties of crash reports collected from Ubuntu Linux users. Understanding crash reports is important to better store, categorize, prioritize, parse, triage, assign bugs to, and potentially synthesize them. Understanding what is in a crash report, and how the metadata and stack traces in crash reports vary will help sol...
Bug deduplication, ie, recognizing bug reports that refer to the same problem, is a challenging task in the software‐engineering life cycle. Researchers have proposed several methods primarily relying on information‐retrieval techniques. Our work motivated by the intuition that domain knowledge can provide the relevant context to enhance effectiven...
Machine learning is a popular method of learning functions from data to represent and to classify sensor inputs, multimedia, emails, and calendar events. Smartphone applications have been integrating more and more intelligence in the form of machine learning. Machine learning functionality now appears on most smartphones as voice recognition, spell...
Machine learning is a popular method of learning functions from data to represent and to classify sensor inputs, multimedia, emails, and calendar events. Smartphone applications have been integrating more and more intelligence in the form of machine learning. Machine learning functionality now appears on most smartphones as voice recognition, spell...
Software energy consumption is a performance related non-functional requirement that complicates building software on mobile devices today. Energy hogging applications are a liability to both the end-user and software developer. Measuring software energy consumption is non-trivial, requiring both equipment and expertise, yet many researchers have f...
Software energy consumption is a performance related non-functional requirement that complicates building software on mobile devices today. Energy hogging applications are a liability to both the end-user and software developer. Measuring software energy consumption is non-trivial, requiring both equipment and expertise, yet many researchers have f...
Bug deduplication is a hot topic in software engineering information retrieval research, but it is often not deployed. Typically to de-duplicate bug reports developers rely upon the search capabilities of the bug report software they employ, such as Bugzilla, Jira, or Github Issues. These search capabilities range from simple SQL string search to I...
Bug deduplication is a hot topic in software engineering information retrieval research, but it is often not deployed. Typically to de-duplicate bug reports developers rely upon the search capabilities of the bug report software they employ, such as Bugzilla, Jira, or Github Issues. These search capabilities range from simple SQL string search to I...
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models advocate Test Driven Development (TDD) as one among their key practices for reducing costs and improving code quality. In this paper we comparatively analyze GitHub repositories that adopt TDD...
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models advocate Test Driven Development (TDD) as one among their key practices for reducing costs and improving code quality. In this paper we comparatively analyze GitHub repositories that adopt TDD...
We created detailed profiles of the energy consumed by common operations done on Java List, Map, and Set abstractions. The results show that the alternative data types for these abstractions differ significantly in terms of energy consumption depending on the operations. For example, an ArrayList consumes less energy than a LinkedList if items are...
Software energy consumption is a relatively new concern for mobile application developers. Poor energy performance can harm adoption and sales of applications. Unfortunately for the developers, the measurement of software energy consumption is expensive in terms of hardware and difficult in terms of expertise. Many prior models of software energy c...
Developers summarize their changes to code in commit messages. When a message seems "unusual", however, this puts doubt into the quality of the code contained in the commit. We trained n-gram language models and used cross-entropy as an indicator of commit message "unusualness" of over 120,000 commits from open source projects. Build statuses colle...
Organizations like Mozilla, Microsoft, and Apple are flooded with thousands of automated crash reports per day. Although crash reports contain valuable information for debugging, there are often too many for developers to examine individually. Therefore, in industry, crash reports are often automatically grouped together in buckets. Ubuntu?s reposi...
The improvement in battery technology for battery-driven devices is insignificant compared to their computing ability. In spite of the overwhelming advances in processing ability, adoption of sophisticated applications is hindered by the fear of shorter battery life. This is one of the several reasons software developers are becoming conscious of w...
Server energy consumption has been a subject of research for more than a decade now. With Internet scaling rapidly all over the world, more servers are being added continuously. With global warming and financial cost associated with running servers, it has now become a more pressing concern to optimize the power consumption of these servers while s...
Server energy consumption has been a subject of research for more than a decade now. With Internet scaling rapidly all over the world, more servers are being added continuously. With global warming and financial cost associated with running servers, it has now become a more pressing concern to optimize the power consumption of these servers while s...
Natural languages like English are rich, complex, and powerful. The highly creative and graceful use of languages like English and Tamil, by masters like Shakespeare and Avvaiyar, can certainly delight and inspire. But in practice, given cognitive constraints and the exigencies of daily life, most human utterances are far simpler and much more repe...
The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedi...
The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedi...
This paper studies the relationship between Test Driven Development (TDD), productivity and developer sentiment in order to assess the impact of TDD on software development. We used a set of 256572 Java repositories archived from GitHub in September 2015 and made available through the Boa language and infrastructure. This research found that of the...
Bug triaging, i.e., assigning a bug report to the “best” person to address it, involves identifying a list of developers that are qualified to understand and address the bug report, and then ranking them according to their expertise. Most research in this area examines the description of the bug report and the developers’ prior development and bug-...
Apache Hadoop has evolved significantly over the last years, with more than 60 releases bringing new features. By implementing the MapReduce programming paradigm and leveraging HDFS, its distributed file system, Hadoop has become a reliable and fault tolerant middleware for parallel and distributed computing over large datasets. Nevertheless, Hadoo...
This paper studies the relationship between Test Driven Development (TDD), productivity and developer sentiment in order to assess the impact of TDD on software development. We used a set of 256572 Java repositories archived from GitHub in September 2015 and made available through the Boa language and infrastructure. This research found that of the...
Nine rising stars in software engineering describe how software engineering research will evolve, highlighting emerging opportunities and groundbreaking solutions. They predict the rise of end-user programming, the monitoring of developers through neuroimaging and biometrics sensors, analysis of data from unstructured documents, the mining of mobil...
Computer Science often seems distant from its natural science cousins, especially software engineering which feels closer to sociology and psychology than to physics. Physical measurements are often rare in software engineering, except in a few niches. One such important niche is that of software energy consumption, green mining, green IT, and sust...