Abram Hindle’s research while affiliated with University of Alberta and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (197)


Federated Learning and Differential Privacy Techniques on Multi-hospital Population-scale Electrocardiogram Data
  • Conference Paper

September 2024

·

4 Reads

·

2 Citations

Vikhyat Agrawal

·

·

·

[...]

·


IRJIT’s Approach
Performance evolution of a typical JIT-SDP approach over time showing a declining trend. x-axis shows time step and y-axis shows G-mean values. The model was trained once and the subsequent predictions were made without retraining the model
An overview of IRJIT operation. IRJIT extracts past changes using SZZ and indexes them using inverted indexes. On arrival of a new commit, IRJIT classifies the commit based on its similarity with the past changes in the index. Finally, IRJIT ranks changed lines according to bugginess
Comparing CPU/GPU run time of the evaluated approaches in seconds. Only JITFine was evaluated on GPU. ORB is not included in the plot because of its negligible run time
Comparing G-mean and difference of recalls (|R0-R1|) performance of IRJITonline\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{online}$$\end{document} and ORB for all datasets. X-axis shows timesteps

+7

IRJIT: A simple, online, information retrieval approach for just-in-time software defect prediction
  • Article
  • Publisher preview available

August 2024

·

21 Reads

Empirical Software Engineering

Just-in-Time software defect prediction (JIT-SDP) prevents the introduction of defects into the software by identifying them at commit check-in time. Current software defect prediction approaches rely on manually crafted features such as change metrics and involve expensive to train machine learning or deep learning models. These models typically involve extensive training processes that may require significant computational resources and time. These characteristics can pose challenges when attempting to update the models in real-time as new examples become available, potentially impacting their suitability for fast online defect prediction. Furthermore, the reliance on a complex underlying model makes these approaches often less explainable, which means the developers cannot understand the reasons behind models’ predictions. An approach that is not explainable might not be adopted in real-life development environments because of developers’ lack of trust in its results. To address these limitations, we propose an approach called IRJIT that employs information retrieval on source code and labels new commits as buggy or clean based on their similarity to past buggy or clean commits. IRJIT approach is online and explainable as it can learn from new data without expensive retraining, and developers can see the documents that support a prediction, providing additional context. By evaluating 10 open-source datasets in a within project setting, we show that our approach is up to 112 times faster than the state-of-the-art ML and DL approaches, offers explainability at the commit and line level, and has comparable performance to the state-of-the-art.

View access options



Fig. 3 | Comparison of AUROC performances for DL: ECG, age, sex model for 15 cardiovascular conditions for specific subgroups. Evaluations are performed separately for the males and females of the holdout patients (a) as well as ECGs without pacemakers and all ECGs in the holdout set (b). The height of bars represents the performance in external holdout validation and the models with statistically higher performance are indicated with a star.
Fig. 4 | The GradCAM plots for the DL model in diagnosis of different cardiovascular conditions. Representative ECG traces were chosen for a selected group of diagnoses. GradCAM results do not extend to the entire population, but indicative of the DL model's prediction for a single representative case. The darker areas in each trace on GradCAM denote the areas with the most contribution to DL model's diagnostic prediction. PR intervals and QRS complexes in STEMI, T waves in NSTEMI, QRS complexes in PHTN, VT beats in patients with non-sustained VT, QRS complexes in AS, p waves in AVB, and ST segment region in HF contributed the most to the diagnosis of each condition. AS aortic stenosis, AVB atrioventricular block, DL deep learning, ECG electrocardiogram, HF heart failure, NSTEMI non-ST-elevation myocardial infarction, STEMI ST-elevation myocardial infarction, PHTN pulmonary hypertension, VT ventricular tachycardia.
Fig. 5 | Heatmap of feature importance analyses of XGBoost models with ECG measurements, age, and sex. Information gain-based feature importance for various cardiovascular conditions with XGBoost models based on ECG measurements showed substantial information gain with P-duration for prediction of AF, heart rate for SVT, RR interval for UA etc. ECG electrocardiogram. Abbreviations for ECG measurements and diseases are provided in Supplementary Tables 8 and 9.
Development and validation of machine learning algorithms based on electrocardiograms for cardiovascular diagnoses at the population level

May 2024

·

60 Reads

·

5 Citations

npj Digital Medicine

Artificial intelligence-enabled electrocardiogram (ECG) algorithms are gaining prominence for the early detection of cardiovascular (CV) conditions, including those not traditionally associated with conventional ECG measures or expert interpretation. This study develops and validates such models for simultaneous prediction of 15 different common CV diagnoses at the population level. We conducted a retrospective study that included 1,605,268 ECGs of 244,077 adult patients presenting to 84 emergency departments or hospitals, who underwent at least one 12-lead ECG from February 2007 to April 2020 in Alberta, Canada, and considered 15 CV diagnoses, as identified by International Classification of Diseases, 10th revision (ICD-10) codes: atrial fibrillation (AF), supraventricular tachycardia (SVT), ventricular tachycardia (VT), cardiac arrest (CA), atrioventricular block (AVB), unstable angina (UA), ST-elevation myocardial infarction (STEMI), non-STEMI (NSTEMI), pulmonary embolism (PE), hypertrophic cardiomyopathy (HCM), aortic stenosis (AS), mitral valve prolapse (MVP), mitral valve stenosis (MS), pulmonary hypertension (PHTN), and heart failure (HF). We employed ResNet-based deep learning (DL) using ECG tracings and extreme gradient boosting (XGB) using ECG measurements. When evaluated on the first ECGs per episode of 97,631 holdout patients, the DL models had an area under the receiver operating characteristic curve (AUROC) of <80% for 3 CV conditions (PTE, SVT, UA), 80–90% for 8 CV conditions (CA, NSTEMI, VT, MVP, PHTN, AS, AF, HF) and an AUROC > 90% for 4 diagnoses (AVB, HCM, MS, STEMI). DL models outperformed XGB models with about 5% higher AUROC on average. Overall, ECG-based prediction models demonstrated good-to-excellent prediction performance in diagnosing common CV conditions.


Patterns of multi-container composition for service orchestration with Docker Compose

May 2024

·

165 Reads

·

1 Citation

Empirical Software Engineering

Software design patterns present general code solutions to common software design problems. Modern software systems rely heavily on containers for running their constituent service components. Yet, despite the prevalence of ready-to-use Docker service images ready to participate in multi-container service compositions of applications, developers do not have much guidance on how to compose their own Docker service orchestrations. Thus in this work, we curate a dataset of successful projects that employ Docker Compose as an orchestration tool to run multiple service containers; then, we engage in qualitative and quantitative analysis of Docker Compose configurations. The collection of data and analysis enables the identification and naming of repeating multi-container composition patterns that are used in numerous successful open-source projects, much like software design patterns. These patterns highlight how software systems are orchestrated in the real-world and can give examples to anybody wishing to compose their own service orchestrations. These contributions also advance empirical research in software engineering patterns as evidence is provided about how Docker Compose is used.


Predicting Individual Survival Distributions Using ECG: A Deep Learning Approach Utilizing Features Extracted by a Learned Diagnostic Model

January 2024

·

52 Reads

·

1 Citation

Proceedings of the AAAI Symposium Series

In the field of healthcare, individual survival prediction is important for personalized treatment planning. This study presents machine learning algorithms for predicting Individual Survival Distributions (ISD) using electrocardiography (ECG) data in two different formats. The models, which predict time until death, are developed and evaluated on a large, population-based cohort from Alberta, Canada. Our results demonstrate that models trained on raw ECG waveforms significantly outperform those trained on traditional ECG measurements in several metrics, including concordance index, hinge L1 loss, margin L1 loss, and margin truncated L1 loss. Additionally, the integration of predicted probabilities from wide-range diagnostic tasks not only enhances our ISD models' performance but also makes them significantly superior to other models across all evaluation metrics in individual survival prediction tasks. This innovative approach highlights the potential to leverage insights from diagnostic models for prognostic tasks, such as individual survival prediction. These findings could have far-reaching implications for the development of personalized treatment plans and open new avenues for future research in survival prediction using ECGs.





Citations (64)


... Moreover, while high accuracy and validation rates in federated learning (FL) demonstrate the model's capacity to effectively classify or predict outcomes, their implications in real-world healthcare applications, such as in smart hospitals, extend beyond these numerical indicators [90]. The high accuracy achieved by FL suggests its potential to handle sensitive neuroimaging data without centralizing it, a critical advantage in environments where patient privacy and data security are paramount [91]. ...

Reference:

Decoding Schizophrenia: How AI-Enhanced fMRI Unlocks New Pathways for Precision Psychiatry
Federated Learning and Differential Privacy Techniques on Multi-hospital Population-scale Electrocardiogram Data
  • Citing Conference Paper
  • September 2024

... However, most of these studies have primarily focused on easily accessible clinical and laboratory parameters, neglecting a more comprehensive evaluation of the patient's nutritional and inflammatory status. 9,[15][16][17][18][19] Accumulating evidence suggests that nutritional and inflammatory indices are key predictors of the outcomes of various cardiovascular and pulmonary diseases. One such index, the advanced lung cancer inflammation index (ALI), is a composite marker that reflects the balance between the nutritional and inflammatory parameters. ...

Development and validation of machine learning algorithms based on electrocardiograms for cardiovascular diagnoses at the population level

npj Digital Medicine

... As microservices architectures grow in complexity, observability becomes crucial for maintaining system health and performance. Observability encompasses three main pillars: logging, monitoring, and distributed tracing [7]. According to a 2023 survey by the Cloud Native Computing Foundation, 90% of organizations consider observability critical or important for their production environments [8]. ...

Patterns of multi-container composition for service orchestration with Docker Compose

Empirical Software Engineering

... To mark a software change as defect-inducing, defect-fixing changes can be identified from issue tracking systems or commit messages. Defect-fixing changes can be linked to defectinducing changes using variants of the SZZ algorithm [22,23]. The choice of how defect-inducing changes are found can affect the outcomes of a defect prediction model [24,25]. ...

Identifying Defect-Inducing Changes in Visual Code
  • Citing Conference Paper
  • October 2023

... Year Publication Venue PS01 Carvalho et al. [11] 2014 Computer Science and Information Systems PS02 Ding et al. [20] 2014 International Conference on Engineering of Complex Computer Systems PS03 Bigliardi et al. [9] 2014 International Conference on Quality Software PS04 Aversano et al. [8] 2017 International Conference on Evaluation of Novel Approaches to Software Engineering PS05 Ma et al. [40] 2018 International Conference on Mining Software Repositories PS06 Prana et al. [53] 2019 Empirical Software Engineering PS07 AlOmar et al. [4] 2021 Journal of Software: Evolution and Process PS08 Pasuksmit et al. [46] 2022 International Conference on Mining Software Repositories PS09 Puhlfürß et al. [54] 2022 International Conference on Software Maintenance and Evolution PS10 Sun et al. [60] 2023 International Conference on Mining Software Repositories PS11 Wermke et al. [65] 2023 Symposium on Security and Privacy PS12 Ciurumelea et al. [14] 2023 Empirical Software Engineering ...

An Empirical Study to Investigate Collaboration Among Developers in Open Source Software (OSS)
  • Citing Conference Paper
  • May 2023

... In contrast to our research, their scope encompasses a broader range of projects, whereas we concentrate solely on Python-based deep learning projects and investigate the usage of tools within source code. Finally, Lin et al. [8] present a comprehensive study on test automation practices in open-source Android apps, focusing on over 12,000 projects from various app markets over a period of 5 years. The main objective was to investigate the adoption of test automation in non-trivial apps, with a particular emphasis on UI and unit tests. ...

Evolution of the Practice of Software Testing in Java Projects
  • Citing Conference Paper
  • May 2023

... GSM, mobile device usage patterns, and energy usage at different states (on and off). Also, [17][18][19] investigated video service energy usage, focusing on quality delivery without specifying the network under investigation. From [16][17][18][19], the relationship between network quality and energy utilisation of smart mobile devices is crucial for users to understand and comprehend its impact on their device's performance. ...

Energy Consumption Estimation of API-usage in Smartphone Apps via Static Analysis
  • Citing Conference Paper
  • May 2023

... [32] employed similarity-based modeling for heart failure prediction via wearable sensors. [33] used ResNet-based deep learning and XGBoost for ECG-based predictions, outperforming traditional models. [34] combined convolutional neural networks with clinical risk factors to predict atrial fibrillation. ...

Towards artificial intelligence-based learning health system for population-level mortality prediction using electrocardiograms

npj Digital Medicine

... In a reflections article (Posnett et al. 2021) on their original paper of 2011 (Posnett et al. 2011), Posnett et al. point out the distinction between readability and program comprehension and while the research area is actively worked on, not all work in readability is related to comprehension. Scalabrino et al. for example investigate program comprehension with the code readability aspect (Scalabrino et al. 2016(Scalabrino et al. , 2017. ...

Reflections on: A Simpler Model of Software Readability
  • Citing Article
  • July 2021

ACM SIGSOFT Software Engineering Notes

... We apply SentenceTransformer (Reimers and Gurevych 2019), a pre-trained BERT (Bidirectional Encoder Representations from Transformers) model, to generate a sentence embedding for each SATD comment. This model is broadly used to produce document vectors of SATD clones in source code comments and similar questions in Q&A websites (Yasmin et al. 2022;Kamienski et al. 2023). ...

Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers

Empirical Software Engineering