John Grundy’s research while affiliated with Monash University (Australia) and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (842)


Demystifying React Native Android Apps for Static Analysis
  • Article

November 2024

·

13 Reads

ACM Transactions on Software Engineering and Methodology

Yonghui Liu

·

·

·

[...]

·

React Native, an open-source framework, simplifies cross-platform app development by allowing JavaScript-side code to interact with native-side code. Previous studies disregarded React Native, resulting in insufficient static analysis of React Native app code. This study initiates the investigation of challenges when statically analyzing React Native apps. We propose ReuNify to improve Soot-based static analysis coverage for JavaScript-side and native-side code. ReuNify converts Hermes bytecode to Soot’s intermediate representation. Hermes bytecode, compiled from JavaScript code and integrated into React Native apps, possesses a unique syntax that eludes current JavaScript analyzers. Additionally, we investigate opcode distribution and conduct in-depth analyses of the usage of opcode between popular apps and malware. We also propose a benchmark consisting of 97 control-flow-related cases to validate the control-flow recovery of the generated intermediate representation. Furthermore, we model the cross-language communication mechanisms of React Native to expand the static analysis coverage for native-side code. Our evaluation demonstrates that ReuNify enables an average increase of 84% in reached nodes within the call graph and further identifies an average of two additional privacy leaks in taint analysis. In summary, this paper demonstrates that ReuNify significantly improves the static analysis for the React Native Android apps


Figure 1: Systematic Literature Review Process
Figure 2: Systematic Literature Review Screen Strategy
Figure 3: Study Distribution by Year and Type
Figure 4: Number of Older Adults Participated in Each Study
Figure 5: Fuzzy Age Group and Mean Age of Older Adults Participated in Each Study

+3

Requirements Engineering for Older Adult Digital Health Software: A Systematic Literature Review
  • Preprint
  • File available

November 2024

·

26 Reads

Growth of the older adult population has led to an increasing interest in technology-supported aged care. However, the area has some challenges such as a lack of caregivers and limitations in understanding the emotional, social, physical, and mental well-being needs of seniors. Furthermore, there is a gap in the understanding between developers and ageing people of their requirements. Digital health can be important in supporting older adults wellbeing, emotional requirements, and social needs. Requirements Engineering (RE) is a major software engineering field, which can help to identify, elicit and prioritize the requirements of stakeholders and ensure that the systems meet standards for performance, reliability, and usability. We carried out a systematic review of the literature on RE for older adult digital health software. This was necessary to show the representatives of the current stage of understanding the needs of older adults in aged care digital health. Using established guidelines outlined by the Kitchenham method, the PRISMA and the PICO guideline, we developed a protocol, followed by the systematic exploration of eight databases. This resulted in 69 primary studies of high relevance, which were subsequently subjected to data extraction, synthesis, and reporting. We highlight key RE processes in digital health software for ageing people. It explored the utilization of technology for older user well-being and care, and the evaluations of such solutions. The review also identified key limitations found in existing primary studies that inspire future research opportunities. The results indicate that requirement gathering and understanding have a significant variation between different studies. The differences are in the quality, depth, and techniques adopted for requirement gathering and these differences are largely due to uneven adoption of RE methods.

Download


Just-In-Time TODO-Missed Commits Detection

November 2024

·

8 Reads

·

3 Citations

IEEE Transactions on Software Engineering

TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, such as a reminder to add/remove features or a request to optimize the code implementations. These can all be considered as notifications for developers to revisit regarding the current suboptimal solutions. TODO comments often bring short-term benefits – higher productivity or shorter development cost – and indicate attention needs to be paid for the long-term software quality. Unfortunately, due to their lack of knowledge or experience and/or the time constraints, developers sometimes may forget or even not be aware of suboptimal implementations. The loss of the TODO comments for these suboptimal solutions may hurt the software quality and reliability in the long-term. Therefore it is beneficial to remind the developers of the suboptimal solutions whenever they change the code. In this work, we refer this problem to the task of detecting TODO-missed commits , and we propose a novel approach named TDR eminder ( T O D O comment Reminder ) to address the task. With the help of TDR eminder , developers can identify possible missing TODO commits just-in-time when submitting a commit. Our approach has two phases: offline training and online inference. We first embed code change and commit message into contextual vector representations using two neural encoders respectively. The association between these representations is learned by our model automatically.In the online inference phase, TDR eminder leverages the trained model to compute the likelihood of a commit being a TODO-missed commit . We evaluate TDR eminder on datasets crawled from 10k popular Python and Java repositories in GitHub respectively. Our experimental results show that TDR eminder outperforms a set of benchmarks by a large margin in TODO-missed commits detection. Moreover, to better help developers use TDR eminder in practice, we have incorporated Large Language Models (LLMs) with our approach to provide explainable recommendations. The user study shows that our tool can effectively inform developers not only “when” to add TODOs, but also “where” and “what” TODOs should be added, verifying the value of our tool in practical application.




Figure 1: Demonstration of existing model obfuscations [37]. Here, existing model obfuscation hides the weights of conv2d operator, renames the original operator name to random strings (conv2d → wripyx), and injects an extra obfuscating operator (i.e., mjzdmh). The customised DL API library is generated to execute the inference of the obfuscated model. The function {OP_NAME}::eval is the code implementation of the operator's forward inference. The extra operator mjzdmh only has an obfuscating function obfuscate_func to copy the input value to the output.
Figure 2: Overview of our proposed model deobfuscation method DLModelExplorer. The mjzdmh operator with a red dotted block is an extra obfuscating operator.
Figure 4: Meta-model our method.
Performance of DynaMO in defending against the instrumentation attack DLModelExplorer. 'TN': True negative (i.e., correct identification for obfuscating operators) rate of operator classification. 'Difference': the average value difference between the attacking performance based on DynaMO (this table) and existing obfuscation methods (Table 1). 'WER, NIR, OCA, SS': the lower is better. 'WEE': the higher is better.
Overhead of DynaMO on random access memory (RAM) cost (Mb per model) compared with existing obfuscation method [37]. To eliminate the influence of other processes on the test machine, we show the increment of RAM usage. Fruit Skin MobileNet MNASNet SqueezeNet EffcientNet MiDaS LeNet PoseNet SSD Average value
DynaMO: Protecting Mobile DL Models through Coupling Obfuscated DL Operators

October 2024

·

14 Reads

Deploying DL models on mobile Apps has become ever-more popular. However, existing studies show attackers can easily reverse-engineer mobile DL models in Apps to steal intellectual property or generate effective attacks. A recent approach, Model Obfuscation, has been proposed to defend against such reverse engineering by obfuscating DL model representations, such as weights and computational graphs, without affecting model performance. These existing model obfuscation methods use static methods to obfuscate the model representation, or they use half-dynamic methods but require users to restore the model information through additional input arguments. However, these static methods or half-dynamic methods cannot provide enough protection for on-device DL models. Attackers can use dynamic analysis to mine the sensitive information in the inference codes as the correct model information and intermediate results must be recovered at runtime for static and half-dynamic obfuscation methods. We assess the vulnerability of the existing obfuscation strategies using an instrumentation method and tool, DLModelExplorer, that dynamically extracts correct sensitive model information at runtime. Experiments show it achieves very high attack performance. To defend against such attacks based on dynamic instrumentation, we propose DynaMO, a Dynamic Model Obfuscation strategy similar to Homomorphic Encryption. The obfuscation and recovery process can be done through simple linear transformation for the weights of randomly coupled eligible operators, which is a fully dynamic obfuscation strategy. Experiments show that our proposed strategy can dramatically improve model security compared with the existing obfuscation strategies, with only negligible overheads for on-device models.


Patch Overfitting in Program Repair: A Survey

October 2024

·

351 Reads

Automatic program repair (APR) has established itself as a promising approach for enhancing software maintenance and reducing manual bug fixing efforts. Despite its potential , a large body of state-of-the-art APR techniques generate patches that are overfitted to the test oracle. These overfitting patches can degrade the original program by introducing security vulnerabilities or eliminating beneficial features, thus greatly hindering APR's adoption in practical settings. Consequently, there is a strong demand for enhancing the practicality of APR techniques by addressing the overfitting issue. This has resulted in development of a broad spectrum of techniques and empirical studies across different stages of the APR pipeline, each designed to make APR more practical and effective by uniquely addressing the issue of patch overfitting. In this article, we provide a comprehensive overview and catalog of APR studies and techniques, and discuss the primary concerns and approaches to date to addressing the patch overfitting problem.


Fig. 3. Screenshots of our PPGen tool in use
Interactive GDPR-Compliant Privacy Policy Generation for Software Applications

October 2024

·

24 Reads

Software applications are designed to assist users in conducting a wide range of tasks or interactions. They have become prevalent and play an integral part in people's lives in this digital era. To use those software applications, users are sometimes requested to provide their personal information. As privacy has become a significant concern and many data protection regulations exist worldwide, software applications must provide users with a privacy policy detailing how their personal information is collected and processed. We propose an approach that generates a comprehensive and compliant privacy policy with respect to the General Data Protection Regulation (GDPR) for diverse software applications. To support this, we first built a library of privacy clauses based on existing privacy policy analysis. We then developed an interactive rule-based system that prompts software developers with a series of questions and uses their answers to generate a customised privacy policy for a given software application. We evaluated privacy policies generated by our approach in terms of readability, completeness and coverage and compared them to privacy policies generated by three existing privacy policy generators and a Generative AI-based tool. Our evaluation results show that the privacy policy generated by our approach is the most complete and comprehensive.



Citations (40)


... While ChatGPT has shown promising results in code refinement tasks, the OpenAI service's closed-source and proprietary nature poses challenges to its applicability in software development projects [18]. First, concerns about data ownership and privacy would prevent organizations from using this solution in their code review workflow [19]. ...

Reference:

Exploring the Potential of Llama Models in Automated Code Refinement: A Replication Study
Large Language Models for Software Engineering: A Systematic Literature Review
  • Citing Article
  • September 2024

ACM Transactions on Software Engineering and Methodology

... Recent model obfuscation approaches propose to use static or dynamic methods to obfuscate the representation of on-device models [3, 36,37]. Such DL model representations produced by model obfuscation methods cannot be understood by automatic tools or humans, but will not affect the model performance et al. [37]. ...

Model-less Is the Best Model: Generating Pure Code Implementations to Replace On-Device DL Models
  • Citing Conference Paper
  • September 2024

... Beyond the aforementioned research on accessibility issue detection and repair, several studies have focused on comprehensive reviews of accessibility issues. These reviews can be broadly categorized into two main areas: those that concentrate on specific disability groups, such as visually impaired individuals, in their use of digital technologies [33], [34], [35], [36], [37] and the other that explores accessibility issues faced by individuals with disabilities in specific contexts [38], [39], [40], [41], [42]. ...

Accessibility of low-code approaches: A systematic literature review
  • Citing Article
  • September 2024

Information and Software Technology

... This threshold is set based on a checklist of eight criteria, with a score of five signifying that the paper meets more than half of these standards [28]. Furthermore, this scoring criterion is established to enhance the findings' reliability, integrity, and validity, thus ensuring that the research results and evidence are trustworthy [26,29]. Therefore, the results of this assessment obtained 23 final papers. ...

The impact of human aspects on the interactions between software developers and end-users in software engineering: A systematic literature review
  • Citing Article
  • September 2024

Information and Software Technology

... They are more representative of real edge environments when compared to simulators, and are more easily accessible and cost-effective when compared to real edge deployments. The main aim of existing edge computing emulators is to create a staging environment that achieves compute and network realism similar to a real edge environment and facilitate testing of IoT applications before deploying them into production [9]- [14]. However, these general-purpose emulators do not incorporate in their design, tools and mechanisms required to autonomously and transparently generate large-scale performance anomaly datasets useful for model training and evaluation. ...

iContinuum: An Emulation Toolkit for Intent-Based Computing Across the Edge-to-Cloud Continuum

... For example, teams can synthesize and derive requirement engineering artifacts, such as user scenarios, user stories, concept mindmaps, and user personas [14]. Among these artifacts, user personas help teams portray the characteristics of actual users, develop empathy towards them, reduce assumptions about them, and focus on design decisions [7,17]. User personas have been used as references in clarifying requirements, designing user interfaces and user experiences, selecting participants for usability testing, and refining testing scenarios [8,15]. ...

Lessons Learned from Persona Usage in Requirements Engineering Practice
  • Citing Conference Paper
  • June 2024

... Meanwhile, [41] suggests involving stakeholders early in the development process and maintaining continuous communication as strategies for better requirements management. The research proposed in [42] focuses on reducing requirement uncertainty and aims to decrease RV and enhance project stability by applying cognitive approaches. According to the authors, clear and thorough requirements elicitation can significantly reduce the uncertainty that leads to RV. ...

Advancing Requirements Engineering Through Generative AI: Assessing the Role of LLMs
  • Citing Chapter
  • June 2024

... Inspired by the great advancements and potential of Large Language Models (LLMs) [9], [10], [11], [12], [13], [14], [15], [16], [17] in code generation [18], [19], [20], [21], [22], [23], [24], in this work, we proposed a novel LLM-based framework, named SOUP (Stack Overflow Updator for Post), to perform the VCP and APU tasks. For the VCP task, we first manually annotated 5K comment-edit pairs, we then finetuned a LLM for this task, and the trained model is denoted as SOUP p . ...

Just-In-Time TODO-Missed Commits Detection
  • Citing Article
  • November 2024

IEEE Transactions on Software Engineering

... Inspired by the great advancements and potential of Large Language Models (LLMs) [9], [10], [11], [12], [13], [14], [15], [16], [17] in code generation [18], [19], [20], [21], [22], [23], [24], in this work, we proposed a novel LLM-based framework, named SOUP (Stack Overflow Updator for Post), to perform the VCP and APU tasks. For the VCP task, we first manually annotated 5K comment-edit pairs, we then finetuned a LLM for this task, and the trained model is denoted as SOUP p . ...

What Makes a Good TODO Comment?
  • Citing Article
  • May 2024

ACM Transactions on Software Engineering and Methodology

... Requirements changes are inevitable in software development [26,27]. They may arise at any stage of the software development process due to different internal and external factors, e.g., customer needs, technological changes, market change, budget change, and global competition [28,29]. ...

Supporting Emotional Intelligence, Productivity and Team Goals while Handling Software Requirements Changes
  • Citing Article
  • May 2024

ACM Transactions on Software Engineering and Methodology