Hyunsook Do’s research while affiliated with University of North Texas and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (53)


Figure 2. Dataset.
Figure 3. Prediction times of each model.
Figure 4. Validation loss of each model.
Figure 5. LIME plot for predicting satisfaction class.
Figure 6. LIME plot for predicting effectiveness class.
LIME-Aided Automated Usability Issue Detection from User Reviews: Leveraging LLMs for Enhanced User Experience Analysis
  • Conference Paper
  • Full-text available

May 2024

·

372 Reads

·

Stephanie Ludi

·

Hyunsook Do

Mobile applications have become essential in today's digital landscape so optimizing their User eXperiences (UX) is essential. Our study explored the application of Large Language Models (LLMs), including some Bidirectional Encoder Representations from Transformers (BERT) family architectures and advanced pre-trained models like Generative Pre-trained Transformer (GPT) by OpenAI GPT-3.5, GPT-4, and Llama 2 by Meta (zero and few-shot), for detecting usability issues in user reviews. The methodology encompassed data preprocessing, sentiment analysis, fine-tuning LLMs, and interpretability techniques, notably Local Interpretable Model-Agnostic Explanations (LIME). The findings indicated that the fine-tuned LLMs, particularly, Robustly Optimized BERT Approach (RoBERTa), XLNet, and DistilBERT were relatively successful in identifying the usability issues, achieving an accuracy rate of 96%. The study also assessed advanced pre-trained models Llama 2, GPT-3.5, and GPT-4, which generally fell short of the performance achieved by fine-tuned models. Finally, we also discovered the use of LIME that helped in understanding the decision-making processes of the fine-tuned models.

Download

Fig. 1. A simple snippet of code denoting the lockObject method in LockFactory. The child nodes (i.e., methods: lock and unlock) are called by lockObject. For both lock & unlock lockObject was regarded the most adequate parent node using HDP.
Capturing Contextual Relationships of Buggy Classes for Detecting Quality-Related Bugs

Quality concerns are critical for addressing system-wide issues related to reliability, security, and performance, among others. However, these concerns often become scattered across the codebase, making it challenging for software developers to effectively address quality bugs. In this paper, we propose a holistic approach to detecting and clustering quality-related content hidden within the codebase. By leveraging the Hierarchical Dirichlet Process (HDP) and complementary techniques such as information retrieval and machine learning, including structural and textual analysis, we create a meaningful hierarchy that detects classes containing relevant information for addressing quality bugs. This approach allows us to uncover rich synergies between complex structured artifacts and infer bug-fixing classes for repairing quality bugs. The reported results show that our approach improves over the state-of-the-art achieving a high precision of 83%, recall of 82%, and F1 score of 83%.


Generalizability of NLP-based Models for Modern Software Development Cross-Domain Environments

March 2023

·

167 Reads

·

2 Citations

Natural Language Processing (NLP) has shown to be effective for solving complex problems in the Software Engineering (SE) domain, such as building chatbots and its ability to translate multi-languages. Despite the advances allowed by NLP, there are technical loopholes that hinder its fullest potential within the SE domain. The open problem remains in their generalizability for modern software development tasks that typically operate in a dynamic environment, such as AWS and SaaS platforms. The problem with these setups is that they may not contain labeled data. This poses a challenge when applying most prominent data-centric NLP models such as BERT transformer models. This position paper highlights some of the most pressing challenges drawn between the intersection of NLP and SE domains. Our vision revolves around improving the NLP model generalizability for dynamic cross-domain environments that contain little or no labeled target-domain data. We discuss these challenges and propose a research roadmap to tackle this problem as a research community emanating from SE lenses.


Exploring Generalizability of NLP-based Models for Modern Software Development Cross-Domain Environments

March 2023

·

2 Reads

Natural Language Processing (NLP) has shown to be effective for solving complex problems in the Software Engineering (SE) domain, such as building chatbots and its ability to translate multi-languages. Despite the advances allowed by NLP, there are technical loopholes that hinder its fullest potential within the SE domain. The open problem remains in their generalizability for modern software development tasks that typically operate in a dynamic environment, such as AWS and SaaS platforms. The problem with these setups is that they may not contain labeled data. This poses a challenge when applying most prominent data-centric NLP models such as BERT transformer models. This position paper highlights some of the most pressing challenges drawn between the intersection of NLP and SE domains. Our vision revolves around improving the NLP model generalizability for dynamic cross-domain environments that contain little or no labeled target-domain data. We discuss these challenges and propose a research roadmap to tackle this problem as a research community emanating from SE lenses.


Towards Semantically Enhanced Detection of Emerging Quality-Related Concerns in Source Code

February 2023

·

354 Reads

·

2 Citations

Software Quality Journal

Quality concerns defined by ISO/IEC 9126 that focus on the quality aspect of the product such as efficiency, usability, and security, among other, tend to be neglected until they are retrofitted later at the implementation level. This retrofitted strategy poses a major challenge and hinders developers from efficiently detecting and understanding quality concerns because they are frequently implemented with no particular structure and are bound to low cohesion (qualities scattered across the codebase). To address these problems, we propose an alternative approach for detecting scattered quality-related content in the codebase. We introduce SoftQualDetector, a lightweight framework that combines three unsupervised techniques for extracting a rich-set of logical text units from the code from the context of semantics, importance, and textual features to detect quality-related classes and generate short keyword summaries pertaining to quality-related classes. SoftQualDetector also provides a 3D visualization for monitoring automated detected quality-related concerns across the codebase so that developers can easily locate the emerging quality concerns and the associated classes. Our evaluation of 1, 248 annotated Java classes shows that SoftQualDetector outperforms several state-of-the-art methods.


A Multi-Model Framework for Semantically Enhancing Detection of Quality-Related Bug Report Descriptions

February 2023

·

437 Reads

·

4 Citations

Empirical Software Engineering

Maintaining and delivering a high-quality software system is a delicate process. One way to ensure that a software system achieves the desired quality is to systematically monitor and timely address quality-related concerns. Quality concerns, such as reliability, usability, performance, and maintainability, among others, can have a broad impact in ensuring that a system remains consistently reliant and available at all times. In contrast, when such concerns are overlooked, become difficult to navigate, or maintain, system-wide failures could emerge. Typically, these failures can chiefly hinder the core functionality of the system and produce a large amount of quality bug reports. For the developers, manually examining these high-impacted quality-related bug reports in open-source issue tracking systems can become a prohibitively expensive and impractical task to deliver. Partly, because such bugs often require expert knowledge to address them. The more perplexing concern is the fact that these bugs are deemed difficult to detect due to their intertwined relationship with functional bugs. Even worse, there are instances when several types of quality concerns are intertwined among each other. Seemingly, these scenarios make quality concerns non-discernible. To ease this problem, we built a multi-model framework (BugReportSoftQualDetector) to automatically detect quality-related content in bug report descriptions. Specifically, we leveraged a weighted combination of semantics, lexical, and shallow features in conjunction with the Random Forest model to detect six most emerging quality concerns present in bug report descriptions. Our results indicate that our approach out-performed both state-of-the-art approaches, one that leveraged lexical features and the other that leveraged shallow features. To assess our approach, we examined six diverse open-source domains hosted from two issue-tracking systems such as Jira and Bugzilla. Through a grounded theory approach, we created a catalog of rules and employed ISO 25010 taxonomy and the FURPS taxonomy to categorize bug reports into six quality types of: performance, maintainability, reliability, portability, usability, and security. We then employed content analysis to manually label 5400 bug reports. Finally, we included a case study for tracing and visually mapping quality concerns into the codebase.


Fig. 1. On the left, are shown specific quality concerns scattered across 3 classes. On the right, are detected ranked classes affected by specific qualities.
Fig. 3. Architecture Overview of SOFTQUALTOPICDETECTOR
Fig. 4. Hierarchical Quality-Topic Meta Model (HQ-TMM). The Overlapping Quality Concerns Returned by matrix multiplication (M1 • M2) Infer Class(es) for Fixing Quality Bugs
Fig. 5. Quality-Trace Link Meta Model (Q-TLMM)
Fig. 6. Avg Prec and Rec across six Quality Types for SOFTQUALTOPICDE-TECTOR and State-of-the-Arts [18], [48]
A Hierarchical Topical Modeling Approach for Recommending Repair of Quality Bugs

January 2023

·

226 Reads

·

1 Citation

Quality bugs are difficult to detect because the implemented quality-related features are commonly scattered across the codebase. Unfortunately, this scattered information prevents software developers from holistically understanding the root cause of quality bugs. The traditional view of a system does not support a hierarchical code view for monitoring and tracing how quality features are topically related and how they interact with each other. In this paper, we show how these limitations can be overcome by leveraging a Hierarchical Dirichlet Process (HDP) topic modeling technique along with other supporting intermediary techniques such as structural and textual analyses to capture hierarchical topical relationships among quality features across the codebase that yield to detection of quality bugs. We present SOFTQUALTOPICDETECTOR, that is capable of clustering scattered quality concerns into a meaningful hierarchy to infer a set of candidate classes relevant for recommending repair of quality bugs. The higher the ranking of classes into a hierarchy the more relevant they are regarded to contain information about the bug under investigation. Additionally, SOFTQUALTOPICDETECTOR incorporates three rich visualiza-tion features for monitoring, prioritizing, and 3-D tracing of suspicious classes to enhance aspects of maintainability, functional suitability, and tracability. We conduct an empirical evaluation of SOFTQUALTOPICDETECTOR that shows an improvement over the baseline and the state-of-the-art by ≈17% and ≈21% in terms of average precision and recall respectively.


Investigating the User Experience and Evaluating Usability Issues in AI-Enabled Learning Mobile Apps: An Analysis of User Reviews

January 2023

·

3,588 Reads

·

21 Citations

International Journal of Advanced Computer Science and Applications

Integrating artificial intelligence (AI) has become crucial in modern mobile application development. However, the current integration of AI in mobile learning applications presents several challenges regarding mobile app usability. This study aims to identify critical usability issues of AI-enabled mobile learning apps by analyzing user reviews. We conducted a qualitative and content analysis of user reviews for two groups of AI apps from the education category - language learning apps and educational support apps. Our findings reveal that while users generally report positive experiences, several AI-related usability issues impact user satisfaction, effectiveness, and efficiency. These challenges include AI-related functionality issues, performance, bias, explanation, and ineffective Features. To enhance user experience and learning outcomes, developers must improve AI technology and adapt learning methodologies to meet users’ diverse demands and preferences while addressing these issues. By overcoming these challenges, AI-powered mobile learning apps can continue to evolve and provide users with engaging and personalized learning experiences.


ADSA – Association-Driven Safety Analysis to Expose Unknown Safety Issues

January 2023

·

36 Reads

·

1 Citation

IEEE Transactions on Dependable and Secure Computing

Autonomous systems are susceptible to unknown safety issues due to overlooked dependencies among components of the system and the entities that are part of its operating environment. The current safety analysis techniques aids in identifying known safety issues but not overlooked/unknown safety issues. To identify unknown safety issues due to problematic interactions between components, in our previous work, we proposed safety assessment for concurrent components (SACC). Despite being more effective than FMEA and goal modeling, SACC suffers from some limitations such as not considering environmental entities and their properties, and a manual process for identifying associated components for the collective analysis. For a complex system with a large number of components, such an analysis can result in overlooking safety issues. To address these limitations, in this paper, we propose an association-driven safety analysis (ADSA) approach, which is extended and built on SACC. The approach uses a property-relation (PR) table and modified association rule mining algorithm to identify components and environmental entities that need to be considered together to detect overlooked or unknown safety issues. We evaluated our approach using four robotic systems and compared with SACC and systems theoretic process analysis (STPA). Our results show that our proposed approach, in particular using behavioral dependencies, is effective at exposing unknown safety issues.


Identifying safety issues from energy conservation requirements

October 2022

·

33 Reads

·

4 Citations

Journal of Software: Evolution and Process

In cyber‐physical systems such as robots and automated vehicles that rely heavily on batteries, safety, and energy conservation can result conflicting requirements when not considered together. In systems engineering, the development begins at a concept phase where we have high‐level information of different components of the system. During the concept phase, we perform hazard analysis and risk assessment, define safety goals, and derive safety requirements. This means requirement engineering occurs at the end of the concept phase. However, energy conservation recommendations are not taken into consideration until the detailed design with specific hardware and software is known. Hence, it is possible to recommend energy conservation behaviors that can compromise system's safety. If we perform a trade‐off analysis between safety and energy conservation at a concept phase, we can propose various design alternatives and choose the best one that offers a safe and energy saving architecture. To achieve this goal, we propose an approach for identifying safety issues that can be caused by energy conservation recommendations. To evaluate the effectiveness of our approach, we performed an empirical study on four robotic systems. Our results show that we can find energy conservation recommendations that can compromise safety at a concept phase. Safety and energy conservation both play an important role in autonomous systems, but these aspects are not considered together during systems engineering. As a result, it is possible to create energy conservation requirements that might compromise safety. This paper discusses how we can identify energy conservation issues that can potentially compromise safety and proposes an approach that allows engineers to come up with design alternatives that are safe and save energy.


Citations (38)


... Login plays a critical role in Android apps, typically representing the first point of interaction between the application and the users [1]. Issues in login processes can significantly degrade the user experience [2]- [5]. ...

Reference:

Characterizing Bugs in Login Processes of Android Applications: An Empirical Study
Investigating the User Experience and Evaluating Usability Issues in AI-Enabled Learning Mobile Apps: An Analysis of User Reviews

International Journal of Advanced Computer Science and Applications

... The first paper session contained two position papers presentations about (i) privacy implications when training LLMs with unsanitized corpora of code [2], presented by Ali Al-Kaswan (Delft University of Technology, the Netherlands), and (ii) generalizability of NLP-based models for software development cross-domain environments [23], presented by Rrezarta Krasniqi (University of North Texas, USA). Each paper presentation was followed by a general discussion involving the presenter, the NLBSE chairs, and the other NLBSE participants. ...

Generalizability of NLP-based Models for Modern Software Development Cross-Domain Environments

... Component or Function Specific Failure and Effects [153,154,155,156,157,158,159,160,74] [ 161,162,163,164,165,166,167,168,169] [ 170,171,172,173,174,175,176,177] [ 178,179,180,181,182,183,184,185,186] Fault Propagation and Representations that Support Propagation Analysis [187,188,189,190,191,192,193,194,195,196] [197,198,199,200,201,202,203,204,205,206] [ 207,208,209,210,211,212,213,214] Domain Specific Methods for Analysis, Design, and Operation [215,216,217,218,219,220,221] Methods and Tools that Expand Traditional System Design Tools for Risk or Safety [222,223,224,225,226,227,228,229,230] flows as nouns (energy, material, and signals). Further, functional descriptions can be understood as input-output relationships of black boxes, leading to function block diagrams. ...

ADSA – Association-Driven Safety Analysis to Expose Unknown Safety Issues
  • Citing Article
  • January 2023

IEEE Transactions on Dependable and Secure Computing

... These bug reports had previously been manually labeled by Krasniqi and Agrawal [14]. We chose these projects and bug reports because they were publicly available and had been used in previous studies for bug triaging of quality bugs [16], semantic detection of quality bugs [17], and semantic detection of quality concerns in source code [18]. ...

Towards Semantically Enhanced Detection of Emerging Quality-Related Concerns in Source Code

Software Quality Journal

... The most common way to report bugs is through issue tracking systems, but in the majority cases, there is not any standard, which causes misinformation for the development team due the unclear or insufficient data [7]. Steps to reproduce the bug, stack trace errors, test case scenarios, logs, and images are factors that impact the quality of the bug reports [6,8,9,10,11,12]. ...

A Multi-Model Framework for Semantically Enhancing Detection of Quality-Related Bug Report Descriptions

Empirical Software Engineering

... The semantics feature was further augmented by instantiating a BERT transformation model [8]. We then introduced SOFTQUALTOPICDETECTOR [31], that could cluster scattered quality concerns across various artifacts into a meaningful topic hierarchy to infer a set of candidate classes relevant for recommending repair of quality bugs. Finally, we introduced an NLP-based framework SOFTQUALDETECTOR [23] that focused on detecting quality classes scattered across the codebase. ...

A Hierarchical Topical Modeling Approach for Recommending Repair of Quality Bugs

... IMUs typically contain sensors such as accelerometers and gyroscopes. Accelerometers measure the acceleration of an object in three directions, including the acceleration component generated by gravity [7]. ...

Identifying safety issues from energy conservation requirements
  • Citing Article
  • October 2022

Journal of Software: Evolution and Process

... These bug reports had previously been manually labeled by Krasniqi and Agrawal [14]. We chose these projects and bug reports because they were publicly available and had been used in previous studies for bug triaging of quality bugs [16], semantic detection of quality bugs [17], and semantic detection of quality concerns in source code [18]. ...

Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging

... Different malware detection techniques [6], [7], [8] have been proposed in literature focusing, particularly, on generating new malware. These can be categorized into two heads. ...

Combinatorial Testing of Context Aware Android Applications

·

·

Renee Bryce

·

[...]

·

... The ISO/PAS 21448 process was organized and the study on the development of the framework was carried out, but it has limitations that it was focused only on scenario derivation and framework [24]. A study of the scenario of ISO/PAS 21448 for the safety of autonomous vehicles was also carried out, but it had limitations that it was focused on the operating environment and scenario [25]. In order to apply ISO/PAS 21448 effectively, it is necessary to analyze and study the entire process of ISO/PAS 21448 along with intensive research and improvement plan for each stage. ...

A Dependency-based Combinatorial Approach for Reducing Effort for Scenario-based Safety Analysis of Autonomous Vehicles
  • Citing Conference Paper
  • January 2021