Hye-young Paik

Hye-young Paik
UNSW Sydney | UNSW · School of Computer Science and Engineering

PhD, Computer Science

About

133
Publications
36,197
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,429
Citations

Publications

Publications (133)
Article
Blockchain eliminates the need for trusted third-party intermediaries in business by enabling decentralised architecture design in software applications. However, the vulnerabilities in on-chain autonomous decision-makings and cumbersome off-chain coordination lead to serious concerns about blockchain’s ability to behave in a trustworthy and effici...
Article
Federated learning has received fast-growing interests from academia and industry to tackle the challenges of data hungriness and privacy in machine learning. A federated learning system can be viewed as a large-scale distributed system with different components and stakeholders as numerous client devices participate in federated learning. Designin...
Preprint
Federated learning is growing fast in both academia and industry to resolve data hungriness and privacy issues in machine learning. A federated learning system being widely distributed with different components and stakeholders requires software system design thinking. For instance, multiple patterns and tactics have been summarised by researchers...
Preprint
Blockchain technology has been exploited to build next-generation applications with its decentralised nature. However, serious concerns are raised about the trustworthiness of the blockchain due to the virtue of the vulnerabilities in on-chain algorithmic mechanisms and tedious debates in the off-chain community. Accordingly, the governance of bloc...
Article
Little attention has been paid to how to investigate crimes related to the Internet of Things (IoT). We propose a forensic investigation framework that considers various aspects of IoT devices and evaluate it with 32 users, including investigators, law enforcement officers, and incident responders.
Article
The push for digitising personal health records needs to occur with serious consideration of privacy in order to instill public confidence. However, the healthcare sector still experiences leakage of Personally Identifiable Information (PII) due to improper data protection practices and security failures by data custodians. Data minimisation refers...
Preprint
Full-text available
The ability to evaluate uncertainties in evolving data streams has become equally, if not more, crucial than building a static predictor. For instance, during the pandemic, a forecast model should always estimate its uncertainty around dynamic factors such as governmental policies, meteorological features and vaccination schedules. Targeting this,...
Preprint
Blockchain eliminates the need for trusted third party intermediaries in business by enabling decentralised architecture in software applications. However, vulnerabilities in on-chain autonomous decision-making and cumbersome off-chain coordination have led to serious concerns about blockchain's ability to behave and make decisions in a trustworthy...
Preprint
Full-text available
The ability to deal with uncertainty in machine learning models has become equally, if not more, crucial to their predictive ability itself. For instance, during the pandemic, governmental policies and personal decisions are constantly made around uncertainties. Targeting this, Neural Process Families (NPFs) have recently shone a light on predictio...
Chapter
Federated learning is an emerging machine learning paradigm that enables multiple devices to train models locally and formulate a global model, without sharing the clients’ local data. A federated learning system can be viewed as a large-scale distributed system, involving different components and stakeholders with diverse requirements and constrai...
Preprint
Full-text available
Federated learning is an emerging privacy-preserving AI technique where clients (i.e., organisations or devices) train models locally and formulate a global model based on the local model updates without transferring local data externally. However, federated learning systems struggle to achieve trustworthiness and embody responsible AI principles....
Preprint
Federated learning is an emerging machine learning paradigm that enables multiple devices to train models locally and formulate a global model, without sharing the clients' local data. A federated learning system can be viewed as a large-scale distributed system, involving different components and stakeholders with diverse requirements and constrai...
Article
Federated learning is an emerging machine learning paradigm where clients train models locally and formulate a global model based on the local model updates. To identify the state-of-the-art in federated learning and explore how to develop federated learning systems, we perform a systematic literature review from a software engineering perspective,...
Preprint
Blockchain has been increasingly used as a software component to enable decentralisation in software architecture for a variety of applications. Blockchain governance has received considerable attention to ensure the safe and appropriate use and evolution of blockchain, especially after the Ethereum DAO attack in 2016. To understand the state-of-th...
Article
Edge computing, as a part a distributed computing architecture, has become an increasingly popular paradigm. It expands the capacity of cloud by facilitating data from the end devices to be stored and processed at the edge of the network closer to the data instead of delivering it to the cloud. Data integrity is a big concern in edge computing. As...
Chapter
Crimes sabotage various societal aspects, such as social stability, public safety, economic development, and individuals’ quality of life. To accurately predict crime occurrences can not only bring the peace of mind to individuals but also help distribute and manage police resources effectively by authorities. We aim to take into account plenty of...
Book
This book constitutes revised and selected papers from the scientific satellite events held in conjunction with the18th International Conference on Service-Oriented Computing, ICSOC 2020. The conference was held virtually during December 14-17, 2020. A total of 125 submissions were received for the satellite events. The volume includes 9 papers fr...
Conference Paper
Full-text available
Self-sovereign identity is a new identity management paradigm that allows entities to really have the ownership of their identity data and control their use without involving any intermediary. Blockchain is an enabling technology for building self-sovereign identity systems by providing a neutral and trustable storage and computing infrastructure,...
Preprint
Full-text available
Self-sovereign identity is a new identity management paradigm that allows entities to really have the ownership of their identity data and control their use without involving any intermediary. Blockchain is an enabling technology for building self-sovereign identity systems by providing a neutral and trustable storage and computing infrastructure a...
Article
Full-text available
Self-sovereign identity (SSI) is considered to be a “killer application” of blockchain. However, there is a lack of systematic architecture designs for blockchain-based SSI systems to support methodical development. An aspect of such gap is demonstrated in current solutions, which are considered coarse grained and may increase data security risks....
Preprint
Full-text available
Self-sovereign identity (SSI) is considered to be a "killer application" of blockchain. However, there is a lack of systematic architecture designs for blockchain-based SSI systems to support methodical development. An aspect of such gap is demonstrated in current solutions, which are considered coarse grained and may increase data security risks....
Article
Full-text available
In a blockchain-based system, data and the consensus-based process of recording and updating them over distributed nodes are central to enabling the trustless multi-party transactions. Thus, properly understanding what and how the data are stored and manipulated ultimately determines the degree of utility, performance, and cost of a blockchain-base...
Chapter
Business world is getting increasingly dynamic. Information processing using knowledge-, service-, and cloud-based systems makes the use of complex, dynamic and often knowledge-intensive activities an inevitable task. Knowledge-intensive processes contain a set of coordinated tasks and activities, controlled by knowledge workers to achieve a busine...
Chapter
Despite its popularity, the decision making process of a Deep Neural Network (DNN) model is opaque to users, making it difficult to understand the behaviour of the model. We present the design of a Web-based DNN interpretability framework which is based on the core notions in case-based reasoning approaches where exemplars (e.g., data points consid...
Chapter
Full-text available
Deciding on the optimal architecture of a software system is difficult, as the number of design alternatives and component interactions can be overwhelmingly large. Adding security considerations can make architecture evaluation even more challenging. Existing model-based approaches for architecture optimisation usually focus on performance and cos...
Article
Tables in documents are a widely-available and rich source of information, but not yet well-utilised computationally because of the difficulty in automatically extracting their structure and data content. There has been a plethora of systems proposed to solve the problem, but current methods present low usability and accuracy and lack precision in...
Preprint
Large amount of public data produced by enterprises are in semi-structured PDF form. Tabular data extraction from reports and other published data in PDF format is of interest for various data consolidation purposes such as analysing and aggregating financial reports of a company. Queries into the structured tabular data in PDF format are normally...
Chapter
Full-text available
A fundamental assumption of improvement in Business Process Management (BPM) is that redesigns deliver refined and improved versions of business processes. These improvements can be validated online through sequential experiment techniques like AB Testing, as we have shown in earlier work. Such approaches have the inherent risk of exposing customer...
Article
Full-text available
A fundamental assumption of Business Process Management (BPM) is that redesign delivers refined and improved versions of business processes. This assumption, however, does not necessarily hold, and any required compensatory action may be delayed until a new round in the BPM life-cycle completes. Current approaches to process redesign face this prob...
Conference Paper
Full-text available
Business process improvement ideas can be validated through sequential experiment techniques like AB Testing. Such approaches have the inherent risk of exposing customers to an inferior process version, which is why the inferior version should be discarded as quickly as possible. In this paper, we propose a contextual multi-armed bandit algorithm t...
Chapter
Tables in documents are a rich and under-exploited source of structured data in otherwise unstructured documents. The extraction and understanding of tabular data is a challenging task which has attracted the attention of researchers from a range of disciplines such as information retrieval, machine learning and natural language processing. In this...
Chapter
Full-text available
Business process improvement ideas can be validated through sequential experiment techniques like AB Testing. Such approaches have the inherent risk of exposing customers to an inferior process version, which is why the inferior version should be discarded as quickly as possible. In this paper, we propose a contextual multi-armed bandit algorithm t...
Article
Full-text available
The presence of the Internet of Things (IoT) in healthcare through the use of mobile medical applications and wearable devices allows patients to capture their healthcare data and enables healthcare professionals to be up-to-date with a patient’s status. Ambient Assisted Living (AAL), which is considered as one of the major applications of IoT, is...
Conference Paper
Full-text available
Privacy-preserving data analytics is an emerging technology which allows multiple parties to perform joint data analytics without disclosing source data to each other or a trusted third-party. A variety of platforms and protocols have been proposed in this domain. However, these systems are not yet widely used, and little is known about them from a...
Conference Paper
Full-text available
A fundamental assumption of Business Process Management (BPM) is that redesign delivers new and improved versions of business processes. This assumption, however, does not necessarily hold, and required compensatory action may be delayed until a new round in the BPM life-cycle completes. Current approaches to process redesign face this problem in o...
Chapter
In this chapter, we explore the concept of data services. After clarifying the main concepts, we introduce key enabling technologies for building data services, namely XSLT and XQuery. These two XML-based languages are used to transform and query potentially heterogeneous data into well-understood standard XML documents. The lab exercises included...
Chapter
Full-text available
In this chapter, we provide concluding remarks offering readers our perspective for continued exploration in the field of Service Oriented Computing.
Chapter
In this chapter, we begin by understanding Service Oriented Architecture (SOA), its key values and goals too modern and evolving business ecosystems. We then describe the SOA architectural stack in reference to software application integration layers. This is followed by an introduction to service composition and data-flow techniques, including end...
Chapter
In this chapter, we present BPEL and BPMN as two main languages of Web service composition. Both BPEL and BPMN allow the codification of control flow logic of a composite service. We will introduce the core syntax elements of the two languages and their usage examples. The lab activities will show how to build a simple BPEL service by composing oth...
Chapter
In this chapter, REST is introduced, an alternate Web service implementation technique. Unlike SOAP and WSDL with clearly defined standards, REST contains a set of generic Web service design principles and guidelines that can be interpreted and implemented differently. In this chapter, we present the fundamentals of the said principles, explaining...
Chapter
In this chapter, we introduce a framework known as Service Component Architecture (SCA) that provides a technology-agnostic capability for composing applications from distributed services. Building a successful SOA solution in practice can be complex, due to the lack of standards and specifications, especially when integrating many different techno...
Chapter
In this chapter, we examine the data-flow aspects of Web service composition, which specifies how data is exchanged between services. The data-flow description encapsulates the data movement from one service to another and the transformations applied on this data. We introduce two different paradigms based on the message passing style, namely, blac...
Chapter
In this chapter, we introduce the motivation behind Web service composition technologies – going from an atomic to a composite service. In doing so, we discuss the two main paradigms of multiple service interactions: Web service orchestration and Web service choreography. In the rest of the book, we will focus on Web service orchestration as the ma...
Chapter
In this chapter, SOAP and WSDL are explained as important standards that lay the foundation for standardized descriptions of messages and operations of a Web service. We first describe the core elements of SOAP and WSDL standards with examples, then present how the two standards are fit together to form the common message communication styles, name...
Book
This book embarks on a mission to dissect, unravel and demystify the concepts of Web services, including their implementation and composition techniques. It provides a comprehensive perspective on the fundamentals of implementation standards and strategies for Web services (in the first half of the book), while also presenting composition technique...
Conference Paper
We propose a PDF document wrapper system that is specifically targeted at table processing applications. We (i) review the PDF specifications and identify particular challenges from the table processing point of view, (ii) specify a table-oriented document model containing the required atomic elements for table extraction and understanding applicat...
Conference Paper
People share various processes in daily lives on-line in natural language form (e.g., cooking recipes, “how-to guides” in eHow). We refer to them as personal process descriptions. Previously, we proposed Personal Process Description Graph (PPDG) to concretely represent the personal process descriptions as graphs, along with query processing techniq...
Conference Paper
Although there is an abundance of how-to guides online, systematically utilising the collective knowledge represented in such guides has been limited. This is primarily due to how-to guides (effectively, informal process descriptions) being expressed in natural language, which complicates the process of extracting actions and data. This paper descr...
Conference Paper
Tables in documents are a rich source of information, but not yet well-utilised computationally because of the difficulty of extracting their structure and data automatically. In this paper, we progress the state-of-the-art in automatic table extraction by identifying common patterns in table headers to develop rules and heuristics for determining...
Conference Paper
People are involved in various processes in their daily lives, such as cooking a dish, applying for a job or opening a bank account. With the advent of easy-to-use Web-based sharing platforms, many of these processes are shared as step-by-step instructions (e.g., “how-to guides” in eHow and wikiHow) on-line in natural language form. We refer to the...
Conference Paper
In this paper, we propose a precise, comprehensive model of table processing which aims to remedy some of the problems in the discussion of table processing in the literature. The model targets application-independent, end-to-end table processing, and thus encompasses a large subset of the work in the area. The model can be used to aid the design o...
Conference Paper
In this paper, we propose Processbook, a social-network-based management system for personal processes (ad hoc processes carried out to achieve a personal goal). A simple modelling interface is introduced based on ToDoLists to help users plan towards their goals. We describe how the system can capture a user’s experience in managing their ToDoList...
Conference Paper
Personal processes are ad-hoc to the point where each process may have a unique structure and is certainly not as strictly defined as a workflow process. In order to describe, share and analyze personal processes more effectively, in this paper, we propose Personal Process Description Graph (PPDG) and a personal process query. The personal process...
Conference Paper
Crowd sourcing is changing the way people work and solve problems from "in-house working" to "public out-sourcing". Most online crowd sourcing platforms perform two main functions: (i) allowing users to advertise their tasks and (ii) helping them find candidate workers. However, they do not support crowd sourcing of complex work consisting of inter...
Conference Paper
The goal of this paper is to improve the Named Entity Recognition for automatic information extraction related to record based data in text documents.