Jin Song Dong's research while affiliated with National University of Singapore and other places

Publications (300)

Article
The Unified Modeling Language (UML) is a standard for modeling dynamic systems. UML behavioral state machines are used for modeling the dynamic behavior of object-oriented designs. The UML specification, maintained by the Object Management Group (OMG), is documented in natural language (in contrast to formal language). The inherent ambiguity of nat...
Article
Due to the surging popularity of various cryptocurrencies in recent years, a large number of browser extensions have been developed as portals to access relevant services, such as cryptocurrency exchanges and wallets. This has stimulated a wild growth of cryptocurrency themed malicious extensions that cause heavy financial losses to the users and l...
Article
Full-text available
Ensemble trees are a popular machine learning model which often yields high prediction performance when analysing structured data. Although individual small decision trees are deemed explainable by nature, an ensemble of large trees is often difficult to understand. In this work, we propose an approach called optimised explanation (OptExplain) that...
Article
Although deep learning has demonstrated astonishing performance in many applications, there are still concerns about its dependability. One desirable property of deep learning applications with societal impact is fairness (i.e., non-discrimination). Unfortunately, discrimination might be intrinsically embedded into the models due to the discriminat...
Conference Paper
Time-travelling visualization answers how the predictions of a deep classifier are formed during the training. It visualizes in two or three dimensional space how the classification boundaries and sample embeddings are evolved during training. In this work, we propose TimeVis, a novel time-travelling visualization solution for deep classifiers. Com...
Article
Understanding how the predictions of deep learning models are formed during the training process is crucial to improve model performance and fix model defects, especially when we need to investigate nontrivial training strategies such as active learning, and track the root cause of unexpected training results such as performance degeneration. In th...
Preprint
Full-text available
Neural networks have been widely applied in security applications such as spam and phishing detection, intrusion prevention, and malware detection. This black-box method, however, often has uncertainty and poor explainability in applications. Furthermore, neural networks themselves are often vulnerable to adversarial attacks. For those reasons, the...
Preprint
Full-text available
Formal methods for verification of programs are extended to testing of programs. Their combination is intended to lead to benefits in reliable program development, testing, and evolution. Our geometric theory of testing is intended to serve as the specification of a testing environment, included as the last stage of a toolchain that assists profess...
Chapter
It is known that neural networks are subject to attacks through adversarial perturbations. Worse yet, such attacks are impossible to eliminate, i.e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial training. Multiple approaches have been developed to detect and reject such adversarial inputs. Rej...
Article
Automated model repair techniques enable machines to synthesise patches that ensure models meet given requirements. B-repair, which is an existing model repair approach, assists users in repairing erroneous models in the B formal method, but repairing large models is inefficient due to successive applications of repair. In this work, we improve the...
Preprint
Full-text available
When deploying pre-trained neural network models in real-world applications, model consumers often encounter resource-constraint platforms such as mobile and smart devices. They typically use the pruning technique to reduce the size and complexity of the model, generating a lighter one with less resource consumption. Nonetheless, most existing prun...
Preprint
Recent years have seen the wide application of NLP models in crucial areas such as finance, medical treatment, and news media, raising concerns of the model robustness and vulnerabilities. In this paper, we propose a novel prompt-based adversarial attack to compromise NLP models and robustness enhancement technique. We first construct malicious pro...
Article
Full-text available
Neural networks have been widely applied in security applications such as spam and phishing detection, intrusion prevention, and malware detection. This black-box method, however, often has uncertainty and poor explainability in applications. Furthermore, neural networks themselves are often vulnerable to adversarial attacks. For those reasons, the...
Article
Deep Neural Networks (DNNs) have been widely adopted, yet DNN models are surprisingly unreliable, which raises significant concerns about their use in critical domains. In this work, we propose that runtime DNN mistakes can be quickly detected and properly dealt with in deployment , especially in settings like self-driving vehicles. Just as softwar...
Article
Fuzzing is a widely-used software vulnerability discovery technology, many of which are optimized using coverage-feedback. Recently, some techniques propose to train deep learning (DL) models to predict the branch coverage of an arbitrary input owing to its always-available gradients etc. as a guide. Those techniques have proved their success in im...
Preprint
Full-text available
Understanding how the predictions of deep learning models are formed during the training process is crucial to improve model performance and fix model defects, especially when we need to investigate nontrivial training strategies such as active learning, and track the root cause of unexpected training results such as performance degeneration. In th...
Preprint
It is known that neural networks are subject to attacks through adversarial perturbations, i.e., inputs which are maliciously crafted through perturbations to induce wrong predictions. Furthermore, such attacks are impossible to eliminate, i.e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial tra...
Preprint
Full-text available
Trained with a sufficiently large training and testing dataset, Deep Neural Networks (DNNs) are expected to generalize. However, inputs may deviate from the training dataset distribution in real deployments. This is a fundamental issue with using a finite dataset. Even worse, real inputs may change over time from the expected distribution. Taken to...
Preprint
Full-text available
Bug datasets consisting of real-world bugs are important artifacts for researchers and programmers, which lay empirical and experimental foundation for various SE/PL research such as fault localization, software testing, and program repair. All known state-of-the-art datasets are constructed manually, which inevitably limits their scalability, repr...
Preprint
Although deep learning has demonstrated astonishing performance in many applications, there are still concerns on their dependability. One desirable property of deep learning applications with societal impact is fairness (i.e., non-discrimination). Unfortunately, discrimination might be intrinsically embedded into the models due to discrimination i...
Article
Full-text available
The SPARC instruction set architecture (ISA) has been used in various processors in workstations, embedded systems, and in mission-critical industries such as aviation and space engineering. Hence, it is important to provide formal frameworks that facilitate the verification of hardware and software that run on or interface with these processors. I...
Preprint
Full-text available
Ensemble trees are a popular machine learning model which often yields high prediction performance when analysing structured data. Although individual small decision trees are deemed explainable by nature, an ensemble of large trees is often difficult to understand. In this work, we propose an approach called optimised explanation (OptExplain) that...
Preprint
Full-text available
The widespread adoption of Deep Neural Networks (DNNs) in important domains raises questions about the trustworthiness of DNN outputs. Even a highly accurate DNN will make mistakes some of the time, and in settings like self-driving vehicles these mistakes must be quickly detected and properly dealt with in deployment. Just as our community has dev...
Article
Full-text available
This paper introduces a new high-performance machine learning tool named Silas, which is built to provide a more transparent, dependable and efficient data analytics service. We discuss the machine learning aspects of Silas and demonstrate the advantage of Silas in its predictive and computational performance. We show that several customised algori...
Article
Full-text available
This work follow the verification as planning paradigm and propose to use model-checking techniques to solve planning and goal reasoning problems for autonomous systems with high-degree of assurance. It presents a novel modelling framework — Goal Task Network (GTN) that encompass both goal reasoning and planning under a unified formal description t...
Article
Adversarial Examples threaten to fool deep learning models to output erroneous predictions with high confidence. Optimization-based methods for constructing such samples have been extensively studied. While being effective in terms of aggression, they typically lack clear interpretation and constraint about their underlying generation process, whic...
Article
Full-text available
Service composition aims at achieving a business goal by composing existing service-based applications or components. The response time of a service is crucial, especially in time-critical business environments, which is often stated as a clause in service-level agreements between service providers and service users. To meet the guaranteed response...
Chapter
Full-text available
N-PAT is a new model-checking tool that supports the verification of nested-models, i.e. models whose behaviour depends on the results of verification tasks. In this paper, we describe its operation and discuss mechanisms that are tailored to the efficient verification of nested-models. Further, we motivate the advantages of N-PAT over traditional...
Preprint
N-PAT is a new model-checking tool that supports the verification of nested-models, i.e. models whose behaviour depends on the results of verification tasks. In this paper, we describe its operation and discuss mechanisms that are tailored to the efficient verification of nested-models. Further, we motivate the advantages of N-PAT over traditional...
Article
Blockchain technology has rapidly emerged as a decentralized trusted network to replace the traditional centralized intermediator. Especially, the smart contracts that are based on blockchain allow users to define the agreed behaviour among them, the execution of which will be enforced by the smart contracts. Based on this, we propose a decentraliz...
Preprint
Service composition aims at achieving a business goal by composing existing service-based applications or components. The response time of a service is crucial especially in time critical business environments, which is often stated as a clause in service level agreements between service providers and service users. To meet the guaranteed response...
Chapter
This chapter starts with the inspiration and main mechanisms of one of the most well-regarded combinatorial optimization algorithms called Ant Colony Optimizer (ACO). This algorithm is then employed to find the optimal path for an AUV. In fact, the problem investigated is a real-world application of the Traveling Salesman Problem (TSP).
Chapter
This chapter is an introduction to nature-inspired algorithms. It first discusses the reason why such algorithms have been very popular in the last decade. Then, different classifications of such methods are given. The chapter also includes the algorithms and problems investigated in this book.
Chapter
Evolutionary Algorithms mimic natural evolutionary process in nature. One of the most well-regarded evolutionary algorithms is Genetic Algorithm (GA) [1]. This algorithm has been inspired from the Drawin’s theory of evolutionary. This theory states that natural organisms develop using the natural selection. Natural selection refers to the process o...
Chapter
Swarm Intelligence (SI) refers to the collective behaviour of a group of creatures without a centralized unit control. This field was first established in 1989 in a robotic project [1]. Systems built based on SI typically have independent intelligent agents that interact locally to achieve a goal as a team [2]. Most of the algorithms in this field...
Chapter
Genetic Algorithm (GA) is one of the most well-regarded evolutionary algorithms in the history. This algorithm mimics Darwinian theory of survival of the fittest in nature. This chapter presents the most fundamental concepts, operators, and mathematical models of this algorithm. The most popular improvements in the main component of this algorithm...
Chapter
Optimization refers to the process of finding an optimal set from the set of all possible solutions for a given problems. An algorithm is normally developed call optimization algorithm to find such a solution. Regardless of the specific structure, such algorithms required to compare two solutions at some stage to decide which one is better. An obje...
Chapter
Metaheuristics have become very popular in the last two decades. This class of problem solving techniques includes a wide range of algorithms to find reasonably good solutions for problems where deterministic methods are not efficient. Their name come from their mechanism, in which they do not required problem-specific heuristic information. Such m...
Chapter
This chapter covers the fundamental concepts of the recently proposed Grasshopper Optimization Algorithm (GOA). The inspiration, mathematical model, and the algorithm are presented in details. A brief literature review of this algorithm including different variants, improvement, hybrids, and applications are given too. The performance of GOA is tes...
Chapter
In the field of Artificial Intelligence (AI), search algorithms have been popular since their invention. A search algorithm is typically designed to search and find a desired solution from a given set of all possible solutions to maximize/minimize one or multiple objectives. Depending on the mechanism of a search method, this set of solution can be...
Book
Full-text available
This book covers the conventional and most recent theories and applications in the area of evolutionary algorithms, swarm intelligence, and meta-heuristics. Each chapter offers a comprehensive description of a specific algorithm, from the mathematical model to its practical application. Different kind of optimization problems are solved in this boo...
Chapter
Full-text available
Feature selection is a preprocessing step that aims to eliminate the features that may negatively influence the performance of the machine learning techniques. The negative influence is due to the possibility of having many irrelevant and/or redundant features. In this chapter, a binary variant of recent Harris hawks optimizer (HHO) is proposed to...
Chapter
The Particle Swarm Optimization (PSO) is one of the most well-regarded algorithms in the literature of meta-heuristics. This algorithm mimics the navigation and foraging behaviour of birds in nature. Despite the simple mathematical model, it has been widely used in diverse fields of studies to solve optimization problems. There is a tremendous numb...
Chapter
Full-text available
This chapter proposes a new efficient moth-flame-embedded multilayer perceptrons (MLP) neuroevolution model to deal with classification problems. Moth-flame optimizer (MFO) is one of the effective swarm-based metaheuristic methods inspired by the natural direction-finding behaviours of moth insects and their well-known entrapment phenomena when the...
Book
This book focuses on the most well-regarded and recent nature-inspired algorithms capable of solving optimization problems with multiple objectives. Firstly, it provides preliminaries and essential definitions in multi-objective problems and different paradigms to solve them. It then presents an in-depth explanations of the theory, literature revie...
Article
A key feature of the booming smart home is the integration of a wide assortment of technologies, including various standards, proprietary communication protocols and heterogeneous platforms. Due to customization, unsatisfied assumptions and incompatibility in the integration, critical security vulnerabilities are likely to be introduced by the inte...
Article
Statistical Model Checking (SMC) is an approximate verification method that overcomes the state space explosion problem for probabilistic systems by Monte Carlo simulations. Simulations might, however, be costly if many samples are required. It is thus necessary to implement efficient algorithms to reduce the sample size while preserving precision...
Preprint
Full-text available
Deep neural networks (DNN) are increasingly applied in safety-critical systems, e.g., for face recognition, autonomous car control and malware detection. It is also shown that DNNs are subject to attacks such as adversarial perturbation and thus must be properly tested. Many coverage criteria for DNN since have been proposed, inspired by the succes...
Article
Full-text available
Regression faults, which make working code stop functioning, are often introduced when developers make changes to the software. Many regression fault localization techniques have been proposed. However, issues like inaccuracy and lack of explanation are still obstacles for their practical application. In this work, we propose a trace-based approach...
Preprint
Full-text available
While AI techniques have found many successful applications in autonomous systems, many of them permit behaviours that are difficult to interpret and may lead to uncertain results. We follow the "verification as planning" paradigm and propose to use model checking techniques to solve planning and goal reasoning problems for autonomous systems. We g...
Preprint
Full-text available
This paper introduces a new classification tool named Silas, which is built to provide a more transparent and dependable data analytics service. A focus of Silas is on providing a formal foundation of decision trees in order to support logical analysis and verification of learned prediction models. This paper describes the distinct features of Sila...
Preprint
Full-text available
Neural network is becoming the dominant approach for solving many real-world problems like computer vision and natural language processing due to its exceptional performance as an end-to-end solution. However, deep learning models are complex and work in a black-box manner in general. This hinders humans from understanding how such systems make dec...
Chapter
Android devices are equipped with various sensors. Permissions from users must be explicitly granted for apps to obtain sensitive information, e.g., geographic location. However, some of the sensors are considered trivial such that no permission control is enforced over them, e.g., the ambient light sensor. In this work, we present a novel side cha...
Preprint
SPARC processors have many applications in mission-critical industries such as aviation and space engineering. Hence, it is important to provide formal frameworks that facilitate the verification of hardware and software that run on or interface with these processors. This paper presents the first mechanised SPARC Total Store Ordering (TSO) memory...
Article
Full-text available
Smart grid (SG) networks are newly upgraded networks of connected objects that greatly improve reliability, efficiency and sustainability of the traditional energy infrastructure. In this respect, the smart metering infrastructure (SMI) plays an important role in controlling, monitoring and managing multiple domains in the SG. Despite the salient f...
Chapter
Full-text available
Several schemes have been provided in Statistical Model Checking (SMC) for the estimation of property occurrence based on predefined confidence and absolute or relative error. Simulations might be however costly if many samples are required and the usual algorithms implemented in statistical model checkers tend to be conservative. Bayesian and rare...
Article
Full-text available
Multi-objective problems with conflicting objectives cannot be effectively solved by aggregation-based methods. The answer to such problems is a Pareto optimal solution set. Due to the difficulty of solving multi-objective problems using multi-objective algorithms and the lack of enough expertise, researchers in different fields tend to aggregative...
Article
CSP# (communicating sequential programs) is a modelling language designed for specifying concurrent systems by integrating CSP-like compositional operators with sequential programs updating shared variables. In this work, we define an observation-oriented denotational semantics in an open environment for the CSP# language based on the UTP framework...
Article
Full-text available
Robust optimisation refers to the process of finding optimal solutions that have the lowest sensitivity to possible perturbations. In a multi-objective search space the robust optimal solutions should have the least dispersion on all of the objectives, making it a more challenging problem than in a single-objective search space. This paper establis...