Bruce William Watson

Bruce William Watson
Stellenbosch University | SUN · Centre for AI Research School for Data-Science and Computational Thinking

Doctor of Philosophy
Chair of the Centre for AI Research, specialised in Cybersecurity and Algorithms

About

178
Publications
57,884
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,718
Citations
Citations since 2017
39 Research Items
523 Citations
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120

Publications

Publications (178)
Article
Full-text available
‘Long COVID’ is the term used to describe the phenomenon in which patients who have survived a COVID-19 infection continue to experience prolonged SARS-CoV-2 symptoms. Millions of people across the globe are affected by Long COVID. Solving the Long COVID conundrum will require drawing upon the lessons of the COVID-19 pandemic, during which thousand...
Article
Full-text available
The data ecosystem is complex and involves multiple stakeholders. Researchers and scientists engaging in data-intensive research collect, analyse, store, manage and share large volumes of data. Consequently, capturing researchers’ and scientists’ views from multidisciplinary fields on data use, sharing and governance adds an important African persp...
Article
Full-text available
In the process of data modelling, interpretation of the results in a straightforward manner is often challenging. The model's statistical performance is linked to specific parameters, while the interpretation of the results in the context of the problem often relies on visualisation aids. The aim of this study was to develop an analytical pipeline...
Article
Full-text available
Background Fibrin(ogen) amyloid microclots and platelet hyperactivation previously reported as a novel finding in South African patients with the coronavirus 2019 disease (COVID-19) and Long COVID/Post-Acute Sequelae of COVID-19 (PASC), might form a suitable set of foci for the clinical treatment of the symptoms of Long COVID/PASC. A Long COVID/PAS...
Chapter
Deductive program verification is a post-hoc quality assurance technique following the design-by-contract paradigm where correctness of the program is proven only after it was written. Contrary, correctness-by-construction (CbC) is an incremental program construction technique. Starting with the functional specification, the program’s correctness i...
Conference Paper
Full-text available
Contract Driven Development formalizes functional requirements within component contracts. The process aims to produce higher quality software, reduce quality assurance costs and improve reusability. However, the perceived complexity and cost of requirements formalization has limited the adoption of this approach in industry. In this article, we co...
Conference Paper
Real-time distributed Internet of Things (IoT) systems are increasingly using complex event processing to make inferences about the environment. This mode of operation is able to reduce communication require- ments, improve robustness and scalability, and avoid the need for big data storage and processing. With systems making many inferences about...
Article
Full-text available
Under the coronavirus pandemic, governments and corporations around the world have adopted a work-from-home (WFH) mode of operations to continue governing and operating. Over two years into the COVID-19 pandemic, many of us continue to work from home and a large majority have few plans to return to the office. Early on, governments and companies s...
Preprint
Full-text available
This is study of microclots in long-COVID patients, including treatment thereof, and a data-driven correlation of comorbidities with long-COVID symptoms. This is a corrected version. The preprint original is also at: https://doi.org/10.21203/rs.3.rs-1205453/v1
Preprint
Full-text available
Background: Fibrin(ogen) amyloid microclots and platelet hyperactivation previously reported as a novel finding in South African patients with the coronavirus 2019 disease (COVID-19) and Long COVID/Post-Acute Sequelae of COVID-19 (PASC), might form a suitable set of foci for the clinical treatment of the symptoms of long COVID/PASC. A Long COVID/PA...
Conference Paper
Full-text available
In early 2020, the rapid adoption of remote working and communications tools by governments, companies, and individuals around the world increased dependency on cyber infrastructure for the normal functioning of States, businesses, and societies. For some, the urgent need to communicate whilst safeguarding human life took priority over ensuring tha...
Conference Paper
Full-text available
Internationally digital technology is widely used in support of elections. While most countries depend on technological advances in some form or the other, electronic voting as such has been far less universally adopted. Thus far, only about 20 per cent of the world has used electronic voting for national elections-and with mixed success. While ove...
Chapter
In recent years, researchers have started to investigate X-by-Construction (XbC) as a refinement approach to engineer systems that by-construction satisfy certain non-functional properties, beyond correctness as considered by the more traditional Correctness-by-Construction (CbC). In line with increasing attention for fault-tolerance and the use of...
Chapter
Correctness-by-construction (CbC) is a refinement-based methodology to incrementally create formally correct programs. Programs are constructed using refinement rules which guarantee that the resulting implementation is correct with respect to a pre-/postcondition specification. In contrast, with post-hoc verification (PhV) a specification and a pr...
Conference Paper
Full-text available
Governments around the world commonly use Cloud Service Providers (CSPs) that are headquartered in other nations. How do they ensure data sovereignty when these CSPs, storing a nation’s data within that nation’s borders, are subject to long-arm statutes on data stored abroad? And what if, in turn, the governmental data is stored abroad, would acces...
Conference Paper
Full-text available
Cloud Service Providers (CSP) offer the opportunity for individuals, companies, and governments to rapidly leverage current capabilities dynamically and with great elasticity. At the time of writing, unlike the U.S., Canada does not have large sovereign CSPs with global presence. Although one may debate overall cost effectiveness and value of mov...
Chapter
Information system security threats perpetuates organisations in spite of enormous investments in security measures. The academic literature and the media reflect the huge financial loss and reputational harm to organisations due to computer related security breaches. Although technical safeguards are indispensable, the academic literature highligh...
Article
In many software applications, it is necessary to preserve confidentiality of information. Therefore, security mechanisms are needed to enforce that secret information does not leak to unauthorized users. However, most language-based techniques that enable information flow control work post-hoc, deciding whether a specific program violates a confid...
Preprint
Full-text available
Regularities in strings are often related to periods and covers, which have extensively been studied, and algorithms for their efficient computation have broad application. In this paper we concentrate on computing cyclic regularities of strings, in particular, we propose several efficient algorithms for computing: (i) cyclic periodicity; (ii) all...
Chapter
Within the last decades, the dead-zone algorithms have emerged as being highly performant on certain types of data. Such algorithms solve the keyword exact matching problem over strings, though extensions to trees and two-dimensional data have also been devised. In this short paper, we give an overview of such algorithms.
Chapter
Regularities in strings are often related to periods and covers, which have extensively been studied, and algorithms for their efficient computation have broad application. In this paper we concentrate on computing cyclic regularities of strings, in particular, we propose several efficient algorithms for computing: (i) cyclic periodicity; (ii) all...
Chapter
Full-text available
Correctness-by-Construction (CbC) is an approach to incrementally create formally correct programs guided by pre- and postcondition specifications. A program is created using refinement rules that guarantee the resulting implementation is correct with respect to the specification. Although CbC is supposed to lead to code with a low defect rate, it...
Chapter
After decades of progress on Correctness-by-Construction (CbC) as a scientific discipline of engineering, it is time to look further than correctness and investigate a move from CbC to XbC, i.e., considering also non-functional properties. X-by-Construction (XbC) is concerned with a step-wise refinement process from specification to code that autom...
Chapter
Guaranteeing that information processed in computing systems remains confidential is vital for many software applications. To this end, language-based security mechanisms enforce fine-grained access control policies for program variables to prevent secret information from leaking through unauthorized access. However, approaches for language-based s...
Article
Full-text available
In this paper, we propose a reduction of the minimization problem for a bottom-up deterministic tree automaton (DFTA), making the latter a minimization of a string deterministic finite automaton (DFA). To achieve this purpose, we proceed first by the transformation of the tree automaton into a particular string automaton, followed by minimizing thi...
Conference Paper
Full-text available
Over the last few decades, several technology specialists have collected computer viruses and other malware. Today, if one desires, they can download current malware collections from Internet- based sources1. It could be argued that a large majority of older malware would not be as effective as the day they were written, due to the target systems o...
Article
Full-text available
The increasingly large volumes of publicly available sensory descriptions of wine raises the question whether this source of data can be mined to extract meaningful domain-specific information about the sensory properties of wine. We introduce a novel application of formal concept lattices, in combination with traditional statistical tests, to visu...
Chapter
A method for developing concurrent software is advocated that centres on using CSP to specify the behaviour of the system. A small example problem is used to illustrate the method. The problem is to develop a simulation system that keeps track of and reports on the least unique bid of multiple streams of randomly generated incoming bids. The proble...
Conference Paper
Full-text available
Technologies have evolved so rapidly that companies and governments seem to be regularly trying to catch up to new capabilities and thereby making quick decisions that have the potential to set precedents and present international challenges. Is cyber capability changing so fast that our sensemaking is lagging? Is cyber shape-shifting? With the op...
Article
Full-text available
A degenerate or indeterminate string on an alphabet $\Sigma$ is a sequence of non-empty subsets of $\Sigma$. Given a degenerate string $t$ of length $n$, we present a new method based on the Burrows--Wheeler transform for searching for a degenerate pattern of length $m$ in $t$ running in $O(mn)$ time on a constant size alphabet $\Sigma$. Furthermor...
Article
Full-text available
Failure deterministic finite automata (FDFAs) represent regular languages more compactly than deterministic finite automata (DFAs). Four algorithms that convert arbitrary DFAs to language-equivalent FDFAs are empirically investigated. Three are concrete variants of a previously published abstract algorithm, the DFA-Homomorphic Algorithm (DHA). The...
Article
Full-text available
The data explosion problem continues to escalate requiring novel and ingenious solutions. Pattern inference focusing on repetitive structures in data is a vigorous field of endeavor aimed at shrinking volumes of data by means of concise descriptions. The Burrows–Wheeler transformation computes a permutation of a string of letters over an alphabet,...
Chapter
Most regular expression matching engines have operators and features to enhance the succinctness of classical regular expressions, such as interval quantifiers and regular lookahead. In addition, matching engines in for example Perl, Java, Ruby and .NET, also provide operators, such as atomic operators, that constrain the backtracking behavior of t...
Article
Full-text available
Being an unsupervised machine learning and data mining technique, biclustering and its multimodal extensions are becoming popular tools for analysing object-attribute data in different domains. Apart from conventional clustering techniques, biclustering is searching for homogeneous groups of objects while keeping their common description, e.g., in...
Chapter
Growing SmartCities means that the amount of information processed and stored to manage a city’s infrastructure (e.g., traffic, public transport, electricity) is growing as well. To manage this, SmartCities are deploying truly distributed and highly scalable information and communication (ICT) infrastructure, connecting a conglomerate of smart devi...
Conference Paper
Correctness-by-construction (CbC) is an approach for developing algorithms inline with rigorous correctness arguments. A high-level specification is evolved into an implementation in a sequence of small, tractable refinement steps guaranteeing the resulting implementation to be correct. CbC facilitates the design of algorithms that are more efficie...
Conference Paper
Correctness-by-construction (CbC), traditionally based on weakest precondition semantics, and post-hoc verification (PhV) aspire to ensure functional correctness. We argue for a lightweight approach to CbC where lack of formal rigour increases productivity. In order to mitigate the risk of accidentally introducing errors during program construction...
Conference Paper
We apply results from ambiguity of non-deterministic finite automata to the problem of determining the asymptotic worst-case matching time, as a function of the length of the input strings, when attempting to match input strings with a given regular expression, where the matcher being used is a backtracking regular expression matcher.
Conference Paper
Modern software systems, in particular in mobile and cloud-based applications, exist in many different variants in order to adapt to changing user requirements or application contexts. Software product line engineering allows developing these software systems by managed large-scale reuse in order to achieve shorter time to market. Traditional softw...
Conference Paper
Taxonomy-Based Software Construction (TABASCO) applies extensive domain analyses to create conceptual hierarchies of algorithmic domains. Those are used as basis for the implementation of software toolkits. The monolithic structure of TABASCO-based toolkits restricts their adoption on resource-constrained or special-purpose devices. In this paper,...
Conference Paper
This extended abstract sketches some of the most recent advances in hardware implementations (and surrounding issues) of finite automata and regular expressions.
Patent
Full-text available
A method of determining a set of prescribed actions includes receiving a configuration script identifying a set of influencers, a set of performance indicators, a model type, a target time, and a prescription method. The method further includes deriving a model of the model type based on data associated with the set of influencers or with the set o...
Conference Paper
Full-text available
We propose a reduction of the minimization problem for a bottom-up deterministic tree automaton (DFTA) to the minimization problem for a string deterministic finite automaton (DFA). We proceed by a transformation of the tree automaton into a particular string automaton and then minimize the string automaton. We show that for our transformation, the...
Data
Figures 16-1 to 16-7.
Article
We discuss the correctness-by-construction approach to software development.•We discuss our experience with this approach in various algorithmic settings.•We argue that its application to algorithmically complex system parts is worthwhile.
Conference Paper
The timing performance data of ten related algorithms (solving the single keyword pattern matching problem) executing under a wide variety of operating conditions, was gathered and analysed. Using the resulting 15 million items of timing data, various metrics to estimate algorithm performance were computed and compared. An assessment is made of whe...
Article
Full-text available
In indexing of, and pattern matching on, DNA and text sequences, it is often important to represent all factors of a sequence. One efficient, compact representation is the factor oracle (FO). At the same time, any classical deterministic finite automata (DFA) can be transformed to a so-called failure one (FDFA), which may use failure transitions to...
Article
Full-text available
As long as software has been produced, there have been efforts to strive for quality in software products. In order to understand quality in software products, researchers have built models of software quality that rely on metrics in an attempt to provide a quantitative view of software quality. The aim of these models is to provide software produc...
Conference Paper
Deep packet inspection (DPI) systems are required to perform at or near network line-rate speeds, matching thousands of rules against the network traffic. The engineering performance and price trade-offs are such that DPI is difficult to virtualize, either because of very high memory consumption or the use of custom hardware; similarly, a running D...
Conference Paper
In indexing of and pattern matching on DNA sequences, representing all factors of a sequence is important. One efficient, compact representation is the factor oracle (FO). At the same time, any classical deterministic finite automata (DFA) can be transformed to a so-called failure one (FDFA), which may use failure transitions to replace multiple sy...
Book
Full-text available
http://dl.acm.org/citation.cfm?id=2564892 TABLE OF CONTENTS: 1. Abstraction, Refinement, Enrichment 2. Ceteris Paribus Preferences: Prediction via Abduction 3. Minimal Weighted Automata over the Galois Field with Two Elements 4. Symmetric Difference NFA: the State of the Art 5. Analyzing Strings with Ordered Lyndon-like Structures 6. Verifying an E...
Article
The design and implementation is discussed of Fire@mSat"2, an algorithm to detect microsatellites (short approximate tandem repeats) in DNA. The algorithm relies on deterministic finite automata. The parameters are designed to support requirements expressed by molecular biologists in data exploration. By setting the parameters of Fire@mSat"2 as lib...
Conference Paper
Most software packages with regular expression matching engines offer operators that extend the classical regular expressions, such as counting, intersection, complementation, and interleaving. Some of the most popular engines, for example those of Java and Perl, also provide operators that are intended to control the nondeterminism inherent in reg...
Article
The factor oracle [3] is a data structure for weak factor recognition. It is a deterministic finite automaton (DFA) built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+ 1 states which are all final, is homogeneous, and has m to 2m-1 transitions. The factor storacle [6] is an alternative automaton that satisf...
Article
I introduce two performance improvements to the Commentz-Walter family of multiple-keyword (exact) pattern matching algorithms. These algorithms (which are in the Boyer-Moore-Horspool style of keyword pattern matchers) consist of two nested loops: the outer one to process the input text and the inner one processing possible keyword matches. The gua...
Conference Paper
A so-called dead-zone pattern matching family of algorithms has previously been proposed as a concept. Here the performance of several instances of the family are empirically investigated. An abstract description of the algorithm family is given, as well as of these instances. This leads to a total of five different implementations of the algorithm...
Conference Paper
Earlier publications provided an abstract specification of a family of single keyword pattern matching algorithms [18] which search unexamined portions of the text in a divide-and-conquer fashion, generating dead-zones in the text as they progress. These dead zones are area of text that require no further examination. Here the results are described...
Article
Full-text available
The consequences of regular expression hashing as a means of finite state automaton reduction is explored, based on variations of Brzozowski's algorithm. In this approach, each hash collision results in the merging of the automaton's states, and it is subsequently shown that a super-automaton will always be constructed, regardless of the hash funct...
Chapter
The previous chapter illustrated the potency of software correctness by construction for developing a new and elegant algorithm. In this chapter we focus on classifying and taxonomising algorithmic problems by relying on correctness by construction thinking.
Chapter
In this chapter, a number of fairly elementary algorithms are developed. They are, namely: linear search; finding the maximal element in an array; a version of binary search; a simple pattern matching algorithm; raising a number to a specific integer power; and finding the integer approximation of a logarithm to the base 2.
Chapter
This chapter provides further examples of the software correctness by construction method. The examples are fairly diverse. They range from sorting in a specialised context (the Dutch National Flag problem), discovering segmental properties of an array (the longest segment and the longest palindrome problems), raster drawing algorithms, the majorit...
Chapter
The correctness by construction methodology advocated by this book starts off with a predicate-based specification of the problem at hand, and then incrementally refines that specification to code. However, to be able to do this, several preliminary notational and theoretical matters have to be in place.
Chapter
Procedures Synonyms are subprocedure, subprogram, routine, subroutine, function and method. In this text, we will keep to the terms procedure and function as they were classically used in languages such as Pascal. offer a well-known way of reusing code. A procedure may be viewed as a named block of code, characterised by its pre- and postconditions...
Chapter
In this chapter, the correctness by construction approach is applied to an algorithmic problem that lies well off the beaten track of classical text book examples. The algorithm has been in the public domain since about 2000, but was only clearly explained and its correctness shown in 2010 [26]. The algorithm has also been shown to be considerably...
Book
The focus of this book is on bridging the gap between two extreme methods for developing software. On the one hand, there are texts and approaches that are so formal that they scare off all but the most dedicated theoretical computer scientists. On the other, there are some who believe that any measure of formality is a waste of time, resulting in...
Article
Full-text available
Inspired by failure functions found in classical pattern matching algorithms, a failure deterministic finite automaton (FDFA) is defined as a formalism to recognise a regular language. An algorithm, based on formal concept analysis, is proposed for deriving from a given deterministic finite automaton (DFA) a language-equivalent FDFA. The FDFA's tra...
Article
Formal concept analysis is used as the basis for two new multiple keyword string pattern matching algorithms. The algorithms addressed are built upon a so-called position encoded pattern lattice (PEPL). The algorithms presented are in conceptual form only; no experimental results are given. The first algorithm to be presented is easily understood a...
Article
Full-text available
In this paper two concurrent versions of Brzozowski's deterministic finite automaton (DFA) construction algorithm are developed from first principles, the one being a slight refinement of the other. We rely on Hoare's CSP as our notation. The specifications that are proposed of the Brzozowski algorithm are in terms of the concurrent composition of...
Chapter
Full-text available
In model-based testing (MBT, also known as “specification-based” testing, or “model-driven” testing: MDT), the test cases, according to which a hardware or software unit (module, component) shall be tested after its development or implementation, are typically not created “ab initio”. They are derived by some means of formal reasoning from a formal...
Article
Full-text available
Many keyword pattern matching algorithms use precomputation subroutines to produce lookup tables, which in turn are used to improve performance during the search phase. If the keywords to be matched are known at compile time, the precomputation subroutines can be implemented to be evaluated at compile time versus at run time. This will provide a pe...
Conference Paper
Full-text available
This paper describes an experimental study to compare the performance of various dynamically resizable bit-vector implementations for the C++ programming language. We compare the std::vector from the Standard Template Library (STL), boost::dynamic_bitset from Boost, Qt::QBitArray from QT Software, and BitMagic's bm::bvector We also compare std::vec...
Conference Paper
Full-text available
Previous work on implementations of FA-based string recognizers suggested a range of implementation strategies (and therefore, algorithms) aiming at improving their performance for fast string recognition. However, an efficient exploitation of suggested algorithms by domain-specific FA-implementers requires prior knowledge of the behaviour (perform...
Conference Paper
The tourist slogan used to market South Africa A World in One Country cuts across many more dimensions than just those of interest to tourists. Everywhere in the country, there is evidence of both a highly advanced and sophisticated economy and lifestyle, as well as of poverty and underdevelopment. The purpose of this session is to reflect on wheth...
Article
An object-oriented framework is proposed for constructing a virtual machine (VM) to be used in the context of incrementally and iteratively developing a domain-specific language (DSL). The framework is written in C#. It includes abstract instruction and environment classes. By extending these, a concrete layer of classes is obtained whose instances...
Book
These proceedings contain the final versions of the papers presented at the 7th International Workshop on Finite-State Methods and Natural Language Processing, FSMNLP 2008. The workshop was held in Ispra, Italy, on September 11–12, 2008. The event was the seventh instance in the series of FSMNLP workshops, and the third that was arranged as a stand...
Conference Paper
Full-text available
We introduce a new CSP operator for modeling scenarios characterised by partial or optional parallelism. We provide examples of such scenarios and sketch the semantics of our operator. Relevant properties are proven.
Article
An incremental algorithm to construct a lattice from a collection of sets is derived, refined, analyzed, and related to a similar previously published algorithm for constructing concept lattices. The lattice constructed by the algorithm is the one obtained by closing the collection of sets with respect to set intersection. The analysis explains the...
Conference Paper
We propose a concept lattice-based approach to multiple two dimensional pattern matching problems. It is assumed that a pattern can be described as a set of vertices (or pixels) and that a small set of vertices around each vertex corresponds to an attribute in a concept lattice. Typically, an attribute should be a succinct characterisation of domai...
Conference Paper
Full-text available
We present two algorithms for minimizing deterministic frontier-to-root tree automata (dfrtas) and compare them with their string counterparts. The presentation is incremental, starting out from definitions of minimality of automata and state equivalence, in the style of earlier algorithm taxonomies by the authors. The first algorithm is the classi...