Steve Kroon

Steve Kroon
Stellenbosch University | SUN · Division of Computer Science

Doctor of Philosophy

About

36
Publications
23,757
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
244
Citations
Additional affiliations
January 2020 - present
Stellenbosch University
Position
  • Professor (Associate)

Publications

Publications (36)
Preprint
Full-text available
Initial work on variational autoencoders assumed independent latent variables with simple distributions. Subsequent work has explored incorporating more complex distributions and dependency structures: including normalizing flows in the encoder network allows latent variables to entangle non-linearly, creating a richer class of distributions for th...
Preprint
Full-text available
Graphical flows add further structure to normalizing flows by encoding non-trivial variable dependencies. Previous graphical flow models have focused primarily on a single flow direction: the normalizing direction for density estimation, or the generative direction for inference. However, to use a single flow to perform tasks in both directions, th...
Preprint
Full-text available
This paper is concerned with distributed detection of central nodes in complex networks using closeness centrality. Closeness centrality plays an essential role in network analysis. Evaluating closeness centrality exactly requires complete knowledge of the network; for large networks, this may be inefficient, so closeness centrality should be appro...
Chapter
Valuable data are often spread out over different similar spreadsheets. Consolidating this data for further analysis can take considerable effort for a spreadsheet user without programming skills. We introduce Spreadsheet Layout Constraint Integration (SpLyCI), a system to semi-automatically merge multiple spreadsheets and lay the result out in a s...
Preprint
Full-text available
We propose a method for combining probabilistic outputs of classifiers to make a single consensus class prediction when no further information about the individual classifiers is available, beyond that they have been trained for the same task. The lack of relevant prior information rules out typical applications of Bayesian or Dempster-Shafer metho...
Article
Recent work in signal propagation theory has shown that dropout limits the depth to which information can propagate through a neural network. In this paper, we investigate the effect of initialisation on training speed and generalisation for ReLU networks within this depth limit. We ask the following research question: given that critical initialis...
Article
Recent work has established the equivalence between deep neural networks and Gaussian processes (GPs), resulting in so-called neural network Gaussian processes (NNGPs). The behaviour of these models depends on the initialisation of the corresponding network. In this work, we consider the impact of noise regularisation (e.g. dropout) on NNGPs, and r...
Article
Full-text available
Existing algorithms like nested sampling and annealed importance sampling are able to produce accurate estimates of the marginal likelihood of a model, but tend to scale poorly to large data sets. This is because these algorithms need to recalculate the log-likelihood for each iteration by summing over the whole data set. Efficient scaling to large...
Preprint
Full-text available
We consider estimating the marginal likelihood in settings with independent and identically distributed (i.i.d.) data. We propose estimating the predictive distributions in a sequential factorization of the marginal likelihood in such settings by using stochastic gradient Markov Chain Monte Carlo techniques. This approach is far more efficient than...
Article
Full-text available
We consider estimating the marginal likelihood in settings with independent and identically distributed (i.i.d.) data. We propose estimating the predictive distributions in a sequential factorization of the marginal likelihood in such settings by using stochastic gradient Markov Chain Monte Carlo techniques. This approach is far more efficient than...
Preprint
Bayesian neural networks (BNNs) have developed into useful tools for probabilistic modelling due to recent advances in variational inference enabling large scale BNNs. However, BNNs remain brittle and hard to train, especially: (1) when using deep architectures consisting of many hidden layers and (2) in situations with large weight variances. We u...
Preprint
Full-text available
Recent work in signal propagation theory has shown that dropout limits the depth to which information can propagate through a neural network. In this paper, we investigate the effect of initialisation on training speed and generalisation for ReLU networks within this depth limit. We ask the following research question: given that critical initialis...
Preprint
Recent work has established the equivalence between deep neural networks and Gaussian processes (GPs), resulting in so-called neural network Gaussian processes (NNGPs). The behaviour of these models depends on the initialisation of the corresponding network. In this work, we consider the impact of noise regularisation (e.g. dropout) on NNGPs, and r...
Preprint
Full-text available
The problem of coordination without a priori information about the environment is important in robotics. Applications vary from formation control to search and rescue. This paper considers the problem of search by a group of solitary robots: self-interested robots without a priori knowledge about each other, and with restricted communication capaci...
Poster
This poster was about a coordination strategy for autonomous solitary robots.
Preprint
Stochastic regularisation is an important weapon in the arsenal of a deep learning practitioner. However, despite recent theoretical advances, our understanding of how noise influences signal propagation in deep neural networks remains limited. By extending recent work based on mean field theory, we develop a new framework for signal propagation in...
Preprint
Denoising autoencoders (DAEs) have proven useful for unsupervised representation learning, but a thorough theoretical understanding is still lacking of how the input noise influences learning. Here we develop theory for how noise influences learning in DAEs. By focusing on linear DAEs, we are able to derive analytic expressions that exactly describ...
Article
Full-text available
Missing data values and differing sampling rates, particularly for important parameters such as particle size and stream composition, are a common problem in minerals processing plants. Missing data imputation is used to avoid information loss (due to downsampling or discarding incomplete records). A recent deep-learning technique, variational auto...
Article
Full-text available
In this paper, we present a method for computing the marginal likelihood, also known as the model likelihood or Bayesian evidence, from Markov Chain Monte Carlo (MCMC), or other sampled posterior distributions. In order to do this, one needs to be able to estimate the density of points in parameter space, and this can be challenging in high numbers...
Article
We compute the Bayesian Evidence for the theoretical models considered in the main analysis of Planck cosmic microwave background data. By utilising carefully-defined nearest-neighbour distances in parameter space, we reuse the Monte Carlo Markov Chains already produced for parameter inference to compute Bayes factors $B$ for many different models...
Conference Paper
Full-text available
Unsupervised pre-training of neural networks has been shown to act as a regularization technique, improving performance and reducing model variance. Recently, fully convolutional networks (FCNs) have shown state-of-the-art results on various semantic segmentation tasks. Unfortunately, there is no efficient approach available for FCNs to benefit fro...
Article
Full-text available
Reinforcement Learning (RL) is a powerful technique to develop intelligent agents in the field of Artificial Intelligence (AI). This paper proposes a new RL algorithm called the Temporal- Difference value iteration algorithm with state-value functions and presents applications of this algorithm to the decision-making problems challenged in the Robo...
Conference Paper
Full-text available
Consider a single camera mounted on the inside of a vehicle's windscreen used for detecting potholes and other obstacles on the road surface. This paper outlines three approaches to the depth estimation problem of determining the distance to these obstacles in the range of 5 m to 30 m. We provide an empirical evaluation of the accuracy of these app...
Data
Pothole dataset compiled at Electrical and Electronic Department, Stelllenbosch University, 2015 Dataset available at https://goo.gl/wfj1B2 The entire dataset consists of two different sets, one was considered to be simple and the other more complex. These datasets do share some files and there are a few instances where two different images would...
Conference Paper
Accurate classifiers for short texts are valuable assets in many applications. Especially in online communities, where users contribute to content in the form of posts and comments, an effective way of automatically categorising posts proves highly valuable. This paper investigates the use of N-grams as features for short text classification, and c...
Conference Paper
Building sophisticated computer players for games has been of interest since the advent of artificial intelligence research. Monte Carlo tree search (MCTS) techniques have led to recent advances in the performance of computer players in a variety of games. Without any refinements, the commonly-used upper confidence bounds applied to trees (UCT) sel...
Conference Paper
Monte-Carlo Tree Search (MCTS) is currently the dominant algorithm in Computer Go. MCTS is an asymmetric tree search technique employing stochastic simulations to evaluate leaves and guide the search. Using features to further guide MCTS is a powerful approach to improving performance. In Computer Go, these features are typically comprised of a num...
Conference Paper
Parallelisation of computationally expensive algorithms, such as Monte-Carlo Tree Search (MCTS), has become increasingly important in order to increase algorithm performance by making use of commonplace parallel hardware. Oakfoam, an MCTS-based Computer Go player, was extended to support parallel processing on multi-core and cluster systems. This w...
Conference Paper
The Twitter lists feature was launched in late 2009 and enables the creation of curated groups containing Twitter users. Each user can be a list author and decide the basis on which other users are added to a list. The most popular lists are those that associate with a topic. Twitter lists can be used as a powerful organisation tool, but its widesp...
Article
The Binary Jumbled String Matching problem is defined as: Given a string $s$ over $\{a,b\}$ of length $n$ and a query $(x,y)$, with $x,y$ non-negative integers, decide whether $s$ has a substring $t$ with exactly $x$ $a$'s and $y$ $b$'s. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries...
Article
Full-text available
The support vector machine (SVM) is a technique for func-tion estimation which was proposed in the early 1990s. The technique provided state-of-the-art performance on many problems familiar to the machine learning community, and has hence gained enormous popularity among them. Despite the solid theoretical justification for the SVM as the solution...
Article
The support vector machine (SVM) is a technique for function estimation which was proposed in the early 1990s. This paper is a follow-up to the paper Getting to grips with Support Vector Machines: Theory, which gave an introductory overview of SVMs. This paper discusses issues arising when applying the SVM in practice, giving useful hints and sugge...
Article
The binary jumbled string matching problem is defined as: Given a binary string s of length n and a query (x, y), with x, y non-negative integers, decide whether s has a substring t with exactly x a's and y b's. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries in constant time. The fas...

Network

Cited By