Michael Spratling

Michael Spratling
Verified
Michael verified their affiliation via an institutional email.
Verified
Michael verified their affiliation via an institutional email.
  • PhD
  • Research Associate at University of Luxembourg

About

120
Publications
25,554
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,031
Citations
Current institution
University of Luxembourg
Current position
  • Research Associate

Publications

Publications (120)
Preprint
Large language models have achieved remarkable success in recent years, primarily due to the implementation of self-attention mechanisms. However, traditional Softmax attention suffers from numerical instability and reduced performance as the length of inference tokens increases. This paper addresses these issues by decomposing the Softmax operatio...
Preprint
Full-text available
Cross-entropy (CE) loss is the de-facto standard for training deep neural networks to perform classification. However, CE-trained deep neural networks struggle with robustness and generalisation issues. To alleviate these issues, we propose high error margin (HEM) loss, a variant of multi-class margin loss that overcomes the training issues of othe...
Preprint
Multi-Task Learning (MTL) involves the concurrent training of multiple tasks, offering notable advantages for dense prediction tasks in computer vision. MTL not only reduces training and inference time as opposed to having multiple single-task models, but also enhances task accuracy through the interaction of multiple tasks. However, existing metho...
Article
Most existing anomaly detection (AD) methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this a...
Article
‘Cellular psychology’ is a new field of inquiry that studies dendritic mechanisms for adapting mental events to the current context, thus increasing their coherence, flexibility, effectiveness, and comprehensibility. Apical dendrites of neocortical pyramidal cells have a crucial role in cognition – those dendrites receive input from diverse sources...
Article
Full-text available
Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized...
Article
Modelling one‐to‐many type mappings in problems with a temporal component can be challenging. Backpropagation is not applicable to networks that perform discrete sampling and is also susceptible to gradient instabilities, especially when applied to longer sequences. In this paper, we propose two recurrent neural network architectures that leverage...
Preprint
Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper...
Article
Multitask learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships. By using shared resources to simultaneously calculate multiple outputs, this learning paradigm has the potential to have lower memory requirements and inference times compared to the traditional approach of using separate methods for ea...
Article
Bowers et al. eloquently describe issues with current deep neural network (DNN) models of vision, claiming that there are deficits both with the methods of assessment, and with the models themselves. I am in agreement with both these claims, but propose a different recipe to the one outlined in the target article for overcoming these issues.
Preprint
Full-text available
Reliable and robust evaluation methods are a necessary first step towards developing machine learning models that are themselves robust and reliable. Unfortunately, current evaluation protocols typically used to assess classifiers fail to comprehensively evaluate performance as they tend to rely on limited types of test data, and ignore others. For...
Preprint
Multi-Task Learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships. By using shared resources to simultaneously calculate multiple outputs, this learning paradigm has the potential to have lower memory requirements and inference times compared to the traditional approach of using separate methods for e...
Preprint
Full-text available
Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized...
Preprint
Full-text available
The N400 event-related component has been widely used to investigate the neural mechanisms underlying real-time language comprehension. However, despite decades of research, there is still no unifying theory that can explain both its temporal dynamics and functional properties. In this work, we show that predictive coding – a biologically plausible...
Preprint
Full-text available
Deep neural networks can be easily fooled into making incorrect predictions through corruption of the input by adversarial perturbations: human-imperceptible artificial noise. So far adversarial training has been the most successful defense against such adversarial attacks. This work focuses on improving adversarial training to boost adversarial ro...
Preprint
Tiny object detection has become an active area of research because images with tiny targets are common in several important real-world scenarios. However, existing tiny object detection methods use standard deep neural networks as their backbone architecture. We argue that such backbones are inappropriate for detecting tiny objects as they are des...
Article
Full-text available
Few-shot segmentation (FSS) aims to segment unseen classes using a few annotated samples. Typically, a prototype representing the foreground class is extracted from annotated support image(s) and is matched to features representing each pixel in the query image. However, models learnt in this way are insufficiently discriminatory, and often produce...
Preprint
Full-text available
Adversarial training suffers from the issue of robust overfitting, which seriously impairs its generalization performance. Data augmentation, which is effective at preventing overfitting in standard training, has been observed by many previous works to be ineffective in mitigating overfitting in adversarial training. This work proves that, contrary...
Preprint
Full-text available
Adversarial training is widely used to improve the robustness of deep neural networks to adversarial attack. However, adversarial training is prone to overfitting, and the cause is far from clear. This work sheds light on the mechanisms underlying overfitting through analyzing the loss landscape w.r.t. the input. We find that robust overfitting res...
Article
Adversarial training is widely used to improve the robustness of deep neural networks to adversarial attack. However, adversarial training is prone to overfitting, and the cause is far from clear. This work sheds light on the mechanisms underlying overfitting through analyzing the loss landscape w.r.t. the input. We find that robust overfitting res...
Chapter
This paper considers few-shot anomaly detection (FSAD), a practical yet under-studied setting for anomaly detection (AD), where only a limited number of normal images are provided for each category at training. So far, existing FSAD studies follow the one-model-per-category learning paradigm used for standard AD, and the inter-category commonality...
Preprint
Full-text available
Initialising the synaptic weights of artificial neural networks (ANNs) with orthogonal matrices is known to alleviate vanishing and exploding gradient problems. A major objection against such initialisation schemes is that they are deemed biologically implausible as they mandate factorization techniques that are difficult to attribute to a neurobio...
Preprint
Few-shot segmentation (FSS) aims to segment unseen classes using a few annotated samples. Typically, a prototype representing the foreground class is extracted from annotated support image(s) and is matched to features representing each pixel in the query image. However, models learnt in this way are insufficiently discriminatory, and often produce...
Preprint
Few-shot segmentation aims to segment images containing objects from previously unseen classes using only a few annotated samples. Most current methods focus on using object information extracted, with the aid of human annotations, from support images to identify the same objects in new query images. However, background information can also be usef...
Article
Full-text available
Finding a template in a search image is an important task underlying many computer vision applications. This is typically solved by calculating a similarity map using features extracted from the separate images. Recent approaches perform template matching in a deep feature space, produced by a convolutional neural network (CNN), which is found to p...
Preprint
This paper considers few-shot anomaly detection (FSAD), a practical yet under-studied setting for anomaly detection (AD), where only a limited number of normal images are provided for each category at training. So far, existing FSAD studies follow the one-model-per-category learning paradigm used for standard AD, and the inter-category commonality...
Article
Most current trackers utilise an appearance model to localise the target object in each frame. However, such approaches often fail when there are similar looking distractor objects in the surrounding background. This paper promotes an approach that can be combined with many existing trackers to tackle this issue and improve tracking robustness. The...
Article
Full-text available
Many current trackers utilise an appearance model to localise the target object in each frame. However, such approaches often fail when there are similar-looking distractor objects in the surrounding background, meaning that target appearance alone is insufficient for robust tracking. In contrast, humans consider the distractor objects as additiona...
Chapter
Finding a template in a search image is an important task underlying many computer vision applications. Recent approaches perform template matching in a deep feature-space, produced by a convolutional neural network (CNN), which is found to provide more tolerance to changes in appearance. In this article, we investigate whether enhancing the CNN’s...
Preprint
Finding a template in a search image is an important task underlying many computer vision applications. Recent approaches perform template matching in a feature-space, such as that produced by a convolutional neural network (CNN), that provides more tolerance to changes in appearance. In this article we investigate combining features from different...
Article
Full-text available
Recognising and locating image patches or sets of image features is an important task underlying much work in computer vision. Traditionally this has been accomplished using template matching. However, template matching is notoriously brittle in the face of changes in appearance caused by, for example, variations in viewpoint, partial occlusion, an...
Article
Full-text available
Recurrent Neural Networks have been widely used to process sequence data, but have long been criticized for their biological implausibility and training difficulties related to vanishing and exploding gradients. This paper presents a novel algorithm for training recurrent networks, target propagation through time (TPTT), that outperforms standard b...
Preprint
Recognising and locating image patches or sets of image features is an important task underlying much work in computer vision. Traditionally this has been accomplished using template matching. However, template matching is notoriously brittle in the face of changes in appearance caused by, for example, variations in viewpoint, partial occlusion, an...
Article
Recent neurophysiological data showing the effects of locomotion on neural activity in mouse primary visual cortex has been interpreted as providing strong support for the predictive coding account of cortical function. Specifically, this work has been interpreted as providing direct evidence that prediction-error, a distinguishing property of pred...
Article
Full-text available
This paper proposes two types of recommender systems based on sparse dictionary coding. Firstly, a novel predictive recommender system that attempts to predict a user’s future rating of a specific item. Secondly, a top-n recommender system which finds a list of items predicted to be most relevant for a given user. The proposed methods are assessed...
Article
Full-text available
Sparse representations have been widely used for many image processing tasks. In this paper, a sparse reconstruction-based discrimination (SRBD) method, which was previously proposed for the classification of image patches, is utilized to improve boundary detection in colour images. This method is applied to refining the results generated by three...
Article
Full-text available
A comprehensive model of gaze control must account for a number of empirical observations at both the behavioural and neurophysiological levels. The computational model presented in this article can simulate the coordinated movements of the eye, head, and body required to perform horizontal gaze shifts. In doing so it reproduces the predictable rel...
Article
Full-text available
The coordinated movement of the eyes, the head and the arm is an important ability in both animals and humanoid robots. To achieve this, the brain and the robot control system need to be able to perform complex non-linear sensory-motor transformations in the forward and inverse directions between many degrees of freedom. In this article, we apply a...
Article
Full-text available
Predictive coding has been proposed as a model of the hierarchical perceptual inference process performed in the cortex. However, results demonstrating that predictive coding is capable of performing the complex inference required to recognise objects in natural images have not previously been presented. This article proposes a hierarchical neural...
Article
Full-text available
Gaze shifts require the coordinated movement of both the eyes and the head in both animals and humanoid robots. To achieve this the brain and the robot control system needs to be able to perform complex non-linear sensory-motor transformations between many degrees of freedom and resolve the redundancy in such a system. In this article we propose a...
Article
Full-text available
Background The predictive coding/biased competition (PC/BC) model of V1 has previously been applied to locate boundaries defined by local discontinuities in intensity within an image. Objective Here PC/BC is extended to perform contour detection for colour images. Methods The proposed extensions are inspired by neurophysiological data from single n...
Article
Full-text available
Predictive coding (PC) is a leading theory of cortical function that has previously been shown to explain a great deal of neurophysiological and psychophysical data. Here it is shown that PC can perform almost exact Bayesian inference when applied to computing with population codes. It is demonstrated that the proposed algorithm, based on PC, can:...
Article
The Hough transform (HT) is widely used for feature extraction and object detection. However, during the HT individual image elements vote for many possible parameter values. This results in a dense accumulator array and problems identifying the parameter values that correspond to image features. This article proposes a new method for implementing...
Article
Full-text available
Previous work has shown that predictive coding can provide a detailed explanation of a very wide range of low-level perceptual processes. It is also widely believed that predictive coding can account for high-level, cognitive, abilities. This article provides support for this view by showing that predictive coding can simulate phenomena such as cat...
Article
Full-text available
Predictive coding is a leading theory of how the brain performs probabilistic inference. However, there are a number of distinct algorithms which are described by the term “predictive coding”. This article provides a concise review of these different predictive coding algorithms, highlighting their similarities and differences. Five algorithms are...
Article
Full-text available
The human visual system uses saccadic and vergence eye movements to foveate visual targets. To mimic this aspect of the biological visual system the PC/BC-DIM neural network is used as an omni-directional basis function network for learning and performing sensory-sensory and sensory-motor transformations without using any hard-coded geometric infor...
Article
Inspired by the probability of boundary (Pb) algorithm, a simplified texture gradient method has been developed to locate texture boundaries within grayscale images. Despite considerable simplification, the proposed algorithm’s ability to locate texture boundaries is comparable with Pb’s texture boundary method. The proposed texture gradient method...
Article
Full-text available
Visual speech recognition aims to identify the sequence of phonemes from continuous speech. Unlike the traditional approach of using 2D image feature extraction methods to derive features of each video frame separately, this paper proposes a new approach using a 3D (spatio-temporal) Discrete Cosine Transform to extract features of each feasible sub...
Article
Representing signals as linear combinations of basis vectors sparsely selected from an overcomplete dictionary has proven to be advantageous for many applications in pattern recognition, machine learning, signal processing, and computer vision. While this approach was originally inspired by insights into cortical information processing, biologicall...
Article
A distinction is commonly made between synaptic connections capable of evoking a response ("drivers") and those that can alter ongoing activity but not initiate it ("modulators"). Here it is proposed that, in cortex, both drivers and modulators are an emergent property of the perceptual inference performed by cortical circuits. Hence, it is propose...
Article
It is often helpful to distinguish between a theory (Marr's computational level) and a specific implementation of that theory (Marr's physical level). However, in the target article, a single implementation of predictive coding is presented as if this were the theory of predictive coding itself. Other implementations of predictive coding have been...
Article
Algorithms that encode images using a sparse set of basis functions have previously been shown to explain aspects of the physiology of primary visual cortex (V1), and have been used for applications such as image compression, restoration, and classification. Here, a sparse coding algorithm, that has previously been used to account of the response p...
Article
Full-text available
In multimodal integration and sensorimotor transformation areas of the posterior parietal cortex (PPC), neural responses often appear encoded in spatial reference frames that are intermediate to the intrinsic sensory reference frames, for example, eye-centered for visual or head-centered for auditory stimulation. Many sensory responses in these are...
Article
PC/BC ("Predictive coding/Biased competition") is a simple computational model that has previously been shown to explain a very wide range of V1 response properties. This article extends work on the PC/BC model of V1 by showing that it can also account for V1 response properties measured using the reverse correlation methodology. Reverse correlatio...
Article
Full-text available
A method is presented for learning the reciprocal feedforward and feedback connections required by the predictive coding model of cortical function. When this method is used, feedforward and feedback connections are learned simultaneously and independently in a biologically plausible manner. The performance of the proposed algorithm is evaluated by...
Article
The predictive coding/biased competition (PC/BC) model is a specific implementation of the predictive coding theory that has previously been shown to provide a detailed account of the response properties of orientation tuned cells in primary visual cortex (V1). Here it is shown that the same model can successfully simulate psychophysical data relat...
Article
Full-text available
The combination of two or more population-coded signals in a neural model of predictive coding can give rise to multiplicative gain modulation in the response properties of individual neurons. Synaptic weights generating these multiplicative response properties can be learned using an unsupervised, Hebbian learning rule. The behavior of the model i...
Article
Cross-orientation suppression and surround suppression have been extensively studied in primary visual cortex (V1). These two forms of suppression have some distinct properties which has led to the suggestion that they are generated by different underlying mechanisms. Furthermore, it has been suggested that mechanisms other than intracortical inhib...
Article
Full-text available
A simple model is shown to account for a large range of V1 classical, and nonclassical, receptive field properties including orientation tuning, spatial and temporal frequency tuning, cross-orientation suppression, surround suppression, and facilitation and inhibition by flankers and textured surrounds. The model is an implementation of the predict...
Article
A hierarchical neural network model is used to learn, without supervision, sensory-sensory coordinate transformations like those believed to be encoded in the dorsal pathway of the cerebral cortex. The resulting representations of visual space are invariant to eye orientation, neck orientation, or posture in general. These posture invariant spatial...
Article
Full-text available
This paper demonstrates that nonnegative matrix factorisation is mathematically related to a class of neural networks that employ negative feedback as a mechanism of competition. This observation inspires a novel learning algorithm which we call Divisive Input Modulation (DIM). The proposed algorithm provides a mathematically simple and computation...
Article
Past physiological and psychophysical experiments have shown that attention can modulate the effects of contextual information appearing outside the classical receptive field of a cortical neuron. Specifically, it has been suggested that attention, operating via cortical feedback connections, gates the effects of long-range horizontal connections u...
Article
Full-text available
Book synopsis: What are the processes, from conception to adulthood, that enable a single cell to grow into a sentient adult? The processes that occur along the way are so complex that any attempt to understand development necessitates a multi-disciplinary approach, integrating data from cognitive studies, computational work, and neuroimaging - an...
Article
Full-text available
In this response, we consider four main issues arising from the commentaries to the target article. These include further details of the theory of interactive specialization, the relationship between neuroconstructivism and selectionism, the implications of neuroconstructivism for the notion of representation, and the role of genetics in theories o...
Article
Attention acts, through cortical feedback pathways, to enhance the response of cells encoding expected or predicted information. Such observations are inconsistent with the predictive coding theory of cortical function which proposes that feedback acts to suppress information predicted by higher-level cortical regions. Despite this discrepancy, thi...
Article
Full-text available
A simple variation of the standard biased competition model is shown, via some trivial mathematical manipulations, to be identical to predictive coding. Specifically, it is shown that a particular implementation of the biased competition model, in which nodes compete via inhibition that targets the inputs to a cortical region, is mathematically equ...
Article
Full-text available
Neuroconstructivism is a theoretical framework focusing on the construction of representations in the developing brain. Cognitive development is explained as emerging from the experience-dependent development of neural structures supporting mental representations. Neural development occurs in the context of multiple interacting constraints acting o...
Article
Full-text available
In order to perform object recognition it is necessary to learn representations of the underlying components of images. Such components correspond to objects, object-parts, or features. Non-negative matrix factorisation is a generative model that has been specifically proposed for finding such meaningful representations of image data, through the u...
Article
Top-down, feedback, influences are known to have significant effects on visual information processing. Such influences are also likely to affect perceptual learning. This paper employs a computational model of the cortical region inter-actions underlying visual perception to investigate possible influences of top-down information on learning. The r...
Article
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is...
Article
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalise across changes in location, rotation and scale. A standard method for learning perceptual representations that are invariant to viewpoint is t...
Article
Page is to be congratulated for challenging some misconceptions about neural representation. However, his target article, and the commentaries to it, highlight that the terms “local” and “distributed” are open to misinterpretation. These terms provide a poor description of neural coding strategies and a better taxonomy might resolve some of the iss...
Article
A long running debate has concerned the question of whether neural representations are encoded using a distributed or a local coding scheme. In both schemes individual neurons respond to certain specific patterns of pre-synaptic activity. Hence, rather than being dichotomous, both coding schemes are based on the same representational mechanism. We...

Network

Cited By