About
71
Publications
51,593
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,687
Citations
Introduction
Current institution
NNAISENSE SA
Current position
- CEO
Additional affiliations
September 2007 - January 2014
April 2004 - August 2014
Publications
Publications (71)
Conventional similarity metrics used to sustain diversity in evolving populations are not well suited to sequential deci- sion tasks. Genotypes and phenotypic structure are poor predictors of how solutions will actually behave in the en- vironment. In this paper, we propose measuring similar- ity directly on the behavioral trajectories of evolving...
Many real-world sequence learning tasks re- quire the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because th...
The idea of using evolutionary computation to train artificial neural networks, or neuroevolution (NE), for reinforcement learning (RL) tasks has now been around for over 20 years. However, as RL tasks become more challenging, the networks required become larger, as do their genomes. But, scaling NE to large nets (i.e. tens of thousands of weights)...
Efficient exploration is an unsolved problem in Reinforcement Learning. We introduce Model-Based Active eXploration (MAX), an algorithm that actively explores the environment. It minimizes data required to comprehensively model the environment by planning to observe novel events, instead of merely reacting to novelty encountered by chance. Non-stat...
This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference in the light of noisy data samples, then passed as input to a neural network that outputs a second, interdependent distribution. Starting from a simple prior and iter...
With the goal of designing novel inhibitors for SARS-CoV-1 and SARS-CoV-2, we propose the general molecule optimization framework, Molecular Neural Assay Search (MONAS), consisting of three components: a property predictor which identifies molecules with specific desirable properties, an energy model which approximates the statistical similarity of...
Control applications present hard operational constraints. A violation of this can result in unsafe behavior. This paper introduces Safe Interactive Model Based Learning (SiMBL), a framework to refine an existing controller and a system model while operating on the real environment. SiMBL is composed of the following trainable components: a Lyapuno...
This paper introduces Non-Autonomous Input-Output Stable
Network (NAIS-Net), a very deep architecture where each stacked
processing block is derived from a time-invariant non-autonomous
dynamical system. Non-autonomy is implemented by skip connections
from the block input to each of the unrolled processing stages and
allows stability to be enforced...
This paper introduces "Non-Autonomous Input-Output Stable Network" (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforc...
Recently proposed neural network activation functions such as rectified
linear, maxout, and local winner-take-all have allowed for faster and more
effective training of deep neural architectures on large and complex datasets.
The common trait among these functions is that they implement local competition
between small groups of units within a layer...
This paper introduces Kernel-based Information Criterion (KIC) for model selection in regression analysis. The kernel-based complexity measure in KIC efficiently computes the interdependency between parameters of the model using a novel variable-wise variance and yields selection of better, more robust regressors. Experimental results show superior...
Dealing with high-dimensional input spaces, like visual input, is a challenging task for reinforcement learning (RL). Neuroevolution (NE), used for continuous RL problems, has to either reduce the problem dimensionality by (1) compressing the representation of the neural network controllers or (2) employing a pre-processor (compressor) that transfo...
Planning movements for humanoid robots is still a major challenge due to the very high degrees-of-freedom involved. Most humanoid control frameworks incorporate dynamical constraints related to a task that require detailed knowledge of the robot’s dynamics, making them impractical as efficient planning. In previous work, we introduced a novel plann...
Dealing with high-dimensional input spaces, like visual input, is a challenging task for reinforcement learning (RL). Neuroevolution (NE), used for continuous RL problems, has to either reduce the problem dimensionality by (1) compressing the representation of the neural network controllers or (2) employing a pre-processor (compressor) that transfo...
Traditional convolutional neural networks (CNN) are stationary and
feedforward. They neither change their parameters during evaluation nor use
feedback from higher to lower layers. Real brains, however, do. So does our
Deep Attention Selective Network (dasNet) architecture. DasNets feedback
structure can dynamically alter its convolutional filter s...
Sequence prediction and classification are ubiquitous and challenging
problems in machine learning that can require identifying complex dependencies
between temporally distant inputs. Recurrent Neural Networks (RNNs) have the
ability, in theory, to cope with these temporal dependencies by virtue of the
short-term memory implemented by their recurre...
Local competition among neighboring neurons is common in biological neural networks (NNs). In this paper, we apply the concept to gradient-based, backprop-trained artificial multilayer NNs. NNs with competing linear units tend to outperform those with non-competing nonlinear units, and avoid catastrophic forgetting when training sets change over ti...
Neuroevolution has yet to scale up to complex reinforcement learning tasks
that require large networks. Networks with many inputs (e.g. raw video) imply a
very high dimensional search space if encoded directly. Indirect methods use a
more compact genotype representation that is transformed into networks of
potentially arbitrary size. In this paper,...
The idea of using evolutionary computation to train artificial neural networks, or neuroevolution (NE), has now been around for over 20 years. The main appeal of this approach is that, because it does not rely on gradient information (e.g. backpropagation), it can potentially harness the universal function approximation capability of neural network...
We present a new way of converting a reversible finite Markov chain into a
non-reversible one, with a theoretical guarantee that the asymptotic variance
of the MCMC estimator based on the non-reversible chain is reduced. The method
is applicable to any reversible chain whose states are not connected through a
tree, and can be interpreted graphicall...
This paper presents initial results of Generalized Compressed Network Search (GCNS), a method for automatically identifying the important frequencies for neural networks encoded as a set of Fourier-type coefficients (i.e. "compressed" networks). GCNS achieves better compression than our previous approach, and promises better generalization capabili...
The Natural Evolution Strategies (NES) family of search algorithms have been shown to be efficient black-box optimizers, but the most powerful version xNES does not scale to problems with more than a few hundred dimensions. And the scalable variant, SNES, potentially ignores important correlations between parameters. This paper introduces Block Dia...
Indirect encoding schemes for neural network phenotypes can represent large networks compactly. In previous work, we presented a new approach where networks are encoded indirectly as a set of Fourier-type coefficients that decorrelate weight matrices such that they can often be represented by a small number of genes, effectively reducing the search...
In this paper, we introduce a method, called Compressed Network Complexity Search (CNCS), for automatically determining the complexity of compressed networks (neural networks encoded indirectly by Fourier-type coefficients) that favors parsimonious solutions. CNCS maintains a probability distribution over complexity classes that it uses to select w...
We analyze the size of the dictionary constructed from online kernel
sparsification, using a novel formula that expresses the expected determinant
of the kernel Gram matrix in terms of the eigenvalues of the covariance
operator. Using this formula, we are able to connect the cardinality of the
dictionary with the eigen-decay of the covariance opera...
Traditional Reinforcement Learning (RL) has focused on problems involving
many states and few actions, such as simple grid worlds. Most real world
problems, however, are of the opposite type, Involving Few relevant states and
many actions. For example, to return home from a conference, humans identify
only few subgoal states such as lobby, taxi, ai...
Neuroevolution, the artificial evolution of neural networks, has shown great promise on continuous reinforcement learning tasks that require memory. However, it is not yet directly applicable to realistic embedded agents using high-dimensional (e.g. raw video images) inputs, requiring very large networks. In this paper, neuroevolution is combined w...
Deep belief networks (DBNs) are popular for learning compact representations of high-dimensional data. However, most approaches so far rely on having a single, complete training set. If the distribution of relevant features changes during subsequent training stages, the features learned in earlier stages are gradually forgotten. Often it is desirab...
A major limitation in applying evolution strategies to black box optimization is the possibility of convergence into bad local optima. Many techniques address this problem, mostly through restarting the search. However, deciding the new start location is nontrivial since neither a good location nor a good scale for sampling a random restart positio...
The principle of artificial curiosity directs active exploration towards the most informative or most interesting data. We show its usefulness for global black box optimization when data point evaluations are expensive. Gaussian process regression is used to model the fitness function based on all available observations so far. For each candidate p...
We present a novel Natural Evolution Strategy (NES) variant, the Rank-One NES
(R1-NES), which uses a low rank approximation of the search distribution
covariance matrix. The algorithm allows computation of the natural gradient
with cost linear in the dimensionality of the parameter space, and excels in
solving high-dimensional non-separable problem...
The idea of evolving novel rather than fit solutions has recently been offered as a way to automatically discover the kind
of complex solutions that exhibit truly intelligent behavior. So far, novelty search has only been studied in the context
of problems where the number of possible “different” solutions has been limited. In this paper, we show,...
To maximize its success, an AGI typically needs to explore its initially
unknown world. Is there an optimal way of doing so? Here we derive an
affirmative answer for a broad class of environments.
In many reinforcement learning (RL) systems, the value function is approximated as a linear combination of a fixed set of basis functions. Performance can be improved by adding to this set. Previous approaches construct a series of basis functions that in sufficient number can eventually represent the value function. In contrast, we show that there...
We propose a new indirect encoding scheme for neural networks in which the weight matrices are represented in the frequency domain by sets Fourier coefficients. This scheme exploits spatial regularities in the matrix to reduce the dimensionality of the representation by ignoring high-frequency coefficients, as is done in lossy image compression. We...
The principle of minimum description length suggests look- ing for the simplest network that works well on the training examples, where simplicity is measured by network descrip- tion size based on a reasonable programming language for encoding networks. Previous work used an assembler-like universal network encoding language (NEL) and Speed Prior-...
Model complexity is key concern to any artificial learning system due its critical impact on generalization. However, EC research has only focused phenotype structural complexity for static problems. For sequential decision tasks, phenotypes that are very similar in struc- ture, can produce radically different behaviors, and the trade-off between f...
Model complexity is key concern to any artificial learning system due its critical impact on generalization. However, EC research has only focused phenotype structural complexity for static problems. For sequential decision tasks, phenotypes that are very similar in struc- ture, can produce radically different behaviors, and the trade-off between f...
This paper presents a reinforcement learning (RL) approach for anemia management in patients undergoing chronic renal failure. Erythropoietin (EPO) is the treatment of choice for this kind of anemia but it is an expensive drug and with some dangerous side-effects that should be considered especially for patients who do not respond to the treatment....
Applied to certain problems, neuroevolution frequently gets stuck in local optima with very low fitness; in particular, this
is true for some reinforcement learning problems where the input to the controller is a high-dimensional and/or ill-chosen
state description. Evidently, some controller inputs are “poisonous”, and their inclusion induce such...
We present the memetic climber, a simple search algorithm that learns topology and weights of neural networks on different time scales. When applied to the problem of learning control for a simulated racing task with carefully selected inputs to the neural network, the memetic climber outperforms a standard hill-climber. When inputs to the network...
Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by using direct interaction with the task e...
Tying suture knots is a time-consuming task performed frequently during Minimally Invasive Surgery (MIS). Automating this task could greatly reduce total surgery time for patients. Current solutions to this problem replay manually programmed trajectories, but a more general and robust approach is to use supervised machine learning to smooth surgeon...
In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the n...
Tying suture knots is a time-consuming task performed frequently during minimally invasive surgery (MIS). Automating this task could greatly reduce total surgery time for patients. Current solutions to this problem replay manually programmed trajectories, but a more general and robust approach is to use supervised machine learning to smooth surgeon...
Many complex control problems are not amenable to tradi-tional controller design. Not only is it di cult to model real systems, but often it is unclear what kind of behavior is required. Reinforcement learning (RL) has made progress through direct interaction with the task environment, but it has been di cult to scale it up to large and partially o...
We address the problem of autonomously learning controllers for vision-capable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for general metrics over state-action trajectories. We demonstrate the feasibility of our approach by successfully running our algorithm on a real mobile robot. The algorithm is novel a...
Traditional Support Vector Machines (SVMs) need pre-wired finite time windows to predict and classify time series. They do not have an internal state necessary to deal with sequences involving arbitrary long-term dependencies. Here we introduce a new class of recurrent, truly sequential SVM-like devices with internal adaptive states, trained by a n...
Existing Support Vector Machines (SVMs) need pre-wired finite time windows to predict and clas-sify time series. They do not have an internal state necessary to deal with sequences involving arbitrary long-term dependencies. Here we introduce the first recurrent, truly sequential SVM-like devices with in-ternal adaptive states, trained by a novel m...
Recurrent neural networks are theoretically capable of learning complex temporal sequences, but training them through gradient-descent is too slow and unstable for practical use in reinforcement learning environments. Neuroevolution, the evolution of artificial neural networks using genetic algorithms, can potentially solve real-world reinforcement...
Existing Recurrent Neural Networks (RNNs) are limited in their ability to model dynamical systems with nonlinearities and hidden internal states. Here we use our general framework for sequence learning, EVOlution of recurrent systems with LINear Outputs (Evolino), to discover good RNN hidden node weights through evolution, while using linear regres...
Current Neural Network learning algorithms are limited in their ability to model non-linear dynami- cal systems. Most supervised gradient-based recur- rent neural networks (RNNs) suffer from a vanish- ing error signal that prevents learning from inputs far in the past. Those that do not, still have prob- lems when there are numerous local minima. W...
Existing Recurrent Neural Networks (RNNs) are limited in their ability to model dynamical systems with nonlinearities and hidden internal states. Here we use our general framework for sequence learning, EVOlution of recurrent systems with LINear Outputs (Evolino), to discover good RNN hidden node weights through evolution, while using linear regres...
Current Neural Network learning algorithms are limited in their ability to model non-linear dynamical systems. Most supervised gradient-based recurrent neural networks (RNNs) suffer from a vanishing error signal that prevents learning from inputs far in the past. Those that do not, still have problems when there are numerous local minima. We introd...
In recent years, the evolution of artificial neural networks or neuroevolution has brought promising results in solving dicult rein- forcement learning problems. But, like standard RL methods, it requires that solutions be discovered in simulation and then be transferred to the real world. To date, transfer has been studied primarily in mobile robo...
Finless rockets are more efficient than finned designs, but are too unstable to fly unassisted. These rockets require an active guidance system to control their orientation, during flight and maintain stability. Because rocket dynamics are highly non-linear, developing such a guidance system can be prohibitively costly, especially for relatively sm...
Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by utilizing direct interaction with the ta...
Technology-driven limitations will soon force microprocessor chips to contain multiple processing cores, as the scalability of individual cores peaks but transistor counts continue to increase. To obtain best performance, flexible management of the on-chip resources, such as cache memory and off-chip bandwidth, is needed. However, control for the d...
Technology-driven limitations will soon force microprocessor chips
to contain multiple processing cores, as the scalability of individual
cores peaks while transistor counts continue to increase. To obtain the
best performance, flexible management of the on-chip resources, such as
cache memory and off-chip bandwidth, is needed. However, the control...
The success of evolutionary methods on standard control learning tasks has created a need for new benchmarks. The classic pole balancing problem is no longer difficult enough to serve as a viable yardstick for measuring the learning efficiency of these systems. The double pole case, where two poles connected to the cart must be balanced simultaneou...
The success of evolutionary methods on standard control learning tasks has created a need for new benchmarks. The classic pole balancing problem is no longer difficult enough to serve as a viable yard stick for measuring the learning efficiency of these systems. The double pole case, where two poles connected to the cart must be balanced simultane...
The success of evolutionary methods on standard control learning tasks has created a need for new benchmarks. The classic pole balancing problem is no longer difficult enough to serve as a viable yardstick for measuring the learning efficiency of these systems. In this paper we present a more difficult version to the classic problem where the cart...
Several researchers have demonstrated how complex behavior can be learned through neuro-evolution (i.e. evolving neural networks with genetic algorithms). However, complex general behavior such as evading predators or avoiding obstacles, which is not tied to specific environments, turns out to be very difficult to evolve. Often the system discovers...
In practice, almost all control systems in use today implement some form of linear control. However, there are many tasks
for which conventional control engineering methods are not directly applicable because there is not enough information about
how the system should be controlled (i.e. reinforcement learning problems). In this paper, we explore a...