[Show abstract][Hide abstract] ABSTRACT: This paper addresses the general problem of reinforcement learning (RL) in
partially observable environments. In 2013, our large RL recurrent neural
networks (RNNs) learned from scratch to drive simulated cars from
high-dimensional video input. However, real brains are more powerful in many
ways. In particular, they learn a predictive model of their initially unknown
environment, and somehow use it for abstract (e.g., hierarchical) planning and
reasoning. Guided by algorithmic information theory, we describe RNN-based AIs
(RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending
sequences of tasks, some of them provided by the user, others invented by the
RNNAI itself in a curious, playful fashion, to improve its RNN-based world
model. Unlike our previous model-building RNN-based RL machines dating back to
1990, the RNNAI learns to actively query its model for abstract reasoning and
planning and decision making, essentially "learning to think." The basic ideas
of this report can be applied to many other cases where one RNN-like system
exploits the algorithmic information content of another. They are taken from a
grant proposal submitted in Fall 2014, and also explain concepts such as
"mirror neurons." Experimental results will be described in separate papers.
[Show abstract][Hide abstract] ABSTRACT: Disentangled distributed representations of data are desirable for machine
learning, since they are more expressive and can generalize from fewer
examples. However, for complex data, the distributed representations of
multiple objects present in the same input can interfere and lead to
ambiguities, which is commonly referred to as the binding problem. We argue for
the importance of the binding problem to the field of representation learning,
and develop a probabilistic framework that explicitly models inputs as a
composition of multiple objects. We propose an unsupervised algorithm that uses
denoising autoencoders to dynamically bind features together in multi-object
inputs through an Expectation-Maximization-like clustering process. The
effectiveness of this method is demonstrated on artificially generated datasets
of binary images, showing that it can even generalize to bind together new
objects never seen by the autoencoder during training.
[Show abstract][Hide abstract] ABSTRACT: To stimulate progress in automating the reconstruction of neural circuits, we organized the first international challenge on 2D segmentation of electron microscopic (EM) images of the brain. Participants submitted boundary maps predicted for a test set of images, and were scored based on their agreement with a consensus of human expert annotations. The winning team had no prior experience with EM images, and employed a convolutional network. This “deep learning” approach has since become accepted as a standard for segmentation of EM images. The challenge has continued to accept submissions, and the best so far has resulted from cooperation between two teams. The challenge has probably saturated, as algorithms cannot progress beyond limits set by ambiguities inherent in 2D scoring and the size of the test dataset. Retrospective evaluation of the challenge scoring system reveals that it was not sufficiently robust to variations in the widths of neurite borders. We propose a solution to this problem, which should be useful for a future 3D segmentation challenge.
Full-text · Article · Nov 2015 · Frontiers in Neuroanatomy
[Show abstract][Hide abstract] ABSTRACT: Theoretical and empirical evidence indicates that the depth of neural
networks is crucial for their success. However, training becomes more difficult
as depth increases, and training of very deep networks remains an open problem.
Here we introduce a new architecture designed to overcome this. Our so-called
highway networks allow unimpeded information flow across many layers on
information highways. They are inspired by Long Short-Term Memory recurrent
networks and use adaptive gating units to regulate the information flow. Even
with hundreds of layers, highway networks can be trained directly through
simple gradient descent. This enables the study of extremely deep and efficient
[Show abstract][Hide abstract] ABSTRACT: Dependable cyber-physical systems strive to deliver anticipative, multi-objective performance anytime, facing deluges of inputs with varying and limited resources. This is even more challenging for life-long learning rational agents as they also have to contend with the varying and growing know-how accumulated from experience. These issues are of crucial practical value, yet have been only marginally and unsatisfactorily addressed in AGI research. We present a value-driven computational model of anytime bounded rationality robust to variations of both resources and knowledge. It leverages continually learned knowledge to anticipate, revise and maintain concurrent courses of action spanning over arbitrary time scales for execution anytime necessary.
[Show abstract][Hide abstract] ABSTRACT: Convolutional Neural Networks (CNNs) can be shifted across 2D images or 3D
videos to segment them. They have a fixed input size and typically perceive
only small local contexts of the pixels to be classified as foreground or
background. In contrast, Multi-Dimensional Recurrent NNs (MD-RNNs) can perceive
the entire spatio-temporal context of each pixel in a few sweeps through all
pixels, especially when the RNN is a Long Short-Term Memory (LSTM). Despite
these theoretical advantages, however, unlike CNNs, previous MD-LSTM variants
were hard to parallelize on GPUs. Here we re-arrange the traditional cuboid
order of computations in MD-LSTM in pyramidal fashion. The resulting
PyraMiD-LSTM is easy to parallelize, especially for 3D data such as stacks of
brain slice images. PyraMiD-LSTM achieved best known pixel-wise brain image
segmentation results on MRBrainS13 (and competitive results on EM-ISBI12).
[Show abstract][Hide abstract] ABSTRACT: There is plenty of theoretical and empirical evidence that depth of neural
networks is a crucial ingredient for their success. However, network training
becomes more difficult with increasing depth and training of very deep networks
remains an open problem. In this extended abstract, we introduce a new
architecture designed to ease gradient-based training of very deep networks. We
refer to networks with this architecture as highway networks, since they allow
unimpeded information flow across several layers on "information highways". The
architecture is characterized by the use of gating units which learn to
regulate the flow of information through a network. Highway networks with
hundreds of layers can be trained directly using stochastic gradient descent
and with a variety of activation functions, opening up the possibility of
studying extremely deep and efficient architectures.
[Show abstract][Hide abstract] ABSTRACT: Several variants of the Long Short-Term Memory (LSTM) architecture for
recurrent neural networks have been proposed since its inception in 1995. In
recent years, these networks have become the state-of-the-art models for a
variety of machine learning problems. This has led to a renewed interest in
understanding the role and utility of various computational components of
typical LSTM variants. In this paper, we present the first large-scale analysis
of eight LSTM variants on three representative tasks: speech recognition,
handwriting recognition, and polyphonic music modeling. The hyperparameters of
all LSTM variants for each task were optimized separately using random search
and their importance was assessed using the powerful fANOVA framework. In
total, we summarize the results of 5400 experimental runs (about 15 years of
CPU time), which makes our study the largest of its kind on LSTM networks. Our
results show that none of the variants can improve upon the standard LSTM
architecture significantly, and demonstrate the forget gate and the output
activation function to be its most critical components. We further observe that
the studied hyperparameters are virtually independent and derive guidelines for
their efficient adjustment.
[Show abstract][Hide abstract] ABSTRACT: In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? We propose here Continual Curiosity driven Skill Acquisition (CCSA). CCSA makes robots intrinsically motivated to acquire, store and reuse skills. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning to learn how to get these intrinsic rewards. CCSA also does this, but unlike previous implementations, the world model is a set of compact low-dimensional representations of the streams of high-dimensional visual information, which are learned through incremental slow feature analysis. These representations augment the robot's state space with new information about the environment. We show how this information can have a higher-level (compared to pixels) and useful interpretation, for example, if the robot has grasped a cup in its field of view or not. After learning a representation, large intrinsic rewards are given to the robot for performing actions that greatly change the feature output, which has the tendency otherwise to change slowly in time. We show empirically what these actions are (e.g., grasping the cup) and how they can be useful as skills. An acquired skill includes both the learned actions and the learned slow feature representation. Skills are stored and reused to generate new observations, enabling continual acquisition of complex skills. We present results of experiments with an iCub humanoid robot that uses CCSA to incrementally acquire skills to topple, grasp and pick-place a cup, driven by its intrinsic motivation from raw pixel vision.
No preview · Article · Feb 2015 · Artificial Intelligence
[Show abstract][Hide abstract] ABSTRACT: The proliferative activity of breast tumors, which is routinely estimated by
counting of mitotic figures in hematoxylin and eosin stained histology
sections, is considered to be one of the most important prognostic markers.
However, mitosis counting is laborious, subjective and may suffer from low
inter-observer agreement. With the wider acceptance of whole slide images in
pathology labs, automatic image analysis has been proposed as a potential
solution for these issues. In this paper, the results from the Assessment of
Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The
challenge was based on a data set consisting of 12 training and 11 testing
subjects, with more than one thousand annotated mitotic figures by multiple
observers. Short descriptions and results from the evaluation of eleven methods
are presented. The top performing method has an error rate that is comparable
to the inter-observer agreement among pathologists.
Full-text · Article · Nov 2014 · Medical Image Analysis
[Show abstract][Hide abstract] ABSTRACT: Recently proposed neural network activation functions such as rectified
linear, maxout, and local winner-take-all have allowed for faster and more
effective training of deep neural architectures on large and complex datasets.
The common trait among these functions is that they implement local competition
between small groups of units within a layer, so that only part of the network
is activated for any given input pattern. In this paper, we attempt to
visualize and understand this self-modularization, and suggest a unified
explanation for the beneficial properties of such networks. We also show how
our insights can be directly useful for efficiently performing retrieval over
large datasets using neural networks.
[Show abstract][Hide abstract] ABSTRACT: The automatic reconstruction of neurons from stacks of electron microscopy sections is an important computer vision problem in neuroscience. Recent advances are based on a two step approach: First, a set of possible 2D neuron candidates is generated for each section independently based on membrane predictions of a local classifier. Second, the candidates of all sections of the stack are fed to a neuron tracker that selects and connects them in 3D to yield a reconstruction. The accuracy of the result is currently limited by the quality of the generated candidates. In this paper, we propose to replace the heuristic set of candidates used in previous methods with samples drawn from a conditional random field (CRF) that is trained to label sections of neural tissue. We show on a stack of Drosophila melanogaster neural tissue that neuron candidates generated with our method produce 30% less reconstruction errors than current candidate generation methods. Two properties of our CRF are crucial for the accuracy and applicability of our method: (1) The CRF models the orientation of membranes to produce more plausible neuron candidates. (2) The interactions in the CRF are restricted to form a bipartite graph, which allows a great sampling speed-up without loss of accuracy.
[Show abstract][Hide abstract] ABSTRACT: We propose a system incorporating a tight integration between computer vision and robot control modules on a complex, high-DOF humanoid robot. Its functionality is showcased by having our iCub humanoid robot pick-up objects from a table in front of it. An important feature is that the system can avoid obstacles - other objects detected in the visual stream - while reaching for the intended target object. Our integration also allows for non-static environments, i.e. the reaching is adapted on-the-fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. Furthermore we show that this system can be used both in autonomous and tele-operation scenarios.
[Show abstract][Hide abstract] ABSTRACT: Four principal features of autonomous control systems are left both unaddressed and unaddressable by present-day engineering methodologies: (1) The ability to operate effectively in environments that are only partially known at design time; (2) A level of generality that allows a system to reassess and redefine the fulfillment of its mission in light of unexpected constraints or other unforeseen changes in the environment; (3) The ability to operate effectively in environments of significant complexity; and (4) The ability to degrade gracefully— how it can continue striving to achieve its main goals when resources become scarce, or in light of other expected or unexpected constraining factors that impede its progress. We describe new methodological and engineering principles for addressing these shortcomings, that we have used to design a machine that becomes increasingly better at behaving in underspecified circumstances, in a goal-directed way, on the job, by modeling itself and its environment as experience accumulates. The work provides an architectural blueprint for constructing systems with high levels of operational autonomy in underspecified circumstances , starting from only a small amount of designer-specified code—a seed. Using value-driven dynamic priority scheduling to control the parallel execution of a vast number of lines of reasoning, the system accumulates increasingly useful models of its experience, resulting in recursive self-improvement that can be autonomously sustained after the machine leaves the lab, within the boundaries imposed by its designers. A prototype system named AERA has been implemented and demonstrated to learn a complex real-world task—real-time multimodal dialogue with humans—by on-line observation. Our work presents solutions to several challenges that must be solved for achieving artificial general intelligence.
[Show abstract][Hide abstract] ABSTRACT: Dealing with high-dimensional input spaces, like visual input, is a challenging task for reinforcement learning (RL). Neuroevolution (NE), used for continuous RL problems, has to either reduce the problem dimensionality by (1) compressing the representation of the neural network controllers or (2) employing a pre-processor (compressor) that transforms the high-dimensional raw inputs into low-dimensional features. In this paper we extend the approach in . The Max-Pooling Convolutional Neural Network (MPCNN) compressor is evolved online, maximizing the distances between normalized feature vectors computed from the images collected by the recurrent neural network (RNN) controllers during their evaluation in the environment. These two interleaved evolutionary searches are used to find MPCNN compressors and RNN controllers that drive a race car in the TORCS racing simulator using only visual input.
[Show abstract][Hide abstract] ABSTRACT: We present here a simulated model of a mobile Kuka Youbot which makes use of Dynamic Field Theory for its underlying perceptual and motor control systems, while learning behavioral sequences through Reinforcement Learning. Although dynamic neural fields have previously been used for robust control in robotics, high-level behavior has generally been pre-programmed by hand. In the present work we extend a recent framework for integrating reinforcement learning and dynamic neural fields, by using the principle of shaping, in order to reduce the search space of the learning agent.
[Show abstract][Hide abstract] ABSTRACT: An important part of human intelligence is the ability to use language. Humans learn how to use language in a society of language users, which is probably the most effective way to learn a language from the ground up. Principles that might allow an artificial agents to learn language this way are not known at present. Here we present a framework which begins to address this challenge. Our auto-catalytic, endogenous, reflective architecture (AERA) supports the creation of agents that can learn natural language by observation. We present results from two experiments where our S1 agent learns human communication by observing two humans interacting in a realtime mock television interview, using gesture and situated language. Results show that S1 can learn multimodal complex language and multimodal communicative acts, using a vocabulary of 100 words with numerous sentence formats, by observing unscripted interaction between the humans, with no grammar being provided to it a priori, and only high-level information about the format of the human interaction in the form of high-level goals of the interviewer and interviewee and a small ontology. The agent learns both the pragmatics, semantics, and syntax of complex sentences spoken by the human subjects on the topic of recycling of objects such as aluminum cans, glass bottles, plastic, and wood, as well as use of manual deictic reference and anaphora.
[Show abstract][Hide abstract] ABSTRACT: Dealing with high-dimensional input spaces, like visual input, is a challenging task for reinforcement learning (RL). Neuroevolution (NE), used for continuous RL problems, has to either reduce the problem dimensionality by (1) compressing the representation of the neural network controllers or (2) employing a pre-processor (compressor) that transforms the high-dimensional raw inputs into low-dimensional features. In this paper, we are able to evolve extremely small recurrent neural network (RNN) controllers for a task that previously required networks with over a million weights. The high-dimensional visual input, which the controller would normally receive, is first transformed into a compact feature vector through a deep, max-pooling convolutional neural network (MPCNN). Both the MPCNN preprocessor and the RNN controller are evolved successfully to control a car in the TORCS racing simulator using only visual input. This is the first use of deep learning in the context evolutionary RL.
[Show abstract][Hide abstract] ABSTRACT: Traditional convolutional neural networks (CNN) are stationary and
feedforward. They neither change their parameters during evaluation nor use
feedback from higher to lower layers. Real brains, however, do. So does our
Deep Attention Selective Network (dasNet) architecture. DasNets feedback
structure can dynamically alter its convolutional filter sensitivities during
classification. It harnesses the power of sequential processing to improve
classification performance, by allowing the network to iteratively focus its
internal attention on some of its convolutional filters. Feedback is trained
through direct policy search in a huge million-dimensional parameter space,
through scalable natural evolution strategies (SNES). On the CIFAR-10 and
CIFAR-100 datasets, dasNet outperforms the previous state-of-the-art model.
Full-text · Article · Jul 2014 · Advances in neural information processing systems