A Geometrical Analysis of Global Stability in Trained Feedback Networks

ArticleinNeural Computation 31(6):1-43 · April 2019with 40 Reads 
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
Cite this publication
Abstract
Recurrent neural networks have been extensively studied in the context of neuroscience and machine learning due to their ability to implement complex computations. While substantial progress in designing effective learning algorithms has been achieved, a full understanding of trained recurrent networks is still lacking. Specifically, the mechanisms that allow computations to emerge from the underlying recurrent dynamics are largely unknown. Here we focus on a simple yet underexplored computational setup: a feedback architecture trained to associate a stationary output to a stationary input. As a starting point, we derive an approximate analytical description of global dynamics in trained networks, which assumes uncorrelated connectivity weights in the feedback and in the random bulk. The resulting mean-field theory suggests that the task admits several classes of solutions, which imply different stability properties. Different classes are characterized in terms of the geometrical arrangement of the readout with respect to the input vectors, defined in the high-dimensional space spanned by the network population. We find that such an approximate theoretical approach can be used to understand how standard training techniques implement the input-output task in finite-size feedback networks. In particular, our simplified description captures the local and the global stability properties of the target solution, and thus predicts training performance.

Do you want to read the rest of this article?

Request Full-text Paper PDF
Advertisement
  • ... Our model in contrast exhibits an interplay between low rank structured connectivity implementing balance, and high rank disordered connectivity inducing chaos, each with independently adjustable strengths. In general, how computation emerges from an interplay between structured and random connectivity has been a subject of recent interest in theoretical neuroscience [18,23,24,25]. Here we show how structure and randomness interact by obtaining analytic insights into the efficacy of predictive coding, dissecting the individual contributions of balance, noise, weight disorder, chaos, delays and nonlinearity, in a model were all ingredients can coexist and be independently adjusted. ...
    ... We find: (1) strong balance is a key requirement for superclassical error scaling with network size; (2) without delays, increasing balance always suppresses errors via powers laws with different exponents (-1 for noise, -2 for chaos); (3) delays yield an oscillatory instability and a tradeoff between noise suppression and resonant amplification; (4) this tradeoff sets a maximal critical balance level which decreases with delay; (5) noise or chaos can increase this maximal level by promoting desynchronization; (6) the competition between noise suppression and resonant amplification sets an optimal balance level that is half the maximal level in the case of noise; (7) but is close to the maximal level in the case of chaos for small delays, because the slow chaos has small power at the high resonant frequency; (8) the optimal decoder error rises as a power law with delay (with exponent 1/2 for noise and 1 for chaos). Also, our model unifies a variety of perspectives in theoretical neuroscience, spanning classical synaptic balance [17,29,42,43,44,45], efficient coding in tight balance [7,46], the interplay of structured and random connectivity in computation [18,23,24,47,48], the relation between oscillations and delays in neural networks [49,50,51] and predictive coding [8,10]. Moreover, the mean-field theory developed here can be extended to spiking neurons with strong recurrent balance and delays [52], analytically explaining relations between delays, coding and oscillations observed in simulations but previously not understood [21,22] Acknowledgments JK thanks the Swartz Foundation for Theoretical Neuroscience for funding; JT thanks the National Science Foundation for funding. ...
    ... We now turn to compute the statistics of the fluctuations of a random network in its chaotic phase, when the variance of the weight distribution is above the critical transition point g > g c . The dynamic mean field theory for a chaotic neural network was first introduced by [15] and re-derived later by [35,38,36,53,24,54]. The connectivity in the subspace orthogonal to the readout direction is randomly distributed, thus the properties of the fluctuations in this subspace, δh ⊥ (t), are equivalent to previous studies of random neural networks. ...
    Preprint
    Biological neural networks face a formidable task: performing reliable computations in the face of intrinsic stochasticity in individual neurons, imprecisely specified synaptic connectivity, and nonnegligible delays in synaptic transmission. A common approach to combatting such biological heterogeneity involves averaging over large redundant networks of $N$ neurons resulting in coding errors that decrease classically as $1/\sqrt{N}$. Recent work demonstrated a novel mechanism whereby recurrent spiking networks could efficiently encode dynamic stimuli, achieving a superclassical scaling in which coding errors decrease as $1/N$. This specific mechanism involved two key ideas: predictive coding, and a tight balance, or cancellation between strong feedforward inputs and strong recurrent feedback. However, the theoretical principles governing the efficacy of balanced predictive coding and its robustness to noise, synaptic weight heterogeneity and communication delays remain poorly understood. To discover such principles, we introduce an analytically tractable model of balanced predictive coding, in which the degree of balance and the degree of weight disorder can be dissociated unlike in previous balanced network models, and we develop a mean field theory of coding accuracy. Overall, our work provides and solves a general theoretical framework for dissecting the differential contributions neural noise, synaptic disorder, chaos, synaptic delays, and balance to the fidelity of predictive neural codes, reveals the fundamental role that balance plays in achieving superclassical scaling, and unifies previously disparate models in theoretical neuroscience.
  • ... From a system perspective, the feedforward neural net-work model has limited computing power, and the feedback dynamics of a feedback neural network more stronger computing power than a feedforward neural network, which is based on feedback to enhance global stability [88]. In feedback neural networks, all neurons have the same status and there is no hierarchical difference. ...
    Article
    Full-text available
    Compared with von Neumann’s computer architecture, neuromorphic systems offer more unique and novel solutions to the artificial intelligence discipline. Inspired by biology, this novel system has implemented the theory of human brain modeling by connecting feigned neurons and synapses to reveal the new neuroscience concepts. Many researchers have vastly invested in neuro-inspired models, algorithms, learning approaches, operation systems for the exploration of the neuromorphic system and have implemented many corresponding applications. Recently, some researchers have demonstrated the capabilities of Hopfield algorithms in some large-scale notable hardware projects and seen significant progression. This paper presents a comprehensive review and focuses extensively on the Hopfield algorithm’s model and its potential advancement in new research applications. Towards the end, we conclude with a broad discussion and a viable plan for the latest application prospects to facilitate developers with a better understanding of the aforementioned model in accordance to build their own artificial intelligence projects.
  • ... From a system perspective, the feedforward neural net-work model has limited computing power, and the feedback dynamics of a feedback neural network more stronger computing power than a feedforward neural network, which is based on feedback to enhance global stability [88]. In feedback neural networks, all neurons have the same status and there is no hierarchical difference. ...
    Article
    Full-text available
    Compared with von Neumann's computer architecture, neuromorphic systems offer more unique and novel solutions to the artificial intelligence discipline. Inspired by biology, this novel system has implemented the theory of human brain modeling by connecting feigned neurons and synapses to reveal the new neuroscience concepts. Many researchers have vastly invested in neuro-inspired models, algorithms, learning approaches, operation systems for the exploration of the neuromorphic system and have implemented many corresponding applications. Recently, some researchers have demonstrated the capabilities of Hopfield algorithms in some large-scale notable hardware projects and seen significant progression. This paper presents a comprehensive review and focuses extensively on the Hopfield algorithm's model and its potential advancement in new research applications. Towards the end, we conclude with a broad discussion and a viable plan for the latest application prospects to facilitate developers with a better understanding of the aforementioned model in accordance to build their own artificial intelligence projects. INDEX TERMS Neuromorphic computing, Neuro-inspired model, Hopfield algorithm, Artificial intelligence.
  • ... Maximizing memory does not necessarily lead to performance (e.g., prediction) maximization [11]. In recent years, a large effort has been devoted to tackle these problems, by studying the dynamical systems underlying RNNs [2,12,13,3,14,15]. ...
    Preprint
    Full-text available
    Reservoir computing is a popular approach to design recurrent neural networks, due to its training simplicity and its approximation performance. The recurrent part of these networks is not trained (e.g. via gradient descent), making them appealing for analytical studies, raising the interest of a vast community of researcher spanning from dynamical systems to neuroscience. It emerges that, even in the simple linear case, the working principle of these networks is not fully understood and the applied research is usually driven by heuristics. A novel analysis of the dynamics of such networks is proposed, which allows one to express the state evolution using the controllability matrix. Such a matrix encodes salient characteristics of the network dynamics: in particular, its rank can be used as an input-indepedent measure of the memory of the network. Using the proposed approach, it is possible to compare different architectures and explain why a cyclic topology achieves favourable results.
  • ... Our approach allows us to gain mechanistic insight into the computations underlying echo state and FORCE learning models which have the same connectivity structure as our model [20,21]. Here, the readout vector n is trained, which leads to correlations to the random part J [3,34]. Our results on multiple fixed points and oscillations show that these correlations are crucial for the rich functional repertoire. ...
    Article
    Full-text available
    A given neural network in the brain is involved in many different tasks. This implies that, when considering a specific task, the network's connectivity contains a component which is related to the task and another component which can be considered random. Understanding the interplay between the structured and random components and their effect on network dynamics and functionality is an important open question. Recent studies addressed the coexistence of random and structured connectivity but considered the two parts to be uncorrelated. This constraint limits the dynamics and leaves the random connectivity nonfunctional. Algorithms that train networks to perform specific tasks typically generate correlations between structure and random connectivity. Here we study nonlinear networks with correlated structured and random components, assuming the structure to have a low rank. We develop an analytic framework to establish the precise effect of the correlations on the eigenvalue spectrum of the joint connectivity. We find that the spectrum consists of a bulk and multiple outliers, whose location is predicted by our theory. Using mean-field theory, we show that these outliers directly determine both the fixed points of the system and their stability. Taken together, our analysis elucidates how correlations allow structured and random connectivity to synergistically extend the range of computations available to networks.
  • ... This simple training protocol is not sufficient in many applications, e.g. when it is required to learn memory states. To this end, training mechanisms based on output feedback [24,25] and online training [10,26] have been proposed, with successful applications in physics [27,28], complex systems modeling [29,30], and neuroscience [31], just to name a few. ...
    Preprint
    Full-text available
    A recurrent neural network (RNN) possesses the echo state property (ESP) if, for a given input sequence, it "forgets" any internal states of the driven (nonautonomous) system and asymptotically follows a unique, possibly complex trajectory. The lack of ESP is conventionally understood as a lack of reliable behaviour in RNNs. Here, we show that RNNs can reliably perform computations under a more general principle that accounts only for their local behaviour in phase space. To this end, we formulate a generalisation of the ESP and introduce an echo index to characterise the number of simultaneously stable responses of a driven RNN. We show that it is possible for the echo index to change with inputs, highlighting a potential source of computational errors in RNNs due to characteristics of the inputs driving the dynamics.
  • ... Our approach allows us to gain mechanistic insight into the computations underlying echo state and FORCE learning models which have the same connectivity structure as our model [20,21]. Here, the readout vector n is trained, which leads to correlations to the random part J [3,31]. Our results on multiple fixed points and oscillations show that these correlations are crucial for the rich functional repertoire. ...
    Preprint
    A given neural network in the brain is involved in many different tasks. This implies that, when considering a specific task, the network's connectivity contains a component which is related to the task and another component which can be considered random. Understanding the interplay between the structured and random components, and their effect on network dynamics and functionality is an important open question. Recent studies addressed the co-existence of random and structured connectivity, but considered the two parts to be uncorrelated. This constraint limits the dynamics and leaves the random connectivity non-functional. Algorithms that train networks to perform specific tasks typically generate correlations between structure and random connectivity. Here we study nonlinear networks with correlated structured and random components, assuming the structure to have a low rank. We develop an analytic framework to establish the precise effect of the correlations on the eigenvalue spectrum of the joint connectivity. We find that the spectrum consists of a bulk and multiple outliers, whose location is predicted by our theory. Using mean-field theory, we show that these outliers directly determine both the fixed points of the system and their stability. Taken together, our analysis elucidates how correlations allow structured and random connectivity to synergistically extend the range of computations available to networks.
  • Preprint
    An emerging paradigm proposes that neural computations can be understood at the level of dynamical systems that govern low-dimensional trajectories of collective neural activity. How the connectivity structure of a network determines the emergent dynamical system however remains to be clarified. Here we consider a novel class of models, Gaussian-mixture low-rank recurrent networks, in which the rank of the connectivity matrix and the number of statistically-defined populations are independent hyper-parameters. We show that the resulting collective dynamics form a dynamical system, where the rank sets the dimensionality and the population structure shapes the dynamics. In particular, the collective dynamics can be described in terms of a simplified effective circuit of interacting latent variables. While having a single, global population strongly restricts the possible dynamics, we demonstrate that if the number of populations is large enough, a rank $R$ network can approximate any $R$-dimensional dynamical system.
  • Stabilization of the rls algorithm in the absence of persistent excitation
    • G Kubin
    Kubin, G. (1988). Stabilization of the rls algorithm in the absence of persistent excitation. ICASSP, 3:1369-1372.
  • A Practical Guide to Applying Echo State Networks. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science
    • M Lukosevicius
    Lukosevicius, M. (2012). A Practical Guide to Applying Echo State Networks. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science. Springer.
  • Learning recurrent neural networks with hessian-free optimization
    • J Martens
    • I Sutskever
    Martens, J. and Sutskever, I. (2011). Learning recurrent neural networks with hessian-free optimization. ICML, pages 1033-1040.
  • Article
    Full-text available
    Autonomous, randomly coupled, neural networks display a transition to chaos at a critical coupling strength. Here, we investigate the effect of a time-varying input on the onset of chaos and the resulting consequences for information processing. Dynamic mean-field theory yields the statistics of the activity, the maximum Lyapunov exponent, and the memory capacity of the network. We find an exact condition that determines the transition from stable to chaotic dynamics and the sequential memory capacity in closed form. The input suppresses chaos by a dynamic mechanism, shifting the transition to significantly larger coupling strengths than predicted by local stability analysis. Beyond linear stability, a regime of coexistent locally expansive but nonchaotic dynamics emerges that optimizes the capacity of the network to store sequential input.
  • Preprint
    Full-text available
    We present a simple model for coherent, spatially correlated chaos in a recurrent neural network. Networks of randomly connected neurons are known to exhibit chaotic fluctuations and have been studied as a model for the temporal variability of neocortical activity. The dynamics generated by such networks, however, are spatially uncorrelated and do not generate coherent fluctuations, which are commonly observed across spatial scales of the neocortex. In our model we introduce a structured component of connectivity, in addition to random connections, which effectively embeds a feedforward structure via unidirectional coupling between a pair of orthogonal modes. Local fluctuations driven by the random connectivity are summed by an output mode and drive coherent activity along an input mode. The orthogonality between input and output mode preserves chaotic fluctuations even as coherence develops. In the regime of weak structured connectivity we apply a perturbative approach to solve the dynamic mean-field equations, showing that in this regime coherent fluctuations are driven passively by the chaos of local residual fluctuations. Strikingly, the chaotic dynamics are not subdued by even very strong structured connectivity as long as we add a detailed balance constraint on the random connectivity. In this regime the system displays longer time-scales and switching-like activity reminiscent of "Up-Down" states observed in cortical circuits. The level of coherence grows with increasing strength of structured connectivity until the dynamics are almost entirely constrained to a single spatial mode. We describe how in this regime the network achieves intermittent self-organized criticality in which the coherent component of the dynamics self-adjusts to yield periods of slow chaos. Furthermore, we show how the dynamics depend qualitatively on the particular realization of the connectivity matrix: a complex leading eigenvector can yield coherent oscillatory chaotic fluctuations while a real leading eigenvector can yield chaos with broken symmetry. We examine the effects of network-size scaling and show that these results are not finite-size effects. Finally, we show that in the regime of weak structured connectivity, coherent chaos emerges also for a generalized structured connectivity with multiple input-output modes.
  • Article
    Full-text available
    Spiking activity of neurons engaged in learning and performing a task show complex spatiotemporal dynamics. While the output of recurrent network models can learn to perform various tasks, the possible range of recurrent dynamics that emerge after learning remains unknown. Here we show that modifying the recurrent connectivity with a recursive least squares algorithm provides sufficient flexibility for synaptic and spiking rate dynamics of spiking networks to produce a wide range of spatiotemporal activity. We apply the training method to learn arbitrary firing patterns, stabilize irregular spiking activity of a balanced network, and reproduce the heterogeneous spiking rate patterns of cortical neurons engaged in motor planning and movement. We identify sufficient conditions for successful learning, characterize two types of learning errors, and assess the network capacity. Our findings show that synaptically-coupled recurrent spiking networks possess a vast computational capability that can support the diverse activity patterns in the brain.
  • Article
    Full-text available
    Musicians can perform at different tempos, speakers can control the cadence of their speech, and children can flexibly vary their temporal expectations of events. To understand the neural basis of such flexibility, we recorded from the medial frontal cortex of nonhuman primates trained to produce different time intervals with different effectors. Neural responses were heterogeneous, nonlinear, and complex, and they exhibited a remarkable form of temporal invariance: firing rate profiles were temporally scaled to match the produced intervals. Recording from downstream neurons in the caudate and from thalamic neurons projecting to the medial frontal cortex indicated that this phenomenon originates within cortical networks. Recurrent neural network models trained to perform the task revealed that temporal scaling emerges from nonlinearities in the network and that the degree of scaling is controlled by the strength of external input. These findings demonstrate a simple and general mechanism for conferring temporal flexibility upon sensorimotor and cognitive functions.
  • Article
    Large scale recordings of neural activity in behaving animals have established that the transformation of sensory stimuli into motor outputs relies on low-dimensional dynamics at the population level, while individual neurons generally exhibit complex, mixed selectivity. Understanding how low-dimensional computations on mixed, distributed representations emerge from the structure of the recurrent connectivity and inputs to cortical networks is a major challenge. Classical models of recurrent networks fall in two extremes: on one hand balanced networks are based on fully random connectivity and generate high-dimensional spontaneous activity, while on the other hand strongly structured, clustered networks lead to low-dimensional dynamics and ad-hoc computations but rely on pure selectivity. A number of functional approaches for training recurrent networks however suggest that a specific type of minimal connectivity structure is sufficient to implement a large range of computations. Starting from this observation, here we study a new class of recurrent network models in which the connectivity consists of a combination of a random part and a minimal, low dimensional structure. We show that in such low-rank recurrent networks, the dynamics are low-dimensional and can be directly inferred from connectivity using a geometrical approach. We exploit this understanding to determine minimal connectivity structures required to implement specific computations. We find that the dynamical range and computational capacity of a network quickly increases with the dimensionality of the structure in the connectivity, so that a rank-two structure is already sufficient to implement a complex behavioral task such as context-dependent decision-making.
  • Article
    Full-text available
    It had previously been shown that generic cortical microcircuit models can perform complex real-time computations on continuous input streams, provided that these computations can be carried out with a rapidly fading memory. We investigate in this article the computational capability of such circuits in the more realistic case where not only readout neurons, but in addition a few neurons within the circuit have been trained for specific tasks. This is essentially equivalent to the case where the output of trained readout neurons is fed back into the circuit. We show that this new model overcomes the limitation of a rapidly fading memory. In fact, we prove that in the idealized case without noise it can carry out any conceivable digital or analog computation on time-varying inputs. But even with noise the resulting computational model can perform a large class of biologically relevant real-time computations that require a non-fading memory. We demonstrate these computational implications of feedback both theoretically and through computer simulations of detailed cortical microcircuit models. We show that the application of simple learning procedures (such as linear regression or perceptron learning) enables such circuits, in spite of their complex inherent dynamics, to represent time over behaviorally relevant long time spans, to integrate evidence from incoming spike trains over longer periods of time, and to process new information contained in such spike trains in diverse ways according to the current internal state of the circuit. In particular we show that such generic cortical microcircuits with feedback provide a new model for working memory that is consistent with a large set of biological constraints. We have shown that feedback increases significantly the computational power of neural circuits. Although this article examines primarily the computational role of feedback in circuits of neurons, the mathematical principles on which its analysis is based apply to a large variety of dynamical systems. Hence they may also throw new light on the computational role of feedback in other complex biological dynamical systems, such as for example genetic regulatory networks.
  • Article
    Networks of randomly connected neurons are among the most popular models in theoretical neuroscience. The connectivity between neurons in the cortex is however not fully random, the simplest and most prominent deviation from randomness found in experimental data being the overrepresentation of bidirectional connections among pyramidal cells. Using numerical and analytical methods, we investigated the effects of partially symmetric connectivity on dynamics in networks of rate units. We considered the two dynamical regimes exhibited by random neural networks: the weak-coupling regime, where the firing activity decays to a single fixed point unless the network is stimulated, and the strong-coupling or chaotic regime, characterized by internally generated fluctuating firing rates. In the weak-coupling regime, we computed analytically for an arbitrary degree of symmetry the auto-correlation of network activity in presence of external noise. In the chaotic regime, we performed simulations to determine the timescale of the intrinsic fluctuations. In both cases, symmetry increases the characteristic asymptotic decay time of the autocorrelation function and therefore slows down the dynamics in the network.
  • Article
    Recurrent neural networks (RNNs) are a class of computational models that are often used as a tool to explain neurobiological phenomena, considering anatomical, electrophysiological and computational constraints. RNNs can either be designed to implement a certain dynamical principle, or they can be trained by input–output examples. Recently, there has been large progress in utilizing trained RNNs both for computational tasks, and as explanations of neural phenomena. I will review how combining trained RNNs with reverse engineering can provide an alternative framework for modeling in neuroscience, potentially serving as a powerful hypothesis generation tool. Despite the recent progress and potential benefits, there are many fundamental gaps towards a theory of these networks. I will discuss these challenges and possible methods to attack them.
  • Article
    Full-text available
    Recurrent networks of non-linear units display a variety of dynamical regimes depending on the structure of their synaptic connectivity. A particularly remarkable phenomenon is the appearance of strongly fluctuating, chaotic activity in networks of deterministic, but randomly connected rate units. How this type of intrinsically generated fluctuations appears in more realistic networks of spiking neurons has been a long standing question. The comparison between rate and spiking networks has in particular been hampered by the fact that most previous studies on randomly connected rate networks focused on highly simplified models, in which excitation and inhibition were not segregated and firing rates fluctuated symmetrically around zero because of built-in symmetries. To ease the comparison between rate and spiking networks, we investigate the dynamical regimes of sparse, randomly-connected rate networks with segregated excitatory and inhibitory populations, and firing rates constrained to be positive. Extending the dynamical mean field theory, we show that network dynamics can be effectively described through two coupled equations for the mean activity and the auto-correlation function. As a consequence, we identify a new signature of intrinsically generated fluctuations on the level of mean firing rates. We moreover found that excitatory-inhibitory networks develop two different fluctuating regimes: for moderate synaptic coupling, recurrent inhibition is sufficient to stabilize fluctuations; for strong coupling, firing rates are stabilized solely by the upper bound imposed on activity. These results extend to more general network architectures, and to rate networks receiving noisy inputs mimicking spiking activity. Finally, we show that signatures of those dynamical regimes appear in networks of integrate-and-fire neurons.
  • Article
    Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here we demonstrate that, starting from random connectivity and modifying a small fraction of connections, a largely disordered recurrent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network Training (PINning), to model and match cellular resolution imaging data from the posterior parietal cortex during a virtual memory-guided two-alternative forced-choice task. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures.
  • Article
    Computational properties of use to biological organisms or to the construction of computers can emerge as collective properties of systems having a large number of simple equivalent components (or neurons). The physical meaning of content-addressable memory is described by an appropriate phase space flow of the state of a system. A model of such a system is given, based on aspects of neurobiology but readily adapted to integrated circuits. The collective properties of this model produce a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size. The algorithm for the time evolution of the state of the system is based on asynchronous parallel processing. Additional emergent collective properties include some capacity for generalization, familiarity recognition, categorization, error correction, and time sequence retention. The collective properties are only weakly sensitive to details of the modeling or the failure of individual devices.
  • Article
    Half-Title Page Wiley Series Page Title Page Copyright Page Dedication Page Table of Contents Preface Acknowledgements Abbreviations and Symbols Notation
  • Article
    Full-text available
    Learning a task induces connectivity changes in neural circuits, thereby changing their dynamics. To elucidate task related neural dynamics we study trained Recurrent Neural Networks. We develop a Mean Field Theory for Reservoir Computing networks trained to have multiple fixed point attractors. Our main result is that the dynamics of the network's output in the vicinity of attractors is governed by a low order linear Ordinary Differential Equation. Stability of the resulting ODE can be assessed, predicting training success or failure. Furthermore, a characteristic time constant, which remains finite at the edge of chaos, offers an explanation of the network's output robustness in the presence of variability of the internal neural dynamics. Finally, the proposed theory predicts state dependent frequency selectivity in network response.
  • Article
    Full-text available
    Firing patterns in the central nervous system often exhibit strong temporal irregularity and heterogeneity in their time averaged response properties. Previous studies suggested that these properties are outcome of an intrinsic chaotic dynamics. Indeed, simplified rate-based large neuronal networks with random synaptic connections are known to exhibit sharp transition from fixed point to chaotic dynamics when the synaptic gain is increased. However, the existence of a similar transition in neuronal circuit models with more realistic architectures and firing dynamics has not been established. In this work we investigate rate based dynamics of neuronal circuits composed of several subpopulations and random connectivity. Nonzero connections are either positive-for excitatory neurons, or negative for inhibitory ones, while single neuron output is strictly positive; in line with known constraints in many biological systems. Using Dynamic Mean Field Theory, we find the phase diagram depicting the regimes of stable fixed point, unstable dynamic and chaotic rate fluctuations. We characterize the properties of systems near the chaotic transition and show that dilute excitatory-inhibitory architectures exhibit the same onset to chaos as a network with Gaussian connectivity. Interestingly, the critical properties near transition depend on the shape of the single- neuron input-output transfer function near firing threshold. Finally, we investigate network models with spiking dynamics. When synaptic time constants are slow relative to the mean inverse firing rates, the network undergoes a sharp transition from fast spiking fluctuations and static firing rates to a state with slow chaotic rate fluctuations. When the synaptic time constants are finite, the transition becomes smooth and obeys scaling properties, similar to crossover phenomena in statistical mechanics
  • Article
    Full-text available
    The brain exhibits temporally complex patterns of activity with features similar to those of chaotic systems. Theoretical studies over the last twenty years have described various computational advantages for such regimes in neuronal systems. Nevertheless, it still remains unclear whether chaos requires specific cellular properties or network architectures, or whether it is a generic property of neuronal circuits. We investigate the dynamics of networks of excitatory-inhibitory (EI) spiking neurons with random sparse connectivity operating in the regime of balance of excitation and inhibition. Combining Dynamical Mean-Field Theory with numerical simulations, we show that chaotic, asynchronous firing rate fluctuations emerge generically for sufficiently strong synapses. Two different mechanisms can lead to these chaotic fluctuations. One mechanism relies on slow I-I inhibition which gives rise to slow subthreshold voltage and rate fluctuations. The decorrelation time of these fluctuations is proportional to the time constant of the inhibition. The second mechanism relies on the recurrent E-I-E feedback loop. It requires slow excitation but the inhibition can be fast. In the corresponding dynamical regime all neurons exhibit rate fluctuations on the time scale of the excitation. Another feature of this regime is that the population-averaged firing rate is substantially smaller in the excitatory population than in the inhibitory population. This is not necessarily the case in the I-I mechanism. Finally, we discuss the neurophysiological and computational significance of our results.
  • Article
    Full-text available
    Providing the neurobiological basis of information processing in higher animals, spiking neural networks must be able to learn a variety of complicated computations, including the generation of appropriate, possibly delayed reactions to inputs and the self-sustained generation of complex activity patterns, e.g. for locomotion. Many such computations require previous building of intrinsic world models. Here we show how spiking neural networks may solve these different tasks. Firstly, we derive constraints under which classes of spiking neural networks lend themselves to substrates of powerful general purpose computing. The networks contain dendritic or synaptic nonlinearities and have a constrained connectivity. We then combine such networks with learning rules for outputs or recurrent connections. We show that this allows to learn even difficult benchmark tasks such as the self-sustained generation of desired low-dimensional chaotic dynamics or memory-dependent computations. Furthermore, we show how spiking networks can build models of external world systems and use the acquired knowledge to control them.
  • Article
    Full-text available
    Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
  • Article
    Full-text available
    Two observations about the cortex have puzzled neuroscientists for a long time. First, neural responses are highly variable. Second, the level of excitation and inhibition received by each neuron is tightly balanced at all times. Here, we demonstrate that both properties are necessary consequences of neural networks that represent information efficiently in their spikes. We illustrate this insight with spiking networks that represent dynamical variables. Our approach is based on two assumptions: We assume that information about dynamical variables can be read out linearly from neural spike trains, and we assume that neurons only fire a spike if that improves the representation of the dynamical variables. Based on these assumptions, we derive a network of leaky integrate-and-fire neurons that is able to implement arbitrary linear dynamical systems. We show that the membrane voltage of the neurons is equivalent to a prediction error about a common population-level signal. Among other things, our approach allows us to construct an integrator network of spiking neurons that is robust against many perturbations. Most importantly, neural variability in our networks cannot be equated to noise. Despite exhibiting the same single unit properties as widely used population code models (e.g. tuning curves, Poisson distributed spike trains), balanced networks are orders of magnitudes more reliable. Our approach suggests that spikes do matter when considering how the brain computes, and that the reliability of cortical representations could have been strongly underestimated.
  • Article
    Full-text available
    Prefrontal cortex is thought to have a fundamental role in flexible, context-dependent behaviour, but the exact nature of the computations underlying this role remains largely unknown. In particular, individual prefrontal neurons often generate remarkably complex responses that defy deep understanding of their contribution to behaviour. Here we study prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice. We find that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population. The population dynamics can be reproduced by a trained recurrent neural network, which suggests a previously unknown mechanism for selection and integration of task-relevant inputs. This mechanism indicates that selection and integration are two aspects of a single dynamical process unfolding within the same prefrontal circuits, and potentially provides a novel, general framework for understanding context-dependent computations.
  • Article
    Full-text available
    The brain's ability to tell time and produce complex spatiotemporal motor patterns is critical for anticipating the next ring of a telephone or playing a musical instrument. One class of models proposes that these abilities emerge from dynamically changing patterns of neural activity generated in recurrent neural networks. However, the relevant dynamic regimes of recurrent networks are highly sensitive to noise; that is, chaotic. We developed a firing rate model that tells time on the order of seconds and generates complex spatiotemporal patterns in the presence of high levels of noise. This is achieved through the tuning of the recurrent connections. The network operates in a dynamic regime that exhibits coexisting chaotic and locally stable trajectories. These stable patterns function as 'dynamic attractors' and provide a feature that is characteristic of biological systems: the ability to 'return' to the pattern being generated in the face of perturbations.
  • Article
    Dynamical systems driven by strong external signals are ubiquitous in nature and engineering. Here we study "echo state networks," networks of a large number of randomly connected nodes, which represent a simple model of a neural network, and have important applications in machine learning. We develop a mean-field theory of echo state networks. The dynamics of the network is captured by the evolution law, similar to a logistic map, for a single collective variable. When the network is driven by many independent external signals, this collective variable reaches a steady state. But when the network is driven by a single external signal, the collective variable is non stationary but can be characterized by its time averaged distribution. The predictions of the mean-field theory, including the value of the largest Lyapunov exponent, are compared with the numerical integration of the equations of motion.
  • Article
    Full-text available
    Recurrent neural networks (RNNs) are useful tools for learning nonlinear relationships between time-varying inputs and outputs with complex temporal dependencies. Recently developed algorithms have been successful at training RNNs to perform a wide variety of tasks, but the resulting networks have been treated as black boxes: their mechanism of operation remains unknown. Here we explore the hypothesis that fixed points, both stable and unstable, and the linearized dynamics around them, can reveal crucial aspects of how RNNs implement their computations. Further, we explore the utility of linearization in areas of phase space that are not true fixed points but merely points of very slow movement. We present a simple optimization technique that is applied to trained RNNs to find the fixed and slow points of their dynamics. Linearization around these slow regions can be used to explore, or reverse-engineer, the behavior of the RNN. We describe the technique, illustrate it using simple examples, and finally showcase it on three high-dimensional RNN examples: a 3-bit flip-flop device, an input-dependent sine wave generator, and a two-point moving average. In all cases, the mechanisms of trained networks could be inferred from the sets of fixed and slow points and the linearized dynamics around them.
  • Article
    In this paper, after giving definitions for a set of commonly used terms in recurrent neural networks (RNNs), all possible RNN architectures based on these definitions are enumerated, and described. Then, most existing RNN architectures are categorized under these headings. Four general neural network architectures, in increasing degree of complexity, are introduced. It is shown that all the existing RNN architectures can be considered as special cases of the general RNN architectures. Furthermore, it is shown how these existing architectures can be transformed to the general RNN architectures. Some open issues concerning RNN architectures are discussed.
  • Article
    We investigate the dynamical behaviour of neural networks with asymmetric synaptic weights, in the presence of random thresholds. We inspect low gain dynamics before using mean-field equations to study the bifurcations of the fixed points and the change of regime that occurs when varying control parameters. We infer different areas with various regimes summarized by a bifurcation map in the parameter space. We numerically show the occurence of chaos that arises generically by a quasi-periodicity route. We then discuss some features of our system in relation with biological observations such as low firing rates and refractory periods.
  • Article
    To answer the questions of how information about the physical world is sensed, in what form is information remembered, and how does information retained in memory influence recognition and behavior, a theory is developed for a hypothetical nervous system called a perceptron. The theory serves as a bridge between biophysics and psychology. It is possible to predict learning curves from neurological variables and vice versa. The quantitative statistical approach is fruitful in the understanding of the organization of cognitive systems. 18 references.
  • Conference Paper
    Full-text available
    Output feedback is crucial for autonomous and parameterized pattern generation with reservoir networks. Read-out learning can lead to error amplification in these settings and therefore regularization is important for both generalization and reduction of error amplification. We show that regularization of the inner reservoir network mitigates parameter dependencies and boosts the task-specific performance. 1
  • Article
    It is known that if one perturbs a large iid random matrix by a bounded rank error, then the majority of the eigenvalues will remain distributed according to the circular law. However, the bounded rank perturbation may also create one or more outlier eigenvalues. We show that if the perturbation is small, then the outlier eigenvalues are created next to the outlier eigenvalues of the bounded rank perturbation; but if the perturbation is large, then many more outliers can be created, and their law is governed by the zeroes of a random Laurent series with Gaussian coefficients. On the other hand, these outliers may be eliminated by enforcing a row sum condition on the final matrix.
  • Article
    Full-text available
    Neuronal activity arises from an interaction between ongoing firing generated spontaneously by neural circuits and responses driven by external stimuli. Using mean-field analysis, we ask how a neural network that intrinsically generates chaotic patterns of activity can remain sensitive to extrinsic input. We find that inputs not only drive network responses, but they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated. The critical input intensity at the phase transition is a nonmonotonic function of stimulus frequency, revealing a "resonant" frequency at which the input is most effective at suppressing chaos even though the power spectrum of the spontaneous activity peaks at zero and falls exponentially. A prediction of our analysis is that the variance of neural responses should be most strongly suppressed at frequencies matching the range over which many sensory systems operate.
  • Article
    Neural circuits display complex activity patterns both spontaneously and when responding to a stimulus or generating a motor output. How are these two forms of activity related? We develop a procedure called FORCE learning for modifying synaptic strengths either external to or within a model neural network to change chaotic spontaneous activity into a wide variety of desired activity patterns. FORCE learning works even though the networks we train are spontaneously chaotic and we leave feedback loops intact and unclamped during learning. Using this approach, we construct networks that produce a wide variety of complex output patterns, input-output transformations that require memory, multiple outputs that can be switched by control inputs, and motor patterns matching human motion capture data. Our results reproduce data on premovement activity in motor and premotor cortex, and suggest that synaptic plasticity may be a more rapid and powerful modulator of network activity than generally appreciated.
  • Article
    Full-text available
    Computational properties of use of biological organisms or to the construction of computers can emerge as collective properties of systems having a large number of simple equivalent components (or neurons). The physical meaning of content-addressable memory is described by an appropriate phase space flow of the state of a system. A model of such a system is given, based on aspects of neurobiology but readily adapted to integrated circuits. The collective properties of this model produce a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size. The algorithm for the time evolution of the state of the system is based on asynchronous parallel processing. Additional emergent collective properties include some capacity for generalization, familiarity recognition, categorization, error correction, and time sequence retention. The collective properties are only weakly sensitive to details of the modeling or the failure of individual devices.
  • Article
    A continuous-time dynamic model of a network of N nonlinear elements interacting via random asymmetric couplings is studied. A self-consistent mean-field theory, exact in the N-->∞ limit, predicts a transition from a stationary phase to a chaotic phase occurring at a critical value of the gain parameter. The autocorrelations of the chaotic flow as well as the maximal Lyapunov exponent are calculated.
  • Article
    We study discrete parallel dynamics of a fully connected network of nonlinear elements interacting via long-range random asymmetric couplings under the influence of external noise. Using dynamical mean-field equations, which become exact in the thermodynamical limit, we calculate the activity and the maximal Lyapunov exponent of the network in dependence of a nonlinearity (gain) parameter and the noise intensity.
  • Article
    We present a method for learning nonlinear systems, echo state networks (ESNs). ESNs employ artificial recurrent neural networks in a way that has recently been proposed independently as a learning mechanism in biological brains. The learning method is computationally efficient and easy to use. On a benchmark task of predicting a chaotic time series, accuracy is improved by a factor of 2400 over previous techniques. The potential for engineering applications is illustrated by equalizing a communication channel, where the signal error rate is improved by two orders of magnitude.
  • Article
    Full-text available
    The dynamics of neural networks is influenced strongly by the spectrum of eigenvalues of the matrix describing their synaptic connectivity. In large networks, elements of the synaptic connectivity matrix can be chosen randomly from appropriate distributions, making results from random matrix theory highly relevant. Unfortunately, classic results on the eigenvalue spectra of random matrices do not apply to synaptic connectivity matrices because of the constraint that individual neurons are either excitatory or inhibitory. Therefore, we compute eigenvalue spectra of large random matrices with excitatory and inhibitory columns drawn from distributions with different means and equal or different variances.
  • Article
    Full-text available
    It has previously been shown that generic cortical microcircuit models can perform complex real-time computations on continuous input streams, provided that these computations can be carried out with a rapidly fading memory. We investigate the computational capability of such circuits in the more realistic case where not only readout neurons, but in addition a few neurons within the circuit, have been trained for specific tasks. This is essentially equivalent to the case where the output of trained readout neurons is fed back into the circuit. We show that this new model overcomes the limitation of a rapidly fading memory. In fact, we prove that in the idealized case without noise it can carry out any conceivable digital or analog computation on time-varying inputs. But even with noise, the resulting computational model can perform a large class of biologically relevant real-time computations that require a nonfading memory. We demonstrate these computational implications of feedback both theoretically, and through computer simulations of detailed cortical microcircuit models that are subject to noise and have complex inherent dynamics. We show that the application of simple learning procedures (such as linear regression or perceptron learning) to a few neurons enables such circuits to represent time over behaviorally relevant long time spans, to integrate evidence from incoming spike trains over longer periods of time, and to process new information contained in such spike trains in diverse ways according to the current internal state of the circuit. In particular we show that such generic cortical microcircuits with feedback provide a new model for working memory that is consistent with a large set of biological constraints. Although this article examines primarily the computational role of feedback in circuits of neurons, the mathematical principles on which its analysis is based apply to a variety of dynamical systems. Hence they may also throw new light on the computational role of feedback in other complex biological dynamical systems, such as, for example, genetic regulatory networks.
  • Article
    Full-text available
    How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an on-line fashion. We have also developed an on-line version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.
  • Article
    Full-text available
    Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered.
  • Article
    The convergence properties of a fairly general class of adaptive recursive least-squares algorithms are studied under the assumption that the data generation mechanism is deterministic and time invariant. First, the (open-loop) identification case is considered. By a suitable notion of excitation subspace, the convergence analysis of the identification algorithm is carried out with no persistent excitation hypothesis, i.e. it is proven that the projection of the parameter error on the excitation subspace tends to zero, while the orthogonal component of the error remains bounded. The convergence of an adaptive control scheme based on the minimum variance control law is then dealt with. It is shown that under the standard minimum-phase assumption, the tracking error converges to zero whenever the reference signal is bounded. Furthermore, the control variable turns out to be bounded
  • Article
    Full-text available
    Echo state networks (ESN) are a novel approach to recurrent neural network training. An ESN consists of a large, fixed, recurrent "reservoir" network, from which the desired output is obtained by training suitable output connection weights. Determination of optimal output weights becomes a linear, uniquely solvable task of MSE minimization. This article reviews the basic ideas and describes an online adaptation scheme based on the RLS algorithm known from adaptive linear systems. As an example, a 10-th order NARMA system is adaptively identified. The known benefits of the RLS algorithms carry over from linear systems to nonlinear ones; specifically, the convergence rate and misadjustment can be determined at design time.
  • Article
    Gradient descent algorithms in recurrent neural networks can have problems when the network dynamics experience bifurcations in the course of learning. The possible hazards caused by the bifurcations of the network dynamics and the learning equations are investigated. The roles of teacher forcing, preprogramming of network structures, and the approximate learning algorithms are discussed. 1 Introduction Supervised learning in recurrent neural networks has been extensively applied to speech recognition, language processing [2, 5, 6], and the modeling of biological neural networks [1, 11, 16, 18]. Although gradient descent algorithms for recurrent networks are considered as a simple extension to the back-propagation learning for feed-forward networks, there is an essential difference between the learning processes in feed-forward and recurrent networks. The output of a feed-forward network is a continuous function of the weights if each unit has a smooth output function, such as a sig...