## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

Recurrent neural networks have been extensively studied in the context of neuroscience and machine learning due to their ability to implement complex computations. While substantial progress in designing effective learning algorithms has been achieved, a full understanding of trained recurrent networks is still lacking. Specifically, the mechanisms that allow computations to emerge from the underlying recurrent dynamics are largely unknown. Here we focus on a simple yet underexplored computational setup: a feedback architecture trained to associate a stationary output to a stationary input. As a starting point, we derive an approximate analytical description of global dynamics in trained networks, which assumes uncorrelated connectivity weights in the feedback and in the random bulk. The resulting mean-field theory suggests that the task admits several classes of solutions, which imply different stability properties. Different classes are characterized in terms of the geometrical arrangement of the readout with respect to the input vectors, defined in the high-dimensional space spanned by the network population. We find that such an approximate theoretical approach can be used to understand how standard training techniques implement the input-output task in finite-size feedback networks. In particular, our simplified description captures the local and the global stability properties of the target solution, and thus predicts training performance.

To read the full-text of this research,

you can request a copy directly from the authors.

... What kind of closed-loop dynamics can feedback networks implement? Despite some theoretical advancement [4,[9][10][11][12], computational properties of feedback networks are still poorly understood. Early theoretical work has indicated that most feedback models are expected to be able to approximate readout signals characterized by arbitrarily complex dynamics [4]. ...

... Early theoretical work has indicated that most feedback models are expected to be able to approximate readout signals characterized by arbitrarily complex dynamics [4]. However, it has been reported that not all feedback architectures and target dynamics result in the same performance: trained networks can experience dynamical instabilities [10,11] and converge to fragile solutions for certain choices of the feedback architecture and parameters [7,13]. ...

... This makes the readout solution highly sensitive to training initial conditions and transient network dynamics. Note that online algorithms come with the advantage of being able to avoid readout solutions that correspond to local instabilities in the closed-loop dynamics [7,11]. This advantage does not seem to play a role for the current task, for which LS-based algorithms typically result in stable dynamics (Fig. 4). ...

A fundamental feature of complex biological systems is the ability to form feedback interactions with their environment. A prominent model for studying such interactions is reservoir computing, where learning acts on low-dimensional bottlenecks. Despite the simplicity of this learning scheme, the factors contributing to or hindering the success of training in reservoir networks are in general not well understood. In this work, we study nonlinear feedback networks trained to generate a sinusoidal signal, and analyze how learning performance is shaped by the interplay between internal network dynamics and target properties. By performing exact mathematical analysis of linearized networks, we predict that learning performance is maximized when the target is characterized by an optimal, intermediate frequency which monotonically decreases with the strength of the internal reservoir connectivity. At the optimal frequency, the reservoir representation of the target signal is high-dimensional, desynchronized, and thus maximally robust to noise. We show that our predictions successfully capture the qualitative behavior of performance in nonlinear networks. Moreover, we find that the relationship between internal representations and performance can be further exploited in trained nonlinear networks to explain behaviors which do not have a linear counterpart. Our results indicate that a major determinant of learning success is the quality of the internal representation of the target, which in turn is shaped by an interplay between parameters controlling the internal network and those defining the task.

... Our model in contrast exhibits an interplay between low rank structured connectivity implementing balance, and high rank disordered connectivity inducing chaos, each with independently adjustable strengths. In general, how computation emerges from an interplay between structured and random connectivity has been a subject of recent interest in theoretical neuroscience [18,23,24,25]. Here we show how structure and randomness interact by obtaining analytic insights into the efficacy of predictive coding, dissecting the individual contributions of balance, noise, weight disorder, chaos, delays and nonlinearity, in a model were all ingredients can coexist and be independently adjusted. ...

... We find: (1) strong balance is a key requirement for superclassical error scaling with network size; (2) without delays, increasing balance always suppresses errors via powers laws with different exponents (-1 for noise, -2 for chaos); (3) delays yield an oscillatory instability and a tradeoff between noise suppression and resonant amplification; (4) this tradeoff sets a maximal critical balance level which decreases with delay; (5) noise or chaos can increase this maximal level by promoting desynchronization; (6) the competition between noise suppression and resonant amplification sets an optimal balance level that is half the maximal level in the case of noise; (7) but is close to the maximal level in the case of chaos for small delays, because the slow chaos has small power at the high resonant frequency; (8) the optimal decoder error rises as a power law with delay (with exponent 1/2 for noise and 1 for chaos). Also, our model unifies a variety of perspectives in theoretical neuroscience, spanning classical synaptic balance [17,29,42,43,44,45], efficient coding in tight balance [7,46], the interplay of structured and random connectivity in computation [18,23,24,47,48], the relation between oscillations and delays in neural networks [49,50,51] and predictive coding [8,10]. Moreover, the mean-field theory developed here can be extended to spiking neurons with strong recurrent balance and delays [52], analytically explaining relations between delays, coding and oscillations observed in simulations but previously not understood [21,22] Acknowledgments JK thanks the Swartz Foundation for Theoretical Neuroscience for funding; JT thanks the National Science Foundation for funding. ...

... We now turn to compute the statistics of the fluctuations of a random network in its chaotic phase, when the variance of the weight distribution is above the critical transition point g > g c . The dynamic mean field theory for a chaotic neural network was first introduced by [15] and re-derived later by [35,38,36,53,24,54]. The connectivity in the subspace orthogonal to the readout direction is randomly distributed, thus the properties of the fluctuations in this subspace, δh ⊥ (t), are equivalent to previous studies of random neural networks. ...

Biological neural networks face a formidable task: performing reliable computations in the face of intrinsic stochasticity in individual neurons, imprecisely specified synaptic connectivity, and nonnegligible delays in synaptic transmission. A common approach to combatting such biological heterogeneity involves averaging over large redundant networks of $N$ neurons resulting in coding errors that decrease classically as $1/\sqrt{N}$. Recent work demonstrated a novel mechanism whereby recurrent spiking networks could efficiently encode dynamic stimuli, achieving a superclassical scaling in which coding errors decrease as $1/N$. This specific mechanism involved two key ideas: predictive coding, and a tight balance, or cancellation between strong feedforward inputs and strong recurrent feedback. However, the theoretical principles governing the efficacy of balanced predictive coding and its robustness to noise, synaptic weight heterogeneity and communication delays remain poorly understood. To discover such principles, we introduce an analytically tractable model of balanced predictive coding, in which the degree of balance and the degree of weight disorder can be dissociated unlike in previous balanced network models, and we develop a mean field theory of coding accuracy. Overall, our work provides and solves a general theoretical framework for dissecting the differential contributions neural noise, synaptic disorder, chaos, synaptic delays, and balance to the fidelity of predictive neural codes, reveals the fundamental role that balance plays in achieving superclassical scaling, and unifies previously disparate models in theoretical neuroscience.

... What kind of closed-loop dynamics can feedback networks implement? Despite some theoretical advancement [4,9,10,11,12], computational properties of feedback networks are still poorly understood. ...

... Early theoretical work has indicated that most feedback models are expected to be able to approximate readout signals characterized by arbitrarily complex dynamics [4]. However, it has been reported that not all feedback architectures and target dynamics result in the same performance: trained networks can experience dynamical instabilities [10,11], and converge to fragile solutions for certain choices of the feedback architecture and parameters [7,13]. ...

A fundamental feature of complex biological systems is the ability to form feedback interactions with their environment. A prominent model for studying such interactions is reservoir computing, where learning acts on low-dimensional bottlenecks. Despite the simplicity of this learning scheme, the factors contributing to or hindering the success of training in reservoir networks are in general not well understood. In this work, we study non-linear feedback networks trained to generate a sinusoidal signal, and analyze how learning performance is shaped by the interplay between internal network dynamics and target properties. By performing exact mathematical analysis of linearized networks, we predict that learning performance is maximized when the target is characterized by an optimal, intermediate frequency which monotonically decreases with the strength of the internal reservoir connectivity. At the optimal frequency, the reservoir representation of the target signal is high-dimensional, de-synchronized, and thus maximally robust to noise. We show that our predictions successfully capture the qualitative behaviour of performance in non-linear networks. Moreover, we find that the relationship between internal representations and performance can be further exploited in trained non-linear networks to explain behaviours which do not have a linear counterpart. Our results indicate that a major determinant of learning success is the quality of the internal representation of the target, which in turn is shaped by an interplay between parameters controlling the internal network and those defining the task.

... Investigating these effects in our tasks would require the addition of recurrent connectivity to the model. Mathematical tools for analyzing learning dynamics in recurrent networks is starting to become available (Mastrogiuseppe and Ostojic, 2019;Schuessler et al., 2020;Dubreuil et al., 2022;Susman et al., 2021), which could allow our analysis to be extended in that direction. ...

The ability to associate sensory stimuli with abstract classes is critical for survival. How are these associations implemented in brain circuits? And what governs how neural activity evolves during abstract knowledge acquisition? To investigate these questions, we consider a circuit model that learns to map sensory input to abstract classes via gradient-descent synaptic plasticity. We focus on typical neuroscience tasks (simple, and context-dependent, categorization), and study how both synaptic connectivity and neural activity evolve during learning. To make contact with the current generation of experiments, we analyze activity via standard measures such as selectivity, correlations, and tuning symmetry. We find that the model is able to recapitulate experimental observations, including seemingly disparate ones. We determine how, in the model, the behaviour of these measures depends on details of the circuit and the task. These dependencies make experimentally testable predictions about the circuitry supporting abstract knowledge acquisition in the brain.

... Echo-state and FORCE networks therefore correspond to low-rank networks with an additionnal full-rank, random term in the connectivity Ostojic, 2018, 2019). Because the feedback loops are trained to produce specific outputs, the low-rank part of the connectivity is typically correlated to the random connectivity term (but see Mastrogiuseppe and Ostojic (2019)). Such correlations increase the dimensionality and the range of the dynamics (Schuessler et al., 2020a;Logiaco et al., 2019), although the low-rank connectivity structure and the number of populations still generate strong constraints. ...

Neural activity in awake animals exhibits a vast range of timescales giving rise to behavior that can adapt to a constantly evolving environment. How are such complex temporal patterns generated in the brain, given that individual neurons function with membrane time constants in the range of tens of milliseconds? How can neural computations rely on such activity patterns to produce flexible temporal behavior? One hypothesis posits that long timescales at the level of neural network dynamics can be inherited from long timescales of underlying biophysical processes at the single neuron level, such as adaptive ionic currents and synaptic transmission. We analyzed large networks of randomly connected neurons taking into account these slow cellular process, and characterized the temporal statistics of the emerging neural activity. Our overarching result is that the timescales of different biophysical processes do not necessarily induce a wide range of timescales in the collective activity of large recurrent networks. Conversely, complex temporal patterns can be generated by structure in synaptic connectivity. In the second chapter of the dissertation, we considered a novel class of models, Gaussian-mixture low-rank recurrent networks, in which connectivity structure is characterized by two independent properties, the rank of the connectivity matrix and the number of statistically defined populations. We show that such networks act as universal approximators of arbitrary low-dimensional dynamical systems, and therefore can generate temporally complex activity. In the last chapter, we investigated how dynamical mechanisms at the network level implement flexible sensorimotor timing tasks. We first show that low-rank networks trained on such tasks generate low-dimensional invariant manifolds, where dynamics evolve slowly and can be flexibly modulated. We then identified the core dynamical components and tested them in simplified network models that carry out the same flexible timing tasks. Overall, we uncovered novel dynamical mechanisms for temporal flexibility that rely on minimal connectivity structure and can implement a vast range of computations.

... Investigating these effects in our tasks would require the addition of recurrent connectivity 485 to the model. Mathematical tools for analyzing learning dynamics in recurrent networks is starting to become 486 available [65][66][67][68], which could allow our analysis to be extended in that direction. ...

The ability to associate sensory stimuli with abstract classes is critical for survival. How are these associations implemented in brain circuits? And what governs how neural activity evolves during abstract knowledge acquisition? To investigate these questions, we consider a circuit model that learns to map sensory inputs into abstract classes via gradient descent synaptic plasticity. We focus on typical neuroscience tasks (simple, and context-dependent, categorization), and study how both synaptic connectivity and neural activity evolve during learning. To make contact with the current generation of experiments we focus on the latter, and analyze activity via standard measures such as selectivity, correlations, and tuning symmetry. We find that the model is able to capture experimental observations, including seemingly disparate ones. We determine how, in the model, the behaviour of these activity measures depends on details of the circuit and the task. These dependencies make experimentally-testable predictions about the circuitry supporting abstract knowledge acquisition in the brain.

... In particular, it was proven that RCN are universal function approximators 19 and that their representations are rich enough to correctly embed dynamical systems through their state-space representation 20,21 . Theoretical analysis of this learning principle led to many results about their expressive power [22][23][24][25][26] . Moreover, interesting results can be derived when assuming linear dynamics [27][28][29][30][31] . ...

In recent years, the artificial intelligence community has seen a continuous interest in research aimed at investigating dynamical aspects of both training procedures and machine learning models. Of particular interest among recurrent neural networks, we have the Reservoir Computing (RC) paradigm characterized by conceptual simplicity and a fast training scheme. Yet, the guiding principles under which RC operates are only partially understood. In this work, we analyze the role played by Generalized Synchronization (GS) when training a RC to solve a generic task. In particular, we show how GS allows the reservoir to correctly encode the system generating the input signal into its dynamics. We also discuss necessary and sufficient conditions for the learning to be feasible in this approach. Moreover, we explore the role that ergodicity plays in this process, showing how its presence allows the learning outcome to apply to multiple input trajectories. Finally, we show that satisfaction of the GS can be measured by means of the mutual false nearest neighbors index, which makes effective to practitioners theoretical derivations.

... Echo-state and FORCE networks therefore correspond to low-rank networks with an additionnal full-rank, random term in the connectivity (Mastrogiuseppe & Ostojic, 2018. Because the feedback loops are trained to produce specific outputs, the low-rank part of the connectivity is typically correlated to the random connectivity term (but see Mastrogiuseppe & Ostojic, 2019). Such correlations increase the dimensionality and the range of the dynamics Logiaco, Abbott, & Escola, 2019), although the low-rank connectivity structure and the number of populations still generate strong constraints. ...

An emerging paradigm proposes that neural computations can be understood at the level of dynamic systems that govern low-dimensional trajectories of collective neural activity. How the connectivity structure of a network determines the emergent dynamical system, however, remains to be clarified. Here we consider a novel class of models, gaussian-mixture, low-rank recurrent networks in which the rank of the connectivity matrix and the number of statistically defined populations are independent hyperparameters. We show that the resulting collective dynamics form a dynamical system, where the rank sets the dimensionality and the population structure shapes the dynamics. In particular, the collective dynamics can be described in terms of a simplified effective circuit of interacting latent variables. While having a single global population strongly restricts the possible dynamics, we demonstrate that if the number of populations is large enough, a rank R network can approximate any R-dimensional dynamical system.

... Memory capacity maximization does not necessarily imply performance (e.g., prediction) maximization [11]. In recent years, large efforts have been devoted to tackle these problems, by studying the dynamical systems behind RNNs [2,12,13,3,14,15]. ...

Reservoir computing is a popular approach to design recurrent neural networks, due to its training simplicity and approximation performance. The recurrent part of these networks is not trained (e.g., via gradient descent), making them appealing for analytical studies by a large community of researchers with backgrounds spanning from dynamical systems to neuroscience. However, even in the simple linear case, the working principle of these networks is not fully understood and their design is usually driven by heuristics. A novel analysis of the dynamics of such networks is proposed, which allows the investigator to express the state evolution using the controllability matrix. Such a matrix encodes salient characteristics of the network dynamics; in particular, its rank represents an input-independent measure of the memory capacity of the network. Using the proposed approach, it is possible to compare different reservoir architectures and explain why a cyclic topology achieves favorable results as verified by practitioners.

... In particular, it was proven that RCN are universal function approximators 19 and that their representations are rich enough to correctly embed dynamical systems through their state-space representation 20,21 . Theoretical analysis of this learning principle led to many results about their expressive power [22][23][24][25][26] . Moreover, interesting results can be derived when assuming linear dynamics [27][28][29][30][31] . ...

In recent years, the machine learning community has seen a continuous growing interest in research aimed at investigating dynamical aspects of both training procedures and perfected models. Of particular interest among recurrent neural networks, we have the Reservoir Computing (RC) paradigm for its conceptual simplicity and fast training scheme. Yet, the guiding principles under which RC operates are only partially understood. In this work, we study the properties behind learning dynamical systems with RC and propose a new guiding principle based on Generalized Synchronization (GS) granting its feasibility. We show that the well-known Echo State Property (ESP) implies and is implied by GS, so that theoretical results derived from the ESP still hold when GS does. However, by using GS one can profitably study the RC learning procedure by linking the reservoir dynamics with the readout training. Notably, this allows us to shed light on the interplay between the input encoding performed by the reservoir and the output produced by the readout optimized for the task at hand. In addition, we show that - as opposed to the ESP - satisfaction of the GS can be measured by means of the Mutual False Nearest Neighbors index, which makes effective to practitioners theoretical derivations.

... Theoretical clarity on the nature of individuality and universality in nonlinear RNN dynamics is largely lacking 4 , with some exceptions [29,30,31,32]. Therefore, with the above neuroscientific and theoretical motivations in mind, we initiate an extensive numerical study of the variations in RNN dynamics across thousands of RNNs with varying modelling choices. ...

Task-based modeling with recurrent neural networks (RNNs) has emerged as a popular way to infer the computational function of different brain regions. These models are quantitatively assessed by comparing the low-dimensional neural representations of the model with the brain, for example using canonical correlation analysis (CCA). However, the nature of the detailed neurobiological inferences one can draw from such efforts remains elusive. For example, to what extent does training neural networks to solve common tasks uniquely determine the network dynamics, independent of modeling architectural choices? Or alternatively, are the learned dynamics highly sensitive to different model choices? Knowing the answer to these questions has strong implications for whether and how we should use task-based RNN modeling to understand brain dynamics. To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics. We find the geometry of the RNN representations can be highly sensitive to different network architectures, yielding a cautionary tale for measures of similarity that rely on representational geometry, such as CCA. Moreover, we find that while the geometry of neural dynamics can vary greatly across architectures, the underlying computational scaffold-the topological structure of fixed points, transitions between them, limit cycles, and linearized dynamics-often appears universal across all architectures.

... This simple training protocol is not sufficient in many applications, e.g. when it is required to learn memory states. To this end, training mechanisms based on output feedback [24,25] and online training [10,26] have been proposed, with successful applications in physics [27,28], complex systems modelling [29,30], and neuroscience [31], just to name a few. ...

A recurrent neural network (RNN) possesses the echo state property (ESP) if, for a given input sequence, it “forgets” any internal states of the driven (nonautonomous) system and asymptotically follows a unique, possibly complex trajectory. The lack of ESP is conventionally understood as a lack of reliable behaviour in RNNs. Here, we show that RNNs can reliably perform computations under a more general principle that accounts only for their local behaviour in phase space. To this end, we formulate a generalisation of the ESP and introduce an echo index to characterise the number of simultaneously stable responses of a driven RNN. We show that it is possible for the echo index to change with inputs, highlighting a potential source of computational errors in RNNs due to characteristics of the inputs driving the dynamics.

... From a system perspective, the feedforward neural net-work model has limited computing power, and the feedback dynamics of a feedback neural network more stronger computing power than a feedforward neural network, which is based on feedback to enhance global stability [88]. In feedback neural networks, all neurons have the same status and there is no hierarchical difference. ...

Compared with von Neumann’s computer architecture, neuromorphic systems offer more unique and novel solutions to the artificial intelligence discipline. Inspired by biology, this novel system has implemented the theory of human brain modeling by connecting feigned neurons and synapses to reveal the new neuroscience concepts. Many researchers have vastly invested in neuro-inspired models, algorithms, learning approaches, operation systems for the exploration of the neuromorphic system and have implemented many corresponding applications. Recently, some researchers have demonstrated the capabilities of Hopfield algorithms in some large-scale notable hardware projects and seen significant progression. This paper presents a comprehensive review and focuses extensively on the Hopfield algorithm’s model and its potential advancement in new research applications. Towards the end, we conclude with a broad discussion and a viable plan for the latest application prospects to facilitate developers with a better understanding of the aforementioned model in accordance to build their own artificial intelligence projects.

... From a system perspective, the feedforward neural net-work model has limited computing power, and the feedback dynamics of a feedback neural network more stronger computing power than a feedforward neural network, which is based on feedback to enhance global stability [88]. In feedback neural networks, all neurons have the same status and there is no hierarchical difference. ...

Compared with von Neumann's computer architecture, neuromorphic systems offer more unique and novel solutions to the artificial intelligence discipline. Inspired by biology, this novel system has implemented the theory of human brain modeling by connecting feigned neurons and synapses to reveal the new neuroscience concepts. Many researchers have vastly invested in neuro-inspired models, algorithms, learning approaches, operation systems for the exploration of the neuromorphic system and have implemented many corresponding applications. Recently, some researchers have demonstrated the capabilities of Hopfield algorithms in some large-scale notable hardware projects and seen significant progression. This paper presents a comprehensive review and focuses extensively on the Hopfield algorithm's model and its potential advancement in new research applications. Towards the end, we conclude with a broad discussion and a viable plan for the latest application prospects to facilitate developers with a better understanding of the aforementioned model in accordance to build their own artificial intelligence projects. INDEX TERMS Neuromorphic computing, Neuro-inspired model, Hopfield algorithm, Artificial intelligence.

... Maximizing memory does not necessarily lead to performance (e.g., prediction) maximization [11]. In recent years, a large effort has been devoted to tackle these problems, by studying the dynamical systems underlying RNNs [2,12,13,3,14,15]. ...

Reservoir computing is a popular approach to design recurrent neural networks, due to its training simplicity and its approximation performance. The recurrent part of these networks is not trained (e.g. via gradient descent), making them appealing for analytical studies, raising the interest of a vast community of researcher spanning from dynamical systems to neuroscience. It emerges that, even in the simple linear case, the working principle of these networks is not fully understood and the applied research is usually driven by heuristics. A novel analysis of the dynamics of such networks is proposed, which allows one to express the state evolution using the controllability matrix. Such a matrix encodes salient characteristics of the network dynamics: in particular, its rank can be used as an input-indepedent measure of the memory of the network. Using the proposed approach, it is possible to compare different architectures and explain why a cyclic topology achieves favourable results.

... Our approach allows us to gain mechanistic insight into the computations underlying echo state and FORCE learning models which have the same connectivity structure as our model [20,21]. Here, the readout vector n is trained, which leads to correlations to the random part J [3,34]. Our results on multiple fixed points and oscillations show that these correlations are crucial for the rich functional repertoire. ...

A given neural network in the brain is involved in many different tasks. This implies that, when considering a specific task, the network's connectivity contains a component which is related to the task and another component which can be considered random. Understanding the interplay between the structured and random components and their effect on network dynamics and functionality is an important open question. Recent studies addressed the coexistence of random and structured connectivity but considered the two parts to be uncorrelated. This constraint limits the dynamics and leaves the random connectivity nonfunctional. Algorithms that train networks to perform specific tasks typically generate correlations between structure and random connectivity. Here we study nonlinear networks with correlated structured and random components, assuming the structure to have a low rank. We develop an analytic framework to establish the precise effect of the correlations on the eigenvalue spectrum of the joint connectivity. We find that the spectrum consists of a bulk and multiple outliers, whose location is predicted by our theory. Using mean-field theory, we show that these outliers directly determine both the fixed points of the system and their stability. Taken together, our analysis elucidates how correlations allow structured and random connectivity to synergistically extend the range of computations available to networks.

... This simple training protocol is not sufficient in many applications, e.g. when it is required to learn memory states. To this end, training mechanisms based on output feedback [24,25] and online training [10,26] have been proposed, with successful applications in physics [27,28], complex systems modeling [29,30], and neuroscience [31], just to name a few. ...

A recurrent neural network (RNN) possesses the echo state property (ESP) if, for a given input sequence, it "forgets" any internal states of the driven (nonautonomous) system and asymptotically follows a unique, possibly complex trajectory. The lack of ESP is conventionally understood as a lack of reliable behaviour in RNNs. Here, we show that RNNs can reliably perform computations under a more general principle that accounts only for their local behaviour in phase space. To this end, we formulate a generalisation of the ESP and introduce an echo index to characterise the number of simultaneously stable responses of a driven RNN. We show that it is possible for the echo index to change with inputs, highlighting a potential source of computational errors in RNNs due to characteristics of the inputs driving the dynamics.

... Our approach allows us to gain mechanistic insight into the computations underlying echo state and FORCE learning models which have the same connectivity structure as our model [20,21]. Here, the readout vector n is trained, which leads to correlations to the random part J [3,31]. Our results on multiple fixed points and oscillations show that these correlations are crucial for the rich functional repertoire. ...

A given neural network in the brain is involved in many different tasks. This implies that, when considering a specific task, the network's connectivity contains a component which is related to the task and another component which can be considered random. Understanding the interplay between the structured and random components, and their effect on network dynamics and functionality is an important open question. Recent studies addressed the co-existence of random and structured connectivity, but considered the two parts to be uncorrelated. This constraint limits the dynamics and leaves the random connectivity non-functional. Algorithms that train networks to perform specific tasks typically generate correlations between structure and random connectivity. Here we study nonlinear networks with correlated structured and random components, assuming the structure to have a low rank. We develop an analytic framework to establish the precise effect of the correlations on the eigenvalue spectrum of the joint connectivity. We find that the spectrum consists of a bulk and multiple outliers, whose location is predicted by our theory. Using mean-field theory, we show that these outliers directly determine both the fixed points of the system and their stability. Taken together, our analysis elucidates how correlations allow structured and random connectivity to synergistically extend the range of computations available to networks.

The neural mechanisms that generate an extensible library of motor motifs and flexibly string them into arbitrary sequences are unclear. We developed a model in which inhibitory basal ganglia output neurons project to thalamic units that are themselves bidirectionally connected to a recurrent cortical network. We model the basal ganglia inhibitory patterns as silencing some thalamic neurons while leaving others disinhibited and free to interact with cortex during specific motifs. We show that a small number of disinhibited thalamic neurons can control cortical dynamics to generate specific motor output in a noise-robust way. Additionally, a single “preparatory” thalamocortical network can produce fast cortical dynamics that support rapid transitions between any pair of learned motifs. If the thalamic units associated with each sequence component are segregated, many motor outputs can be learned without interference and then combined in arbitrary orders for the flexible production of long and complex motor sequences.

An emerging paradigm proposes that neural computations can be understood at the level of dynamical systems that govern low-dimensional trajectories of collective neural activity. How the connectivity structure of a network determines the emergent dynamical system however remains to be clarified. Here we consider a novel class of models, Gaussian-mixture low-rank recurrent networks, in which the rank of the connectivity matrix and the number of statistically-defined populations are independent hyper-parameters. We show that the resulting collective dynamics form a dynamical system, where the rank sets the dimensionality and the population structure shapes the dynamics. In particular, the collective dynamics can be described in terms of a simplified effective circuit of interacting latent variables. While having a single, global population strongly restricts the possible dynamics, we demonstrate that if the number of populations is large enough, a rank $R$ network can approximate any $R$-dimensional dynamical system.

Autonomous, randomly coupled, neural networks display a transition to chaos at a critical coupling strength. Here, we investigate the effect of a time-varying input on the onset of chaos and the resulting consequences for information processing. Dynamic mean-field theory yields the statistics of the activity, the maximum Lyapunov exponent, and the memory capacity of the network. We find an exact condition that determines the transition from stable to chaotic dynamics and the sequential memory capacity in closed form. The input suppresses chaos by a dynamic mechanism, shifting the transition to significantly larger coupling strengths than predicted by local stability analysis. Beyond linear stability, a regime of coexistent locally expansive but nonchaotic dynamics emerges that optimizes the capacity of the network to store sequential input.

We present a simple model for coherent, spatially correlated chaos in a recurrent neural network. Networks of randomly connected neurons exhibit chaotic fluctuations and have been studied as a model for capturing the temporal variability of cortical activity. The dynamics generated by such networks, however, are spatially uncorrelated and do not generate coherent fluctuations, which are commonly observed across spatial scales of the neocortex. In our model we introduce a structured component of connectivity, in addition to random connections, which effectively embeds a feedforward structure via unidirectional coupling between a pair of orthogonal modes. Local fluctuations driven by the random connectivity are summed by an output mode and drive coherent activity along an input mode. The orthogonality between input and output mode preserves chaotic fluctuations even as coherence develops. In the regime of weak structured connectivity we apply a perturbative approach to solve the dynamic mean-field equations, showing that in this regime coherent fluctuations are driven passively by the chaos of local residual fluctuations. Strikingly, the chaotic dynamics are not subdued even by very strong structured connectivity if we add a row balance constraint on the random connectivity. In this regime the system displays longer time-scales and switching-like activity reminiscent of “Up-Down” states observed in cortical circuits. The level of coherence grows with increasing strength of structured connectivity until the dynamics are almost entirely constrained to a single spatial mode. We describe how in this regime the model achieves intermittent self-tuned criticality in which the coherent component of the dynamics self-adjusts to yield periods of slow chaos. Furthermore, we show how the dynamics depend qualitatively on the particular realization of the connectivity matrix: a complex leading eigenvalue can yield coherent oscillatory chaos while a real leading eigenvalue can yield chaos with broken symmetry. We examine the effects of network-size scaling and show that these results are not finite-size effects. Finally, we show that in the regime of weak structured connectivity, coherent chaos emerges also for a generalized structured connectivity with multiple input-output modes.
Author Summary
Neural activity observed in the neocortex is temporally variable, displaying irregular temporal fluctuations at every accessible level of measurement. Furthermore, these temporal fluctuations are often found to be spatially correlated whether at the scale of local measurements such as membrane potentials and spikes, or global measurements such as EEG and fMRI. A thriving field of study has developed models of recurrent networks which intrinsically generate irregular temporal variability, the paradigmatic example being networks of randomly connected rate neurons which exhibit chaotic dynamics. These models have been examined analytically and numerically in great detail, yet until now the intrinsic variability generated by these networks have been spatially uncorrelated, yielding no large-scale coherent fluctuations. Here we present a simple model of a recurrent network of firing-rate neurons that intrinsically generates spatially correlated activity yielding coherent fluctuations across the entire network. The model incorporates random connections and adds a structured component of connectivity that sums network activity over a spatial “output” mode and projects it back to the network along an orthogonal “input” mode. We show that this form of structured connectivity is a general mechanism for producing coherent chaos.

Spiking activity of neurons engaged in learning and performing a task show complex spatiotemporal dynamics. While the output of recurrent network models can learn to perform various tasks, the possible range of recurrent dynamics that emerge after learning remains unknown. Here we show that modifying the recurrent connectivity with a recursive least squares algorithm provides sufficient flexibility for synaptic and spiking rate dynamics of spiking networks to produce a wide range of spatiotemporal activity. We apply the training method to learn arbitrary firing patterns, stabilize irregular spiking activity of a balanced network, and reproduce the heterogeneous spiking rate patterns of cortical neurons engaged in motor planning and movement. We identify sufficient conditions for successful learning, characterize two types of learning errors, and assess the network capacity. Our findings show that synaptically-coupled recurrent spiking networks possess a vast computational capability that can support the diverse activity patterns in the brain.

Musicians can perform at different tempos, speakers can control the cadence of their speech, and children can flexibly vary their temporal expectations of events. To understand the neural basis of such flexibility, we recorded from the medial frontal cortex of nonhuman primates trained to produce different time intervals with different effectors. Neural responses were heterogeneous, nonlinear, and complex, and they exhibited a remarkable form of temporal invariance: firing rate profiles were temporally scaled to match the produced intervals. Recording from downstream neurons in the caudate and from thalamic neurons projecting to the medial frontal cortex indicated that this phenomenon originates within cortical networks. Recurrent neural network models trained to perform the task revealed that temporal scaling emerges from nonlinearities in the network and that the degree of scaling is controlled by the strength of external input. These findings demonstrate a simple and general mechanism for conferring temporal flexibility upon sensorimotor and cognitive functions.

It had previously been shown that generic cortical microcircuit models can perform complex real-time computations on continuous input streams, provided that these computations can be carried out with a rapidly fading memory. We investigate in this article the computational capability of such circuits in the more realistic case where not only readout neurons, but in addition a few neurons within the circuit have been trained for specific tasks. This is essentially equivalent to the case where the output of trained readout neurons is fed back into the circuit. We show that this new model overcomes the limitation of a rapidly fading memory. In fact, we prove that in the idealized case without noise it can carry out any conceivable digital or analog computation on time-varying inputs. But even with noise the resulting computational model can perform a large class of biologically relevant real-time computations that require a non-fading memory. We demonstrate these computational implications of feedback both theoretically and through computer simulations of detailed cortical microcircuit models. We show that the application of simple learning procedures (such as linear regression or perceptron learning) enables such circuits, in spite of their complex inherent dynamics, to represent time over behaviorally relevant long time spans, to integrate evidence from incoming spike trains over longer periods of time, and to process new information contained in such spike trains in diverse ways according to the current internal state of the circuit. In particular we show that such generic cortical microcircuits with feedback provide a new model for working memory that is consistent with a large set of biological constraints. We have shown that feedback increases significantly the computational power of neural circuits. Although this article examines primarily the computational role of feedback in circuits of neurons, the mathematical principles on which its analysis is based apply to a large variety of dynamical systems. Hence they may also throw new light on the computational role of feedback in other complex biological dynamical systems, such as for example genetic regulatory networks.

Recurrent networks of non-linear units display a variety of dynamical regimes depending on the structure of their synaptic connectivity. A particularly remarkable phenomenon is the appearance of strongly fluctuating, chaotic activity in networks of deterministic, but randomly connected rate units. How this type of intrinsically generated fluctuations appears in more realistic networks of spiking neurons has been a long standing question. To ease the comparison between rate and spiking networks, recent works investigated the dynamical regimes of randomly-connected rate networks with segregated excitatory and inhibitory populations, and firing rates constrained to be positive. These works derived general dynamical mean field (DMF) equations describing the fluctuating dynamics, but solved these equations only in the case of purely inhibitory networks. Using a simplified excitatory-inhibitory architecture in which DMF equations are more easily tractable, here we show that the presence of excitation qualitatively modifies the fluctuating activity compared to purely inhibitory networks. In presence of excitation, intrinsically generated fluctuations induce a strong increase in mean firing rates, a phenomenon that is much weaker in purely inhibitory networks. Excitation moreover induces two different fluctuating regimes: for moderate overall coupling, recurrent inhibition is sufficient to stabilize fluctuations; for strong coupling, firing rates are stabilized solely by the upper bound imposed on activity, even if inhibition is stronger than excitation. These results extend to more general network architectures, and to rate networks receiving noisy inputs mimicking spiking activity. Finally, we show that signatures of the second dynamical regime appear in networks of integrate-and-fire neurons.

Learning a task induces connectivity changes in neural circuits, thereby
changing their dynamics. To elucidate task related neural dynamics we study
trained Recurrent Neural Networks. We develop a Mean Field Theory for Reservoir
Computing networks trained to have multiple fixed point attractors. Our main
result is that the dynamics of the network's output in the vicinity of
attractors is governed by a low order linear Ordinary Differential Equation.
Stability of the resulting ODE can be assessed, predicting training success or
failure. Furthermore, a characteristic time constant, which remains finite at
the edge of chaos, offers an explanation of the network's output robustness in
the presence of variability of the internal neural dynamics. Finally, the
proposed theory predicts state dependent frequency selectivity in network
response.

Firing patterns in the central nervous system often exhibit strong temporal
irregularity and heterogeneity in their time averaged response properties.
Previous studies suggested that these properties are outcome of an intrinsic
chaotic dynamics. Indeed, simplified rate-based large neuronal networks with
random synaptic connections are known to exhibit sharp transition from fixed
point to chaotic dynamics when the synaptic gain is increased. However, the
existence of a similar transition in neuronal circuit models with more
realistic architectures and firing dynamics has not been established.
In this work we investigate rate based dynamics of neuronal circuits composed
of several subpopulations and random connectivity. Nonzero connections are
either positive-for excitatory neurons, or negative for inhibitory ones, while
single neuron output is strictly positive; in line with known constraints in
many biological systems. Using Dynamic Mean Field Theory, we find the phase
diagram depicting the regimes of stable fixed point, unstable dynamic and
chaotic rate fluctuations. We characterize the properties of systems near the
chaotic transition and show that dilute excitatory-inhibitory architectures
exhibit the same onset to chaos as a network with Gaussian connectivity.
Interestingly, the critical properties near transition depend on the shape of
the single- neuron input-output transfer function near firing threshold.
Finally, we investigate network models with spiking dynamics. When synaptic
time constants are slow relative to the mean inverse firing rates, the network
undergoes a sharp transition from fast spiking fluctuations and static firing
rates to a state with slow chaotic rate fluctuations. When the synaptic time
constants are finite, the transition becomes smooth and obeys scaling
properties, similar to crossover phenomena in statistical mechanics

The brain exhibits temporally complex patterns of activity with features similar to those of chaotic systems. Theoretical studies over the last twenty years have described various computational advantages for such regimes in neuronal systems. Nevertheless, it still remains unclear whether chaos requires specific cellular properties or network architectures, or whether it is a generic property of neuronal circuits. We investigate the dynamics of networks of excitatory-inhibitory (EI) spiking neurons with random sparse connectivity operating in the regime of balance of excitation and inhibition. Combining Dynamical Mean-Field Theory with numerical simulations, we show that chaotic, asynchronous firing rate fluctuations emerge generically for sufficiently strong synapses. Two different mechanisms can lead to these chaotic fluctuations. One mechanism relies on slow I-I inhibition which gives rise to slow subthreshold voltage and rate fluctuations. The decorrelation time of these fluctuations is proportional to the time constant of the inhibition. The second mechanism relies on the recurrent E-I-E feedback loop. It requires slow excitation but the inhibition can be fast. In the corresponding dynamical regime all neurons exhibit rate fluctuations on the time scale of the excitation. Another feature of this regime is that the population-averaged firing rate is substantially smaller in the excitatory population than in the inhibitory population. This is not necessarily the case in the I-I mechanism. Finally, we discuss the neurophysiological and computational significance of our results.

Providing the neurobiological basis of information processing in higher
animals, spiking neural networks must be able to learn a variety of complicated
computations, including the generation of appropriate, possibly delayed
reactions to inputs and the self-sustained generation of complex activity
patterns, e.g. for locomotion. Many such computations require previous building
of intrinsic world models. Here we show how spiking neural networks may solve
these different tasks. Firstly, we derive constraints under which classes of
spiking neural networks lend themselves to substrates of powerful general
purpose computing. The networks contain dendritic or synaptic nonlinearities
and have a constrained connectivity. We then combine such networks with
learning rules for outputs or recurrent connections. We show that this allows
to learn even difficult benchmark tasks such as the self-sustained generation
of desired low-dimensional chaotic dynamics or memory-dependent computations.
Furthermore, we show how spiking networks can build models of external world
systems and use the acquired knowledge to control them.

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

Two observations about the cortex have puzzled neuroscientists for a long time. First, neural responses are highly variable. Second, the level of excitation and inhibition received by each neuron is tightly balanced at all times. Here, we demonstrate that both properties are necessary consequences of neural networks that represent information efficiently in their spikes. We illustrate this insight with spiking networks that represent dynamical variables. Our approach is based on two assumptions: We assume that information about dynamical variables can be read out linearly from neural spike trains, and we assume that neurons only fire a spike if that improves the representation of the dynamical variables. Based on these assumptions, we derive a network of leaky integrate-and-fire neurons that is able to implement arbitrary linear dynamical systems. We show that the membrane voltage of the neurons is equivalent to a prediction error about a common population-level signal. Among other things, our approach allows us to construct an integrator network of spiking neurons that is robust against many perturbations. Most importantly, neural variability in our networks cannot be equated to noise. Despite exhibiting the same single unit properties as widely used population code models (e.g. tuning curves, Poisson distributed spike trains), balanced networks are orders of magnitudes more reliable. Our approach suggests that spikes do matter when considering how the brain computes, and that the reliability of cortical representations could have been strongly underestimated.

Prefrontal cortex is thought to have a fundamental role in flexible, context-dependent behaviour, but the exact nature of the computations underlying this role remains largely unknown. In particular, individual prefrontal neurons often generate remarkably complex responses that defy deep understanding of their contribution to behaviour. Here we study prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice. We find that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population. The population dynamics can be reproduced by a trained recurrent neural network, which suggests a previously unknown mechanism for selection and integration of task-relevant inputs. This mechanism indicates that selection and integration are two aspects of a single dynamical process unfolding within the same prefrontal circuits, and potentially provides a novel, general framework for understanding context-dependent computations.

The brain's ability to tell time and produce complex spatiotemporal motor patterns is critical for anticipating the next ring of a telephone or playing a musical instrument. One class of models proposes that these abilities emerge from dynamically changing patterns of neural activity generated in recurrent neural networks. However, the relevant dynamic regimes of recurrent networks are highly sensitive to noise; that is, chaotic. We developed a firing rate model that tells time on the order of seconds and generates complex spatiotemporal patterns in the presence of high levels of noise. This is achieved through the tuning of the recurrent connections. The network operates in a dynamic regime that exhibits coexisting chaotic and locally stable trajectories. These stable patterns function as 'dynamic attractors' and provide a feature that is characteristic of biological systems: the ability to 'return' to the pattern being generated in the face of perturbations.

Recurrent neural networks (RNNs) are useful tools for learning nonlinear relationships between time-varying inputs and outputs with complex temporal dependencies. Recently developed algorithms have been successful at training RNNs to perform a wide variety of tasks, but the resulting networks have been treated as black boxes: their mechanism of operation remains unknown. Here we explore the hypothesis that fixed points, both stable and unstable, and the linearized dynamics around them, can reveal crucial aspects of how RNNs implement their computations. Further, we explore the utility of linearization in areas of phase space that are not true fixed points but merely points of very slow movement. We present a simple optimization technique that is applied to trained RNNs to find the fixed and slow points of their dynamics. Linearization around these slow regions can be used to explore, or reverse-engineer, the behavior of the RNN. We describe the technique, illustrate it using simple examples, and finally showcase it on three high-dimensional RNN examples: a 3-bit flip-flop device, an input-dependent sine wave generator, and a two-point moving average. In all cases, the mechanisms of trained networks could be inferred from the sets of fixed and slow points and the linearized dynamics around them.

To answer the questions of how information about the physical world is sensed, in what form is information remembered, and how does information retained in memory influence recognition and behavior, a theory is developed for a hypothetical nervous system called a perceptron. The theory serves as a bridge between biophysics and psychology. It is possible to predict learning curves from neurological variables and vice versa. The quantitative statistical approach is fruitful in the understanding of the organization of cognitive systems. 18 references.

Output feedback is crucial for autonomous and parameterized pattern generation with reservoir networks. Read-out learning can lead to error amplification in these settings and therefore regularization is important for both generalization and reduction of error amplification. We show that regularization of the inner reservoir network mitigates parameter dependencies and boosts the task-specific performance. 1

Neuronal activity arises from an interaction between ongoing firing generated spontaneously by neural circuits and responses driven by external stimuli. Using mean-field analysis, we ask how a neural network that intrinsically generates chaotic patterns of activity can remain sensitive to extrinsic input. We find that inputs not only drive network responses, but they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated. The critical input intensity at the phase transition is a nonmonotonic function of stimulus frequency, revealing a "resonant" frequency at which the input is most effective at suppressing chaos even though the power spectrum of the spontaneous activity peaks at zero and falls exponentially. A prediction of our analysis is that the variance of neural responses should be most strongly suppressed at frequencies matching the range over which many sensory systems operate.

Computational properties of use of biological organisms or to the construction of computers can emerge as collective properties of systems having a large number of simple equivalent components (or neurons). The physical meaning of content-addressable memory is described by an appropriate phase space flow of the state of a system. A model of such a system is given, based on aspects of neurobiology but readily adapted to integrated circuits. The collective properties of this model produce a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size. The algorithm for the time evolution of the state of the system is based on asynchronous parallel processing. Additional emergent collective properties include some capacity for generalization, familiarity recognition, categorization, error correction, and time sequence retention. The collective properties are only weakly sensitive to details of the modeling or the failure of individual devices.

It has previously been shown that generic cortical microcircuit models can perform complex real-time computations on continuous input streams, provided that these computations can be carried out with a rapidly fading memory. We investigate the computational capability of such circuits in the more realistic case where not only readout neurons, but in addition a few neurons within the circuit, have been trained for specific tasks. This is essentially equivalent to the case where the output of trained readout neurons is fed back into the circuit. We show that this new model overcomes the limitation of a rapidly fading memory. In fact, we prove that in the idealized case without noise it can carry out any conceivable digital or analog computation on time-varying inputs. But even with noise, the resulting computational model can perform a large class of biologically relevant real-time computations that require a nonfading memory. We demonstrate these computational implications of feedback both theoretically, and through computer simulations of detailed cortical microcircuit models that are subject to noise and have complex inherent dynamics. We show that the application of simple learning procedures (such as linear regression or perceptron learning) to a few neurons enables such circuits to represent time over behaviorally relevant long time spans, to integrate evidence from incoming spike trains over longer periods of time, and to process new information contained in such spike trains in diverse ways according to the current internal state of the circuit. In particular we show that such generic cortical microcircuits with feedback provide a new model for working memory that is consistent with a large set of biological constraints. Although this article examines primarily the computational role of feedback in circuits of neurons, the mathematical principles on which its analysis is based apply to a variety of dynamical systems. Hence they may also throw new light on the computational role of feedback in other complex biological dynamical systems, such as, for example, genetic regulatory networks.

Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered.

Echo state networks (ESN) are a novel approach to recurrent neural network training. An ESN consists of a large, fixed, recurrent "reservoir" network, from which the desired output is obtained by training suitable output connection weights. Determination of optimal output weights becomes a linear, uniquely solvable task of MSE minimization. This article reviews the basic ideas and describes an online adaptation scheme based on the RLS algorithm known from adaptive linear systems. As an example, a 10-th order NARMA system is adaptively identified. The known benefits of the RLS algorithms carry over from linear systems to nonlinear ones; specifically, the convergence rate and misadjustment can be determined at design time.

Large scale recordings of neural activity in behaving animals have established that the transformation of sensory stimuli into motor outputs relies on low-dimensional dynamics at the population level, while individual neurons generally exhibit complex, mixed selectivity. Understanding how low-dimensional computations on mixed, distributed representations emerge from the structure of the recurrent connectivity and inputs to cortical networks is a major challenge. Classical models of recurrent networks fall in two extremes: on one hand balanced networks are based on fully random connectivity and generate high-dimensional spontaneous activity, while on the other hand strongly structured, clustered networks lead to low-dimensional dynamics and ad-hoc computations but rely on pure selectivity. A number of functional approaches for training recurrent networks however suggest that a specific type of minimal connectivity structure is sufficient to implement a large range of computations. Starting from this observation, here we study a new class of recurrent network models in which the connectivity consists of a combination of a random part and a minimal, low dimensional structure. We show that in such low-rank recurrent networks, the dynamics are low-dimensional and can be directly inferred from connectivity using a geometrical approach. We exploit this understanding to determine minimal connectivity structures required to implement specific computations. We find that the dynamical range and computational capacity of a network quickly increases with the dimensionality of the structure in the connectivity, so that a rank-two structure is already sufficient to implement a complex behavioral task such as context-dependent decision-making.

Networks of randomly connected neurons are among the most popular models in theoretical neuroscience. The connectivity between neurons in the cortex is however not fully random, the simplest and most prominent deviation from randomness found in experimental data being the overrepresentation of bidirectional connections among pyramidal cells. Using numerical and analytical methods, we investigated the effects of partially symmetric connectivity on dynamics in networks of rate units. We considered the two dynamical regimes exhibited by random neural networks: the weak-coupling regime, where the firing activity decays to a single fixed point unless the network is stimulated, and the strong-coupling or chaotic regime, characterized by internally generated fluctuating firing rates. In the weak-coupling regime, we computed analytically for an arbitrary degree of symmetry the auto-correlation of network activity in presence of external noise. In the chaotic regime, we performed simulations to determine the timescale of the intrinsic fluctuations. In both cases, symmetry increases the characteristic asymptotic decay time of the autocorrelation function and therefore slows down the dynamics in the network.

Recurrent neural networks (RNNs) are a class of computational models that are often used as a tool to explain neurobiological phenomena, considering anatomical, electrophysiological and computational constraints.
RNNs can either be designed to implement a certain dynamical principle, or they can be trained by input–output examples. Recently, there has been large progress in utilizing trained RNNs both for computational tasks, and as explanations of neural phenomena. I will review how combining trained RNNs with reverse engineering can provide an alternative framework for modeling in neuroscience, potentially serving as a powerful hypothesis generation tool.
Despite the recent progress and potential benefits, there are many fundamental gaps towards a theory of these networks. I will discuss these challenges and possible methods to attack them.

Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here we demonstrate that, starting from random connectivity and modifying a small fraction of connections, a largely disordered recurrent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network Training (PINning), to model and match cellular resolution imaging data from the posterior parietal cortex during a virtual memory-guided two-alternative forced-choice task. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures.

Computational properties of use to biological organisms or to the construction of computers can emerge as collective properties of systems having a large number of simple equivalent components (or neurons). The physical meaning of content-addressable memory is described by an appropriate phase space flow of the state of a system. A model of such a system is given, based on aspects of neurobiology but readily adapted to integrated circuits. The collective properties of this model produce a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size. The algorithm for the time evolution of the state of the system is based on asynchronous parallel processing. Additional emergent collective properties include some capacity for generalization, familiarity recognition, categorization, error correction, and time sequence retention. The collective properties are only weakly sensitive to details of the modeling or the failure of individual devices.

Half-Title Page Wiley Series Page Title Page Copyright Page Dedication Page Table of Contents Preface Acknowledgements Abbreviations and Symbols Notation

Dynamical systems driven by strong external signals are ubiquitous in nature and engineering. Here we study "echo state networks," networks of a large number of randomly connected nodes, which represent a simple model of a neural network, and have important applications in machine learning. We develop a mean-field theory of echo state networks. The dynamics of the network is captured by the evolution law, similar to a logistic map, for a single collective variable. When the network is driven by many independent external signals, this collective variable reaches a steady state. But when the network is driven by a single external signal, the collective variable is non stationary but can be characterized by its time averaged distribution. The predictions of the mean-field theory, including the value of the largest Lyapunov exponent, are compared with the numerical integration of the equations of motion.

Stability problems associated with recursive least-squares (RLS)
algorithms due to lack persistency exciting input data are considered.
It is shown by an example that quantization of data and finite-word
length computations assist each other in destroying persistent
excitation. A projection operator formalism is used to interpret this
effect for the square-foot factorized autocorrelation matrix estimator.
This estimator is modified such that only the single dimension observed
through the current input data is updated. Thereby divergence of
unobserved modes is prevented. This new O( N <sup>2</sup>) RLS
algorithm with selective memory is computationally simple and stable
even for small values of the forgetting factor

In this paper, after giving definitions for a set of commonly used terms in recurrent neural networks (RNNs), all possible RNN architectures based on these definitions are enumerated, and described. Then, most existing RNN architectures are categorized under these headings. Four general neural network architectures, in increasing degree of complexity, are introduced. It is shown that all the existing RNN architectures can be considered as special cases of the general RNN architectures. Furthermore, it is shown how these existing architectures can be transformed to the general RNN architectures. Some open issues concerning RNN architectures are discussed.

We investigate the dynamical behaviour of neural networks with asymmetric synaptic weights, in the presence of random thresholds. We inspect low gain dynamics before using mean-field equations to study the bifurcations of the fixed points and the change of regime that occurs when varying control parameters. We infer different areas with various regimes summarized by a bifurcation map in the parameter space. We numerically show the occurence of chaos that arises generically by a quasi-periodicity route. We then discuss some features of our system in relation with biological observations such as low firing rates and refractory periods.

It is known that if one perturbs a large iid random matrix by a bounded rank
error, then the majority of the eigenvalues will remain distributed according
to the circular law. However, the bounded rank perturbation may also create one
or more outlier eigenvalues. We show that if the perturbation is small, then
the outlier eigenvalues are created next to the outlier eigenvalues of the
bounded rank perturbation; but if the perturbation is large, then many more
outliers can be created, and their law is governed by the zeroes of a random
Laurent series with Gaussian coefficients. On the other hand, these outliers
may be eliminated by enforcing a row sum condition on the final matrix.

Neural circuits display complex activity patterns both spontaneously and when responding to a stimulus or generating a motor output. How are these two forms of activity related? We develop a procedure called FORCE learning for modifying synaptic strengths either external to or within a model neural network to change chaotic spontaneous activity into a wide variety of desired activity patterns. FORCE learning works even though the networks we train are spontaneously chaotic and we leave feedback loops intact and unclamped during learning. Using this approach, we construct networks that produce a wide variety of complex output patterns, input-output transformations that require memory, multiple outputs that can be switched by control inputs, and motor patterns matching human motion capture data. Our results reproduce data on premovement activity in motor and premotor cortex, and suggest that synaptic plasticity may be a more rapid and powerful modulator of network activity than generally appreciated.

A continuous-time dynamic model of a network of N nonlinear elements interacting via random asymmetric couplings is studied. A self-consistent mean-field theory, exact in the N-->∞ limit, predicts a transition from a stationary phase to a chaotic phase occurring at a critical value of the gain parameter. The autocorrelations of the chaotic flow as well as the maximal Lyapunov exponent are calculated.

We study discrete parallel dynamics of a fully connected network of nonlinear elements interacting via long-range random asymmetric couplings under the influence of external noise. Using dynamical mean-field equations, which become exact in the thermodynamical limit, we calculate the activity and the maximal Lyapunov exponent of the network in dependence of a nonlinearity (gain) parameter and the noise intensity.

We present a method for learning nonlinear systems, echo state networks (ESNs). ESNs employ artificial recurrent neural networks in a way that has recently been proposed independently as a learning mechanism in biological brains. The learning method is computationally efficient and easy to use. On a benchmark task of predicting a chaotic time series, accuracy is improved by a factor of 2400 over previous techniques. The potential for engineering applications is illustrated by equalizing a communication channel, where the signal error rate is improved by two orders of magnitude.

The dynamics of neural networks is influenced strongly by the spectrum of eigenvalues of the matrix describing their synaptic connectivity. In large networks, elements of the synaptic connectivity matrix can be chosen randomly from appropriate distributions, making results from random matrix theory highly relevant. Unfortunately, classic results on the eigenvalue spectra of random matrices do not apply to synaptic connectivity matrices because of the constraint that individual neurons are either excitatory or inhibitory. Therefore, we compute eigenvalue spectra of large random matrices with excitatory and inhibitory columns drawn from distributions with different means and equal or different variances.

How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an on-line fashion. We have also developed an on-line version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.

The convergence properties of a fairly general class of adaptive
recursive least-squares algorithms are studied under the assumption that
the data generation mechanism is deterministic and time invariant.
First, the (open-loop) identification case is considered. By a suitable
notion of excitation subspace, the convergence analysis of the
identification algorithm is carried out with no persistent excitation
hypothesis, i.e. it is proven that the projection of the parameter error
on the excitation subspace tends to zero, while the orthogonal component
of the error remains bounded. The convergence of an adaptive control
scheme based on the minimum variance control law is then dealt with. It
is shown that under the standard minimum-phase assumption, the tracking
error converges to zero whenever the reference signal is bounded.
Furthermore, the control variable turns out to be bounded

Gradient descent algorithms in recurrent neural networks can have problems when the network dynamics experience bifurcations in the course of learning. The possible hazards caused by the bifurcations of the network dynamics and the learning equations are investigated. The roles of teacher forcing, preprogramming of network structures, and the approximate learning algorithms are discussed. 1 Introduction Supervised learning in recurrent neural networks has been extensively applied to speech recognition, language processing [2, 5, 6], and the modeling of biological neural networks [1, 11, 16, 18]. Although gradient descent algorithms for recurrent networks are considered as a simple extension to the back-propagation learning for feed-forward networks, there is an essential difference between the learning processes in feed-forward and recurrent networks. The output of a feed-forward network is a continuous function of the weights if each unit has a smooth output function, such as a sig...

A Practical Guide to Applying Echo State Networks. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science

- M Lukosevicius

Lukosevicius, M. (2012). A Practical Guide to Applying Echo State Networks. Neural Networks: Tricks of the
Trade. Lecture Notes in Computer Science. Springer.

Learning recurrent neural networks with hessian-free optimization

- J Martens
- I Sutskever

Martens, J. and Sutskever, I. (2011). Learning recurrent neural networks with hessian-free optimization. ICML,
pages 1033-1040.