## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

This paper considers on-line training of feedforward neural networks. Training examples are only available through sampling from a certain, possibly infinite, distribution. In order to make the learning process autonomous, one can employ Extended Kalman Filter or stochastic steepest descent with adaptively adjusted step-sizes. Here the latter is considered. A scheme of determining step-sizes is introduced that satisfies the following requirements: (i) it does not need any auxiliary problem-dependent parameters, (ii) it does not assume any particular loss function that the training process is intended to minimize, (iii) it makes the learning process stable and efficient. An experimental study with several approximation problems is presented. Within this study the presented approach is compared with Extended Kalman Filter and LFI, with satisfactory results.

To read the full-text of this research,

you can request a copy directly from the authors.

... But every method may have its limitations. During the recent decades, other techniques have been developed to investigate the stability, in which the fixed point method is always one of those alternatives [21][22][23][24][25][26][27]. For example, in 2015, Zhou utilized Brouwer's fixed point theorem to prove the existence and uniqueness of equilibrium of the hybrid BAM neural networks with proportional delays and finally constructed appropriate delay differential inequalities to derive the stability of equilibrium [28]. ...

... Remark 1. Different from the methods of [28,29], it will be the first time to utilize contraction mapping principle to infer directly the LMI-based stability criterion of BAM neural networks, convenient for computer programming. Recently, there have been a lot of good results and methods [21][22][23][24][25][26][27][28][29] enlightening our current work. In this paper, we shall propose the LMI-based criterion, novel against the existing results, published from 2013 to 2016 (see Remark 10 and Table 1). ...

... Hence, we have proved (36) and (34). And so we can conclude (27) from (28), (33), and (34). This means that condition (d) is satisfied, too. ...

The fixed point technique has been employed in the stability analysis of time-delays bidirectional associative memory (BAM) neural networks with impulse. By formulating a contraction mapping in a product space, a new LMI-based exponential stability criterion was derived. Lately, fixed point methods have educed various good results inspiring this work, but those criteria cannot be programmed by a computer. In this paper, LMI conditions of the obtained result can be applicable to computer Matlab LMI toolbox which meets the need of the large-scale calculation in real engineering. Moreover, a numerical example and a comparable table are presented to illustrate the effectiveness of the proposed methods.

... The prediction is based on the learned experience for a given motor control task. For the second part, we combine the actor-critic reinforcement learning algorithm with experience replay [Wawrzynski, 2009] and online step-size estimation [Wawrzynski and Papis, 2011] for real-time efficient learning. ...

... In this paper, we first briefly describe the fundamentals and important methods of solving reinforcement learning problem in a model-free way. The challenges it its design lead us to present the realtime reinforcement learning algorithm that considerably makes learning efficient by: 1) experience replay -using past experience of the robot to speed up computations [Wawrzynski, 2009], and 2) fixed point step-size estimation -autonomously estimating the learning rate parameters of the algorithm [Wawrzynski and Papis, 2011]. ...

... ., ξ t is a sequence of data samples and the function g defines the direction of parameter improvement, given by the gradient ∇J(θ).i.e, Eg(θ, ξ) = ∇J(θ). We apply the fixed-point algorithm for step-size estimation introduced in [Wawrzynski and Papis, 2011]. The step-size at every time instant t is estimated by computing two displacement vectors G t,n and G * t,n given by: ...

... In theory, the Boltzmann Machine requires performing computation on binary states and real-valued weights. Prior work, however, has shown that the Boltzmann machine can still solve a broad range of optimization and machine learning problems with a negligible loss in solution quality when the weights are represented in a fixed-point, multi-bit format [50,51]. Nevertheless, we expect that storing a large number of bits within each memristive storage element will prove difficult [52,38]. ...

... We observed that a 32-bit fixed point representation causes a negligible degradation (<1%) in the outcome of the optimization process and the accuracy of the learning tasks. This result confirms similar observations reported in prior work [50,51]. ...

... By computing gradient estimates always at two points it captures global characteristics of the process, thereby, obtaining robustness and independence from any parameters. In (Wawrzyński & Papis, 2011) this approach was specialized to neural-network on-line training, e.g., each weight of the network was assigned a separate stepsize. Here the original algorithm is enhanced and adopted to provide autonomy to RL with experience replay. ...

... This creates large differences between g(θ t , ξ t+i ) and g(θ t+i , ξ t+i ) manifested by large discrepancy between G t,n and G * t,n . The papers (Wawrzyński, 2010;Wawrzyński & Papis, 2011) present a statistical analysis to formalize the above intuitive principles in a simple, unidimensional model. Since each g(θ t , ξ t+i ) in (12) is an unbiased estimate of ∇J(θ t ), the value 1 n G * t,n is also an unbiased estimator of ∇J(θ t ) and its quality increases with n. ...

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time.

In this paper the classic momentum algorithm for stochastic optimization is considered. A method is introduced that adjusts coefficients for this algorithm during its operation. The method does not depend on any preliminary knowledge of the optimization problem. In the experimental study, the method is applied to on-line learning in feed-forward neural networks, including deep auto-encoders, and outperforms any fixed coefficients. The method eliminates coefficients that are difficult to determine, with profound influence on performance. While the method itself has some coefficients, they are ease to determine and sensitivity of performance to them is low. Consequently, the method makes on-line learning a practically parameter-free process and broadens the area of potential application of this technology.

The running state of the fan has significant influence on the safety and economy of the power plant unit, so it is necessary to monitor the fan performance and running state in real time. According to the basic theory of the fan, there is a stable, good nonlinear mapping relation between the inlet pressure difference and flow, which can be utilized to monitor the flow of the fan. Thus, the fan differential pressure - flow curve model is established by the optimized BP neural network and the modified Support Vector Machine (SVM). The fitting error shows that the improved SVM model is better. Finally, the on-line fan monitoring system software is established by using Visual Basic (VB) language and Matlab programming based on the improved SVM fan differential pressure - flow curve model, which can accurately monitor the fan operation.

This paper considers on-line training of feadforward neural networks. Training examples are only available sampled randomly from a given generator. What emerges in this setting is the problem of step-sizes, or learning rates, adaptation. A scheme of determining step-sizes is introduced here that satisfies the following requirements: (i) it does not need any auxiliary problem-dependent parameters, (ii) it does not assume any particular loss function that the training process is intended to minimize, (iii) it makes the learning process stable and efficient. An experimental study with the 2D Gabor function approximation is presented.

Reinforcement learning by direct policy gradient estimation is attractive in theory but in practice leads to notoriously ill-behaved optimization problems. We improve its robustness and speed of convergence with stochastic meta-descent, a gain vector adaptation method that employs fast Hessian-vector products. In our experiments the resulting algorithms outperform previously employed online stochastic, offline conjugate, and natural policy gradient methods.

In this article, we propose a method to adapt stepsize pa- rameters used in reinforcement learning for dynamic envi- ronments. In general reinforcement learning situations, a stepsize parameter is decreased to zero during learning, be- cause the environment is generally supposed to be noisy but stationary, such that the true expected rewards are fixed. On the other hand, we assume that in the real world, the true expected reward changes over time and hence, the learning agent must adapt the change through continuous learning. We derive the higher-order derivatives of exponential mov- ing average (which is used to estimate the expected values of states or actions in major reinforcement learning) using step- size parameters. We also illustrate a mechanism to calculate these derivatives in a recursive manner. Using the mecha- nism, we construct a precise and flexible adaptation method for the stepsize parameter in order to minimize square er- rors or maximize a certain criterion. The proposed method is validated both theoretically and experimentally.

This paper introduces a learning method for two-layer feedforward neural networks based on sensitivity analysis, which uses a linear training algorithm for each of the two layers. First, random values are assigned to the outputs of the first layer; later, these initial values are updated based on sensitivity formulas, which use the weights in each of the layers; the process is repeated until convergence. Since these weights are learnt solving a linear system of equations, there is an important saving in computational time. The method also gives the local sensitivities of the least square errors with respect to input and output data, with no extra computational cost, because the necessary information becomes available without extra calculations. This method, called the Sensitivity-Based Linear Learning Method, can also be used to provide an initial set of weights, which significantly improves the behavior of other learning algorithms. The theoretical basis for the method is given and its performance is illustrated by its application to several examples in which it is compared with several learning algorithms and well known data sets. The results have shown a learning speed generally faster than other existing methods. In addition, it can be used as an initialization tool for other well known methods with significant improvements.

We provide a variable metric stochastic approximation theory. In doing so, we provide a convergence theory for a large class of online variable metric methods including the recently introduced online versions of the BFGS algorithm and its limited-memory LBFGS variant. We also discuss the implications of our results for learning from expert advice. Comment: Correctment of theorem 3.4. from AISTATS 2009 article

This paper investigates new learning algorithms (LF I and LF II) based on Lyapunov function for the training of feedforward neural networks. It is observed that such algorithms have interesting parallel with the popular backpropagation (BP) algorithm where the fixed learning rate is replaced by an adaptive learning rate computed using convergence theorem based on Lyapunov stability theory. LF II, a modified version of LF I, has been introduced with an aim to avoid local minima. This modification also helps in improving the convergence speed in some cases. Conditions for achieving global minimum for these kind of algorithms have been studied in detail. The performances of the proposed algorithms are compared with BP algorithm and extended Kalman filtering (EKF) on three bench-mark function approximation problems: XOR, 3-bit parity, and 8-3 encoder. The comparisons are made in terms of number of learning iterations and computational time required for convergence. It is found that the proposed algorithms (LF I and II) are much faster in convergence than other two algorithms to attain same accuracy. Finally, the comparison is made on a complex two-dimensional (2-D) Gabor function and effect of adaptive learning rate for faster convergence is verified. In a nutshell, the investigations made in this paper help us better understand the learning procedure of feedforward neural networks in terms of adaptive learning rate, convergence speed, and local minima.

In this work, two modifications on Levenberg-Marquardt (LM)
algorithm for feedforward neural networks are studied. One modification
is made on performance index, while the other one is on calculating
gradient information. The modified algorithm gives a better convergence
rate compared to the standard LM method and is less computationally
intensive and requires less memory. The performance of the algorithm has
been checked on several example problems

Appropriate bias is widely viewed as the key to efficient learning and generalization. I present a new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience. The IDBD algorithm is developed for the case of a simple, linear learning system---the LMS or delta rule with a separate learning-rate parameter for each input. The IDBD algorithm adjusts the learning-rate parameters, which are an important form of bias for this system. Because bias in this approach is adapted based on previous learning experience, the appropriate testbeds are drifting or non-stationary learning tasks. For particular tasks of this type, I show that the IDBD algorithm performs better than ordinary LMS and in fact finds the optimal learning rates. The IDBD algorithm extends and improves over prior work by Jacobs and by me in that it is fully incremental and has only a single free parameter. This paper also extends previous work by pr...

Applications and issues application to learning, state dependent noise and queueing applications to signal processing and adaptive control mathematical background convergence with probability one - Martingale difference noise convergence with probability one - correlated noise weak convergence - introduction weak convergence methods for general algorithms applications - proofs of convergence rate of convergence averaging of the iterates distributed/decentralized and asynchronous algorithms.

To understand the substituent effects of 3-picoline derivatives on reaction equilibrium, the interactions between a series of 3-picoline-like ligands and [OV(O2)2(D2O)]−/[OV(O2)2(HOD)]− in solution were explored by multinuclear (1H, 13C, and 51V) magnetic resonance, COSY, and HSQC in 0.15 mol L−1 NaCl ionic medium for mimicking physiological conditions. The relative reactivity among the 3-picoline derivatives is 3-methyl pyridine > nicotinate >nicotinamide > ethyl nicotinate. Competitive coordination results in the formation of a series of new six-coordinated peroxovanadate species [OV(O2)2L] n − (L = 3-picoline derivatives, n = 1 or 2). Density functional calculations provide a reasonable explanation on the relative reactivity of the 3-picoline derivatives. Solvation effects play an important role in these reactions.

A new type of attractor—terminal attractors—for content-addressable memory, associative memory, and pattern recognition in artificial neural networks operating in continuous time is introduced. The idea of a terminal attractor is based upon a violation of the Lipschitz condition at a fixed point. As a result, the fixed point becomes a singular solution which envelopes the family of regular solutions, while each regular solution approaches such an attractor in finite time. It will be shown that terminal attractors can be incorporated into neural networks such that any desired set of these attractors with prescribed basins is provided by an appropriate selection of the synaptic weights. The applications of terminal attractors for content-addressable and associative memories, pattern recognition, self-organization, and for dynamical training are illustrated.

The major drawbacks of backpropagation algorithm are local minima and slow convergence. This paper presents an efficient technique ANMBP for training single hidden layer neural network to improve convergence speed and to escape from local minima. The algorithm is based on modified backpropagation algorithm in neighborhood based neural network by replacing fixed learning parameters with adaptive learning parameters. The developed learning algorithm is applied to several problems. In all the problems, the proposed algorithm outperform well.

While there exist many techniques for finding the parameters that minimize an error function, only those methods that solely perform local computations are used in connectionist networks. The most popular learning algorithm for connectionist networks is the back-propagation procedure, which can be used to update the weights by the method of steepest descent. In this paper, we examine steepest descent and analyze why it can be slow to converge. We then propose four heuristics for achieving faster rates of convergence while adhering to the locality constraint. These heuristics suggest that every weight of a network should be given its own learning rate and that these rates should be allowed to vary over time. Additionally, the heuristics suggest how the learning rates should be adjusted. Two implementations of these heuristics, namely momentum and an algorithm called the delta-bar-delta rule, are studied and simulation results are presented.

Like other gradient descent techniques, backpropagation converges slowly, even for medium sized network problems. This fact results from the usually large dimension of the weight space and from the particular shape of the error surface in each iteration point. Oscillation between the sides of deep and narrow valleys, for example, is a well known case where gradient descent provides poor convergence rates.
In this work, we present an acceleration technique for the backpropagation algorithm based on individual adaptation of the learning rate parameter of each synapse. The efficiency of the method is discussed and several related issues are analyzed.

In this letter, an improvement of the recently developed neighborhood-based Levenberg-Marquardt (NBLM) algorithm is proposed and tested for neural network (NN) training. The algorithm is modified by allowing local adaptation of a different learning coefficient for each neighborhood. This simple add-in to the NBLM training method significantly increases the efficiency of the training episodes carried out with small neighborhood sizes, thus, allowing important savings in memory occupation and computational time while obtaining better performance than the original Levenberg-Marquardt (LM) and NBLM methods.

In this paper, an improved training algorithm based on the terminal attractor concept for feedforward neural network learning
is proposed. A condition to avoid the singularity problem is proposed. The effectiveness of the proposed algorithm is evaluated
by various simulation results for a function approximation problem and a stock market index prediction problem. It is shown
that the terminal attractor based training algorithm performs consistently in comparison with other existing training algorithms.

In this paper, the problem of stochastic synchronization analysis is investigated for a new array of coupled discrete-time stochastic complex networks with randomly occurred nonlinearities (RONs) and time delays. The discrete-time complex networks under consideration are subject to: 1) stochastic nonlinearities that occur according to the Bernoulli distributed white noise sequences; 2) stochastic disturbances that enter the coupling term, the delayed coupling term as well as the overall network; and 3) time delays that include both the discrete and distributed ones. Note that the newly introduced RONs and the multiple stochastic disturbances can better reflect the dynamical behaviors of coupled complex networks whose information transmission process is affected by a noisy environment (e.g., internet-based control systems). By constructing a novel Lyapunov-like matrix functional, the idea of delay fractioning is applied to deal with the addressed synchronization analysis problem. By employing a combination of the linear matrix inequality (LMI) techniques, the free-weighting matrix method and stochastic analysis theories, several delay-dependent sufficient conditions are obtained which ensure the asymptotic synchronization in the mean square sense for the discrete-time stochastic complex networks with time delays. The criteria derived are characterized in terms of LMIs whose solution can be solved by utilizing the standard numerical software. A simulation example is presented to show the effectiveness and applicability of the proposed results.

Actor-Critics constitute an important class of reinforcement learning algorithms that can deal with continuous actions and states in an easy and natural way. This paper shows how these algorithms can be augmented by the technique of experience replay without degrading their convergence properties, by appropriately estimating the policy change direction. This is achieved by truncated importance sampling applied to the recorded past experiences. It is formally shown that the resulting estimation bias is bounded and asymptotically vanishes, which allows the experience replay-augmented algorithm to preserve the convergence properties of the original algorithm. The technique of experience replay makes it possible to utilize the available computational power to reduce the required number of interactions with the environment considerably, which is essential for real-world applications. Experimental results are presented that demonstrate that the combination of experience replay and Actor-Critics yields extremely fast learning algorithms that achieve successful policies for non-trivial control tasks in considerably short time. Namely, the policies for the cart-pole swing-up [Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219-245] are obtained after as little as 20 min of the cart-pole time and the policy for Half-Cheetah (a walking 6-degree-of-freedom robot) is obtained after four hours of Half-Cheetah time.

The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks. The algorithm is tested on several function approximation problems, and is compared with a conjugate gradient algorithm and a variable learning rate algorithm. It is found that the Marquardt algorithm is much more efficient than either of the other techniques when the network contains no more than a few hundred weights.

A novel approach is presented for the training of multilayer
feedforward neural networks, using a conjugate gradient algorithm
incorporating an appropriate line search algorithm. The algorithm
updates the input weights to each neuron in an efficient parallel way,
similar to the one used by the well known backpropagation algorithm. The
performance of the algorithm is superior to that of the conventional
backpropagation algorithm and is based on strong theoretical reasons
supported by the numerical results of three examples

A novel real-time learning algorithm for a multilayered neural
network is derived from the extended Kalman filter (EKF). Since this
EKF-based learning algorithm approximately gives the minimum variance
estimate of the linkweights, the convergence performance is improved in
comparison with the backwards error propagation algorithm using the
steepest descent techniques. Furthermore, tuning parameters which
crucially govern the convergence properties are not included, which
makes its application easier. Simulation results for the XOR and parity
problems are provided

Stochastic meta-descent (SMD) is a new technique for online adaptation of local learning rates in arbitrary twice-differentiable systems. Like matrix momentum it uses full second-order information while retaining O(n) computational complexity by exploiting the efficient computation of Hessian-vector products. Here we apply SMD to independent component analysis, and employ the resulting algorithm for the blind separation of time-varying mixtures. By matching individual learning rates to the rate of change in each source signal's mixture coefficients, our technique is capable of simultaneously tracking sources that move at very different, a priori unknown speeds.

In this appendix, we prove that for a constant step-size β, stable process (7), δ ∈ (0, 0.5), γ i = (1 − δβa) i , and ϕ t defined in (42) it is true

- A Appendix

Appendix
A. Properties of ϕ t defined by Equation (42)
In this appendix, we prove that for a constant step-size β, stable process
(7), δ ∈ (0, 0.5), γ i = (1 − δβa) i, and ϕ t defined in (42) it is true that
lim
t→∞
E(ϕ t − θ t ) 2 ≤ lim
t→∞
8δV θ t.
(70)
Equation (43) is also justified here.

UCI machine learning repository

- A Frank
- A Asuncion

Frank, A. and Asuncion, A. (2010). UCI machine learning repository.