John Sum

National Chung Hsing University, 臺中市, Taiwan, Taiwan

Are you John Sum?

Claim your profile

Publications (70)70.95 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: The dual neural network (DNN)-based k-winner-take-all ( kWTA) model is an effective approach for finding the k largest inputs from n inputs. Its major assumption is that the threshold logic units (TLUs) can be implemented in a perfect way. However, when differential bipolar pairs are used for implementing TLUs, the transfer function of TLUs is a logistic function. This brief studies the properties of the DNN- kWTA model under this imperfect situation. We prove that, given any initial state, the network settles down at the unique equilibrium point. Besides, the energy function of the model is revealed. Based on the energy function, we propose an efficient method to study the model performance when the inputs are with continuous distribution functions. Furthermore, for uniformly distributed inputs, we derive a formula to estimate the probability that the model produces the correct outputs. Finally, for the case that the minimum separation ∆min of the inputs is given, we prove that if the gain of the activation function is greater than 1/4∆min max(ln2n, 2 ln1-ϵ/ϵ ), then the network can produce the correct outputs with winner outputs greater than 1-ϵ and loser outputs less than ϵ, where ϵ is the threshold less than 0.5.
    IEEE transactions on neural networks and learning systems 11/2014; · 4.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Finding the location of a mobile source from a number of separated sensors is an important problem in global positioning systems and wireless sensor networks. This problem can be achieved by making use of the time-of-arrival (TOA) measurements. However, solving this problem is not a trivial task because the TOA measurements have nonlinear relationships with the source location. This paper adopts an analog neural network technique, namely Lagrange programming neural network, to locate a mobile source. We also investigate the stability of the proposed neural model. Simulation results demonstrate that the mean-square error performance of our devised location estimator approaches the Cramér–Rao lower bound in the presence of uncorrelated Gaussian measurement noise.
    Neural Computing and Applications 01/2014; 24:109–116. · 1.76 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper develops two neural network models, based on Lagrange programming neural networks (LPNNs), for recovering sparse signals in compressive sampling. The first model is for the standard recovery of sparse signals. The second one is for the recovery of sparse signals from noisy observations. Their properties, including the optimality of the solutions and the convergence behavior of the networks, are analyzed. We show that for the first case, the network converges to the global minimum of the objective function. For the second case, the convergence is locally stable.
    Neurocomputing 01/2014; 129:298–305. · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recently, an analog neural network model, namely Wang's kWTA, was proposed. In this model, the output nodes are defined as the Heaviside function. Subsequently, its finite time convergence property and the exact convergence time are analyzed. However, the discovered characteristics of this model are based on the assumption that there are no physical defects during the operation. In this brief, we analyze the convergence behavior of the Wang's kWTA model when defects exist during the operation. Two defect conditions are considered. The first one is that there is input noise. The second one is that there is stochastic behavior in the output nodes. The convergence of the Wang's kWTA under these two defects is analyzed and the corresponding energy function is revealed.
    IEEE transactions on neural networks and learning systems 01/2013; 24(9):1472-1478. · 4.37 Impact Factor
  • Chi-Sing Leung, John Pui-Fai Sum
    [Show abstract] [Hide abstract]
    ABSTRACT: Fault tolerance is an interesting topic in neural networks. However, many existing results on this topic focus only on the situation of a single fault source. In fact, a trained network may be affected by multiple fault sources. This brief studies the performance of faulty radial basis function (RBF) networks that suffer from multiplicative weight noise and open weight fault concurrently. We derive a mean prediction error (MPE) formula to estimate the generalization ability of faulty networks. The MPE formula provides us a way to understand the generalization ability of faulty networks without using a test set or generating a number of potential faulty networks. Based on the MPE result, we propose methods to optimize the regularization parameter, as well as the RBF width.
    IEEE transactions on neural networks and learning systems 07/2012; 23(7):1148-1155. · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Improving fault tolerance of a neural network has been studied for more than two decades. Various training algorithms have been proposed in sequel. The on-line node fault injection-based algorithm is one of these algorithms, in which hidden nodes randomly output zeros during training. While the idea is simple, theoretical analyses on this algorithm are far from complete. This paper presents its objective function and the convergence proof. We consider three cases for multilayer perceptrons (MLPs). They are: 1) MLPs with single linear output node; 2) MLPs with multiple linear output nodes; and 3) MLPs with single sigmoid output node. For the convergence proof, we show that the algorithm converges with probability one. For the objective function, we show that the corresponding objective functions of cases 1) and 2) are of the same form. They both consist of a mean square errors term, a regularizer term, and a weight decay term. For case 3), the objective function is slight different from that of cases 1) and 2). With the objective functions derived, we can compare the similarities and differences among various algorithms and various cases.
    IEEE transactions on neural networks and learning systems 01/2012; 23:211-222. · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Injecting weight noise during training is a simple technique that has been proposed for almost two decades. However, little is known about its convergence behavior. This paper studies the convergence of two weight noise injection-based training algorithms, multiplicative weight noise injection with weight decay and additive weight noise injection with weight decay. We consider that they are applied to multilayer perceptrons either with linear or sigmoid output nodes. Let w(t) be the weight vector, let V(w) be the corresponding objective function of the training algorithm, let α >; 0 be the weight decay constant, and let μ(t) be the step size. We show that if μ(t)→ 0, then with probability one E[||w(t)||22] is bound and limt→∞||w(t)||2 exists. Based on these two properties, we show that if μ(t)→ 0, Σtμ(t)=∞, and Σtμ(t)2 <; ∞, then with probability one these algorithms converge. Moreover, w(t) converges with probability one to a point where ∇wV(w)=0.
    IEEE transactions on neural networks and learning systems 01/2012; 23(11):1827-1840. · 4.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: An illumination adjustable image (IAI), con-taining a set of pre-captured reference images under vari-ous light directions, represents the appearance of a scene with adjustable illumination. One of drawbacks of using the IAI representation is that an IAI consumes a lot of memory. Although some previous works proposed to use blockwise principal component analysis for compressing IAIs, they did not consider the spherical nature of the extracted eigen-coefficients. This paper utilizes the spher-ical nature of the extracted eigen-coefficients to improve the compression efficiency. Our compression scheme consists of two levels. In the first level, the reference images are converted into a few eigen-images (floating point images) and a number of eigen-coefficients. In the second level, the eigen-images are compressed by a wavelet-based method. The eigen-coefficients are orga-nized into a number of spherical functions. Those spherical coefficients are then compressed by the proposed HEAL-PIX discrete cosine transform technique.
    Neural Computing and Applications 01/2012; 21. · 1.76 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recently, an extended Wang's kWTA network with stochastic output nodes has been studied. It is shown that this extended model converges asymptotically. In this paper, we further studies the convergence time of this model. The convergence time is defined as the time taken for the model to reach a finite region around the convergent. With this definition, an approximation of convergence time is derived and its viability is shown by simulation results.
    Mobile, Ubiquitous, and Intelligent Computing (MUSIC), 2012 Third FTRA International Conference on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Injecting weight noise during training has been a simple strategy to improve the fault tolerance of multilayer perceptrons (MLPs) for almost two decades, and several online training algorithms have been proposed in this regard. However, there are some misconceptions about the objective functions being minimized by these algorithms. Some existing results misinterpret that the prediction error of a trained MLP affected by weight noise is equivalent to the objective function of a weight noise injection algorithm. In this brief, we would like to clarify these misconceptions. Two weight noise injection scenarios will be considered: one is based on additive weight noise injection and the other is based on multiplicative weight noise injection. To avoid the misconceptions, we use their mean updating equations to analyze the objective functions. For injecting additive weight noise during training, we show that the true objective function is identical to the prediction error of a faulty MLP whose weights are affected by additive weight noise. It consists of the conventional mean square error and a smoothing regularizer. For injecting multiplicative weight noise during training, we show that the objective function is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise. With our results, some existing misconceptions regarding MLP training with weight noise injection can now be resolved.
    IEEE Transactions on Neural Networks 03/2011; · 2.95 Impact Factor
  • Neural Information Processing - 18th International Conference, ICONIP 2011, Shanghai, China, November 13-17, 2011, Proceedings, Part III; 01/2011
  • Source
    Tommy W. S. Chow, John Sum
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel dynamic obstacle recognition system combining global feature with local feature to identify vehicles, pedestrians and unknown backgrounds for a driver assistance system. The proposed system consists of two main procedures: ...
    Neural Computing and Applications 01/2011; 20:923-924. · 1.76 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, an objective function for training a radial basis function (RBF) network to handle single node open fault is presented. Based on the definition of this objective function, we propose a training method in which the computational complexity is the same as that of the least mean squares (LMS) method. Simulation results indicate that our method could greatly improve the fault tolerance of RBF networks, as compared with the one trained by LMS method. Moreover, even if the tuning parameter is misspecified, the performance deviation is not significant.
    Neurocomputing. 01/2011;
  • Chi-Sing Leung, John Sum
    Neural Information Processing - 18th International Conference, ICONIP 2011, Shanghai, China, November 13-17, 2011, Proceedings, Part III; 01/2011
  • Source
    Neural Information Processing - 18th International Conference, ICONIP 2011, Shanghai, China, November 13-17, 2011, Proceedings, Part III; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Improving fault tolerance of a neural network is an important issue that has been studied for more than two decades. Various algorithms have been proposed in sequel and many of them have succeeded in attaining a fault tolerant neural network. Amongst all, on-line node fault injection-based algorithms are one type of these algorithms. Despite its simple implementation, theoretical analyses on these algorithms are far from complete. In this paper, an on-line node fault injection training algorithm is studied. By node fault injection training, we assume that the hidden nodes are random neuron in which the output of these hidden nodes can be zeros in a random manner. So, in each step of update, we randomly set the hidden outputs to be zeros. The network output and the gradient vector are calculated with these zero-output hidden nodes, and thus apply the standard online weight algorithm to update the weight vector. The corresponding objective function is derived and the convergence of the algorithm is proved. By a theorem from H. White, we show that the weight vector obtained by this algorithm can converge with probability one. The weight vector converges to a local minimum of the objective function derived.
    Technologies and Applications of Artificial Intelligence (TAAI), 2010 International Conference on; 12/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The weight-decay technique is an effective approach to handle overfitting and weight fault. For fault-free networks, without an appropriate value of decay parameter, the trained network is either overfitted or underfitted. However, many existing results on the selection of decay parameter focus on fault-free networks only. It is well known that the weight-decay method can also suppress the effect of weight fault. For the faulty case, using a test set to select the decay parameter is not practice because there are huge number of possible faulty networks for a trained network. This paper develops two mean prediction error (MPE) formulae for predicting the performance of faulty radial basis function (RBF) networks. Two fault models, multiplicative weight noise and open weight fault, are considered. Our MPE formulae involve the training error and trained weights only. Besides, in our method, we do not need to generate a huge number of faulty networks to measure the test error for the fault situation. The MPE formulae allow us to select appropriate values of decay parameter for faulty networks. Our experiments showed that, although there are small differences between the true test errors (from the test set) and the MPE values, the MPE formulae can accurately locate the appropriate value of the decay parameter for minimizing the true test error of faulty networks.
    IEEE Transactions on Neural Networks 08/2010; 21(8):1232-44. · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In the last two decades, many online fault/noise injection algorithms have been developed to attain a fault tolerant neural network. However, not much theoretical works related to their convergence and objective functions have been reported. This paper studies six common fault/noise-injection-based online learning algorithms for radial basis function (RBF) networks, namely 1) injecting additive input noise, 2) injecting additive/multiplicative weight noise, 3) injecting multiplicative node noise, 4) injecting multiweight fault (random disconnection of weights), 5) injecting multinode fault during training, and 6) weight decay with injecting multinode fault. Based on the Gladyshev theorem, we show that the convergence of these six online algorithms is almost sure. Moreover, their true objective functions being minimized are derived. For injecting additive input noise during training, the objective function is identical to that of the Tikhonov regularizer approach. For injecting additive/multiplicative weight noise during training, the objective function is the simple mean square training error. Thus, injecting additive/multiplicative weight noise during training cannot improve the fault tolerance of an RBF network. Similar to injective additive input noise, the objective functions of other fault/noise-injection-based online algorithms contain a mean square error term and a specialized regularization term.
    IEEE Transactions on Neural Networks 04/2010; 21(6):938-47. · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Compressive sampling is a sampling technique for sparse signals. The advantage of compressive sampling is that signals are compactly represented by a few number of measured values. This paper adopts an analog neural network technique, Lagrange programming neural networks (LPNNs), to recover data in compressive sampling. We propose the LPNN dynamics to handle three sceneries, including the standard recovery of sparse signal, the recovery of non-sparse signal, and the noisy measurement values, in compressive sampling. Simulation examples demonstrate that our approach effectively recovers the signals from the measured values for both noise free and noisy environment.
    Neural Information Processing. Models and Applications - 17th International Conference, ICONIP 2010, Sydney, Australia, November 22-25, 2010, Proceedings, Part II; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Injecting weight noise during training has been proposed for almost two decades as a simple technique to improve fault tolerance and generalization of a multilayer perceptron (MLP). However, little has been done regarding their convergence behaviors. Therefore, we presents in this paper the convergence proofs of two of these algorithms for MLPs. One is based on combining injecting multiplicative weight noise and weight decay (MWN-WD) during training. The other is based on combining injecting additive weight noise and weight decay (AWN-WD) during training. Let m be the number of hidden nodes of a MLP, a be the weight decay constant and Sb be the noise variance. It is showed that the convergence of MWN-WD algorithm is with probability one if a >; √(Sb)m. While the convergence of the AWN-WD algorithm is with probability one if a >; 0.
    01/2010;

Publication Stats

221 Citations
70.95 Total Impact Points

Institutions

  • 2007–2014
    • National Chung Hsing University
      臺中市, Taiwan, Taiwan
  • 2010–2011
    • Providence University
      臺中市, Taiwan, Taiwan
  • 2001–2010
    • The University of Hong Kong
      • Department of Electrical and Electronic Engineering
      Hong Kong, Hong Kong
    • Nankai University
      • Institute of Modern Optics (IMO)
      Tianjin, Tianjin Shi, China
  • 2008
    • City University of Hong Kong
      • Department of Electronic Engineering
      Chiu-lung, Kowloon City, Hong Kong
  • 2006–2007
    • Chung Shan Medical University
      臺中市, Taiwan, Taiwan
  • 2003
    • The Hong Kong Polytechnic University
      • Department of Computing
      Hong Kong, Hong Kong
  • 1999
    • Hong Kong Baptist University
      • Department of Computer Science
      Kowloon, Hong Kong
  • 1994–1998
    • The Chinese University of Hong Kong
      • Department of Computer Science and Engineering
      Hong Kong, Hong Kong