ArticlePDF Available

A brief review of feed-forward neural networks

Authors:

Abstract and Figures

Artificial neural networks, or shortly neural networks, find applications in a very wide spectrum. In this paper, following a brief presentation of the basic aspects of feed-forward neural networks, their mostly used learning/training algorithm, the so-called back-propagation algorithm, have been described.
Content may be subject to copyright.
Commun. Fac. Sci. Univ. Ank. Series A2-A3
V.50(1) pp 11-17 (2006)
A BRIEF REVIEW OF FEED-FORWARD NEURAL NETWORKS
MURAT H. SAZLI
Ankara University, Faculty of Engineering, Department of Electronics Engineering
06100 Tandoğan, Ankara, TURKEY
E-mail: sazli@eng.ankara.edu.tr
(Received Jan. 03, 200; Revised: Jan. 27, 2006 Accepted Feb. 06, 2006)
ABSTRACT
Artificial neural networks, or shortly neural networks, find applications in a very wide
spectrum. In this paper, following a brief presentation of the basic aspects of feed-forward neural
networks, their mostly used learning/training algorithm, the so-called back-propagation algorithm, have
been described.
KEYWORDS: Artificial neural networks, feed-forward neural networks, back-propagation algorithm
1. INTRODUCTION
Artificial neural networks, or shortly neural networks, have been
successfully applied to many diverse fields. Pattern classification/recognition;
system modeling and identification; signal processing; image processing; control
systems and stock market predictions are some of those main fields of engineering
and science [1]. This, of course, can be attributed to the many useful aspects of
neural networks, such as their parallel structure, learning and adaptive capabilities,
Very Large Scale Integrated (VLSI) implementability, fault tolerance, to name a few
[2], [3].
Outline of the paper is as follows. In the next section, a brief introduction
on feed-forward neural networks is presented. Feed-forward neural networks are the
mostly encountered and used in many diverse applications, therefore they are chosen
to exemplify the artificial neural networks. Back-propagation algorithm is described
in detail in Section 3. Back-propagation algorithm is the mostly used algorithm in
the training of feed-forward neural networks, but it is also used, along with the
modified versions of the algorithm, in the training of other types of neural networks.
2. FEED-FORWARD NEURAL NETWORKS
Artificial neural networks, as the name implies, are inspired from their
biological counterparts, the biological brain and the nervous system. Biological
brain is entirely different than the conventional digital computer in terms of its
structure and the way it processes information. In many ways, biological brain (or
MURAT H. SAZLI
12
human brain as its most perfect example) is far more advanced and superior to
conventional computers. The most important distinctive feature of a biological brain
is its ability to “learn” and “adapt”, while a conventional computer does not have
such abilities. Conventional computers accomplish specific tasks based upon the
instructions loaded to them, the so-called “programs” or “software”.
Basic building block of neural networks is a “neuron”. A neuron can be
perceived as a processing unit. In a neural network, neurons are connected with each
other through “synaptic weight”s, or “weight”s in short. Each neuron in a network
receives “weighted” information via these synaptic connections from the neurons
that it is connected to and produces an output by passing the weighted sum of those
input signals (either external inputs from the environment or the outputs of other
neurons) through an “activation function”.
There are two main categories of network architectures depending on the
type of the connections between the neurons, “feed-forward neural networks” and
recurrent neural networks”. If there is no “feedback” from the outputs of the
neurons towards the inputs throughout the network, then the network is referred as a
“feed-forward neural network”. Otherwise, if there exists such a feedback, i.e. a
synaptic connection from the outputs towards the inputs (either their own inputs or
the inputs of other neurons), then the network is called a “recurrent neural network”.
Usually, neural networks are arranged in the form of “layer”s. Feed-forward neural
networks fall into two categories depending on the number of the layers, either
single layer” or “multi-layer”.
Figure 1. A single layer feed-forward neural network
A BRIEF REVIEW OF FEED-FORWARD NEURAL NETWORKS 13
In Figure 1, a single layer feed-forward neural network (fully connected) is
shown. Including the input layer, there are two layers in this structure. However,
input layer does not count because there is no computation performed in that layer.
Input signals are passed on to the output layer via the weights and the neurons in the
output layer compute the output signals.
Figure 2. A multi-layer feed-forward neural network
In Figure 2, a multi-layer feed-forward neural network with one “hidden
layer” is depicted. As opposed to a single-layer network, there is (at least) one layer
of “hidden neurons” between the input and output layers. According to Haykin [1],
the function of hidden neurons is to intervene between the external input and the
network output in some useful manner. Existence of one or more hidden layers
enable the network to extract higher-order statistics. For the example given in Figure
2, there is only one hidden layer and the network is referred as a 5-3-2 network
because there are 5 input neurons, 3 hidden neurons, and 2 output neurons.
In both Figure 1 and Figure 2, networks are “fully connected” because
every neuron in each layer is connected to every other neuron in the next forward
layer. If some of the synaptic connections were missing, the network would be
called as “partially connected”.
The most important feature of a neural network that distinguishes it from a
conventional computer is its “learning” capability. A neural network can learn from
its environment and improve its performance through learning. Haykin defined
learning in the context of neural networks in the literature [1] as follows:
MURAT H. SAZLI
14
“Learning is a process by which the free parameters of a neural network are
adapted through a process of stimulation by the environment in which the network is
embedded. The type of learning is determined by the manner in which the parameter
changes take place [1].”
3. BACK-PROPAGATION ALGORITHM
Among many other learning algorithms, “back-propagation algorithm” is the
most popular and the mostly used one for the training of feed-forward neural
networks. It is, in essence, a means of updating networks synaptic weights by back
propagating a gradient vector in which each element is defined as the derivative of
an error measure with respect to a parameter. Error signals are usually defined as the
difference of the actual network outputs and the desired outputs. Therefore, a set of
desired outputs must be available for training. For that reason, back-propagation is a
supervised learning rule. A brief explanation of the back-propagation algorithm to
train a feedforward neural network is presented in the following.
Let us consider a multilayer feedforward neural network as shown in Figure 2.
Let us take a neuron in output layer and call it neuron j. The error signal at the
output of the neuron j for nth iteration is defined by Equation (1) in the literature [1]:
)()( nydne jjj
=
(1)
where j
d is the desired output for neuron j and )(ny j is the actual output for
neuron j calculated by using the current weights of the network at iteration n. For a
certain input there is a certain desired output, which the network is expected to
produce. Presentation of each training example from the training set is defined as an
“iteration”.
Instantenous value of the error energy for the neuron j is given in Equation
(2):
)(
2
1
)( 2nen jj =
ε
(2)
Since the only visible neurons are the ones in the output layer, error signals for those
neurons can be directly calculated. Hence, the instantaneous value, )(n
ε
, of the
total error energy is the sum of all )(n
j
ε
’s for all neurons in the output layer, as
given in Equation (3):
=
Qj
jnen )(
2
1
)( 2
ε
(3)
A BRIEF REVIEW OF FEED-FORWARD NEURAL NETWORKS 15
where Q is the set of all neurons in the output layer.
Suppose there are N patterns (examples) in the training set. The average
squared energy for the network is found by Equation (4):
=
=
N
n
av n
N1
)(
1
εε
(4)
It is important to note that the instantaneous error energy )(n
ε
and therefore the
average error energy, av
ε
, is a function of all the free parameters (i.e., synaptic
weights and bias levels) of the network. Back-propagation algorithm, as explained
in the following, provides the means to adjust the free parameters of the network to
minimize the average error energy, av
ε
. There are two different modes of back-
propagation algorithm: “sequential mode” and “batch mode. In sequential mode,
weight updates are performed after the presentation of each training example. One
complete presentation of the training set is called an “epoch”. In the batch mode,
weight updates are performed after the presentation of all training examples, i.e.
after an epoch is completed. Sequential mode is also referred as on-line, pattern or
stochastic mode. This is the most frequently used mode of operation and explained
in the following.
Let us start by giving the output expression for the neuron j in Equation (5):
=
=
m
i
ijij nynwfny
0
)()()( (5)
where m is the total number of inputs to the neuron j (excluding the bias) from the
previous layer and f is the activation function used in the neuron j, which is some
nonlinear function. Here 0j
w equals the bias j
b applied to the neuron j and it
corresponds to the fixed input 1
0
+
=
y.
The weight updates to be applied to the weights of the neuron j is
proportional to the partial derivative of the instantaneous error energy )(n
ε
with
respect to the corresponding weight, i.e. )(/)( nwn ji
ε
, and using the chain rule
of calculus it can be expressed in Equation (6):
MURAT H. SAZLI
16
)(
)(
)(
)(
)(
)(
)(
)(
nw
ny
ny
ne
ne
n
nw
n
ji
j
j
j
jji
=
εε
(6)
From Equations (2), (1) and (5) respectively, Equation (7) is obtained.
)(
)(
)( ne
ne
n
j
j
=
ε
(7)
1
)(
)( =
ny
ne
j
j (8)
)()()(
)(
)()(
)()(
)(
)(
0
0
0
nynynwf
nw
nynw
nynwf
nw
ny
i
m
i
iji
ji
m
i
iji
m
i
iji
ji
j
=
=
=
=
= (9)
where
=
=
=
=m
i
iji
m
i
iji
m
i
iji
nynw
nynwf
nynwf
0
0
0)()(
)()(
)()( (10)
Substituting Equations (7), (8) and (9) in Equation (6) yields Equation (11).
)()()()(
)(
)(
0
nynynwfne
nw
n
j
m
i
ijij
ji
=
=
ε
(11)
A BRIEF REVIEW OF FEED-FORWARD NEURAL NETWORKS 17
The correction )(nw ji
applied to )(nw ji is defined by the delta rule, given in
Equation (12).
)(
)(
)( nw
n
nw
ji
ji
=
ε
η
(12)
In Equation (12),
η
corresponds to the learning-rate parameter of the back-
propagation algorithm, which is usually set to a pre-determined value and kept
constant during the operation of the algorithm.
4. CONCLUSIONS
Feed-forward neural networks are the mostly encountered type of artifical
neural networks and applied to many diverse fields. In this paper, a brief
introduction on artificial neural networks, following a presentation on feed-forward
neural networks, have been given. A detailed description of back-propagation
algorithm, the mostly used learning/training algorithm of feed-forward neural
networks, have been presented in the paper, as well. Interested reader is referred to
the literature [1] for a thorough discussion of the artificial neural networks and the
back-propagation algorithm.
ÖZET
Yapay sinir ağları, kısaca sinir ağları, çok geniş bir spektrumda uygulama alanları bulmaktadır. Bu
makalede, ileri-beslemeli sinir ağlarının temel özelliklerinin kısa bir tanıtımını takiben, bu tür sinir
ağlarında en yaygın kullanılan öğrenme/eğitme algoritması olan geri-yayınım algoritması tarif
edilmektedir.
ANAHTAR KELİMELER: Yapay sinir ağları, ileri-beslemeli sinir ağları, geri-yayınım algoritması
REFERENCES
[1] S. Haykin. “Neural Networks, A Comprehensive foundation”, 2nd edition.
Prentice Hall, 1999.
[2] M. H. Sazlı. “Neural Network Applications to Turbo Decoding”. Ph.D.
Dissertation, Syracuse University, 2003.
[3] M. H. Sazlı, C. Işık. “Neural Network Implementation of the BCJR Algorithm”.
accepted for publication in Digital Signal Processing Journal, Elsevier, 2005.
... Artificial Neural Networks (ANNs), such as standard feed-forward networks 25 , are limited in their application to static data, where each input is independent of others. They do not account for the order or context of inputs, which is essential in time-dependent tasks. ...
... • With varying α (γ = 2): We observed that ( Table 2) α=0. 25 to 0.5 provided a good balance, as it adjusted the focus toward the positive class without neglecting the background. Higher values like α=0.75 or 0.9 made the model focus heavily on cracks, which was beneficial when cracks were extremely rare but could lead to overfitting and thus lower precision ( Table 2) on testing data i.e unseen data. ...
Article
Full-text available
The detection of cracks in large structures is of critical importance, as such damage can result not only in significant financial costs but also pose serious risks to public safety. Many existing methods for crack detection rely on deep learning algorithms or traditional approaches that typically use image data. In this study, however, we explore an innovative approach based on numerical data, which is characterized by greater cost efficiency and offers intriguing research implications. This study emphasizes the evaluation of hybrid RNN-CNN models in comparison to the pure CNN models previously utilized in related research. Our proposed model incorporates a single RNN layer, complemented by essential supporting layers, which contributes to a reduction in complexity and a decrease in the number of parameters. This design choice results in a more streamlined and efficient architecture. Our experimental results reveal an accuracy of 78.9%, which, while slightly lower than the performance of conventional CNN models, underscores the potential of RNN layers in crack detection tasks. Importantly, this work demonstrates that integrating additional RNN layers can effectively enhance crack detection capabilities, particularly given the significance of preserving spatial information for accurate crack segmentation. These findings open avenues for further exploration and optimization of RNN-based methods in structural damage analysis, suggesting that the strategic use of RNNs can complement CNN models to achieve robust performance in this domain.
... More recently, neural network-based approaches have gained traction. In [18], feedforward neural networks [19] were employed to approximate reachable sets, with uncertainty quantified via conformal inference [20], [21]. Additionally, kernel density estimation, accelerated by the fast Fourier transform, has been applied to model uncertainties and compute probabilistic reachable sets [22]. ...
Preprint
Full-text available
Ensuring safety in cyber-physical systems (CPSs) is a critical challenge, especially when system models are difficult to obtain or cannot be fully trusted due to uncertainty, modeling errors, or environmental disturbances. Traditional model-based approaches rely on precise system dynamics, which may not be available in real-world scenarios. To address this, we propose a data-driven safety verification framework that leverages matrix zonotopes and barrier certificates to verify system safety directly from noisy data. Instead of trusting a single unreliable model, we construct a set of models that capture all possible system dynamics that align with the observed data, ensuring that the true system model is always contained within this set. This model set is compactly represented using matrix zonotopes, enabling efficient computation and propagation of uncertainty. By integrating this representation into a barrier certificate framework, we establish rigorous safety guarantees without requiring an explicit system model. Numerical experiments demonstrate the effectiveness of our approach in verifying safety for dynamical systems with unknown models, showcasing its potential for real-world CPS applications.
Article
Full-text available
Image classification is widely used in everyday life such as in car steering, closed-circuit television (CCTV), traffic cameras, etc. The implementation of image classification can be done using several methods, including neural network and support vector machine models. The neural network method is able to find the right weights that allow the network to show the desired behaviour while the support vector machine method has many dimensions and can overcome linear and non-linear data. In this research, feature extraction was carried out using VGG16 to increase accuracy. This research aims to find out how to implement the neural network and SVM algorithms to classify images and determine the results of analyzing the performance of the two methods. The data used in this study is secondary data consisting of 10 types of large wild cats with a total of 2339 training image datasets and 50 testing image datasets. The research stages consist of data augmentation, model design, model training, and model evaluation. Classification with the neural network model produced an accuracy of 96% and the support vector machine model produced an accuracy of 96%, which means that in a consistent training environment, the two models have the same performance.
Article
This paper presents a comprehensive review of various statistical, machine learning, and deep learning methods for quantile regression in time series, focusing on their application in predicting dengue outbreaks. Given the increasing global incidence of dengue, the ability to forecast higher quantiles of cases is crucial for effective public health responses. The study emphasizes the limitations of traditional regression models, which typically predict average outcomes, and instead highlights the importance of probabilistic forecasting methods that account for uncertainty, offering more detailed predictions for extreme events. Methods such as quantile autoregression (QAR), quantile neural networks, penalized quantile regression and hybrid models are evaluated through simulations and are applied to a real-world dataset from a dengue-prone region. The study also discusses the potential of recent advances, including deep reinforcement learning, in quantile regression forecasts under various scenarios. The performances of these models are assessed using metrics like root mean squared error and continuous ranked probability score, with QAR consistently outperforming other models in real-life scenario and LASSO-based quantile regression providing excellent results in various simulation settings.
Article
The present investigation is focused on the machinability and characterisation of cryogenised AISI 4140 steels, which are renowned for exhibiting superior hardness, durability, wear resistance, dimensional and chemical stability, and fatigue strength when compared to conventional steels. Hard turning experiments were conducted on cryogenised steel specimens employing dry cutting conditions with a carbide insert. As the cutting tip, the TT5100 quality TIN coated type with code WNMG 080408 MT produced by TaeguTec company was used. The study examines the impact of various cutting parameters, including three distinct cutting speeds (160, 200, 240 m/min), three feed rates (0.04, 0.08, 0.12 mm/rev), and three depths of cut (0.1, 0.15, 0.2 mm), on power consumption and surface roughness values. The utilisation of the artificial neural network (ANN) approach, a machine learning methodology, for the analysis of measured values adds another layer of originality to the research. Transfer functions, such as Scaled Conjugate Gradient (Trainscg), were employed within the ANN structure. The calculated metrics include a mean absolute error (MAE) of 0.0218, 0.0542, and 0.0064, and a mean squared error (MSE) of 0.0429. The mean absolute error (MAE) was 0.0683, the mean squared error (MSE) was 0.0429, and the mean relative deviation (ARD%) was 10.71%, 0.0718%, and 0.2020%. Furthermore, the optimal values for the correlation coefficient (R2) were determined as 0.9512, 0.9999, and 0.9997.
Article
Error control coding, or channel coding, is an essential part of a communication system. Recent invention of the state-of-the-art turbo codes and their innovative iterative decoding technique has been a milestone in the history of error control coding. Turbo coding and decoding made it possible to approach the Shannon's channel capacity limit within a few tenths of a dB, thereby providing incredible coding gains. The decoding problem of error control codes has been shown as a promising application of neural networks, which have been previously applied to many other fields.In this dissertation, we discovered the equivalence of the BCJR (Bahl, Cocke, Jelinek and Raviv) algorithm, also known as MAP (Maximum A Posteriori) algorithm, and a feedforward neural network structure. We reformulated the BCJR algorithm, which is the optimum soft input soft output (SISO) algorithm, using matrix manipulations. Based upon this new formulation, we implemented the BCJR algorithm as a feedforward neural network structure. We verified our theoretical work by computer simulations.
Neural Network Implementation of the BCJR Algorithm”. accepted for publication in 17 )(n is defined by the delta rule, given in )( n nw ji ∂ −=∆ ε η
  • M H Sazlı
  • C Işık
M. H. Sazlı, C. Işık. “Neural Network Implementation of the BCJR Algorithm”. accepted for publication in Digital Signal Processing Journal, Elsevier, 2005. 17 )(n is defined by the delta rule, given in )( n nw ji ∂ −=∆ ε η (12)