Approximating I/O Data Using Radial Basis Functions: A New Clustering-Based Approach.
[show abstract] [hide abstract]
ABSTRACT: In this paper, a systematic design is proposed to determine fuzzy system structure and learning its parameters, from a set of given training examples. In particular, two fundamental problems concerning fuzzy system modeling are addressed: 1) fuzzy rule parameter optimization and 2) the identification of system structure (i.e., the number of membership functions and fuzzy rules). A four-step approach to build a fuzzy system automatically is presented: Step 1 directly obtains the optimum fuzzy rules for a given membership function configuration. Step 2 optimizes the allocation of the membership functions and the conclusion of the rules, in order to achieve a better approximation. Step 3 determines a new and more suitable topology with the information derived from the approximation error distribution; it decides which variables should increase the number of membership functions. Finally, Step 4 determines which structure should be selected to approximate the function, from the possible configurations provided by the algorithm in the three previous steps. The results of applying this method to the problem of function approximation are presented and then compared with other methodologies proposed in the bibliographyIEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 07/2000; · 3.08 Impact Factor
[show abstract] [hide abstract]
ABSTRACT: The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular-value decomposition to solve for the weights of the network. Such a procedure has several drawbacks, and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The authors propose an alternative learning procedure based on the orthogonal least-squares method. The procedure chooses radial basis function centers one by one in a rational way until an adequate network has been constructed. In the algorithm, each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least-squares learning strategy provides a simple and efficient means for fitting radial basis function networks. This is illustrated using examples taken from two different signal processing applications.IEEE Transactions on Neural Networks 02/1991; 2(2):302-9. · 2.95 Impact Factor
Conference Proceeding: Improving the LBG Algorithm.Foundations and Tools for Neural Modeling, International Work-Conference on Artificial and Natural Neural Networks, IWANN '99, Alicante, Spain, June 2-4, 1999, Proceedings, Volume I; 01/1999
J. Cabestany, A. Prieto, and D.F. Sandoval (Eds.): IWANN 2005, LNCS 3512, pp. 289 – 296, 2005.
© Springer-Verlag Berlin Heidelberg 2005
Approximating I/O Data Using Radial Basis Functions:
A New Clustering-Based Approach
Mohammed Awad, Héctor Pomares, Luis Javier Herrera, Jesús González,
Alberto Guillén, and Fernando Rojas
Dept. of Computer Architecture and Computer Technology,
University of Granada, Granada, Spain
Abstract. In this paper, we deal with the problem of function approximation
from a given set of input/output data. This problem consists of analyzing these
training examples so that we can predict the output of the model given new in-
puts. We present a new method for function approximation of the I/O data using
radial basis functions (RBFs). This approach is based on a new efficient method
of clustering of the centres of the RBF Network (RBFN); it uses the objective
output of the RBFN to move the clusters instead of just the input values of the
I/O data. This method of clustering, especially designed for function approxi-
mation problems, improves the performance of the approximator system ob-
tained, compared with other models derived from traditional algorithms.
Function approximation is the name given to a computational task that is of interest to
many sciences and engineering communities . Function Approximation consists of
synthesizing a complete model from samples of the function and its independent vari-
ables . In supervised learning, the task is that of learning a mapping from one vec-
tor space to another with the learning based on a set of instances of such mappings.
We assume that a function F does exist and we endeavour to synthesize a computa-
tional model of that function. As a general mathematical problem, function approxi-
mation has been studied for centuries. However, some knowledge of the function to
be approximated is usually assumed, depending on the specific problem. For example,
in pattern recognition, a function mapping is made whose objective is to assign each
pattern in a feature space to a specific label in a class space.
When one makes no assumptions about a model of the function to be approxi-
mated, mathematical theory can only provide interpolation techniques such as
Splines, Taylor expansions, Fourier series, etc. Under this assumption, we can also
make use of the so called model-free systems. These systems include neural networks
and fuzzy systems, among others.
Radial Basis Function Networks (RBFNs) can be seen as a particular class of Artifi-
cial Neural Networks ANNs. They are characterized by a transfer function in the hidden
unit layer having radial symmetry with respect to a centre. The basic architecture of an
RBFN is a 3-layer network. The output of the net is given by the following expression:
290 M. Awad et al.
( , , )
are the basis functions set and wi the associate weights for
every RBF. The basis function φ can be calculated as a gaussian function using the
where c? is the central point of the function φ and r is its radius.
RBFNs are universal approximators and thus best suited for function approxima-
tion problems. In general an approximator is said to be universal if it can approximate
any continuous function on a compact set to a desired degree of precision.
The technique of finding the suitable number of radial functions is very complex
since we must be careful of not producing excessively large networks which are inef-
ficient and sensitive to over-fitting and exhibit poor performances.
When the values of c? and r of the basis functions are known it is possible to use a
linear optimization method for finding the values of wi that minimize the cost function
computed on the sample set. This method relies on the computation of the pseudo-
Other methods proposed in the literature try to optimize also the centre values of
the RBFs. For instance, in  Chen et al. propose an alternative learning procedure
based on the orthogonal least-squares method. The procedure chooses radial basis
function centres one by one in a rational way until an adequate network has been
constructed; each selected centre maximizes the increment to the explained variance
or energy of the desired output and does not suffer numerical ill-conditioning prob-
lems. Orr in  selects the centres among the samples of the basis functions that most
contribute to the output variance. Another solution to this problem is to cluster similar
samples of the input data together. Every cluster has a centroid, which can then be
chosen as the centre of a new RBF. We can find in the literature some unsupervised
clustering algorithms such as k-means , fuzzy c-means , enhanced LBG , and
also some supervised clustering algorithms such as the Clustering for Function Ap-
proximation method (CFA) , the Conditional Fuzzy Clustering algorithm (CFC) 
and the Alternating Cluster Estimation method (ACE) .
In this paper we present a new method for function approximation from a set of I/O
data using radial basis functions (RBFs). This approach is based on a new efficient method
of clustering of the centres of the RBF Network; it uses the target output of the RBFN to
migrate and fine-tune the clusters instead of just the input values of the I/O data. This
method of clustering, especially designed for function approximation problems, calculates
the error committed in every cluster using the real output of the RBFN trying to concen-
trate more clusters in those input regions where the approximation error is bigger, thus
attempting to homogenize the contribution to the error of every cluster.
After this introduction, the organization of the rest of this paper is as follows. Sec-
tion 2 presents an overview of the proposed algorithm. In Section 3, we present in
detail the proposed algorithm for the determination of the pseudo-optimal RBF
parameters. Then, in Section 4 we show some results that confirm the goodness of the
proposed methodology. Some final conclusions are drawn in Section 5.
( , , )
x c r
Approximating I/O Data Using RBFs: A New Clustering-Based Approach 291
2 Overview of the Proposed Algorithm
As mentioned before, the problem of function approximation consists of synthesizing
a complete model from samples of the function and its independent variables. Con-
sider a function ( )
y f x
which there are available a set of input/output data pairs. The idea is to approximate
these data with another function
ally measured by a cost function which takes into account the error between the out-
put of the RBFN and the target output. In this paper, the cost function we are going to
use is the so-called Normalized Root Mean Squared Error (NRMSE). This perform-
ance index is defined as:
? where x? is a vector (
x ) in n-dimensional space from
F x?. The accuracy of the approximation is gener-
(( )) /
where y is the mean of the target output and p is the data number.
The objective of our algorithm, which is inspired in the CFA algorithm, is to in-
crease the density of clusters in the input domain areas where the target function is
less accurately approximated, rather than just in the zones where there are more input
examples, as most unsupervised clustering algorithms would do, or in zones where
more variability of the output is found, as it is the case of CFA.
The RBFN universal approximation property states that an optimal solution to the
approximation problem can be found, which minimizes the NRMSE. In order to find
the minimum of the error function, the RBFN is completely specified by choosing the
following parameters: the number m of radial basis functions, the centres c? of every
RBF, the radius r, and the weights w.
The number of RBFs is a critical choice. In our algorithm we have used a simple
incremental method to determine the number of RBFs. We will stop adding new
RBFs when the approximation error falls below a certain target error, in our case
NRMSETARGET=0.1. As to the rest of the parameters of the RBFN, in Section 3 we pre-
sent a new clustering technique especially suited for function approximation problems.
The basic idea we have developed is to calculate the error committed in every clus-
ter using the real output of the RBFN to compute the error for each training data be-
longing to the cluster, and concentrating more clusters in those input regions where
the cluster error is bigger. Fig. 1 presents a flow chart with the general description of
the complete incremental algorithm.
3 Parameter Adjustment of the RBF Network
The locality property inherent to the Radial Basis Functions allows us to use a cluster-
ing algorithm to obtain the RBF centres. Clustering algorithms may get stuck in a
local minimum ignoring a better placement of some of the clusters, i.e., the algorithm
is trapped in a local minimum which is not the global one. For this reason we need a
clustering algorithm capable to solve this local minima problem. To avoid this prob-
lem we endow our supervised algorithm with a migration technique. This modifica-
292 M. Awad et al.
tion allows the algorithm to escape from local minima and to obtain a prototype allo-
cation independent of the initial configuration.
To optimize the other parameters of the RBFN (the radius r and the weights w) we
used well-known heuristics such as the k-nearest neighbour technique (knn) for the
initialization of the radius of each RBF, and some conventional techniques such as
singular value decomposition (SVD) to directly optimize the weights. Finally, local
minimization routines such as the Levenberg-Marquardt algorithm are finally used to
fine-tune the obtained RBFN.
Therefore, in this section we will concentrate on the proposed clustering algorithm.
In Fig. 2, we show a flowchart representing the general description of our clustering
Confirm the Structure
Begin with 1 RBF
Calculate the RBFNNs parameters:
Centers using the proposed clustering algorithm.
Radius using K-nearest neighbours Knn
Weights using sigular value descomposition SVD
Calculate the Error NRMSEt
t TA R G E T
N R M SE N R M SE
Add 1 RBF
Return the final Clusters Cj
Perform Migration of the Clusters
Perform Local Displacement of the Clusters
Perform Local Displacement of the Clusters
D ← ∞
Initiate the Clusters using K-means
Calculate the Distortion D
Fig. 1. General description of the algorithm
Fig. 2. General description of the proposed
As can be seen from this figure, the initial values of the clusters are calculated us-
ing the k-means clustering algorithm followed by a local displacement process which
locally minimizes the distortion (D) within each cluster (see Fig. 3), which is defined as:
Approximating I/O Data Using RBFs: A New Clustering-Based Approach 293
where m is the number of RBFs (clusters),
error committed by the net when the input vector
jc? is the centre of cluster Cj and Eij is the
ix? belongs to cluster Cj.
( , , )
In the Local Displacement of the Cluster Centres, we start by making a hard parti-
tion of the training set, just as in the k-means algorithm. This partition produces a
Voronoi partition of the training data set. The second step of the process of local dis-
placement is the calculation of the error of the RBFN using the the K-nearest
neighbours algorithm to initiate the radii and the singular value decomposition to
calculate the weights of the RBFs.
After this process we must update the cluster centres in order to minimize the total
distortion (4). The algorithm stopped when the value of the distortion is less than the
value of an threshold ε.
Update the Clusters
Perform the Partition of the Training set.
Calculate the Error of the RBFN, using (KNN) to initiate the Radius
and (SVD) to calculate the weight of the RBFN
Calculate the Distortion D
DD D ε−<
Return the new Cluster Cj
Perform the Partition of the Training set
Calculate the Error of the RBFN, using (KNN) to initiate the
Radius and (SVD) to calculate the weight of the RBFN
Stop the Migration
Select all the Clusters that they have U < 1
Calculate the Distortion Djand the Utility Uj,
Reject the Migration
Confirm the migration
Select all the Clusters that have (U > 1)
Select one (U<1) using roulette wheel selection
Calculate the probability of every Cluster that has U>1
Has maximum probability.
Calculate the probability of every Cluster that has U<1
Has maximum probability.
Any cluster with U< 1?
Calculate the Distortion Dj
Perform Local Displacement of the Clusters
Used K-means to repatition the data
Move the selected cluster with (U<1) to the zone of the
cluster selected with (U>1).
Select one (U>1) using the roulette wheel selection
Any cluster with U > 1?
Has the Distotion
Fig. 3. Local Displacement of the Clusters
Fig. 4. The Migration Process
294 M. Awad et al.
This is carried out by an iterative process that updates each cluster centres as the
weighted mean of the training data belonging to that cluster and we repeat this proc-
ess until the total distortion of the net reaches a minimum.
The migration process migrates clusters from the better approximated zones toward
those zones where the approximation error is worse, thus attempting to make equal
their contribution to the total distortion. Our main hypothesis is that the best initial
cluster configuration will be the one that equalizes the approximation error committed
by every cluster. To avoid local minimums, the migration process uses a pseudo-
random selection of the cluster to migrate, being the probability of choosing a given
cluster inversely proportional to what we call “the utility” of that cluster, which is
In this way, the proposed algorithm selects one cluster that has utility less than one
and moves this cluster to the zone nearby a new selected cluster having utility more
than one (see Fig. 4). This migration step is necessary because the local displacement
of clusters only moves clusters in a local manner. It should be noted that using the k-
means algorithm to divide the data that belonged to the zone that receives the new
cluster is less complex and need less execution time than others migration algorithms
such as ELBG and CFA.
4 Example of the Proposed Procedure
Let us consider the function 
This has been chosen to demonstrate the importance of the equidistribution of the
approximation errors throughout the clusters. This function (see Fig. 5a), has a very
variable output when the input x is near to the value zero.
To test the effects caused by the proposed algorithm on the initialization and avoid
local minima on the placement of the clusters, a training set of 2000 samples of the
function was generated by evaluating inputs taken uniformly from the interval [0,10],
from which we have removed 1000 points for validation.
The results of the proposed algorithm compared with the CFA algorithm, which re-
sulted to be the best algorithm for this example in , with
sented in Table 1. In the table, NRMSEC is the NRMSE of the training data after the
clustering process is concluded. NRMSET is the final error index (for 10000 test data)
obtained after the application of the Levenberg–Marquardt method. It must be noted,
that both clustering algorithms were designed to provide an initial RBF configuration
to be subsequently optimized using a local optimization method in order to find the
global minimum. Std are the standard deviation of the error indices using 5 executions
of both algorithms. Finally, TimeC is the clustering process execution time (in sec-
= 0.001, are repre-
Approximating I/O Data Using RBFs: A New Clustering-Based Approach 295
onds). As can be seen from the table, the proposed algorithm reaches better approxi-
mations using less time than the CFA algorithm, in all cases.
Fig. 5. a) Objective function. b) Approximation with 6 RBFs
Table 1. Comparison between CFA and the proposed approach
m NRMSEC Std NRMSET Std TIMEC
NRMSEC Std NRMSET Std TIMEC
As an example of the learning process, in Fig. 6a we can see the initial distortion
distribution for the case of 6 equally distributed RBFs, which is the first configuration
whose approximation error falls under the target error. Fig. 6b represents the same
information when the clustering process has ended. We can now see the advantage
that we can expect from the fact of making each cluster to have an equal contribution
to the total distortion, which is the objective of the proposed clustering algorithm.
Finally, 5b represents the approximation of the net using 6 RBFs. We can see how the
net is capable of making practically a perfect approximation.
Fig. 6. a) The distortion before the migration b) The distortion after the migration
296 M. Awad et al.
In this paper we have proposed an algorithm of clustering especially suited for func-
tion approximation problems. This method calculates the error committed in every
cluster using the real output of the RBFN, and not just an approximate value of that
output, trying to concentrate more clusters in those input regions where the approxi-
mation error is bigger, thus attempting to homogenize the contribution to the error of
every cluster. This algorithm is easy to implement and is superior in both performance
and computation time to other algorithms such as the CFA method. We have also
shown how it is possible to use this algorithm to find the minimal number of RBFs
that satisfy a certain error target for a given function approximation problem.
This work has been partially supported by the Spanish CICYT Project TIN2004-
1. Pomares, H., Rojas, I., Ortega, J., González, J., Prieto, A.: “A Systematic Approach to
Self-Generating Fuzzy Rule-Table for Function Approximation”, IEEE Trans. Syst,. Man,
and Cybern. (2000), Part-B, 30(3) 431-447.
2. Higgins, Ch.: Classifications and approximation with Rule-Based Networks. Phd. Thesis
3. Chen S., Cowan C. F. N, Grant P. M.: “Orthogonal least squares learning algorithm for radial
basis functions networks”, IEEE Trans. Neural Networks (1991), vol. 2, no. 2, 302-309.
4. Orr M. J. L.: “Regularization in the selection of radial basis function centers”, Neural
Computation (1995), 7(3), 606-623
5. Duda, R. O., Hart, P. E.: Pattern Classification and Scene Analysis. New York: Wiley, 1973.
6. Bezdek, J. C.: Pattern Recognition with Fuzzy Objective Function Algorithms Plenum,
New York, 1981.
7. Russo, M., Patanè, G.: “Improving the LBG Algorithm,” in Lecture Notes in Computer
Science. New York: Springer-Verlag (1999), vol. 1606, 621–630.
8. Gonzalez, J., Rojas, I., Pomares, H., Ortega, J., Prieto, A.: “A new clustering technique for
function approximation”, IEEE Trans. Neural Networks (2002), vol.: 13, no. 1, 132 -142.
9. Pedrycz, W.: “Conditional fuzzy C-means,” Pattern Recognition Lett., vol. 17, pp. 625–
10. Runkler, T. A., Bezdek, J. C.: “Alternating cluster estimation: A new tool for clustering
and function approximation”, IEEE Trans. Fuzz Syst. (1999), vol.7, 377- 393.