Article

Evolving artificial neural network ensembles. IEEE Comput Intell Mag

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Using a coordinated group of simple solvers to tackle a complex problem is not an entirely new idea. Its root could be traced back hundreds of years ago when ancient Chinese suggested a team approach to problem solving. For a long time, engineers have used the divide-and-conquer strategy to decompose a complex problem into simpler sub-problems and then solve them by a group of solvers. However, knowing the best way to divide a complex problem into simpler ones relies heavily on the available domain knowledge. It is often a manual process by an experienced engineer. There have been few automatic divide-and-conquer methods reported in the literature. Fortunately, evolutionary computation provides some of the interesting avenues to automatic divide-and-conquer methods. An in-depth study of such methods reveals that there is a deep underlying connection between evolutionary computation and ANN ensembles. Ideas in one area can be usefully transferred into another in producing effective algorithms. For example, using speciation to create and maintain diversity had inspired the development of negative correlation learning for ANN ensembles, and an in-depth study of diversity in ensembles. This paper will review some of the recent work in evolutionary approaches to designing ANN ensembles.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... D eep learning has achieved great success in many fields [1], such as speech recognition [2], semantic segmentation [3], [4], image recognition [5], [6], and natural language processing [7]. With the excellent performance in these fields, convolutional neural networks (CNNs) have become one of the most widely used models in deep learning [8]. ...
... However, the number of parameters from the final model which is found by our algorithm is much lower than Firefly-CNN. The proposed algorithm has an error rate 1.6% Airplane Automobile Horse FIGURE 6 Three examples from CIFAR10. ...
Article
Full-text available
Recently, convolutional neural networks (CNNs) have achieved great success in the field of artificial intelligence, including speech recognition, image recognition, and natural language processing. CNN architecture plays a key role in CNNs' performance. Most previous CNN architectures are hand-crafted, which requires designers to have rich expert domain knowledge. The trial-and-error process consumes a lot of time and computing resources. To solve this problem, researchers proposed the neural architecture search, which searches CNN architecture automatically, to satisfy different requirements. However, the blindness of the search strategy causes a 'loss of experience' in the early stage of the search process, and ultimately affects the results of the later stage. In this paper, we propose a self-adaptive mutation neural architecture search algorithm based on ResNet blocks and DenseNet blocks. The self-adaptive mutation strategy makes the algorithm adaptively adjust the mutation strategies during the evolution process to achieve better exploration. In addition, the whole search process is fully automatic, and users do not need expert knowledge about CNNs architecture design. In this paper, the proposed algorithm is compared with 17 state-of-the-art algorithms, including manually designed CNN and automatic search algorithms on CIFAR10 and CIFAR100. The results indicate that the proposed algorithm outperforms the competitors in terms of classification performance and consumes fewer computing resources.
... COMBINATION OF PI-BASED FORECAST The combination of point forecast from several models to improve the overall prediction accuracy has received huge TABLE I PARAMETERS USED IN REPLY TO LUBE METHOD TO TRAIN THE PI-NN MODEL attention in the NN community. It has been reported in [34] ...
... that the NN ensemble technique is an effective method to improve the prediction performance, even if it is a simple averaging method (combined forecast through simple averaging of forecasts from individual members). The application of the NN ensemble method can be found in many different applications, such as electricity load forecasting, machine learning, finance and economics, and medical science [34]–[36]. However, all of these applications deal with point-based forecasting. ...
Article
Full-text available
Neural networks (NNs) are an effective tool to model nonlinear systems. However, their forecasting performance significantly drops in the presence of process uncertainties and disturbances. NN-based prediction intervals (PIs) offer an alternative solution to appropriately quantify uncertainties and disturbances associated with point forecasts. In this paper, an NN ensemble procedure is proposed to construct quality PIs. A recently developed lower–upper bound estimation method is applied to develop NN-based PIs. Then, constructed PIs from the NN ensemble members are combined using a weighted averaging mechanism. Simulated annealing and a genetic algorithm are used to optimally adjust the weights for the aggregation mechanism. The proposed method is examined for three different case studies. Simulation results reveal that the proposed method improves the average PI quality of individual NNs by 22%, 18%, and 78% for the first, second, and third case studies, respectively. The simulation study also demonstrates that a 3%–4% improvement in the quality of PIs can be achieved using the proposed method compared to the simple averaging aggregation method.
... For example, a set of multi-layer perceptron neural networks can be trained with initial weights, a number of layers and nodes, different error criteria, and so on. Setting such parameters can control individual model instability and, ultimately, diversify them [38]. The ability to ...
... The R package ForecastComb (Weiss et al., 2018) provides tools for rank-based combinations. Another weighting scheme which attaches a weight proportional to exp(β(N + 1 − i)) to the ith ordered constituent forecast was adopted in Yao and Islam (2008) and Donate et al. (2013) to combine forecasts obtained from artificial neural networks (ANNs), where β is a scaling factor. However, as mentioned by Andrawis et al. (2011), this class of combination methods limits the weights to only a discrete set of possible values. ...
Article
Full-text available
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sources, thereby mitigating the risk of identifying a single "best" forecast. Combination schemes have evolved from simple combination methods without estimation, to sophisticated methods involving time-varying weights, nonlinear combinations, correlations among components, and cross-learning. They include combining point forecasts and combining probabilistic forecasts. This paper provides an up-to-date review of the extensive literature on forecast combinations, together with reference to available open-source software implementations. We discuss the potential and limitations of various methods and highlight how these ideas have developed over time. Some important issues concerning the utility of forecast combinations are also surveyed. Finally, we conclude with current research gaps and potential insights for future research.
... The R package ForecastComb (Weiss et al., 2018) provides tools for rank-based combinations. Another weighting scheme which attaches a weight proportional to exp(β(N + 1 − i)) to the ith ordered constituent forecast was adopted in Yao and Islam (2008) and Donate et al. (2013) to combine forecasts obtained from artificial neural networks (ANNs), where β is a scaling factor. However, as mentioned by Andrawis et al. (2011), this class of combination methods limits the weights to only a discrete set of possible values. ...
Preprint
Full-text available
Forecast combinations have flourished remarkably in the forecasting community and, in recent years, have become part of the mainstream of forecasting research and activities. Combining multiple forecasts produced from the single (target) series is now widely used to improve accuracy through the integration of information gleaned from different sources, thereby mitigating the risk of identifying a single "best" forecast. Combination schemes have evolved from simple combination methods without estimation, to sophisticated methods involving time-varying weights, nonlinear combinations, correlations among components, and cross-learning. They include combining point forecasts, and combining probabilistic forecasts. This paper provides an up-to-date review of the extensive literature on forecast combinations, together with reference to available open-source software implementations. We discuss the potential and limitations of various methods and highlight how these ideas have developed over time. Some important issues concerning the utility of forecast combinations are also surveyed. Finally, we conclude with current research gaps and potential insights for future research.
... While it is highly improbable that all of the input parameters (pH, electrical conductivity, water level, water temperature, and air temperature) had the exact same values and the output was different, this is not impossible. This situation did not exist in our observed dataset; in case it did, a possible solution would have been to explore the possibility of using ensembles of neural networks [54][55][56] to ensure that instead of one deterministic value the output will be a range of possible values. In this way, the output parameter could have different values even for identical sets of input parameters. ...
Article
Full-text available
The scope of the present study is the estimaThe scope of the present study is the estimation of the concentration of nitrates in groundwater using artificial neural networks (ANNs) based on easily measurable in situ data. For the purpose of the current study, two feedforward neural networks were developed to determine whether including land use variables would improve the model results. In the first network, easily measurable field data were used, i.e., pH, electrical conductivity, water temperature, air temperature, and aquifer level. This model achieved a fairly good simulation based on the root mean squared error (RMSE in mg/L) and the Nash–Sutcliffe Model Efficiency (NSE) indicators (RMSE = 26.18, NSE = 0.54). In the second model, the percentages of different land uses in a radius of 1000 m from each well was included in an attempt to obtain a better description of nitrate transport in the aquifer system. When these variables were used, the performance of the model increased significantly (RMSE = 15.95, NSE = 0.70). For the development of the models, data from chemical and physical analyses of groundwater samples from wells located in the Kopaidian Plain and the wider area of the Asopos River Basin, both in Greece, were used. The simulation that the models achieved indicates that they are a potentially useful tools for the estimation of groundwater contamination by nitrates and may therefore constitute a basis for the development of groundwater management plans.tion of the concentration of nitrates
... When combining meta-heuristics with ANNs, evolution can be performed at three levels: (i) connection weights (which corresponds to the training phase and is formulated as the minimization of the Mean Squared Error (MSE)-described by Equation (3)); (ii) architecture (which corresponds to the process of identifying the optimal topology); and (iii) learning rules (which aims to adapt the learning rules using the evolutive process) [34]. In the current work, mBFO is applied for a simultaneous optimization of connection weights and architecture. ...
Article
Full-text available
The glass transition temperature (Tg) is an important decision parameter when synthesizing polymeric compounds or when selecting their applicability domain. In this work, the glass transition temperature of more than 100 homopolymers with saturated backbones was predicted using a neuro-evolutive technique combining Artificial Neural Networks with a modified Bacterial Foraging Optimization Algorithm. In most cases, the selected polymers have a vinyl-type backbone substituted with various groups. A few samples with an oxygen atom in a linear non-vinyl hydrocarbon main chain were also considered. Eight structural, thermophysical, and entanglement properties estimated by the quantitative structure–property relationship (QSPR) method, along with other molecular descriptors reflecting polymer composition, were considered as input data for Artificial Neural Networks. The Tg’s neural model has a 7.30% average absolute error for the training data and 12.89% for the testing one. From the sensitivity analysis, it was found that cohesive energy, from all independent parameters, has the highest influence on the modeled output.
... Or, dans de nombreuses expériences abordant l'évolution de topologie l'opérateur de croisement n'est pas présent (Angeline et al., 1994). La permutation n'est pas considérée comme seulement inefficace mais aussi susceptible de produire des individus moins performants (Yao et Islam, 2008), ceci à cause du problème de permutation (Angeline et al., 1994;Belew et al., 1990;Hancock, 1992;Haflidason et Neville, 2009). Celui-ci est dû au fait qu'un même réseau peut être représenté dans un codage génétique par de nombreuses codifications différentes. ...
Article
Evolutionary robotics aim at building machines constantly able to learn new knowledges in a continuous, uncontroled and changing world. This method allowed to successfully build real robots showing complex reactive behaviors. Further research include the design of control architectures with more cognitive abilities. Memory is a central component of cognition and finding methods allowing robots to acquire memory could be a first step necessary to develop higher cognitive behavior. On this basis, the goal of this thesis is to study the synthesis of control architecture for robots, able to achieve tasks requiring the development of internal memory. We hypothesize that building an internal form of memory in a control architecture is a deceptive problem. In other words evolutionary robotics tend to generate agents only taking into account their current perceptions. We propose an approach based on the use of different selective pressures in order to avoid premature convergence to individuals having reactive behavior. We show that in order to promote the emergence of internal memory, it is necessary : (1) to use discrete fitness which doesn't introduce gradients which may tend to local optima; (2) to develop behavioral diversity mechanisms in order to explore search space more efficiently; (3) to develop different helper objectives that ensure robust memory.
Article
The object of the research is artificial neural networks (ANN) with convolutional architecture for image classification. The subject of the research is the study and development of algorithms for constructing ensembles of convolutional neural networks (SNS) in conditions of limited training sample. The aim of the study is to develop an algorithm for the formation of an effective model based on an ensemble of convolutional SNS using methods of averaging the results of each model, capable of avoiding overfitting in the process of improving the accuracy of the forecast and trained on a small amount of data, less than 10 thousand examples. As a basic network, an effective SNA architecture was developed as part of the ensemble, which showed good results as a single model. The article also examines methods for combining the results of ensemble models and provides recommendations for the formation of the SNA architecture. The research methods used are the theory of neural networks, the theory of machine learning, artificial intelligence, methods of algorithmization and programming of machine learning models, a comparative analysis of models based on different algorithms using classical ensembling with simple averaging and combining the results of basic algorithms in conditions of limited sampling, taking into account weighted average. The field of application of the obtained algorithm and model is medical diagnostics in medical institutions, sanatoriums during primary diagnostic admission, using the example of a research task, the model is trained to classify dermatological diseases according to input photographs. The novelty of the study lies in the development of an effective algorithm and image classification model based on an ensemble of convolutional NS that exceed the prediction accuracy of basic classifiers, the process of retraining an ensemble of classifiers with deep architecture on a small sample volume is investigated, from which conclusions are drawn on the design of an optimal network architecture and the choice of methods for combining the results of several basic classifiers. As a result of the research, an algorithm has been developed for the formation of an ensemble of SNS based on an effective basic architecture and weighted average averaging of the results of each model for the classification task of image recognition in conditions of limited sampling.
Chapter
This book presents research focused on the design of fractal antennas using bio-inspired computing techniques. The authors present designs for fractal antennas having desirable features like size reduction characteristics, enhanced gain, and improved bandwidths. The research is summarized in six chapters which highlight the important issues related to fractal antenna design and the mentioned computing techniques. Chapters demonstrate several applied concepts and techniques used in the process such as Artificial Neural Networks (ANNs), Genetic Algorithms (GAs), Particle Swarm Optimization (PSO) and Bacterial Foraging Optimization (BFO). The work aims to provide cost-effective and efficient solutions to the demand for compact antennas due to the increasing demand for reduced sizes of components in modern wireless communication devices. A key feature of the book includes an extensive literature survey to understand the concept of fractal antennas, their features, and design approaches. Another key feature is the systematic approach to antenna design. The book explains how the IE3D software is used to simulate various fractal antennas, and how the results can be used to select a design. This is followed by ANN model development and testing for optimization, and an exploration of ANN ensemble models for the design of fractal antennas. The bio-inspired computing techniques based on GA, PSO, and BFO are developed to find the optimal design of the proposed fractal antennas for the desired applications. The performance comparison of the given computing techniques is also explained to demonstrate how to select the best algorithm for a given bio-inspired design. Finally, the book explains how to evaluate antenna designs. This book is a valuable resource for students (from UG to PG levels) and research scholars undertaking learning modules or projects on microstrip and patch antenna design in communications or electronics engineering courses.
Article
This paper presents a comprehensive review of evolutionary algorithms that learn an ensemble of predictive models for supervised machine learning (classification and regression). We propose a detailed four-level taxonomy of studies in this area. The first level of the taxonomy categorizes studies based on which stage of the ensemble learning process is addressed by the evolutionary algorithm: the generation of base models, model selection, or the integration of outputs. The next three levels of the taxonomy further categorize studies based on methods used to address each stage. In addition, we categorize studies according to the main types of objectives optimized by the evolutionary algorithm, the type of base learner used and the type of evolutionary algorithm used. We also discuss controversial topics, like the pros and cons of the selection stage of ensemble learning, and the need for using a diversity measure for the ensemble’s members in the fitness function. Finally, as conclusions, we summarize our findings about patterns in the frequency of use of different methods and suggest several new research directions for evolutionary ensemble learning.
Article
Full-text available
With the development of deep learning, the design of an appropriate network structure becomes fundamental. In recent years, the successful practice of Neural Architecture Search (NAS) has indicated that an automated design of the network structure can efficiently replace the design performed by human experts. Most NAS algorithms make the assumption that the overall structure of the network is linear and focus solely on accuracy to assess the performance of candidate networks. This paper introduces a novel NAS algorithm based on a multi-objective modeling of the network design problem to design accurate Convolutional Neural Networks (CNNs) with a small structure. The proposed algorithm makes use of a graph-based representation of the solutions which enables a high flexibility in the automatic design. Furthermore, the proposed algorithm includes novel ad-hoc crossover and mutation operators. We also propose a mechanism to accelerate the evaluation of the candidate solutions. Experimental results demonstrate that the proposed NAS approach can design accurate neural networks with limited size.
Article
Full-text available
In the process of building electrical load data collection, it is inevitable to introduce different kinds of noises, which makes the observation values deviate from the actual values, thus resulting in high levels of uncertainties. And such uncertainties make it difficult to achieve accurate point prediction of the short-term building electrical load. To improve the rationality of the prediction results and offer more effective information for decision makers, this paper proposes a novel multi-objective algorithm optimized modular fuzzy method which can accomplish the interval prediction for the short-term electrical load. First, one novel single-input-rule-modules (SIRMs)-based distributed interval fuzzy model (SIRM-DIFM) is proposed by replacing the original functional weights of the traditional SIRMs-based fuzzy inference system (SIRM-FIS) with the interval functional weights. Then, a data-driven learning scheme is presented for constructing the SIRM-DIFM. This learning sheme includes two main steps. The first step utilizes the iterative least square method to generate fuzzy rules for the SIRMs and determine the centers of the interval functional weights, while in the second step, the genetic algorithm (GA)-based multi-objective optimization algorithm is adopted to determine the widths of the interval functional weights. Through these two steps, accurate point estimation and reasonable interval prediction results can be achieved. Finally, two building electrical load prediction experiments are conducted to verify the effectiveness of the presented SIRM-DIFM. Simulation results indicate that the proposed SIRM-DIFM can compensate the shortcomings of the low accuracy of the point estimation and the predicted interval can effectively cover the observed data, providing the decision-makers more reliable and useful information.
Article
Structural collapse performance assessment has been at the center of many researchers’ interest due to complications of this phenomenon and uncertainties involved in modeling the simulation of the structural collapse response. This research aims to predict the structural collapse responses including mean collapse capacity, collapse standard deviation, and collapse drift by considering modeling uncertainties and then estimating collapse fragility curves, collapse risk, and reliability using Response Surface Method (RSM) and Artificial Neural Network (ANN). Modeling uncertainties for evaluating collapse responses are the parameters of the modified Ibarra-Krawinkler moment-rotation curve. Moreover, to analyze the structural uncertainty, the correlation between the model parameters in one component and between two structural components was considered. The Latin Hypercube Sampling (LHS) method and Cholesky decomposition were used to produce independent and dependent random variables, respectively. To predict the collapse responses of the structure, taking into account the uncertainties, as the number of uncertainties increases, the number of simulations for the uncertainties also increases, leading to a significant increase in the computational effort to estimate the structural responses, in the presence of a limited number of samples for uncertainties, a hybrid of ANN with PSO algorithm was used to reduce the computational effort in order to estimate the collapse fragility curves, collapse risk, and structural reliability. The results show that structural collapse responses can be predicted with appropriate accuracy by producing a limited number of samples for uncertainties and using an ANN-PSO algorithm.
Article
Full-text available
Evolutionary optimization aims to tune the hyper-parameters during learning in a computationally fast manner. For optimization of multi-task problems, evolution is done by creating a unified search space with a dimensionality that can include all the tasks. Multi-task evolution is achieved via selective imitation where two individuals with the same type of skill are encouraged to crossover. Due to the relatedness of the tasks, the resulting offspring may have a skill for a different task. In this way, we can simultaneously evolve a population where different individuals excel in different tasks. In this paper, we consider a type of evolution called Genetic Programming (GP) where the population of genes have a tree-like structure and can be of different lengths and hence can naturally represent multiple tasks. We apply the model to multi-task neuroevolution that aims to determine the optimal hyper-parameters of a neural network such as number of nodes, learning rate, and number of training epochs using evolution. Here each gene is encoded with the hyper parameters for a single neural network. Previously, optimization was done by enabling or disabling individual connections between neurons during evolution. This method is extremely slow and does not generalize well to new neural architectures such as Seq2Seq. To overcome this limitation, we follow a modular approach where each sub-tree in a GP can be a sub-neural architecture that is preserved during crossover across multiple tasks. Lastly, in order to leverage on the inter-task covariance for faster evolutionary search, we project the features from both tasks to common space using fuzzy membership functions. The proposed model is used to determine the optimal topology of a feed-forward neural network for classification of emotions in physiological heart signals and also a Seq2seq chatbot that can converse with kindergarten children. We can outperform baselines by over 10% in accuracy.
Article
Full-text available
The performance of a Convolutional Neural Network (CNN) highly depends on its architecture and corresponding parameters. Manually designing a CNN is a time–consuming process in regards to the various layers that it can have, and the variety of parameters that must be set up. Increasing the complexity of the network structure by employing various types of connections makes designing a network even more challenging. Evolutionary computation as an optimisation technique can be applied to arrange the CNN layers and/or initiate its parameters automatically or semi–automatically. Dense network and Residual network are two popular network structures that were introduced to facilitate the training of deep networks. In this paper, leveraging the potentials of Dense and Residual blocks, and using the capability of evolutionary computation, we propose an automatic evolutionary model to detect an optimum and accurate network structure and its parameters for medical image segmentation. The proposed evolutionary DenseRes model is employed for segmentation of six publicly available MRI and CT medical datasets. The proposed model obtained high accuracy while employing networks with minimal parameters for the segmentation of medical images and outperformed manual and automatic designed networks, including U–Net, Residual U–Net, Dense U–Net, Non–Bypass Dense, NAS U–Net, AdaresU–Net, and EvoU–Net.
Article
Developing a Deep Convolutional Neural Network (DCNN) is a challenging task that involves deep learning with significant effort required to configure the network topology. The design of a 3D DCNN not only requires a good complicated structure but also a considerable number of appropriate parameters to run effectively. Evolutionary computation is an effective approach that can find an optimum network structure and/or its parameters automatically. Note that the Neuroevolution approach is computationally costly, even for developing 2D networks. As it is expected that it will require even more massive computation to develop 3D Neuroevolutionary networks, this research topic has not been investigated until now. In this paper, in addition to developing 3D networks, we investigate the possibility of using 2D images and 2D Neuroevolutionary networks to develop 3D networks for 3D volume segmentation. In doing so, we propose to first establish new evolutionary 2D deep networks for medical image segmentation and then convert the 2D networks to 3D networks in order to obtain optimal evolutionary 3D deep convolutional neural networks. The proposed approach results in a massive saving in computational and processing time to develop 3D networks, while achieved high accuracy for 3D medical image segmentation of nine various datasets.
Article
Conventional artificial neural network (ANN) learning algorithms for classification tasks, either derivative-based optimization algorithms or derivative-free optimization algorithms work by training ANN first (or training and validating ANN) and then testing ANN, which are a two-stage and one-pass learning mechanism. Thus, this learning mechanism may not guarantee the generalization ability of a trained ANN. In this article, a novel bilevel learning model is constructed for self-organizing feed-forward neural network (FFNN), in which the training and testing processes are integrated into a unified framework. In this bilevel model, the upper level optimization problem is built for testing error on testing data set and network architecture based on network complexity, whereas the lower level optimization problem is constructed for network weights based on training error on training data set. For the bilevel framework, an interactive learning algorithm is proposed to optimize the architecture and weights of an FFNN with consideration of both training error and testing error. In this interactive learning algorithm, a hybrid binary particle swarm optimization (BPSO) taken as an upper level optimizer is used to self-organize network architecture, whereas the Levenberg-Marquardt (LM) algorithm as a lower level optimizer is utilized to optimize the connection weights of an FFNN. The bilevel learning model and algorithm have been tested on 20 benchmark classification problems. Experimental results demonstrate that the bilevel learning algorithm can significantly produce more compact FFNNs with more excellent generalization ability when compared with conventional learning algorithms.
Chapter
Images in biomedical articles are often referenced for clinical decision support, educational purposes, and medical research. Authors-marked annotations such as text labels and symbols overlaid on these images are used to highlight regions of interest which are then referenced in the caption text or figure citations in the articles. Detecting and recognizing such symbols is valuable for improving biomedical information retrieval. In this research, image processing and computational intelligence methods are integrated for object segmentation and discrimination and applied to the problem of detecting arrows on these images. Evolving Artificial Neural Networks (EANNs) and Evolving Artificial Neural Network Ensembles (EANNEs) computational intelligence-based algorithms are developed to recognize overlays, specifically arrows, in medical images. For these discrimination techniques, EANNs use particle swarm optimization and genetic algorithm for artificial neural network (ANN) training, and EANNEs utilize the number of ANNs generated in an ensemble and negative correlation learning for neural network training based on averaging and Linear Vector Quantization (LVQ) winner-take-all approaches. Experiments performed on medical images from the imageCLEFmed’08 data set, yielded area under the receiver operating characteristic curve and precision/recall results as high as 0.988 and 0.928/0.973, respectively, using the EANNEs method with the winner-take-all approach.
Chapter
Ensemble learning is a powerful paradigm that has been used in the top state-of-the-art machine learning methods like Random Forests and XGBoost. Inspired by the success of such methods, we have developed a new Genetic Programming method called Ensemble GP. The evolutionary cycle of Ensemble GP follows the same steps as other Genetic Programming systems, but with differences in the population structure, fitness evaluation and genetic operators. We have tested this method on eight binary classification problems, achieving results significantly better than standard GP, with much smaller models. Although other methods like M3GP and XGBoost were the best overall, Ensemble GP was able to achieve exceptionally good generalization results on a particularly hard problem where none of the other methods was able to succeed.
Chapter
In ensemble learning, the accuracy and diversity are two conflicting objectives. As the number of base learners increases, the prediction speed of ensemble learning machines drops significantly and the required storage space also increases rapidly. How to balance these two goals for selective ensemble learning is an extremely essential problem. In this paper, ensemble learning based on multimodal multiobjective optimization is studied in detail. The great significance and importance of multimodal multiobjective optimization algorithm is to find these different classifiers ensemble by considering the balance between accuracy and diversity, and different classifiers ensemble correspond to the same accuracy and diversity. Experimental results show that multimodal multiobjective optimization algorithm can find more ensemble combinations than unimodal optimization algorithms.
Article
Full-text available
Network security plays an essential role in secure communication and avoids financial loss and crippled services due to network intrusions. Intruders generally exploit the flaws of popular software to mount a variety of attacks against network computer systems. The damage caused in the network attacks may vary from a little disruption in service to on developing financial loss. Recently, intrusion detection systems (IDSs) comprising machine learning techniques have emerged for handling unauthorized usage and access to network resources. With the passage of time, a wide variety of machine learning techniques have been designed and integrated with IDSs. Still, most of the IDSs reported poor intrusion detection results using false positive rate and detection rate. For solving these issues, researchers focused on the development of ensemble classifiers involving the integration of predictions by multiple individual classifiers. The ensemble classifiers enable to compensate for the weakness of individual classifiers and use their combined knowledge to enhance its performance. This study presents motivation and comprehensive review of intrusion detection systems based on ensembles in machine learning as an extension of our previous work in the field. Particularly, different ensemble methods in the field are analysed, taking into consideration different types of ensembles, and various approaches for integrating the predictions of individual classifiers for an ensemble classifier. The representative studies are compared in chronological order for systematic and critical analysis, understanding the current challenges and status of research in the field. Finally, the study presents essential future research directions for the development of effective IDSs.
Article
Full-text available
Service-oriented architecture is becoming a major software framework for complex application and it can be dynamically and flexibly composed by integrating existing component web services provided by different providers with standard protocols. The rapid introduction of new web services into a dynamic business environment can adversely affect the service quality and user satisfaction. Therefore, how to leverage, aggregate and make use of individual component's quality of service (QoS) information to derive the optimal QoS of the composite service which meets the needs of users is still an ongoing hot research problem. This study aims at reviewing the advance of the current state-of-the-art in technologies and inspiring the possible new ideas for web service selection and composition, especially with nature-inspired computing approaches. Firstly, the background knowledge of web services is presented. Secondly, various nature-inspired web selection and composition approaches are systematically reviewed and analysed for QoS-aware web services. Finally, challenges, remarks and discussions about QoS-aware web service composition are presented.
Article
Full-text available
Accurate design of miniaturized antenna is constrained by the limited well‐formulated exact mathematical expressions. Demands for smart devices with features like portability, implantability, and configurability have further placed bigger challenges in front of the antenna design engineers or scientists. As a part of the search for various solutions, many innovative approaches have been proposed by various authors in different literatures. Application of soft computing is also another design approach to accurate design of fractal antenna. Here, the authors have attempted to propose a better solution to miniaturized antenna and its design. A fractal antenna based on circular outer geometry has been proposed as a solution to the search of miniaturized antennas, and a particle swarm optimization–based selective artificial neural networks ensemble is developed, which is employed as the objective function of a bacterial foraging optimization algorithm leading to a hybridized algorithm. The developed hybrid algorithm is utilized to develop the proposed antenna at 2.45 GHz. A good agreement of the simulated, desired, and experimental results validates the proposed design approach.
Article
Recent works in evolutionary robotics have shown the viability of evolution driven by behavioural novelty and diversity. These evolutionary approaches have been successfully used to generate repertoires of diverse and high-quality behaviours, instead of driving evolution towards a single, task-specific solution. Having repertoires of behaviours can enable new forms of robotic control, in which high-level controllers continually decide which behaviour to execute. To date, however, only the use of repertoires of open-loop locomotion primitives has been studied. We propose EvoRBC-II, an approach that enables the evolution of repertoires composed of general closed-loop behaviours, that can respond to the robot's sensory inputs. The evolved repertoire is then used as a basis to evolve a transparent higher-level controller that decides when and which behaviours of the repertoire to execute. Relying on experiments in a simulated domain, we show that the evolved repertoires are composed of highly diverse and useful behaviours. The same repertoire contains sufficiently diverse behaviours to solve a wide range of tasks, and the EvoRBC-II approach can yield a performance that is comparable to the standard tabula-rasa evolution. EvoRBC-II enables automatic generation of hierarchical control through a two-step evolutionary process, thus opening doors for the further exploration of the advantages that can be brought by hierarchical control.
Article
Full-text available
The traditional methods of designing antennas are not suitable in case of fractal antennas due to non availability of accurate mathematical design expressions. Recently, ANN model relating the physical and electromagnetic parameters of the fractal antenna to be designed is used as objective function of the optimization algorithm and it has been shown as an effective approach. In presented paper, ANN ensemble model has been used as the objective function of a PSO algorithm to calculate the optimal dimensions of a circular fractal antenna for desired resonant frequency. It has been established that ANN ensemble has better performance than the constituent ANN models. The design accuracy of the proposed hybrid algorithm is validated through the simulation and experimental results of the designed antenna. The size reduction capability of the proposed fractal antenna is used to design an antenna for 5.8 GHz WLAN band with a size reduction of 41.64% compared to simple circular microstrip antenna. The miniaturization of the antenna will lead to the design of compact devices for wireless communication systems.
Article
Multi-Modal Optimization (MMO) aiming to locate multiple optimal (or near-optimal) solutions in a single simulation run has practical relevance to problem solving across many fields. Population-based meta-heuristics have been shown particularly effective in solving MMO problems, if equipped with specificallydesigned diversity-preserving mechanisms, commonly known as niching methods. This paper provides an updated survey on niching methods. The paper first revisits the fundamental concepts about niching and its most representative schemes, then reviews the most recent development of niching methods, including novel and hybrid methods, performance measures, and benchmarks for their assessment. Furthermore, the paper surveys previous attempts at leveraging the capabilities of niching to facilitate various optimization tasks (e.g., multi-objective and dynamic optimization) and machine learning tasks (e.g., clustering, feature selection, and learning ensembles). A list of successful applications of niching methods to real-world problems is presented to demonstrate the capabilities of niching methods in providing solutions that are difficult for other optimization methods to offer. The significant practical value of niching methods is clearly exemplified through these applications. Finally, the paper poses challenges and research questions on niching that are yet to be appropriately addressed. Providing answers to these questions is crucial before we can bring more fruitful benefits of niching to real-world problem solving.
Article
Full-text available
This paper proposes a framework to obtain ensembles of classifiers from a Multi-objective Evolutionary Algorithm (MOEA), improving the restrictions imposed by two non-cooperative performance measures for multiclass problems: (1) the Correct Classification Rate or Accuracy (CCR) and, (2) the Minimum Sensitivity (MS) of all classes, i.e., the lowest percentage of examples correctly predicted as belonging to each class with respect to the total number of examples in the corresponding class. The proposed framework is based on collecting Pareto fronts of Artificial Neural Networks models for multiclass problems by the Memetic Pareto Evolutionary NSGA2 (MPENSGA2) algorithm, and it builds a new Pareto front (ensemble) from stored fronts. The ensemble built significantly improves the closeness to the optimum solutions and the diversity of the Pareto front. For verifying it, the performance of the new front obtained has been measured with the habitual use of weighting methodologies, such as Majority Voting, Simple Averaging and Winner Takes All. In addition to CCR and MS measures, three trade-off measures have been used to obtain the goodness of a Pareto front as a whole: Hyperarea, Laumanns’s Hyperarea (LAUMANNS) and Zitzler’s Spread (M3). The proposed framework can be adapted for any MOEA that aims to improve the compaction and diversity of its Pareto front, and whose fitness functions impose severe restrictions for multiclass problems.
Chapter
This article introduces multimodal optimization (MMO) methods aiming to locate multiple optimal (or close to optimal) solutions for an optimization problem. MMO is an important topic that has practical relevance in problem solving across many fields. Many real-world optimization problems are multimodal by nature –in other words, processing in more than one mode. There often exist multiple satisfactory solutions. For such an optimization problem, it may be desirable to locate all global optima and/or some local optima that are considered as being satisfactory. MMO has practical relevance to many engineering problems. Optimization methods specifically designed for solving MMO problems, often called niching methods, are predominantly developed from the field of evolutionary computation that belongs to a family of stochastic optimization algorithms (or metaheuristic algorithms), including genetic algorithms, evolutionary strategies, particle swarm optimization, differential evolution, and so on. This article covers selected classic niching methods, along with performance measures and benchmark test function suites developed for evaluating niching methods. The article also presents a list of niching application examples and suggestions on further readings of niching methods.
Conference Paper
With the increasing complexity of modern industrial processes and equipment, single fault diagnosis technology has failed to meet diagnostic needs. A complex diagnostic system which get together a variety of different technologies is the future development trend of fault diagnosis. According to a large number of characteristic information caused by difficult fault diagnosis, principal component analysis and genetic algorithm optimization BP neural network method is presented in this paper. The method takes advantage of principal component analysis of the ideological dimension reduction, in the large amounts of raw data selected representative feature data. Then genetic algorithms optimization get better initial weights and biases of neural networks and solve the BP neural network falling into local extremum problems. Fault Diagnosis simulation verify the validity of the algorithm by used diesel fuel system in industrial equipment, and come to the conclusion that the method is effective.
Chapter
Evolutionary programming (EP) has a long history of development and application. This chapter provides a basic background in EP in terms of its procedure, history, extensions, and application. As one of the founding approaches in the field of evolutionary computation (EC), EP shares important similarities and differences with other EC approaches.
Article
Full-text available
The use of artificial neural networks as the objective function of optimization algorithms is proposed in the recent past. In this paper, the use of artificial neural networks ensemble as objective function in place of single artificial neural networks is proposed. An ensemble hybrid algorithm is developed by using artificial neural networks and bacterial foraging optimization algorithm technique for designing a fractal antenna of a rectenna system working at 2.45 GHz. The closed form expressions are not available for the fractal antennas, so the use of artificial intelligence techniques for their design is appropriate. As the size reduction in rectenna systems used in wireless devices is a significant research domain to meet the demand for reduced size handheld devices, so the geometry of the antenna is selected to achieve this objective and a size reduction of 34.39 % is attained. The bandwidth enhancement of the proposed antenna is also achieved so that it can be used over wide band. The performance of the proposed optimized fractal antenna is verified using simulation and experimental results.
Conference Paper
In this paper, we first survey the theoretical and historical backgrounds related to ensemble neural network rule extraction. Then we propose a new rule extraction method for ensemble neural networks. We also demonstrate that the use of ensemble neural networks produces higher recognition accuracy than do individual neural networks. Because the extracted rules are more comprehensible. The rule extraction method we use is the Ensemble-Recursive-Rule eX traction (E-Re-RX) algorithm. The E-Re-RX algorithm is an effective rule extraction algorithm for dealing with data sets that mix discrete and continuous attributes. In this algorithm, primary rules are generated, followed by secondary rules to handle only those instances that do not satisfy the primary rules, and then these rules are integrated. We show that this reduces the complexity of using multiple neural networks. This method achieves extremely high recognition rates, even with multiclass problems.
Article
Full-text available
Improving the performance of optimization algorithms is a trend with a continuous growth, powerful and stable algorithms being always in demand, especially nowadays when in the majority of cases, the computational power is not an issue. In this context, differential evolution (DE) is optimized by employing different approaches belonging to different research directions. The focus of the current review is on two main directions: (a) the replacement of manual control parameter setting with adaptive and self-adaptive methods; and (b) hybridization with other algorithms. The control parameters have a big influence on the algorithms performance, their correct setting being a crucial aspect when striving to obtain optimal solutions. Since their values are problem dependent, setting them is not an easy task. The trial and error method initially used is time and resource consuming, and in the same time, does not guarantee optimal results. Therefore, new approaches were proposed, the automatic control being one of the best solution developed by researchers. Concerning hybridization, the scope was to combine two or more algorithms in order to eliminate or to reduce the drawbacks of each individual algorithm. In this manner, different combinations at different levels were proposed. This work presents the main approaches mixing DE with global algorithms, DE with local algorithms and DE with global and local algorithms. In addition, a special attention was given to the situations in which DE is employed as a local search procedure or DE principles are included in other global search methods.
Conference Paper
The advancement in renewable energy sector being the focus of research these days, a novel neuro evolutionary technique is proposed for modeling wind power forecasters. The paper uses the robust technique of Cartesian Genetic Programming to evolve ANN for development of forecasting models. These Models predicts power generation of a wind based power plant from a single hour up to a year taking a big lead over other proposed models by reducing its MAPE to as low as 1.049% for a single day hourly prediction. Results when compared with other models in the literature demonstrated that the proposed models are among the best estimators of wind based power generation plants proposed to date.
Conference Paper
An enhancement to the growth curve approach based on neuro evolution is proposed to develop various forecasting models to investigate the state and worth of the producer, to market a new product. The forecasting model is obtained using a newly introduced neuro evolutionary approach called Cartesian Genetic Programming based ANN (CGPANN). CGPANN helps in obtaining an optimum model for all the necessary parameters of an ANN. An accurate and computationally efficient model is obtained, achieving an accuracy as high as 93.37% on the time devised terrains, providing a general mechanism for forecasting models in mathematical agreement to its application in econometrics. Comparison with other contemporary model evidences the perfection of the proposed model thus its vital power in developing the growth curve approach for predicting the sustainability of new products. © IFIP International Federation for Information Processing 2014.
Article
Using a surrogate model to evaluate the expensive fitness of candidate solutions in an evolutionary algorithm can significantly reduce the overall computational cost of optimization tasks. In this paper we present a recurrent neural network ensemble that is used as a surrogate for the long-term prediction of computational fluid dynamic simulations. A hybrid multi-objective evolutionary algorithm that trains and optimizes the structure of the recurrent neural networks is introduced. Selection and combination of individual prediction models in the Pareto set of solutions is used to create the ensemble of predictors. Five selection methods are tested on six data sets and the accuracy of the ensembles is compared to the converged computational fluid dynamic data, as well as to the delta change between two flow conditions. Intermediate computational fluid dynamic data is used for training and the method presented can produce accurate and stable results using a third of the intermediate data needed for convergence.
Article
Using a coordinated group of simple solvers to tackle a complex problem is not an entirely new idea. Its root could be traced back hundreds of years ago when ancient Chinese suggested a team approach to problem solving. For a long time, engineers have used the divide-and-conquer strategy to decompose a complex problem into simpler sub-problems and then solve them by a group of solvers. However, knowing the best way to divide a complex problem into simpler ones relies heavily on the available domain knowledge. It is often a manual process by an experienced engineer. There have been few automatic divide-and-conquer methods reported in the literature. Fortunately, evolutionary computation provides some of the interesting avenues to automatic divide-and-conquer methods [15]. An in-depth study of such methods reveals that there is a deep underlying connection between evolutionary computation and ANN ensembles.. Ideas in one area can be usefully transferred into another in producing effective algorithms. For example, using speciation to create and maintain diversity [15] had inspired the development of negative correlation learning for ANN ensembles [33], [34] and an in-depth study of diversity in ensembles [12], [51]. This paper will review some of the recent work in evolutionary approaches to designing ANN ensembles.
Chapter
Full-text available
This paper introduces a Genetic Algorithm (GA) for training Artificial Neural Networks (ANNs) using the electromagnetic spectrum signal of a combustion process for flame pattern classification. Combustion requires identification systems that provide information about the state of the process in order to make combustion more efficient and clean. Combustion is complex to model using conventional deterministic methods thus motivate the use of heuristics in this domain. ANNs have been successfully applied to combustion classification systems; however, traditional ANN training methods get often trapped in local minima of the error function and are inefficient in multimodal and non-differentiable functions. A GA is used here to overcome these problems. The proposed GA finds the weights of an ANN than best fits the training pattern with the highest classification rate.
Article
Full-text available
In this paper, we first review the theoretical and historical backgrounds on rule extraction from neural network ensembles. Because the structures of previous neural network ensembles were quite complicated, research on an efficient rule extraction algorithm from neural network ensembles has been sparse, even though a practical need exists for rule extraction in Big Data datasets. We describe the Recursive-Rule extraction (Re-RX) algorithm, which is an important step toward handling large datasets. Then we survey the family of the Recursive-Rule extraction algorithm, i.e. the Multiple-MLP Ensemble Re-RX algorithm, and present concrete applications in financial and medical domains that require extremely high accuracy for classification rules. Finally, we mention two promising ideas to considerably enhance the accuracy of the Multiple-MLP Ensemble Re-RX algorithm. We also discuss developments in the near future that will make the Multiple-MLP Ensemble Re-RX algorithm much more accurate, concise, and comprehensible rule extraction from mixed datasets.
Article
The neural network ensemble (NNE) is a very effective way to obtain a good prediction performance by combining the outputs of several independently trained neural networks. Swarm intelligence is applied here to model the population of interacting agents or swarms that are able to self-organize. In this paper, we combine NNE and multi-population swarm intelligence to construct our improved neural network ensemble (INNE). First, each component forward neural network (FNN) is optimized by chaotic particle swarm optimization (CPSO) and gradient gescending (GD) algorithm. Second, in contrast to most existing NNE training algorithm, we adopt multiple obviously different populations to construct swarm intelligence. As an example, one population is trained by particle swarm optimization (PSO) and the others are trained by differential evolution (DE) or artificial bee colony algorithm (ABC). The ensemble weights are trained by multi-population co-evolution PSO–ABC–DE chaotic searching algorithm (M-PSO–ABC–DE–CS). Our experiments demonstrate that the proposed novel INNE algorithm is superior to existing popular NNE in function prediction.
Article
Full-text available
Artificial Neuron-Glia Networks (ANGNs) are a novel bio-inspired machine learning approach. They extend classical Artificial Neural Networks (ANNs) by incorporating recent findings and suppositions about the way information is processed by neural and astrocytic networks in the most evolved living organisms. Although ANGNs are not a consolidated method, their performance against the traditional approach, i.e. without artificial astrocytes, was already demonstrated on classification problems. However, the corresponding learning algorithms developed so far strongly depends on a set of glial parameters which are manually tuned for each specific problem. As a consequence, previous experimental tests have to be done in order to determine an adequate set of values, making such manual parameter configuration time-consuming, error-prone, biased and problem dependent. Thus, in this paper, we propose a novel learning approach for ANGNs that fully automates the learning process, and gives the possibility of testing any kind of reasonable parameter configuration for each specific problem. This new learning algorithm, based on coevolutionary genetic algorithms, is able to properly learn all the ANGNs parameters. Its performance is tested on five classification problems achieving significantly better results than ANGN and competitive results with ANN approaches.
Article
Full-text available
Within the context of learning a rule from examples, we study the general characteristics of learning with ensembles. The generalization performance achieved by a simple model ensemble of linear students is calculated exactly in the thermodynamic limit of a large number of input components and shows a surprisingly rich behavior. Our main findings are the following. For learning in large ensembles, it is advantageous to use underregularized students, which actually overfit the training data. Globally optimal generalization performance can be obtained by choosing the training set sizes of the students optimally. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of corruption of the training data.
Article
Full-text available
Multi-objective evolutionary algorithms for the construction of neural ensembles is a relatively new area of research. We recently proposed an ensemble learning algorithm called DIVACE (DIVerse and ACcurate Ensemble learning algorithm). It was shown that DIVACE tries to find an optimal trade-off between diversity and accuracy as it searches for an ensemble for some particular pattern recognition task by treating these two objectives explicitly separately. A detailed discussion of DIVACE together with further experimental studies form the essence of this paper. A new diversity measure which we call Pairwise Failure Crediting (PFC) is proposed. This measure forms one of the two evolutionary pressures being exerted explicitly in DIVACE. Experiments with this diversity measure as well as comparisons with previously studied approaches are hence considered. Detailed analysis of the results show that DIVACE, as a concept, has promise.
Article
Full-text available
Ensemble approaches to classification and regression have attracted a great deal of interest in recent years. These methods can be shown both theoretically and empirically to outperform single predictors on a wide range of tasks. One of the elements required for accurate prediction when using an ensemble is recognised to be error “diversity”. However, the exact meaning of this concept is not clear from the literature, particularly for classification tasks. In this paper we first review the varied attempts to provide a formal explanation of error diversity, including several heuristic and qualitative explanations in the literature. For completeness of discussion we include not only the classification literature but also some excerpts of the rather more mature regression literature, which we believe can still provide some insights. We proceed to survey the various techniques used for creating diverse ensembles, and categorise them, forming a preliminary taxonomy of diversity creation methods. As part of this taxonomy we introduce the idea of implicit and explicit diversity creation methods, and three dimensions along which these may be applied. Finally we propose some new directions that may prove fruitful in understanding classification error diversity.
Conference Paper
Full-text available
We study the formal basis behind Negative Correlation (NC) Learning, an ensemble technique developed in the evolutionary compu- tation literature. We show that by removing an assumption made in the original work, NC can be shown to be a derivative technique of the Am- biguity decomposition by Krogh and Vedelsby. From this formalisation, we calculate parameter bounds, and show significant improvements in empirical tests. We hypothesize that the reason for its success lies in rescaling an estimate of ensemble covariance; then show that during this rescaling, NC varies smoothly between a single neural network and an ensemble system. Finally we unify several other works in the literature, all of which have exploited the Ambiguity decomposition in some way, and term them the Ambiguity Family.
Article
Full-text available
Diversity among the base classifiers is deemed to be important when constructing a classifier ensemble. Numerous algorithms have been proposed to construct a good classifier ensemble by seeking both the accuracy of the base classifiers and the diversity among them. However, there is no generally accepted definition of diversity, and measuring the diversity explicitly is very difficult. Although researchers have designed several experimental studies to compare different diversity measures, usually confusing results were observed. In this paper, we present a theoretical analysis on six existing diversity measures (namely disagreement measure, double fault measure, KW variance, inter-rater agreement, generalized diversity and measure of difficulty), show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms. We illustrate why confusing experimental results were observed and show that the discussed diversity measures are naturally ineffective. Our analysis provides a deeper understanding of the concept of diversity, and hence can help design better ensemble learning algorithms.
Article
Full-text available
This paper reviews research on combining artificial neural nets, and provides an overview of, and an introduction to, the papers contained in this special issue, and its companion (Connection Science, 9, 1). Two main approaches, ensemble-based, and modular, are identified and considered. An ensemble, or committee, is made up of a set of nets, each of which is a general function approximator. The members of the ensemble are combined in order to obtain better generalization performance than would be achieved by any of the individual nets. The main issues considered here under the heading of ensemble-based approaches are a how to combine the outputs of the ensemble members, b how to create candidate ensemble members and c which methods lead to the most effective ensembles? Under the heading of modular approaches, we begin by considering a divide-and-conquer approach by which a function is automatically decomposed into a number of subfunctions which are treated by specialist modules. Other modular approaches are also identified and considered, for while the divide-and-conquer approach is designed to improve performance, the term modularity can be given a wider interpretation. The broadly defined topic of modularity includes the explicit decomposition of a task based on the designer's understanding, and the exploitation of specialist modules in order to accomplish tasks which could not be performed by a monolithic net.
Article
Full-text available
Standard methods for simultaneously inducing the structure and weights of recurrent neural networks limit every task to an assumed class of architectures. Such a simplification is necessary since the interactions between network structure and function are not well understood. Evolutionary computations, which include genetic algorithms and evolutionary programming, are population-based search methods that have shown promise in many similarly complex tasks. This paper argues that genetic algorithms are inappropriate for network acquisition and describes an evolutionary program, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks. GNARL's empirical acquisition method allows for the emergence of complex behaviors and topologies that are potentially excluded by the artificial architectural constraints imposed in standard network induction methods.
Conference Paper
Full-text available
In this paper, we present a comparison between two multiobjective formulations to the formation of neuro-ensembles. The first formulation splits the training set into two nonoverlapping stratified subsets and form an objective to minimize the training error on each subset, while the second formulation adds random noise to the training set to form a second objective. A variation of the memetic Pareto artificial neural network (MPANN) algorithm is used. MPANN is based on differential evolution for continuous optimization. The ensemble is formed from all networks on the Pareto frontier. It is found that the first formulation outperformed the second. The first formulation is also found to be competitive to other methods in the literature.
Conference Paper
Full-text available
The Pareto differential evolution (PDE) algorithm was introduced and showed competitive results. The behavior of PDE, as in many other evolutionary multiobjective optimization (EMO) methods, varies according to the crossover and mutation rates. In this paper, we present a new version of PDE with self-adaptive crossover and mutation. We call the new version self-adaptive Pareto differential evolution (SPDE). The emphasis of this paper is to analyze the dynamics and behavior of SPDE. The experiments also show that the algorithm is very competitive with other EMO algorithms
Conference Paper
Full-text available
The use of evolutionary algorithms (EAs) to solve problems with multiple objectives (known as multi-objective optimization problems (MOPs)) has attracted much attention. Being population based approaches, EAs offer a means to find a group of Pareto-optimal solutions in a single run. Differential evolution (DE) is an EA that was developed to handle optimization problems over continuous domains. The objective of this paper is to introduce a novel Pareto-frontier differential evolution (PDE) algorithm to solve MOPs. The solutions provided by the proposed algorithm for two standard test problems, outperform the Strength Pareto Evolutionary Algorithm, one of the state-of-the-art evolutionary algorithms for solving MOPs
Conference Paper
Full-text available
Various schemes for combining genetic algorithms and neural networks have been proposed and tested in recent years, but the literature is scattered among a variety of journals, proceedings and technical reports. Activity in this area is clearly increasing. The authors provide an overview of this body of literature drawing out common themes and providing, where possible, the emerging wisdom about what seems to work and what does not
Article
Full-text available
Studies evolutionary programming with mutations based on the Levy probability distribution. The Levy probability distribution has an infinite second moment and is, therefore, more likely to generate an offspring that is farther away from its parent than the commonly employed Gaussian mutation. Such likelihood depends on a parameter α in the Levy distribution. We propose an evolutionary programming algorithm using adaptive as well as nonadaptive Levy mutations. The proposed algorithm was applied to multivariate functional optimization. Empirical evidence shows that, in the case of functions having many local optima, the performance of the proposed algorithm was better than that of classical evolutionary programming using Gaussian mutation.
Chapter
Learning and evolution are two fundamental processes of adaptation. Various models have been proposed to explain their behaviour. Rather than discussing these models in detail, this paper concentrates on the interaction between learning and evolution as well as the interaction between different levels of evolution. We will argue that the evolution of learning rules and its interaction with other evolutionary developments (in either artificial or biological systems) plays a key role in accounting for the creativity of those systems. We will concentrate on two models of learning and evolution: connectionistlearning (artificial neural networks, or ANNs) and genetic algorithms (GAs).
Article
In this paper, we study a number of objective functions for training new hidden units in constructive algorithms for multilayer feedforward networks. The aim is to derive a class of objective functions the computation of which and the corresponding weight updates can be done in O(N) time, where N is the number of training patterns. Moreover, even though. input weight freezing is applied during the process for computational efficiency, the convergence property of the constructive algorithms using these objective functions is still preserved. We also propose a few computational tricks that can be used to improve the optimization of the objective functions under practical situations. Their relative performance in a set of two-dimensional regression problems is also discussed.
Article
Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.
Article
Learning and evolution ai-e two fundamental forms of adaptation. There has been a gl-eat interest in combining learning and evolution with artificial neural networks (ANN's) in recent years. This paper: I) reviews reviews ent combinations between ANN's and evolutionary algorithms (EA's), including using EA's to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EA's; and 3) points out possible future research directions. it is shown, through a considerably large literature review, that combinations between ANN's and EA's can lead to significantly better intelligent systems than relying on ANN's or EA's alone.
Article
In trying to solve multiobjective optimization problems, many traditional methods scalarize the objective vector into a single objective. In those cases, the obtained solution is highly sensitive to the weight vector used in the scalarization process and demands that the user have knowledge about the underlying problem. Moreover, in solving multiobjective problems, designers may be interested in a set of Pareto-optimal points, instead of a single point. Since genetic algorithms (GAs) work with a population of points, it seems natural to use GAs in multiobjective optimization problems to capture a number of solutions simultaneously. Although a vector evaluated GA (VEGA) has been implemented by Schaffer and has been tried to solve a number of multiobjective problems, the algorithm seems to have bias toward some regions. In this paper, we investigate Goldberg's notion of nondominated sorting in GAs along with a niche and speciation method to find multiple Pareto-optimal points simultaneously. The proof-of-principle results obtained on three problems used by Schaffer and others suggest that the proposed method can be extended to higher dimensional and more difficult multiobjective problems. A number of suggestions for extension and application of the algorithm are also discussed.
Book
1 Introduction.- 1.1 Adaptive Signal Processing.- 1.2 The Adaptive Filter.- 1.3 Modes of Operation.- 1.4 Application of Adaptive Filters.- 1.5 Summary.- 2 Adaptive Fir Filter Algorithms.- 2.1 Introduction.- 2.2 Optimum Linear Estimation.- 2.2.1 The Optimum FIR Filter.- 2.2.2 FIR System Identification.- 2.3 Sampled Matrix Inversion.- 2.4 Least Squares Estimation.- 2.4.1 Recursive Least Squares.- 2.4.2 Data Windows.- 2.4.3 Fast Algorithms.- 2.4.4 Properties of the Least Squares Estimate.- 2.5 Stochastic Gradient Methods.- 2.5.1 The Least Mean Squares Algorithm.- 2.5.2 The Block Least Mean Squares Algorithm.- 2.6 Self-Orthogonalising Algorithms.- 2.6.1 The Sliding DFT Adaptive Filter.- 2.7 Summary and Complexity Comparison.- 3 Performance Comparisons.- 3.1 Introduction.- 3.2 System Identification.- 3.3 Channel Equalisation.- 3.4 Summary and Conclusions.- 4 A Self-Orthogonalising Block Adaptive Filter.- 4.1 Introduction.- 4.2 Theoretical Development.- 4.2.1 Comparison of Theory with Simulation.- 4.3 A Practical Algorithm.- 4.4 Computational Complexity.- 4.5 Simulation Results.- 4.6 Conclusions.- 5 The Infinite Impulse Response Linear Equaliser.- 5.1 Introduction.- 5.2 The Linear Equaliser.- 5.2.1 Structure of an IIR Equaliser.- 5.3 FIR and IIR Equaliser Performance.- 5.4 System Identification.- 5.4.1 Adaptive IIR Solutions.- 5.5 Conclusions.- 6 An Adaptive IIR Equaliser.- 6.1 Introduction.- 6.2 The Kalman Filter.- 6.3 The Kalman Filter as an IIR Equaliser.- 6.4 An Adaptive Kalman Equaliser.- 6.4.1 System Identification.- 6.4.2 Model Uncertainty.- 6.4.3 Verification of Compensation Technique.- 6.4.4 Comparison with an RLS FIR Equaliser.- 6.4.5 Computational Complexity.- 6.5 RLS System Identification.- 6.6 Conclusions.- 7 Conclusions.- 7.1 Summary.- 7.2 Limitations and Further Work.- Appendix A The Fast Kalman Algorithm.- Appendix B The RLS Lattice Algorithm.- Appendix C Circular and Linear Convolution.- References.
Article
Multi-network systems, i.e. multiple neural network systems, can often solve complex problems more effectively than their monolithic counterparts. Modular neural networks (MNNs) tackle a complex problem by decomposing it into simpler subproblems and then solving them. Unlike the decomposition in MNNs, a neural network ensemble usually includes redundant component nets and is often inspired by statistical theories. This paper presents different types of problem decompositions and discusses the suitability of various multi-network systems for different decompositions. A classification of various multi-network systems, in the context of problem decomposition, is obtained by exploiting these differences. Then a specific type of problem decomposition, which gives no information about the subproblems and is often ignored in literature, is discussed in detail and a novel MNN architecture for problem decomposition is presented. Finally, a co-evolutionary model is presented, which is used to design and optimize such MNNs with subtask specific modules. The model consists of two populations. The first population consists of a pool of modules and the second population synthesizes complete systems by drawing elements from the pool of modules. Modules represent a part of the solution, which co-operate with each other to form a complete solution. Using two artificial supervised learning tasks, constructed from smaller subtasks, it can be shown that if a particular task decomposition is better than others, in terms of performance on the overall task, it can be evolved using the co-evolutionary model.
Article
Evolutionary programming is a method for simulating evolution that emphasizes the behavioral rather than the genetic relationship of parents and their offspring. In a typical evolutionary program, every parent simultaneously generates a number of offspring, which are all subsequently placed in competition. Evolution can be abstracted as a more continuous process by generating only a single offspring from one parent and then immediately placing it in competition with all existing solutions. Some theoretical observations are made with respect to this new model. The results of empirical trials on a test landscape with multiple local minima indicate that the standard method of reproduction and selection may be more appropriate for practical optimization problems.
Article
The use of backpropagation for training artificial neural networks (ANNs) is usually associated with a long training process. The user needs to experiment with a number of network architectures; with larger networks, more computational cost in terms of training time is required. The objective of this letter is to present an optimization algorithm, comprising a multiobjective evolutionary algorithm and a gradient-based local search. In the rest of the letter, this is referred to as the memetic Pareto artificial neural network algorithm for training ANNs. The evolutionary approach is used to train the network and simultaneously optimize its architecture. The result is a set of networks, with each network in the set attempting to optimize both the training error and the architecture. We also present a self-adaptive version with lower computational cost. We show empirically that the proposed method is capable of reducing the training time compared to gradient-based techniques.
Article
Many techniques for model selection in the field of neural networks correspond to well established statistical methods. For example, architecture modifications based on test variables calculated after convergence of the training process can be viewed as part of a hypothesis testing search, and the use of complexity penalty terms is essentially a type of regularization or biased regression. The method of “stopped” or “cross-validation” training, on the other hand, in which an oversized network is trained until the error on a further validation set of examples deteriorates, then training is stopped, is a true innovation since model selection doesn't require convergence of the training process. Here, the training process is used to perform a directed search of the parameter space for a model which doesn't overfit the data and thus demonstrates superior generalization performance. In this paper we show that this performance can be significantly enhanced by expanding the “nonconvergent method” of stopped training to include dynamic topology modifications (dynamic weight pruning) and modified complexity penalty term methods in which the weighting of the penalty term is adjusted during the training process. On an extensive sequence of simulation examples we demonstrate the general superiority of the “extended” nonconvergent methods compared to classical penalty term methods, simple stopped training, and methods which only vary the number of hidden units.
Article
The paper describes changes in the structure (anatomy) of a multilevel neural network which is trained by back-propagation algorithm. A structural change is both a random process and an environment-dependent process. The mechanisms of cell propagation and degeneration of inactive synapses and inactive cells are considered. The change in the neural network structure occurs during the training and running of the system, which enables dynamic adaptation of the network capacity to the complex problem of the network interaction with the environment. Structural changes in a multilevel neural network are tested on several tasks of the network training.
Article
The number of digits it takes to write down an observed sequence x1, …, xN of a time series depends on the model with its parameters that one assumes to have generated the observed data. Accordingly, by finding the model which minimizes the description length one obtains estimates of both the integer-valued structure parameters and the real-valued system parameters.
Conference Paper
The formation of a neural network ensemble has attracted much attention in the machine learning literature. A set of trained neural networks are combined using a post-gate to form a single super-network. One main challenge in the literature is to decide on which network to include in, or exclude from the ensemble. Another challenge is how to define an optimum size for the ensemble. Some researchers also claim that for an ensemble to be effective, the networks need to be different. However, there is not a consistent definition of what “different” means. Some take it to mean weakly correlated networks, networks with different bias-variance trade-off, and/or networks which are specialized on different parts of the input space. In this paper, we present a theoretically sound approach for the formation of neural network ensembles. The approach is based on the dominance concept that determines which network to include/exclude, identifies a suitable size for the ensemble, and provides a mechanism for quantifying differences between networks. The approach was tested on a number of standard dataset and showed competitive results.
Book
This Third Edition provides the latest tools and techniques that enable computers to learn The Third Edition of this internationally acclaimed publication provides the latest theory and techniques for using simulated evolution to achieve machine intelligence. As a leading advocate for evolutionary computation, the author has successfully challenged the traditional notion of artificial intelligence, which essentially programs human knowledge fact by fact, but does not have the capacity to learn or adapt as evolutionary computation does. Readers gain an understanding of the history of evolutionary computation, which provides a foundation for the author's thorough presentation of the latest theories shaping current research. Balancing theory with practice, the author provides readers with the skills they need to apply evolutionary algorithms that can solve many of today's intransigent problems by adapting to new challenges and learning from experience. Several examples are provided that demonstrate how these evolutionary algorithms learn to solve problems. In particular, the author provides a detailed example of how an algorithm is used to evolve strategies for playing chess and checkers. As readers progress through the publication, they gain an increasing appreciation and understanding of the relationship between learning and intelligence. Readers familiar with the previous editions will discover much new and revised material that brings the publication thoroughly up to date with the latest research, including the latest theories and empirical properties of evolutionary computation. The Third Edition also features new knowledge-building aids. Readers will find a host of new and revised examples. New questions at the end of each chapter enable readers to test their knowledge. Intriguing assignments that prepare readers to manage challenges in industry and research have been added to the end of each chapter as well. This is a must-have reference for professionals in computer and electrical engineering; it provides them with the very latest techniques and applications in machine intelligence. With its question sets and assignments, the publication is also recommended as a graduate-level textbook. © 2006 The Institute of Electrical and Electronics Engineers, Inc.
Article
New neural learning algorithms are often benchmarked only poorly. This article gathers some important DOs and DON'Ts for researchers in order to improve on that situation. The essential requirements are (1) Volume: benchmarking has to be broad enough, i.e. must use several problems; (2) Validity: common errors that invalidate the results have to be avoided; (3) Reproducibility: benchmarking has to be documented well enough to be completely reproducible; and (4) Comparability: benchmark results should, if possible, be directly comparable with the results achieved by others using different algorithms.
Article
Evolutionary artificial neural networks (EANNs) can be considered as a combination of artificial neural networks (ANNs) and evolutionary search procedures such as genetic algorithms (GAs). This paper distinguishes among three levels of evolution in EANNs, i.e. the evolution of connection weights, architectures and learning rules. It first reviews each kind of evolution in detail and then analyses major issues related to each kind of evolution. It is shown in the paper that although there is a lot of work on the evolution of connection weights and architectures, research on the evolution of learning rules is still in its early stages. Interactions among different levels of evolution are far from being understood. It is argued in the paper that the evolution of learning rules and its interactions with other levels of evolution play a vital role in EANNs.
Article
The influence of a macroscopic time-dependent threshold on the retrieval dynamics of attractor associative memory models with ternary neurons ¿-1, 0, +1¿ is examined. If the threshold is chosen appropriately as a function of the cross-talk noise and of the activity of the memorized patterns in the model, adapting itself in the course of the time evolution, it guarantees an autonomous functioning of the model. Especially in the limit of sparse coding, it is found that this self-control mechanism considerably improves the quality of the fixed-point retrieval dynamics, in particular the storage capacity, the basins of attraction and the information content. The mutual information is shown to be the relevant parameter to study the retrieval quality of such sparsely coded models. Numerical results confirm these observations.
Article
Neural network-based modeling often involves trying multiple networks with different architectures and training parameters in order to achieve acceptable model accuracy. Typically, one of the trained networks is chosen as best, while the rest are discarded. [Hashem and Schmeiser (1995)] proposed using optimal linear combinations of a number of trained neural networks instead of using a single best network. Combining the trained networks may help integrate the knowledge acquired by the components networks and thus improve model accuracy. In this paper, we extend the idea of optimal linear combinations (OLCs) of neural networks and discuss issues related to the generalization ability of the combined model. We then present two algorithms for selecting the component networks for the combination to improve the generalization ability of OLCs. Our experimental results demonstrate significant improvements in model accuracy, as a result of using OLCs, compared to using the apparent best network. Copyright 1997 Elsevier Science Ltd.
Article
Neural network methods have proven to be powerful tools in modelling of nonlinear processes. One crucial part of modelling is the training phase where the model parameters are adjusted so that the model performs the desired operation as well as possible. Besides parameter estimation, an important problem is to select a suitable model structure. With a bad structure we potentially run into problems like underfitting, overfitting or wasting computational resources. One approach for structure learning is to use constructive methods, where training begins with minimal structure, and then more parameters are added when needed according to some predefined rule. This kind of constructive solution has also become more attractive in neural networks literature where one of the most well known constructive techniques is cascade-correlation (CC) learning. Inspired by CC we propose and study a similar technique called constructive backpropagation (CBP). We show that CBP is computationally just as efficient as the CC algorithm even though we need to backpropagate the error through no more than one hidden layer. Further, CBP has the same constructive benefits as CC, but in addition CBP benefits from simpler implementation and the ability to utilize stochastic optimization routines. Moreover, we show how CBP can be extended to allow addition of multiple new units simultaneously and how it can be used to perform continuous automatic structure adaptation. This includes both addition and deletion of units. The performance of CBP learning is studied with time series modelling experiments which demonstrate that CBP can provide significantly better modelling capabilities compared to CC learning.
Article
This paper presents a learning approach, i.e. negative correlation learning, for neural network ensembles. Unlike previous learning approaches for neural network ensembles, negative correlation learning attempts to train individual networks in an ensemble and combines them in the same learning process. In negative correlation learning, all the individual networks in the ensemble are trained simultaneously and interactively through the correlation penalty terms in their error functions. Rather than producing unbiased individual networks whose errors are uncorrelated, negative correlation learning can create negatively correlated networks to encourage specialisation and cooperation among the individual networks. Empirical studies have been carried out to show why and how negative correlation learning works. The experimental results show that negative correlation learning can produce neural network ensembles with good generalisation ability.
Article
This paper presents a new evolutionary system, i.e., EPNet, for evolving artificial neural networks (ANNs). The evolutionary algorithm used in EPNet is based on Fogel's evolutionary programming (EP). Unlike most previous studies on evolving ANN's, this paper puts its emphasis on evolving ANN's behaviors. Five mutation operators proposed in EPNet reflect such an emphasis on evolving behaviors. Close behavioral links between parents and their offspring are maintained by various mutations, such as partial training and node splitting. EPNet evolves ANN's architectures and connection weights (including biases) simultaneously in order to reduce the noise in fitness evaluation. The parsimony of evolved ANN's is encouraged by preferring node/connection deletion to addition. EPNet has been tested on a number of benchmark problems in machine learning and ANNs, such as the parity problem, the medical diagnosis problems, the Australian credit card assessment problem, and the Mackey-Glass time series prediction problem. The experimental results show that EPNet can produce very compact ANNs with good generalization ability in comparison with other algorithms.
Conference Paper
Differential Evolution (DE) has recently proven to be an efficient method for optimizing real-valued multi-modal objective functions. Besides its good convergence properties and suitability for parallelization, DE's main assets are its conceptual simplicity and ease of use. This paper describes two variants of DE which were used to minimize the real test functions of the ICEC'96 contest
Conference Paper
The specification of neural net architectures by genetic algorithm (GA) is thought to be hampered by difficulties with crossover. This is the `permutation' or `competing conventions' problem: similar nets may have the hidden units defined in different orders so that they have very dissimilar genetic strings, preventing successful recombination of building blocks. Previous empirical tests of a number of recombination operators using a simulated net-building task indicated the superiority of one that sorts hidden unit definitions by overlap prior to crossover. However, simple crossover also fared well, suggesting that the permutation problem is not serious in practice. This is supported by an observed reduction in performance when the permutation problem is removed. The GA is shown to be able to resolve the permutations, so that the advantages of an increase in the number of maxima outweigh the difficulties of recombination
Article
Based on negative correlation learning and evolutionary learning, this paper presents evolutionary ensembles with negative correlation learning (EENCL) to address the issues of automatic determination of the number of individual neural networks (NNs) in an ensemble and the exploitation of the interaction between individual NN design and combination. The idea of EENCL is to encourage different individual NNs in the ensemble to learn different parts or aspects of the training data so that the ensemble can learn better the entire training data. The cooperation and specialization among different individual NNs are considered during the individual NN design. This provides an opportunity for different NNs to interact with each other and to specialize. Experiments on two real-world problems demonstrate that EENCL can produce NN ensembles with good generalization ability.