ArticlePDF Available

Abstract

In Nature, living beings improve their adaptation to surrounding environments by means of two main orthogonal processes: evolution and lifetime learning. Within the Artificial Intelligence arena, both mechanisms inspired the development of non-orthodox problem solving tools, namely: Genetic and Evolutionary Algorithms (GEAs) and Artificial Neural Networks (ANNs). In the past, several gradient-based methods have been developed for ANN training, with considerable success. However, in some situations, these may lead to local minima in the error surface. Under this scenario, the combination of evolution and learning techniques may induce better results, desirably reaching global optima. Comparative tests that were carried out with classification and regression tasks, attest this claim.
... ...
...
Crossover
Mutation
Selection
Population
Lamarckian
Learning
Encode
Decode
X
-1
0
1
X
Z
-1
01
-1
-1
1
1
Y
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
-1 -0.5 0 0.5 1
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0 50 100 150 200 250 300
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0 50 100 150 200 250 300
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0
0.1
0.2
0.3
0.4
0.5
0.6
0 50 100 150 200 250 300 350 400 450 500
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0 50 100 150 200 250 300
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0 50 100 150 200 250 300 350 400 450 500
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0
1
2
3
4
5
6
0 50 100 150 200 250 300 350 400 450 500
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250 300 350 400 450 500
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
15
20
25
30
35
40
45
50
55
60
0 200 400 600 800 1000
error (RMSE)
time (seconds)
Connectionist Model
Darwinian Model
Larmarkian Model
Population of Connectionist Models
... Those finds do confirm the ones obtained by Sasaki and Tokoro (1997) that Lamarckian is efficient in static environments, performing poorly and being unstable in dynamic environments. More studies with Lamarckian models can be found in (Cortez et al., 2002, Rocha et al., 2000) where Lamarckian achieved good results in machine learning benchmarks. ...
... The genetic and evolutionary computation comprises a group of techniques of which is part the genetic algorithms, the evolutionary strategies, the genetic programming and the evolutionary programming (Cortez et. al, 2002). As the author states, there isn't a clear division and exists an overlap of those techniques. Of the above referred techniques the genetic algorithms (GA) have been proving that they are robust and efficient methods in the resolution of optimization problems. The GA, such as artificial neural networks (ANN), had been also inspired by ...
... More studies with Lamarckian models can be found in (Cortez et al., 2002, Rocha et al., 2000) where Lamarckian achieved good results in machine learning benchmarks. ...
Article
Full-text available
We review the integration between the genetic and evolutionary techniques with artificial neural networks. A Lamarckian model is proposed based on genetic algorithms and artificial neural networks. The genetic algorithm evolves the population while the artificial neural network performs the learning process. The direct encoding scheme was used. This model was submitted to several data sets and provided good results, exhibiting superior robustness when compared with the Levenberg-Marquardt and the Scaled Conjugate Gradient algorithms. It also achieved the best solutions in the regression problems.
... A Darwin-neural network learns specific tasks through interactions with an unknown environment, and its behavior develops according to gained the experience (Manderick, 1991). However, a Lamarckian-neural network is a non-orthodox problem solving tool that combines evolution and learning techniques (Cortez et al., 2002). ...
Poster
Full-text available
Until nowadays, the scientific community firmly rejected the Theory of Inheritance of Acquired Characteristics, a theory mostly associated with the name of Jean-Baptiste Lamarck (1774-1829). Though largely dismissed when applied to biological organisms, this theory found its place in a young discipline called Artificial Life. Based on two models of Darwinian and Lamarckian evolutionary theories using neural networks and genetic algorithms, this paper presents a notion about how the life might be if knowledge was inherited by a Lamarckian scheme. Such evolutionary models turned out to be useful not only in engineering, but seem to have significant philosophical implications.
... A MO was used in Chabbouh et al. (2019) to evolve DTs but it only optimized predictive performance measures (Precision and Recall), while in Czajkowski and Kretowski (2019b) a MO was adopted to optimize oblique and mixed DT model complexity and predictive performance for regression tasks. Moreover, LE can use a local learning procedure to accelerate evolution, where the improved solution is encoded back into the chromosome (Cortez et al., 2002). Our work is the only study that introduces a LE, which uses a fast local ML search to improve the GE solutions. ...
Article
The worldwide adoption of mobile devices is raising the value of Mobile Performance Marketing, which is supported by Demand-Side Platforms (DSP) that match mobile users to advertisements. In these markets, monetary compensation only occurs when there is a user conversion. Thus, a key DSP issue is the design of a data-driven model to predict user conversion. To handle this nontrivial task, we propose a novel Multi-objective Optimization (MO) approach to evolve Decision Trees (DT) using a Grammatical Evolution (GE), under two main variants: a pure GE method (MGEDT) and a GE with Lamarckian Evolution (MGEDTL). Both variants evolve variable-length DTs and perform a simultaneous optimization of the predictive performance and model complexity. To handle big data, the GE methods include a training sampling and parallelism evaluation mechanism. The algorithms were applied to a recent database with around 6 million records from a real-world DSP. Using a realistic Rolling Window (RW) validation, the two GE variants were compared with a standard DT algorithm (CART), a Random Forest and a state-of-the-art Deep Learning (DL) model. Competitive results were obtained by the GE methods, which present affordable training times and very fast predictive response times.
... Lamarckian evolution, in contrast to Darwinian, does explicitly store the locally learned improvements in the individual genomes, so that lifetime learning can directly accelerate the evolutionary process and vice versa [1]. Up until now, the Lamarckian approach to evolution has seen an initial investigation [9]. While this mechanism has largely not been seen as a correct description of biological evolution, some recent research has reported a Lamarckian type of evolution in nature [10]. ...
Conference Paper
Full-text available
Morphological evolution in a robotic system produces novel robot bodies after each reproduction event. This implies the necessity for lifetime learning so that newborn robots can acquire a controller that fits their body. Thus, we obtain a system where evolution and learning are combined. This combination can be Darwinian or Lamarckian and in this paper, we compare the two. In particular , we investigate the evolved morphologies under these regimes for modular robots evolved for good locomotion. Using eight quantifiable morphological descriptors to characterize the physical properties of robots we compare the regions of attraction in the resulting 8-dimensional space. The results show prominent differences in symmetry, size, proportion, and coverage.
... Several authors (Sutton, 1986;Whitley et al., 1990) have reported that gradient descent backpropagation has drawbacks due to the possibility of getting trapped in a local minimum of the error function. Some researchers (Yao, 1999;Cortez et al., 2002) have also proposed using evolutionary approach (GA) instead of backpropagation for ANN training. In this study, no such limitation was found and based on our experience; it seems that gradient descent has a good capability in data modeling. ...
Conference Paper
The major aim of this study was to model the effect of two causal factors, i.e. coating weight gain and amount of pectin–chitosan in the coating solution on the in vitro release profile of theophylline for bimodal drug delivery. Artificial neural network (ANN) as a multilayer perceptron feedforward network was incorporated for developing a predictive model of the formulations. Five different training algorithms belonging to three classes: gradient descent, quasi-Newton (Levenberg–Marquardt, LM) and genetic algorithm (GA) were used to train ANN containing a single hidden layer of four nodes. The next objective of the current study was to compare the performance of aforementioned algorithms with regard to predicting ability. The ANNs were trained with those algorithms using the available experimental data as the training set. The divergence of the RMSE between the output and target values of test set was monitored and used as a criterion to stop training. Two versions of gradient descent backpropagation algorithms, i.e. incremental backpropagation (IBP) and batch backpropagation (BBP) outperformed the others. No significant differences were found between the predictive abilities of IBP and BBP, although, the convergence speed of BBP is three-to four-fold higher than IBP. Although, both gradient descent backpropagation and LM methodologies gave comparable results for the data modeling, training of ANNs with genetic algorithm was erratic. The precision of predictive ability was measured for each training algorithm and their performances were in the order of: IBP, BBP > LM > QP (quick propagation) > GA. According to BBP–ANN implementation, an increase in coating levels and a decrease in the amount of pectin–chitosan generally retarded the drug release. Moreover, the latter causal factor namely the amount of pectin–chitosan played slightly more dominant role in determination of the dissolution profiles.
... Another approach to hybrid training is to use a local search algorithm to guide the evolution process. There are two main approaches that follow this concept: Baldwinian [23] and Lamarckian [24] evolution. In the former, at every iteration of the evolutionary algorithm, a local search algorithm is executed for each member in the population of candidate solutions. ...
Conference Paper
The evolution in the use of digital identities has brought several advancements. However, this evolution has also contributed to the rise of the identity theft. An alternative to curb identity theft is by the identification of anomalous user behavior on the computer, what is known as behavioral intrusion detection. Among the features to be extracted from the user behavior, this paper focuses on keystroke dynamics, which analysis the user typing rhythm. This work uses a neural network to recognize users by keystroke dynamics and draws a comparison among several training algorithms: single backpropagation, three approaches based on genetic algorithms and three approaches based on immune algorithms.
... Ackley & Littman (1994) reported similar positive results for Lamarckian evolution in a large spatially distributed population. Studies in recurrent ANNs (Ku & Mak 1997), application to the TSP (Ross 1999), the 4-Cycle problem (Julstrom 1999) and a set of regression and optimisation problems (Cortez et al. 2002) confirm that Lamarckism accelerates evolution, and gives quicker results. ...
Article
Abstract In this thesis, I report on a study of the interactions between evolution and learning. The methodology used is to build an abstract computational model of these processes, and study them from an Artificial Life perspective. Restrictions imposed,by biology as we know it are relaxed, and experiments are carried out in the spirit of exploring life as it could be. Also included is a study of the inheritance of acquired characteristics, or Lamarckism. The Baldwin effect was revisited. Experiments were carried out to explore the factors that affect the evolution of the rate of learning. I show,how,the rate of change of environment, the costs associated with learning, the different selection pressures and the rate of mutation affect the evolution of learning. Interesting features such as cycling in the rate of learning were observed. Lamarckism,was explored in a similar vein. Its advantage in stationary and slowly changing environments was shown; at the same time it was found that if there were any costs associated with Lamarckism, the degree of Lamarckism always evolved to zero. This leads to an interesting explanation of why,Lamarckism,will never be favoured by evolution by natural selection. I round off with a few experiments that explore the dynamics,of mixed populations of individuals in competition with each other; to deter- mine the best evolutionary strategy for a given combination of costs and environmental dynamics. Finally I tie together all these threads, and present through the outcomes, a unified picture of the interactions between evolution and learning. i Acknowledgements I would like to sincerely thank Prof. Gillian Hayes, my supervisor at the University of Edinburgh. I am extremely grateful to her for agreeing to supervise me; for her
... In addiction, the proposed approach opens room for the development of automatic tools for clinical decision support.Table 9 In future research it is intend to improve the ORMs performances, by exploring different ANNs topologies (e.g., Radial Basis Functions). Another interesting direction is based in the use of training algorithms that can optimize other learning functions (e.g., Evolutionary Algo- rithms [11] or Particle Swarms [12] ), since the gradientbased methods (such as RPROP [6]) work by minimizing the Sum Squared Error, a target which does not necessarily correspond to maximizing the sensitivity and specificity rates. Finally, it is intended to enlarge the ANN experiments to other ICU applications (e.g., predicting life expectancy). ...
Chapter
Most of the Computer Vision applications used in diagnoses in Medical Imaging involve real time analysis and description of object behaviour from image sequences. In order to fulfil this goal it is shown how the formalism of Extended Logic Programming (ELP) for the evaluation and manipulation of conceptual descriptions from images sequences can be translated into ELP programs, and thereby reducing the problem of the recognition of such conceptual descriptions to proving ELP goals. Indeed, as more health-care providers invest in computerised medical records, more clinical data is made accessible, and clinical insights will become more reliable. As information technologies advance, a plethora of data becomes available in almost all domains of one’s lives, being of particular interest to this work that of the medical practice. Intelligent Diagnosis Systems with built-in functions for Knowledge Discovery and Data Mining, concerning with extracting and abstracting useful rules from such huge repositories of data are becoming increasingly important for providing better service or care. In particular, embedding Machine Learning technology into Intelligent Diagnosis Systems seems to be well suited for medical diagnostics in specialized medical domains, such as medical imaging.
Article
190 articles about neural network learning algorithms published in 1993 and 1994 are examined for the amount of experimental evaluation they contain. 29% of them employ not even a single realistic or real learning problem. Only 8% of the articles present results for more than one problem using real world data. Furthermore, one third of all articles do not present any quantitative comparison with a previously known algorithm. These results suggest that we should strive for better assessment practices in neural network learning algorithm research. For the long-term benefit of the field, the publication standards should be raised in this respect and easily accessible collections of benchmark problems should be built. Keywords: algorithm evaluation, science, experiment 1 Introduction A large body of research in artificial neural networks is concerned with finding good learning algorithms to solve practical application problems. Such work tries to improve for instance the quality of solu...
Book
Ripley brings together two crucial ideas in pattern recognition: statistical methods and machine learning via neural networks. He brings unifying principles to the fore, and reviews the state of the subject. Ripley also includes many examples to illustrate real problems in pattern recognition and how to overcome them.
Article
Baldwinian evolution and Lamarckism hold that behaviour is not evolved solely through genetics, but also through learning. In Baldwinian evolution, learned behaviour causes changes only to the fitness landscape, whilst in Lamarckism, learned behaviour also causes changes to the parents’ genotypes. Although the biological plausibility of these positions remains arguable, they provide a potentially useful framework for the construction of artificial systems. As an example, we describe the use of Baldwinian and Lamarckian evolution in the design of the hidden layer of a RBF network.
Article
A neural network learning procedure has been applied to the classification of sonar returns from two undersea targets, a metal cylinder and a similarly shaped rock. Networks with an intermediate layer of hidden processing units achieved a classification accuracy as high as 100% on a training set of 104 returns. These networks correctly classified up to 90.4% of 104 test returns not contained in the training set. This performance was better than that of a nearest neighbor classifier, which was 82.7%, and was close to that of an optimal Bayes classifier. Specific signal features extracted by hidden units in a trained network were identified and related to coding schemes in the pattern of connection strengths between the input and the hidden units. Network performance and classification strategy was comparable to that of trained human listeners.