In this paper the optimization of additively decomposed discrete functions is investigated. For these functions genetic algorithms have exhibited a poor performance. First the schema theory of genetic algorithms is reformulated in probability theory terms. A schema defines the structure of a marginal distribution. Then the conceptual algorithm BEDA is introduced. BEDA uses a Boltzmann distribution to generate search points. From BEDA a new algorithm, FDA, is derived. FDA uses a factorization of the distribution. The factorization captures the structure of the given function. The factorization problem is closely connected to the theory of conditional independence graphs. For the test functions considered, the performance of FDA—in number of generations till convergence—is similar to that of a genetic algorithm for the OneMax function. This result is theoretically explained.
All content in this area was uploaded by Heinz Mühlenbein on Mar 11, 2014
Content may be subject to copyright.
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0510 15 20
0.5
0.6
0.7
0.8
0.9
1
0510 15 20
p(t)
Generation
Marginal probabilities p(t) for OneMax
I=0.8
I=1.6
v=1.2
v=1.5
... Univariate Marginal Distribution Algorithm (UMDA) is an algorithm that uses a uniform distribution, without using operations such as mutation and recombination [17]. In general, it is an algorithm that estimates a probability distribution and a part of this information is responsible for generating new points or new candidate solutions [18]. Build a probability model based on calculated statistical information from a selected population and sample the model to build the next generation [13]. ...
... Require: ϕ 0 adaptation to environment coefficient, ϕ 1 neighborhood memory coefficient, ϕ 2 memory coefficient, n swarm size, Vmax maximum velocity of particles, Random is random number (0, 1) 1: Start randomly position and velocity of each particle i on j dimensions 2: while t < Max call functions do 3: for each particle do 4: if f (actualparticle) < f (P ij ) then 5: Start best position by actual particle P ij ← x i 6: end if 7: if f (actualparticle) < f (P gj ) then 8: Start best global position P gj ← x i 9: end if 10: end for 11: for each particle i do 12: for each dimension j do 13: Calculate new velocity v ij of particle according to Eq. (1) 14: if v ij > VMax then 15: v ij = VMax 16: end if 17: if v ij <= VMax then 18: if Random <= s(v id ) then 22: x ij = 1 23: else 24: x ij = 0 25: end if 26: end for 27: end for 28: end while 29: return P gj that is the best solution found For each individual X i,G a mutant vector V i,G is created, which is shown in the Eq. (4) that is for binary version of DE [4]: ...
... Require: ϕ 0 adaptation to environment coefficient, ϕ 1 neighborhood memory coefficient, ϕ 2 memory coefficient, n swarm size, sb scout bees, r search radio, Vmax maximum velocity of bees, Random is random number (0, 1). 1: Start randomly position and velocity of each bee i on j dimensions 2: while t < Max call functions do 3: for each bee do 4: if f (actualbee) < f (P ij ) then 5: Start best position by actual bee P ij ← x i 6: end if 7: if f (actualbee) < f (P gj ) then 8: Start best global position P gj ← x i 9: end if 10: end for 11: for each bee i do 12: for each dimension j do 13: Calculate new velocity v ij of bee according to Eq. (1) 14: if v ij > VMax then 15: v ij = VMax 16: end if 17: if v ij <= VMax then 18: if Random <= s(v id ) then 22: x ij = 1 23: else 24: x ij = 0 25: end if 26: end for 27: Choose the best sb bees 28: for each sb bee do 29: Apply the search radius r 30: if f (sb) bee < f (rbee) from search radius then 31: Replace sb bee by rbee from search radius sb ← rbee ij 32: end if 33: end for 34: end for 35: end while 36: return P gj that is the best solution found ...
Intelligent buildings are at the forefront due to its main objective of providing comfort to users and saving energy through intelligent control systems. Intelligent systems have been reported to offer comfort to a single user or averaging the comfort of multiple users without considering that their needs may be different from those of other users. This work defines a versatile model for a multi-user intelligent system that negotiates with the resources of the environment to offer visual comfort to multiple users with different profiles, activities and priorities using soft-computing algorithms. In addition, this model makes use of external lighting to provide the recommended amount of illumination for each user without having to totally depend on artificial lighting, inducing there will be an energy efficiency but without measuring it.
... To study the target problems, we apply hybrid genetic algorithms (GAs) and variants of estimation of distribution algorithms (EDAs) based on factorizations [3,18,26]. This strategy allows us to investigate EAs that are blind to the problem structure and other variants that exploit the information about this structure for a more efficient search. ...
... 4. Factorized distribution algorithm (FDA): An EDA that uses the same structure as BWCX-GA. Marginal probabilities for all configurations of the blocks are learned from the selected solutions and new solutions are generated sampling from a junction tree [18] constructed from this factorization. ...
Chimera graphs define the topology of one of the first commercially available quantum computers. A variety of optimization problems have been mapped to this topology to evaluate the behavior of quantum enhanced optimization heuristics in relation to other optimizers, being able to efficiently solve problems classically to use them as benchmarks for quantum machines. In this paper we investigate for the first time the use of Evolutionary Algorithms (EAs) on Ising spin glass instances defined on the Chimera topology. Three genetic algorithms (GAs) and three estimation of distribution algorithms (EDAs) are evaluated over 1000 hard instances of the Ising spin glass constructed from Sidon sets. We focus on determining whether the information about the topology of the graph can be used to improve the results of EAs and on identifying the characteristics of the Ising instances that influence the success rate of GAs and EDAs.
... Zhang et al. [98] proposed an evolutionary guided mutation algorithm (EA/G) based on the maximal clique problem (MCP) where the guided mutation is new offspring generating operator, which is a consolidation of conventional mutation operator and EDA offspring generating scheme [66]. Evolutionary algorithms such as genetic algorithm (GA) [30], scatter search [30], and estimation of distribution algorithms (EDAs) [4,5,8,46,59,60,65,95] are not suitable because global statistical information and location information is not directly used to guide the search. To address these issues, guided mutation operators alter the parent solution to generate offspring by consolidating local and global information of parent solution and EA/G search different areas in different search phases; EA/G is used to find a maximal clique. ...
Detection of communities is one of the prominent characteristics of vast and complex networks like social networks, collaborative networks, and web graphs. In the modern era, new users get added to these complex networks, which results in an expansion of application-generated networks. Extracting relevant information from these large networks has become one of the most prominent research areas. Community detection tries to reduce the application-generated graph into smaller communities in which nodes within the community are similar. Most of the recent proposals are focused on detecting overlapping communities in the network with higher accuracy. An integral issue in graph theory is the enumeration of cliques in a larger graph. As clique is a group of completely connected nodes which shows the explicit communities means these nodes share the same types of information. Clique-based community detection algorithm utilizing the clique property of the graph also identifies the implicit communities, which is not directly shown in the graph. Many overlapping community detection algorithms are proposed by researchers that rely on cliques. The goal of this paper is to offer a comparative analysis of clique-based community detection algorithms. This paper provides a pervasive survey on research works identifying the cliques in a network for detecting overlapping communities. We bring together most of the state-of-the-art clique-based community detection algorithms into a single article with their accessible benchmark data sets. It presents a detailed description of methods based on K-cliques, maximal cliques, and triad percolation methods and addresses these approaches’ challenges. Finally, the comparative analysis of overlapping community detection methodologies is also reported.
... From this distribution, some interesting properties can be extracted for various applications (Nash (1982); Novak & Bortz (1970); Kozliak (2004); Richmond & Solomon (2001)). In fact, it has tons of application in almost every field in science: resource scheduling, the spatial dynamic of diffusion systems, chemical kinetics, Gibbs entropy and thermodynamic entropy, Quantum Science and Technology, lattice-Boltzmann networks, Boltzmann Workflow Generators, simulated annealing and parallel annealing algorithm, neuromorphic systems, Quantum machine learning, graphical models, probabilistic modelling (Liang et al. (2014), Ernst et al. (2019), Gao et al. (2019), Takeda et al. (2017), Rabbani & Babaei (2019), Aarts & Korst (1988), Neftci et al. (2014), Biamonte et al. (2017), Mühlenbein et al. (1999), Yunpeng et al. (2006)) The Boltzmann Distribution was a preliminary to the Boltzmann Machine. A Boltzmann machine is also known as an equivalent recurrent neural network (Medsker & Jain (2001)) -in binary decisions it has some bias in each node. ...
The perfect learning exists. We mean a learning model that can be generalized, and moreover, that can always fit perfectly the test data, as well as the training data. We have performed in this thesis many experiments that validate this concept in many ways. The tools are given through the chapters that contain our developments. The classical Multilayer Feedforward model has been re-considered and a novel -architecture is proposed to fit any multivariate regression task. This model can easily be augmented to thousands of possible layers without loss of predictive power, and has the potential to overcome our difficulties simultaneously in building a model that has a good fit on the test data, and don't overfit. His hyper-parameters, the learning rate, the batch size, the number of training times (epochs), the size of each layer, the number of hidden layers, all can be chosen experimentally with cross-validation methods. There is a great advantage to build a more powerful model using mixture models properties. They can self-classify many high dimensional data in a few numbers of mixture components. This is also the case of the Shallow Gibbs Network model that we built as a Random Gibbs Network Forest to reach the performance of the Multilayer feedforward Neural Network in a few numbers of parameters, and fewer backpropagation iterations. To make it happens, we propose a novel optimization framework for our Bayesian Shallow Network, called the {Double Backpropagation Scheme} (DBS) that can also fit perfectly the data with appropriate learning rate, and which is convergent and universally applicable to any Bayesian neural network problem. The contribution of this model is broad. First, it integrates all the advantages of the Potts Model, which is a very rich random partitions model, that we have also modified to propose its Complete Shrinkage version using agglomerative clustering techniques. The model takes also an advantage of Gibbs Fields for its weights precision matrix structure, mainly through Markov Random Fields, and even has five (5) variants structures at the end: the Full-Gibbs, the Sparse-Gibbs, the Between layer Sparse Gibbs which is the B-Sparse Gibbs in a short, the Compound Symmetry Gibbs (CS-Gibbs in short), and the Sparse Compound Symmetry Gibbs (Sparse-CS-Gibbs) model. The Full-Gibbs is mainly to remind fully-connected models, and the other structures are useful to show how the model can be reduced in terms of complexity with sparsity and parsimony. All those models have been experimented, and the results arouse interest in those structures, in a sense that different structures help to reach different results in terms of Mean Squared Error (MSE) and Relative Root Mean Squared Error (RRMSE). For the Shallow Gibbs Network model, we have found the perfect learning framework : it is the configuration, which is a combination of the \emph{Universal Approximation Theorem}, and the DBS optimization, coupled with the (\emph{dist})-Nearest Neighbor-(h)-Taylor Series-Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model [which in turn is a combination of the research of the Nearest Neighborhood for a good Train-Test association, the Taylor Approximation Theorem, and finally the Multivariate Interpolation Method]. It indicates that, with an appropriate number of neurons on the hidden layer, an optimal number of DBS updates, an optimal DBS learnnig rate , an optimal distance \emph{dist} in the research of the nearest neighbor in the training dataset for each test data x_i^{\mbox{test}}, an optimal order of the Taylor approximation for the Perfect Multivariate Interpolation (\emph{dist}-NN-(h)-TS-PMI) model once the {\bfseries DBS} has overfitted the training dataset, the train and the test error converge to zero (0). As the Potts Models and many random Partitions are based on a similarity measure, we open the door to find \emph{sufficient} invariants descriptors in any recognition problem for complex objects such as image; using \emph{metric} learning and invariance descriptor tools, to always reach 100\% accuracy. This is also possible with invariant networks that are also universal approximators. Our work closes the gap between the theory and the practice in artificial intelligence, in a sense that it confirms that it is possible to learn with very small error allowed.
Project planning can be treated as an optimization problem focused on organizing a set of tasks while respecting a set of precedent constraints and the limited use of renewable and non-renewable resources. The resulting schedule must have the properties of being executed with the least possible time and cost and with a balanced quality of the solution. In this research, a set of new distribution estimation algorithms is presented to solve this problem. Furthermore, the behavior of different evolutionary algorithms in the construction of schedules is compared. The experimental results show the viability of evolutionary algorithms for the agile construction of chronograms and their potential use in BIM environments. In the experimentation, a set of 150 instances collected in 15 databases from the PSPLib library were used. The sensitivity of the behavior of the algorithms is evaluated in the following scenarios: variation in the number of execution modes, in the number of tasks, in the number of renewable resources, and in the number of non-renewable resources, demonstrating the feasibility of the solutions.
Project Scheduling Problems (PSP) constitute a family of problems that includes different variants, which range from simple task planning without taking into account the resources they consume, to more sophisticated variants that consider several modes of processing of projects tasks, generalization of precedence relationships, multiple projects simultaneously and projects with variable resources. In this sense, various algorithms, both exact and heuristic, have been used to find optimal or quasi-optimal project schedules. This research aims to propose a new Constraints Learning Univariate Estimation of Distribution Algorithm (CL_UMDA) as an extension of the Univariate Marginal Distribution Algorithm (UMDA). The new algorithm incorporates the constraints handling inside the probabilistic model, for the solution of the PSP problem in its multimode variant (MMRCPSP). For this purpose, a group of experiments was developed on five databases of the PSPLib library, comparing the proposed algorithm with others reported in the literature. The experimental results show the superiority of the CL_UMDA performance over other algorithms used in the experimentation.
In this paper, a method combining three techniques is proposed in order to reduce the amount of features used to train and predict over a handwritten data set of digits. The proposal uses typical testors and searches through evolutionary strategy to find a reduced set of features that preserves essential information of all the classes that compose the data set. Once found it, this reduced subset will be strengthened for classification. To achieve it, the neural network prediction accuracy plays the role of fitness function. Thus, when a subset reaches a threshold prediction accuracy, it is returned as a solution of this step. Evolutionary strategy makes this intense search of features viable in terms of computing complexity and time. The discriminator construction algorithm is proposed as a strategy to achieve a smaller feature subset that preserves the accuracy of the overall data set. The proposed method is tested using the public MNIST data set. The best result found a subset of 171 features out of the 784, which only represents 21.81% of the total number of characteristics. The accuracy average was 97.83% on the testing set. The results are also contrasted with the error rate of other reported classifiers, such as PCA, over the same data set.
From the study of the Estimation of Distribution Algorithms (EDA) based on polytrees, we propose and evaluate the class of EDA algorithms that use independence tests for learning the probabilistic model. These algorithms are known as constraint-based EDA which define a class of EDA called constraint-based estimation of distribution algorithms (CBEDA). As a result, a new CBEDA TPDA algorithm is proposed using the three-phase dependence detection method for learning Bayesian networks. The experimental results show that the new proposal has adequate numerical qualities for the solution of optimization problems with integer representation such as the deceptive functions and the problem of protein structure prediction (PSP). The results are compared with other state-of-the-art algorithms in evolutionary computation, including proposals from the EDA field.
The Breeder Genetic Algorithm (BGA) is based on the equation for the response to selection. In order to use this equation for prediction, the variance of the fitness of the population has to be estimated. For the usual sexual recombination the computation can be difficult. In this paper we shortly state the problem and investigate several modifications of sexual recombination. The first method is gene pool recombination, which leads to marginal distribution algorithms. In the last part of the paper we discuss more sophisticated methods, based on estimating the distribution of promising points.
Many combinatorial optimization algorithms have no mechanism for capturing inter-parameter dependencies. However, modeling such depen- dencies may allow an algorithm to concentrate its sampling more effectively on regions of the search space which have appeared promising in the past. We present an algorithm which incre- mentally learns pairwise probability distributions from good solutions seen so far, uses these statis- tics to generate optimal (in terms of maximum likelihood) dependency trees to model these dis- tributions, and then stochastically generates new candidate solutions from these trees. We test this algorithm on a variety of optimization problems. Our results indicate superior performance over other tested algorithms that either (1) do not explicitly use these dependencies, or (2) use these dependencies to generate a more restricted class of dependency graphs.
A variety of problems in machine learning and digital communication deal with complex but structured natural or artificial systems. In this book, Brendan Frey uses graphical models as an overarching framework to describe and solve problems of pattern classification, unsupervised learning, data compression, and channel coding. Using probabilistic structures such as Bayesian belief networks and Markov random fields, he is able to describe the relationships between random variables in these systems and to apply graph-based inference techniques to develop new algorithms. Among the algorithms described are the wake-sleep algorithm for unsupervised learning, the iterative turbodecoding algorithm (currently the best error-correcting decoding algorithm), the bits-back coding method, the Markov chain Monte Carlo technique, and variational inference.
Bradford Books imprint
Abstract In many optimization problems, the structure of solutions reflects complex relationships between the different input parameters. For example, experience may,tell us that certain parameters are closely related and should not be explored independently. Similarly, experience may establish that a subset of parameters must take on particular values. Any search of the cost landscape should take advantage of these relationships. We present MIMIC, a framework in which we analyze the global structure of the optimization landscape. A novel and efficient algorithm for the estimation of this structure is derived. We use knowledge,of this structure to guide a randomized search through the solution space and, in turn, to refine our estimate of the structure. Our technique obtains significant speed gains over other randomized,optimization procedures. Advances in Neural Information Processing Systems 1997 MIT Press, Cambridge, MA