Using evolutionary techniques to hunt for snakes and coils
ABSTRACT The snake-in-the-box problem is a difficult problem in mathematics and computer science that deals with finding the longest-possible constrained path that can be formed by following the edges of a multidimensional hypercube. This problem was first described by Kautz in the late 1950's (Kautz, 1958). Snake-in-the-box codes, or 'snakes,' are the node or transition sequences of constrained open paths through an n-dimensional hypercube. Coil-in-the-box codes, or 'coils,' are the node or transition sequences of constrained closed paths, or cycles, through an n-dimensional hypercube. Snakes and coils have many applications in electrical engineering, coding theory, and computer network topologies. Generally, the longer the snake or coil for a given dimension, the more useful it is in these applications (Klee, 1970). By applying a relatively recent evolutionary search algorithm known as a population-based stochastic hill-climber, new lower bounds were achieved for (1) the longest-known snake in each of the dimensions nine through twelve and (2) the longest-known coil in each of the dimensions nine through eleven.
- SourceAvailable from: Michael Stich[Show abstract] [Hide abstract]
ABSTRACT: The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.PLoS ONE 01/2011; 6(10):e26324. · 3.53 Impact Factor
Conference Paper: Pruning the Search Space for the Snake-in-the-Box Problem.[Show abstract] [Hide abstract]
ABSTRACT: This paper explores methods for reducing the search space when hunting for snakes using a Genetic Algorithm (GA). The first method attempts to reinterpret individuals in an effort to utilize the high degree of symmetry inherent in the problem and, thereby, make crossover more effective. The second method centers around removing snake blockers, which are sequences that prevent snakes, and the effects of snake blockers on the search space and GA effectiveness. Testing of these methods is limited to dimension 8; however, the concepts are applicable to all dimensions.Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1-4, 2010, Proceedings, Part III; 01/2010
- PLoS ONE 01/2011; 6(12). · 3.53 Impact Factor