ChapterPDF Available

Discovering Relational Structure in Program Synthesis Problems with Analogical Reasoning



Much recent progress in Genetic Programming (GP) can be ascribed to work in semantic GP, which facilitates program induction by considering program behavior on individual fitness cases. It is therefore interesting to consider whether alternative decompositions of fitness cases might also provide useful information. The one we present here is motivated by work in analogical reasoning. So-called proportional analogies ('gills are to fish as lungs are to mammals') have a hierarchical relational structure that can be captured using the formalism of Structural Information Theory. We show how proportional analogy problems can be solved with GP and, conversely, how analogical reasoning can be engaged in GP to provide for problem decomposition. The idea is to treat pairs of fitness cases as if they formed a proportional analogy problem, identify relational consistency between them, and use it to inform the search process.
Discovering Relational Structure in Program
Synthesis Problems with Analogical Reasoning
Jerry Swan and Krzysztof Krawiec
Abstract Much recent progress in Genetic Programming (GP) can be ascribed to
work in semantic GP, which facilitates program induction by considering program
behavior on individual fitness cases. It is therefore interesting to consider whether
alternative decompositions of fitness cases might also provide useful information.
The one we present here is motivated by work in analogical reasoning. So-called
proportional analogies (‘gills are to fish as lungs are to mammals’) have a hierar-
chical relational structure that can be captured using the formalism of Structural In-
formation Theory. We show how proportional analogy problems can be solved with
GP and, conversely, how analogical reasoning can be engaged in GP to provide for
problem decomposition. The idea is to treat pairs of fitness cases as if they formed
a proportional analogy problem, identify relational consistency between them, and
use it to inform the search process.
Key words: Program Synthesis; Genetic Programming; Proportional Analogy; In-
ductive Logic Programming; Machine Learning
1 Introduction
Perhaps the strongest reason for favouring Genetic Programming (GP) over alterna-
tive machine learning approaches is the explanatory power afforded by the resulting
symbolic descriptions. Whilst other approaches may be faster or more accurate, GP
can provide more compelling insights into observed data than numerically-driven
approaches constrained to specific model class.
Jerry Swan
Department of Computer Science, University of York, UK
Krzysztof Krawiec
Institute of Computing Science, Poznan University of Technology, Pozna´
n, Poland
2 Jerry Swan and Krzysztof Krawiec
To maximize the explanatory power of GP, it is highly desirable to obtain sym-
bolic explanations which appear to the human reader to be not only comprehensi-
ble but also natural. In respect of comprehensibility, there has been considerable
work in combating expression bloat [39]. However, there has been relatively little
emphasis on building human bias into the search process. Since much human bias
originates in universal observations that stem from the specific constitution of the
natural world, its inclusion may actually lead to both quantitative and qualitative
improvements [41]. Since GP is often used to search for regularities in real-world
data, equipping it with such biases may be desirable, at the least in extracting more
compelling explanations from experimental results [40].
In this chapter, we explore a mechanism for the discovery of problem’s relational
structure, framed in terms of existing work on analogical reasoning. Analogy can
be considered as ‘a mapping between systems or processes’ and has been described
as ‘the core of cognition’ [14]. In cognitive science, it is understood to provide a
flexible mechanism for re-contextualising situations in terms of prior (or hypotheti-
cal) experience and is also considered a key mechanism for escaping dichotomies of
representation [31], which is argued to be of general importance for Computational
Intelligence [22].
We start with a brief overview of analogy as a computational mechanism in Sec-
tion 2. In Section 3, we present the formalism of Structural Information Theory for
building the relational structures needed for the proposed approach. In Section 4, we
present GPCAT, a framework for solving proportional analogy problems using GP,
and experimentally assess its performance in Section 5. In Section 6, we explain
how similar mechanisms can be used to aid GP applied to conventional program
synthesis problems. In Section 7 we discuss the related work, and summarize this
study in Section 8.
2 Analogical reasoning
The use of analogy as a computational mechanism dates back to Evans’ famous
geometric reasoner [9]. More recent computational models include the Structure
Mapping Engine (SME) [10], the connectionist models ACME [15] and LISA [16],
Heuristic-Driven Theory Projection [38] and some matching techniques used in
Case-Based Reasoning [1]. A short article can only provide a brief overview of the
wide range of literature: considerably more detail is available in the recent volume
by Prade and Richard [33]. As distinct from predictive analogy, which is concerned
with inferring properties of a target object as a function of its similarity to a source
object, our interest here is in the application of proportional analogy.
The roots of analogical proportion can be traced as far back as Aristotle [3]. A
proportional analogy problem, denoted A:B::C:D, is concerned with finding D
s.t.h. D is to C as B is to A. The ‘microdomain’ of Letter String Analogy (LSA)
Problems (e.g. abc : abd :: ijk : ?) can be considered exemplary and is of long-
standing interest: although seemingly simple, the domain can require remarkable
Discovering Relational Structure in Program Synthesis Problems ... 3
abc h
i jk h
Fig. 1 Commutativity of proportional analogy [37]
sophistication [11]. As can be seen in Fig. 1, proportional analogy problems can
also be considered to form a commutative diagram [37]. Notable approaches to LSA
problems include Hofstadter and Mitchell’s CopyCat [14] and the Anti-Unification
based approach of Weller and Schmid [45]. When studied in the context of core AI
research and cognitive science, LSAs are often left ‘open-ended’:
abc : abd :: ijk : ?
abc : abd :: iijjkk : ?
abc : abd :: mrrjjj : ?
abc : abd :: xyz : ?
Posed in this way, LSAs are unlike traditional instances of computational prob-
lem solving — in general, a LSA problem has no singular ‘Platonic’ solution, so it is
therefore difficult to define an objective measure for solution quality in a ‘top down’
fashion. Nevertheless, humans confronted with LSA problems typically converge on
a few answers that occur with relatively stable frequencies. For instance, the most
common answers to the above LSAs are respectively ijl,iijjll,mrrkkk and
xya, which corroborates the existence of human bias.
3 Capturing relational structure
Any method that is intended to deal with proportional analogy problems requires
some (formal or informal) means of capturing the relational structure of objects in
the domain (here: letter strings). Ideally, such a mechanism should take into account
the natural biases discussed in Section 1. One means of representing and quantifying
such bias is via the use of Structural Information Theory (SIT) [24]. SIT is a formal-
ism of relational structure which also provides a complexity metric. In contrast to the
complexity metrics of Algorithmic Information Theory (e.g. Kolmorogorov), SIT
is explicitly designed to correspond to the principles of human Gestalt perception
[8], intended to explain human propensity to prefer certain perceptual groupings.
The rules of Gestalt are readily illustrated in visual perception, where they explain
the inclination for grouping smaller objects into larger shapes, grouping objects by
proximity, closing partially occluded curves, and others.
The original description of SIT due to Leeuwenberg [24] describes linear, one-
dimensional patterns of objects in terms of repetition, alternation and symmetry,
subsequently extended to a recursive algebraic description by Dastani et al. [4]. It is
4 Jerry Swan and Krzysztof Krawiec
the latter that we use here: repetition is denoted by the iterated application of some
designated function e.g. Iter(ab,id,2)(where id is the identity function) denotes
the pattern abab and Iter(a,succ,3)(where succ is the successor function) denotes
abc. Alternation denotes a sequence into which an object is interleaved. It has ‘left’
and ‘right’ variants: for example, AltL(a,xyz)describes axayaz and AltR(a,xyz)
describes xayaza. Symmetry denotes a sequence followed by its reversal, and
occurs in an ‘even’ form (SymE(ab)=abba) and an ‘odd’ form (SymO(ab,c)=
A SIT term determines a unique string, but the opposite does not hold: a given se-
quence may clearly be representable by many different SIT descriptions. For exam-
ple, abccba can be represented both by SymE(Iter(a,succ,3)) and SymO(ab,It er(c,
id,2)). Associated with each structural description is the notion of information load,
intended to quantify human preference between alternative relational descriptions
— those with lower information loads being preferable. The measure of informa-
tion load we adopt here is due to Dastani et al [4], which modifies the previous
formulation of VanDerHelm and Leeuwenberg [43] and is defined as the sum of
occurrences of individual operators in a SIT description, not including the SIT op-
erators themselves. Thus, while Iter(ab,id,2)and AltL(a,bb)both represent SIT
descriptions of abab, the former has an information load of 2 and the latter 3.
3.1 Finding SIT terms with GP
In the recursive variant of SIT described above, the patterns appearing in a SIT rela-
tion above can themselves be SIT relations. This lends itself to a direct representa-
tion of SIT relations as nodes in a tree structure, allowing the use of GP to find a SIT
description for a given string [4]. As mentioned above, it is desirable to search for
SIT structures of low complexity, as given by the information load measure. How-
ever, this quantity alone cannot effectively drive the search, as the relations found
by GP have to produce the target string in the first place. Therefore, we define our
fitness function as:
f(t) = Lev(t,s)+ 0.001 ·I n f Load(t),(1)
where tis the SIT term being evaluated, sis the string to be reproduced, Lev(t,s)
is the Levensthein distance between the string produced by tand s, and In f Load(t)
is the information load. The fitness function effectively realizes lexicographic or-
dering of search objectives, prioritizing matching the target string. Alternatively, a
multiobjective evolutionary search could be engaged here.
The instruction set of our GP setup includes all the algebraic relations presented
above, i.e. Iter,AltR,AltL,SymE,SymO, and the Sequence and Group relations
that respectively cater for flat and nested (hierarchical) sequences. There are also
numeric constants that Iter needs to determine the number of iterations and function
literals: succ,pred, and id. Terms, numeric constants and function literals form
Discovering Relational Structure in Program Synthesis Problems ... 5
three types handled by strongly-typed GP mechanisms. Using EpochX GP [30],
we evolve a population of 100 individuals SIT relations, initialized with Koza’s
‘Grow’ method with program height set to 3. The upper limit on expression height
in evolution is 8. Evolution lasts for 100 generations. All other parameters are as per
the EpochX defaults.
We applied the above GP setup to all 35 unique letter strings occurring in the
problems originally considered by Mitchell [28] (Section 2), i.e.
abc abd ace cab cba cde cmg
cmz edc glz ijk kji mrr qbc
rst rsu xcg xlg xyz aabc aabd
abcd abcm abcn ijkk rijk aababc aabbcc
aabbcd hhwwqq iijjkk lmfgop mrrjjj rssttt xpqdef
and repeat each run 10 times. On average, GP finds a correct SIT term (i.e. re-
producing the target string perfectly, with Lev(t,s) = 0) in 93.4% of runs. For most
strings, the success rate is 10/10, and the worst success rate is 4/10 (for lmfgop).
The average information load amounts to 2.835, and the average number of nodes
in a term is 7.190. GP managed to find SIT terms with minimal- or close-to minimal
load for many problems, for instance:
ijkk: Group(Iter(i,Succ,3),k)
aabbcc: Iter(Group(a,a),Succ,3)
xpqdef: Seq(Group(x,Grou p(Iter(p,Succ,2),It er(d,Succ,2))),f)
Arguably, optimal SITs for these small problems could be found via exhaustive
search. However, for more complex problems that we wish to handle prospectively,
resorting to heuristic search is likely to be unavoidable.
4 Solving proportional analogies with GP
The CopyCat program [14] is a cognitive model of proportional analogy. Although
very carefully engineered, the specifics of the interactions between its architectural
elements (described in more detail in Section 7) are somewhat complex. While they
have been described at length [28, 14], this is has nonetheless been done in a rela-
tively informal fashion. It is therefore interesting to see if comparable results can be
obtained by combining more readily-demarcated methods.
We therefore propose GPCAT, a GP-based method for tacking proportional
analogies, with which we intend to achieve several goals:
1. Compose well-known formalisms like SITs, Anti-Unification, and GP, rather
than the mechanisms that are specific to CopyCat.
2. Verify GP’s usefulness for solving proportional analogies.
3. Prospectively extend/substitute GPCAT’s components with formalism for han-
dling other domains more common to program synthesis.
6 Jerry Swan and Krzysztof Krawiec
Algorithm 1 Anti-unification algorithm for two terms.
function AU(x,y)
if x=ythen
return x
else if x=f(x1,...,xn)y=f(y1,...,yn)then
return f(AU(x1,y1),...,AU(xn,yn))
return φ
end if
end function
There are three main components of GP CAT:
1. A domain-specific relational formalism (in this case SIT, Section 3).
2. An Anti-Unification algorithm.
3. A GP algorithm.
Anti-Unification (AU) is a procedure that extracts the common substructure of
a set of terms T. The AU of Tis itself a term, with some subterms replaced with
variables. The defining property of such term u(Anti-Unifier) is that for each tT
there exists a substitution σ(i.e. a mapping from variables to terms) such that when
applied to u, it makes it equal to t, i.e., uσ=t. In fact, uhas the important property
of being the most specific such term — informally, it preserves as much of the
common structure as possible.
Algorithms for n-ary Anti-Unification were invented more-or-less simultane-
ously by Reynolds [35] and Plotkin [32]. For our purposes, unification of two terms
(as per Algorithm 1) will suffice. The value φdenotes a so-called ‘fresh variable’,
which maps to xunder some substitution σxand to yunder σy. The expressiveness
of AU is dependent on how equality between terms is defined: in the case of syn-
tactic AU that we consider here, function symbols are simply unique labels, with no
intrinsic meaning.
Anti-Unification has been used in the solution of proportional analogy problems
by Weller and Schmid [45]. Their algorithm is as follows [44]:
1. Use AU to compute the common structure of the terms A and C (Fig. 1), with
associated substitutions σA,σC.
2. Determine Das σC(σ1
For illustration, consider letter strings A=abcg and C=ccbbaah. Their natural
representations in terms of SITs are respectively the following terms:
The above algorithm returns the following AU of these terms: Seq(Iter($1,$2,
3),$3)with substitutions σA={$1 7→ a,$2 7→ succ,$3 7→ g}and σC={$1 7→
Group(c,c),$2 7→ pred,$3 7→ h}.
Discovering Relational Structure in Program Synthesis Problems ... 7
4.1 The GPCAT algorithm
We now describe the application of GPCAT to the LSA domain. As we argue later,
it can be also generalized to handle certain types of program synthesis problem.
Given a proportional analogy problem, GPCAT generates a formal description of
detected analogies/relationships, i.e. a set of expressions with variables, which can
be then instantiated to generate the answers (i.e˙
, the possible values for D). For
some analogy of the form A:B::C:D(Fig. 1), GPCAT maintains a population
of solutions, each of which is a triple of SIT terms, (tA,tB,tC), intended to capture
the respective structures for A,B, and C. The terms are subject to the same genetic
search operators as in the single-term experiment presented in Section 3.1. The mu-
tation operator randomly picks the term to be modified from tA,tB, and tC; then, the
selected term undergoes mutation as in Section 3.1, while the remaining two terms
remain intact. Crossover operates analogously, i.e. the resulting offspring solutions
diverge from the parents in only one of the terms.
The search goal is to synthesize a triple of SIT terms that not only reproduce
the strings in LSA problem, but also together form a plausible analogical structure
and ultimately yields the correct D. To this end, we attempt to capture the analogy
between the horizontal and vertical mappings (hand vin Fig. 1) by performing
Anti-Unification of their outcomes. As D is not given, the only explicitly known
mappings are h(tA) = tBand v(tA) = tC. These mappings share the same left-hand
side tA, so we perform Anti-Unification of their right-hand sides only, i.e. , of tBand
tC. This is also motivated by the fact that in most LSA problems, A plays the role of
a mere ‘anchor’ for the symbols occurring in B and C; for instance in all but three
LSA problems considered in [28], A is a sequence of three consecutive characters,
typically abc.
We embed these computations into a fitness function which, for a given candidate
solution (tA,tB,tC), proceeds as follows:
1. Perform Anti-Unification of tBand tCto factor out their common substructure.
This results in a term uwith a number of variables $i,i=1,...,k, and two sub-
stitutions σBand σC, such that uσB=tBand uσC=tC. Technically, both σBand
σCare sets of mappings from variables to subterms, e.g., σB={$1 7→ a,$2 7→
Group(a,b)}. Symbols in right-hand sides of substitutions are represented as in-
teger offsets w.r.t. the ‘lowest’ character occurring in the term (the importance of
this will become clear in the example that follows).
2. Generate all 2kcombinations of mappings from σBand σC, resulting in 2k‘artifi-
cial’ substitutions σj,j=1,...,2k(for low values of kin typical LSA problems,
this can be done exhaustively).
3. Apply each σjindependently to u, which results in a list of 2kSIT terms. Express
(i.e. ‘flatten’) the terms, obtaining so up to 2kletter strings (distinct SIT terms,
when expressed, may result in the same letter string). The resulting letter strings
are the candidate answers, i.e. the proposed values of D, for the considered LSA
8 Jerry Swan and Krzysztof Krawiec
4. Characterize the candidate solution (tA,tB,tC)and the formal objects obtained in
the above steps using following indicators:
L=Lev(tA,A) + Lev(tB,B) + Lev(tC,C), the total Levenshtein distance be-
tween expressed tA,tB, and tCand respectively A, B and C — to be minimized
(cf. Lev in Section 3.1).
I=In f Load (tA)+ I n f Load(tB) + In f Load (tC), the total information load —
to be minimized.
M— the total number of variables in u(equal also to the number of mappings
in σBand σC) — to be maximized, as the presence of multiple mappings may
signal good structural correspondence of tBand tC.
N— the number of mappings to null value (i.e. $ j7→ ε) — to be minimized,
as such mappings signal structural inconsistency between uand of the SIT
terms it has been obtained from.
The indicators computed in step 4 form a multiobjective characterization of the
evaluated candidate solution, and can be either aggregated into a single scalar fitness
or handled by a multiobjective selection procedure. In this study, we follow the
former option, and define minimized fitness as:
f((tA,tB,tC)) = L+N+0.01(IM)(2)
By taking into account several indicators, we mandate evolution to optimize all as-
pects of the analogy models simultaneously, i.e., conformance of SIT terms with
the underlying LSA problem (L), low complexity of terms (I), and good Anti-
Unification (Mand N). Our fitness prioritizes Land N, i.e., puts solution correctness
Note that the proposed fitness function does not involve D, even if it is known.
The correct D is expected to appear in the letter string list obtained in step 3.
We work through these steps for abc : abd :: ijk : ? (Fig. 1) and a candidate
(tA,tB,tC) = (Iter(a,Succ,3),Seq(Iter(a,Succ,2),d),Iter(i,Succ,3))
Note that this solution reproduces all three terms perfectly, so its L=0.
The anti-unifier of tBand tC(step 1 of GPCAT) calculated using first-order, rigid,
unranked AU algorithm [2], is given by:
with σB={$1 7→ a,$2 7→ 2,$3 7→ d}, and σC={$1 7→ i,$2 7→ 3,$3 7→ ε}. Now,
as signalled in Step 1 of GPCAT, the symbols in right-hand sides of substitu-
tions are represented as offsets w.r.t. the lowest characters (here aand i, respec-
tively), so the substitutions take the following form (note the underlined differ-
ences): σB={$1 7→ 0,$2 7→ 2,$3 7→ 3}, and σC={$1 7→ 0,$2 7→ 3,$3 7→ ε}. With
k=3 variables, there are 23=8 artificial substitutions σjthat can be built by com-
bining the individual mappings from σBand σC(step 2 of GPCAT). Among them,
Discovering Relational Structure in Program Synthesis Problems ... 9
there is σ3={$1 7→ 0,$2 7→ 2,$3 7→ 3}, which for initial character iproduces i jl ,
the most natural answer to this LSA problem.
5 The experiment
We applied GPCAT to 32 out of 35 LSA problems originally considered by Mitchell
[28], i.e. those problems with A being a sequence of three consecutive letters. In-
struction set (SIT operators) and evolutionary parameters were set as in Section 3.1,
except for higher initial tree height (5, to promote diversity in initial population) and
lower than usual selection pressure (tournament of size 2, in order to promote explo-
ration and lower the risk of premature convergence). This time we relied on imple-
mentation based on the FUE L evolutionary computation library written in Scala1.
The best-of-run solutions resulting from particular runs were subject to evalua-
tion, and the lists of answers to the problem (i.e. element ‘D’ in A : B :: C : D)
was collected with 30 runs for each LSA problem. The following table presents the
top five most frequently occurring answer strings per 30 runs of GPCAT for selected
problems from the considered suite. Each string is accompanied with the percentage
of times it has occurred.
Problem Most frequent answers
abc:abd::ijk ijl:100 ik:7 bcd:7 abbd:7 ac:7
abc:abd::xyz xya:100 bcd:7 abbd:7 xz:0.07 ac:7
abc:abd::kji ijl:70 cba:57 kln:17 bce:10 jl:7
abc:qbc::iijjkk aabbcc:53 ijl:43 ab:23 ij:23 ik:10
abc:abd::mrrjjj jkm:67 iiaaa:33 rrjjj:33 jrrjjj:17 diiaaa:17
By contrast, CopyCat responses [28] are (also in per cents of runs, but this time they
sum up to 100%, as a CopyCat run produces a single answer):
Problem Most frequent answers
abc:abd::ijk ijl:96.9 ijd:2.7 ijk:0.2 hjk:0.1 ijj:0.1
abc:abd::xyz xyd:81.1 wyz:11.4 yyz:6 dyz:0.7 xyz:0.4
abc:abd::kji kjh:56.1 kjj:23.8 lji:18.6 kjd:1.1 kki:0.3
abc:abd::iijjkk iijjll: 81.0 iijjkl: 16.5 iijjdd: 0.9 iikkll: 0.9 iijkll: 0.3
abc:abd::mrrjjj mrrkkk:70.5 mrrjjk:19.7 mrrjkk:4.8 mrrjjjj:4.2 mrrjjd:0.6
GPCAT’s outcomes tend to only partially coincide with those of CopyCat: for in-
stance for the first problem, ijl is the most common answer in both methods, while
for the second problem their outcomes do not overlap at all (one of the reasons be-
ing that GP CAT’s process of variable alignment has built-in the concept of modulo,
whearas by design, CopyCat’s domain knowledge excludes a successor to ‘z’). One
possible research direction is thus tweaking and extending GPCAT in order to match
the distribution of human answers (of which CopyCat, despite being concerned with
10 Jerry Swan and Krzysztof Krawiec
plausible solutions rather than slavish reproduction of human bias, is arguably the
best known computational model).
However, exact mimicking of human behaviour, though interesting from the
viewpoint of cognitive science, might be of lesser importance for program synthe-
sis. What might be more essential in the latter context is the sole concept of pro-
portional analogy, and generative mechanism for their creation based on structural
Anti-Unification. We discuss this perspective in the following section.
6 Analogies in program synthesis
Let us now illustrate why we find analogical reasoning a useful concept for test-
based program synthesis. Consider the domain of list manipulation and the task of
synthesizing the append function. Let the desired behaviour of that function be
specified by the following set of tests:
append([1,2], []) = [1,2]
append([1,2], [3]) = [1,2,3]
append([1,2,3], []) = [1,2,3]
append([a,b], [c]) = [a,b,c]
By selecting pairs of tests from this list, we may form the following proportional
([1,2],[3]) //
([a,b],[c]) //[a,b,c]
([1,2],[]) //
([1,2,3],[]) //[1,2,3]
([1,2],[3]) //
([1,2,3],[]) //[1,2,3]
These analogies capture three unrelated characteristics of the synthesis task. The
first one is type-related and says that append takes no notice of the nature of the
list elements: in a sense, it behaves ‘modulo’ type, whether list elements are charac-
ters or numbers. The second analogy concerns more the operational characteristics
of append, and signals that if the second argument of append is an empty list,
then the expected result is the first argument. The third analogy might be seen as
expressing an invariant; i.e. that moving the head of the second list to the end of the
first list does not change the outcome.
On the face of it, these analogies express quite trivial facts. Nevertheless, our case
in point is that just by juxtaposing existing tests (i.e., without reaching to any source
of extra knowledge), we obtain concepts that capture various qualities of desired
behaviour. We claim that (i) identification of such qualities and (ii) their separation
can make program synthesis more efficient. Conventional GP has all these test cases
at its disposal, yet is completely oblivious to this opportunity.
We believe that this can provide a basis for the induction of high-level, ‘global’
descriptions of a set of fitness cases from repeated encounters with local ones by
Discovering Relational Structure in Program Synthesis Problems ... 11
the search process [19]. This then begs the wide-ranging research question of how
to exploit such induced invariants for use as search drivers [21, 20, 23], i.e. addi-
tional quasi-objectives that guide the search process. Depending on the domain, it
may be possible to express them as predicates in the same function set as is used
to solve the problem. Alternatively, it may be desirable to add induced invariants
to a competitive co-evolutionary population of constraints. In either case, our ap-
proach yields relational linkages in a functional, hierarchical manner, as opposed to
the traditional models of relational linkage occasionally used in stochastic program
induction, which are primarily probabalistic [46].
In a broader perspective, of particular interest here is the prospect of using the
generative aspects of GP to help address a persistent problem in formal methods. As
observed by Luqi and Goguen [25], “formal methods tend to be brittle or discontin-
uous – a small change in the domain can require a great deal of new work”. Since
formal approaches can be sensitive to the particular manner in which their input is
presented, the ability to generate alternative representations for inputs may bring
benefits not available to either approach in isolation. Conversely, it was observed
by Kocsis and Swan that the formal structure of inductively-defined datatypes can
usefully be exploited for GP purposes, e.g. to eliminate otherwise stochastic opera-
tions [18]. We might also hope to make use of this kind of structure for our current
purposes. For example, it is well-known that lists (and indeed algebraic datatypes in
general) can be expressed in a relational manner, in this case via the type construc-
tors Nil and Cons. Hence [1,2,3] can be expressed as:
Cons(1,Cons(2,Cons(3,N il)))
Using SIT-style relations, this can be represented as Iter (Nil , succ,3)2. Hence, the
second analogy can be represented by the Anti-Unifier:
App(Iter(Nil,succ,$1),Nil),Iter(N il,succ,$1))
with substitutions σ1={$1 7→ 2},σ2={$1 7→ 3}, in congruence with the fact that
appending Nil preserves structure.
Finally, we note that while the ‘mixing’ properties of binary recombination have
been widely examined in the EC community, even though Yu notes that structure
abstraction can contribute to success [47], the notion of an ‘abstracting’ binary op-
erator has not, to our knowledge, been further explored. It would therefore be inter-
esting to consider generalization of two programs as an addition to the traditional
palette of binary recombination operators.
2Strictly, Iter here is slightly more complex than previously, in that it expresses an inductive
construction known as a catamorphism [27]
12 Jerry Swan and Krzysztof Krawiec
7 Related work
Foundational computational work in proportional analogy was done by van der
Helm and Leeuwenberg [12], describing the problem in terms of path search in
directed acyclic graphs and giving an algorithm which is O(n4)in the size of the
input. This was subsequently extended by Dastani et al. [4] to incorporate the al-
gebraic approach to SIT adopted in this article. Dastani also applied GP (with an
uncharacteristically high mutation rate of 0.4) to the induction of SIT structures for
linear line patterns [5], i.e. polylines which can be encoded as letterstrings.
CopyCat [14] is perhaps the most well-known architecture for solving pro-
portional analogies. It has a tripartite structure, consisting of a blackboard (‘the
workspace’), a priority queue of programs for updating blackboard state (‘the coder-
ack’) and a semantic network with dynamically re-weighted link strengths (‘the slip-
net’). CopyCat is entirely concerned with (predominantly local) mechanisms that
have cognitive plausibility.
Closest to the current work is the algorithm of Weller and Schmid [45] for solv-
ing proportional analogies, which performs anti-unification via E-generalization.
The representation for E-generalization is a regular tree grammar, which means that
the result is a (potentially infinite) equivalence class of terms for D. The claimed
advantages for their approach are twofold:
1. There is no need to explicitly induce SIT representations for A,B,C, since all are
represented simultaneously via the regular tree grammar.
2. All consistent values for Dare likewise represented simultaneously.
However, this approach suffers from the severe disadvantage that no mechanism
is provided for enumerating the resulting regular tree grammar in preference order
(e.g. by information load). Since it is not possible to distinguish certain specific
representations for Das being more compelling, it also does not appear to be of
practical use. In contrast, our approach induces SIT representations with low infor-
mation load via GP driven by multi-aspect fitness function, then uses syntactic AU
to determine D.
Early use of analogical mechanisms for program synthesis predominantly oper-
ated on specifications rather than concrete programs ([26, 7, 42, 6]). More recently,
Schmid learned programs from fitness cases via planing [36] and Raza et al. [34]
used Anti-Unification to address scalability issues in synthesising DSL programs
for XML transformation.
IGOR II [13] is currently considered the exemplar of program synthesis by Induc-
tive Functional Programing (IFP) [29]. It creates a recursive program to generalize
a set of fitness cases via a pattern-based rewriting system, having first obtained the
least general generalization of the set of fitness cases by examples by AU. Katayama
[17] categorized approaches to IFP into analytical approaches based on analysis of
fitness cases and generate-and-test approaches that create many candidate programs.
The IGOR II algorithm is further extended by Katayama to hybridize these two ap-
Discovering Relational Structure in Program Synthesis Problems ... 13
8 Conclusions
In this chapter, we discussed two-way liaisons between GP-based program synthe-
sis and analogical reasoning. We showed that, on one hand, GP can be employed
to solve proportional analogy problems with aid of structural representations (SIT
terms) and a formal Anti-Unification mechanism. On the other hand – and more
importantly – we pointed out to potential ways of improving the efficiency of a GP
search process via detection and structural characterization of analogies between
fitness cases.
In this study, we have only scratched the surface regarding the exploitation of
analogical reasoning for GP-based program synthesis. For example, we have limited
our attention to analogies built on pairs of tests. Arguably, other interesting and
potentially useful structures could be obtained by working with multiple tests at a
time. We hypothesize that one way of attaining this goal could be via hierarchically
aggregating analogies, i.e. forming analogies of the form Case1 : Case2 ::
Case3 : Case4. Another possibility is to exploit the knowledge captured by
analogies for parent (mate) selection: arguably, two programs that happen to ‘solve’
analogies based on different pairs of tests feature complementary characteristics
that may be worth combining. These observations point to next steps in the research
agenda of analogy-based programming.
Acknowledgements Thanks are due to Dave Bender and the CRCC in Bloomington
for providing us with the original list of letter-string analogy examples. K. Kraw-
iec acknowledges support from grant 2014/15/B/ST6/05205 funded by the National
Science Centre, Poland. Both authors thank the reviewers for valuable and insightful
suggestions and comments.
1. Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations,
and system approaches. AI Commun. 7(1), 39–59 (1994). URL
2. Baumgartner, A., Kutsia, T.: A Library of Anti-Unification Algorithms. RISC Report Se-
ries 14-07, Research Institute for Symbolic Computation (RISC), Johannes Kepler University
Linz, Schloss Hagenberg, 4232 Hagenberg, Austria (2014). URL http://www.risc.
3. Cooke, H., Tredennick, H.: Aristotle: The Organon. No. v. 1 in Aristotle: The Organon.
Harvard University Press (1938). URL
4. Dastani, M., Indurkhya, B., Scha, R.: Analogical projection in pattern perception. J. Exp.
Theor. Artif. Intell. 15(4), 489–511 (2003). DOI 10.1080/09528130310001626283
5. Dastani, M., Marchiori, E., Voorn, R.: Finding perceived pattern structures using Genetic Pro-
gramming. In: L. Spector, et al. (eds.) Proceedings of the Genetic and Evolutionary Computa-
tion Conference (GECCO-2001), pp. 3–10. Morgan Kaufmann, San Francisco, California,
USA (2001). URL˜wbl/biblio/gecco2001/
14 Jerry Swan and Krzysztof Krawiec
6. Dershowitz, N.: The evolution of programs: Program abstraction and instantiation. In: Pro-
ceedings of the 5th International Conference on Software Engineering, ICSE ’81, pp. 79–
88. IEEE Press, Piscataway, NJ, USA (1981). URL
7. Dershowitz, N., Manna, Z.: On automating structured programming. In: G. Huet, G. Kahn
(eds.) IRIA Symposium on Proving and Improving Programs, pp. 167–193. Arc-et-Senans,
France (1975)
8. Ehrenfels, C.v.: ¨
Uber Gestaltqualit¨
aten. Vierteljahresschr. f¨
ur Philosophie, 14, 249-292 (1890)
9. Evans, T.G.: A heuristic program to solve geometric-analogy problems. In: Proceedings
of the April 21-23, 1964, spring joint computer conference, AFIPS ’64 (Spring), pp. 327–
338. ACM, New York, NY, USA (1964). URL
10. Falkenhainer, B., Forbus, K.D., Gentner, D.: The Structure-Mapping Engine: Algorithm and
Examples. Artificial Intelligence 41(1), 1 – 63 (1989). DOI
11. French, R.M.: The subtlety of sameness: A theory and computer model of analogy-making.
The MIT Press (1995)
12. van der Helm, P., Leeuwenberg, E.: Avoiding explosive search in automatic selection of sim-
plest pattern codes. Pattern Recognition 19(2), 181 – 191 (1986). DOI
13. Hofmann, M.: Igor II - an analytical inductive functional programming system. In: In Proceed-
ings of the 2010 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation,
pp. 29–32 (2010)
14. Hofstadter, D.R.: Fluid Concepts and Creative Analogies: Computer Models of the Funda-
mental Mechanisms of Thought. Basic Books, Inc., New York, NY, USA (1996)
15. Holyoak, K.J., Thagard, P.: Analogical mapping by constraint satisfaction. Cognitive Science
13(3), 295–355 (1989). DOI 10.1207/s15516709cog1303 1. URL
16. Hummel, J.E., Holyoak, K.J.: Distributed representations of structure: A Theory of Analogical
Access and Mapping. Psychological Review pp. 427–466 (1997)
17. Katayama, S.: An analytical inductive functional programming system that avoids unintended
programs. In: Proceedings of the ACM SIGPLAN 2012 Workshop on Partial Evaluation
and Program Manipulation, PEPM ’12, pp. 43–52. ACM, New York, NY, USA (2012).
DOI 10.1145/2103746.2103758. URL
18. Kocsis, Z.A., Swan, J.: Asymptotic Genetic Improvement programming via type functors and
catamorphisms. In: C. Johnson, K. Krawiec, A. Moraglio, M. O’Neill (eds.) Semantic Meth-
ods in Genetic Programming. Ljubljana, Slovenia (2014). URL http://www.cs.put. Workshop at
Parallel Problem Solving from Nature 2014 conference
19. Kovitz, B., Swan, J.: Structural stigmergy: A speculative pattern language for metaheuristics.
In: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic
and Evolutionary Computation, GECCO Comp ’14, pp. 1407–1410. ACM, New York, NY,
USA (2014). DOI 10.1145/2598394.2609845. URL
20. Krawiec, K.: Behavioral Program Synthesis with Genetic Programming, 1st edn. Springer
Publishing Company, Incorporated (2015)
21. Krawiec, K., O’Reilly, U.M.: Behavioral Programming: A broader and more detailed take on
Semantic GP. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary
Computation, GECCO ’14, pp. 935–942. ACM, New York, NY, USA (2014). DOI 10.1145/
2576768.2598288. URL
22. Krawiec, K., Swan, J.: Guiding evolutionary learning by searching for regularities in behav-
ioral trajectories: A case for representation agnosticism. In: AAAI Fall Symposium: How
Should Intelligence be Abstracted in AI Research (2013)
Discovering Relational Structure in Program Synthesis Problems ... 15
23. Krawiec, K., Swan, J., O’Reilly, U.M.: Behavioral program synthesis: Insights and
prospects. In: R. Riolo, W.P. Worzel, K. Groscurth (eds.) Genetic Programming The-
ory and Practice XIII, Genetic and Evolutionary Computation. Springer, Ann Arbor, USA
(2015). URL
Research/2015GPTP.pdf. Forthcomming
24. Leeuwenberg, E., van der Helm, P.: Structural Information Theory: The Simplicity of Visual
Form. Cambridge University Press (2015)
25. Luqi, Goguen, J.A.: Formal methods: Promises and problems. IEEE Softw. 14(1), 73–85
(1997). DOI 10.1109/52.566430. URL
26. Manna, Z., Waldinger, R.: Knowledge and reasoning in program synthesis. In: Programming
Methodology, 4th Informatik Symposium, pp. 236–277. Springer-Verlag, London, UK, UK
(1975). URL
27. Meijer, E., Fokkinga, M., Paterson, R.: Functional programming with bananas, lenses, en-
velopes and barbed wire. In: Proceedings of the 5th ACM Conference on Functional Pro-
gramming Languages and Computer Architecture, pp. 124–144. Springer-Verlag New York,
Inc., New York, NY, USA (1991). URL
28. Mitchell, M.: Analogy-making as perception: a computer model. MIT Press (1993). URL
29. Muggleton, S.: Inductive Logic Programming: Derivations, successes and shortcomings.
SIGART Bull. 5(1), 5–11 (1994). DOI 10.1145/181668.181671. URL http://doi.acm.
30. Otero, F., Castle, T., Johnson, C.: EpochX: Genetic Programming in Java with statistics and
event monitoring. In: Proceedings of the 14th Annual Conference Companion on Genetic
and Evolutionary Computation, GECCO ’12, pp. 93–100. ACM, New York, NY, USA (2012).
DOI 10.1145/2330784.2330800
31. Phillips, S., Wilson, W.H.: Categorial compositionality: A category theory explanation for the
systematicity of human cognition. PLoS Computational Biology 6(7) (2010)
32. Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)
33. Prade, H., Richard, G.: Computational Approaches to Analogical Reasoning: Current Trends.
Springer Publishing Company, Incorporated (2014)
34. Raza, M., Gulwani, S., Milic-Frayling, N.: Programming by example using least general
generalizations. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial In-
telligence, July 27 -31, 2014, Qu´
ebec City, Qu´
ebec, Canada., pp. 283–290 (2014). URL
35. Reynolds, J.C.: Transformational Systems and the Algebraic Structure of Atomic Formulas.
In: B. Meltzer, D. Michie (eds.) Machine Intelligence 5, pp. 135–151. Edinburgh University
Press, Edinburgh, Scotland (1969)
36. Schmid, U.: Inductive Synthesis of Functional Programs, Universal Planning, Folding of Fi-
nite Programs, and Schema Abstraction by Analogical Reasoning, Lecture Notes in Computer
Science, vol. 2654. Springer (2003). DOI 10.1007/b12055. URL
37. Schmid, U., Burghardt, J.: An algebraic framework for solving proportional and predictive
analogies. In: In F. Schmalhofer, (Eds.), Proceedings of the European Conference on
Cognitive Science, pp. 295–300. Erlbaum (2003)
38. Schmidt, M., Krumnack, U., Gust, H., K ¨
uhnberger, K.: Heuristic-Driven Theory Projec-
tion: An overview. In: H. Prade, G. Richard (eds.) Computational Approaches to Ana-
logical Reasoning: Current Trends, vol. 548, pp. 163–194. Springer (2014). DOI 10.1007/
978-3-642-54516-0 7
39. Silva, S., Dignum, S., Vanneschi, L.: Operator equalisation for bloat free genetic programming
and a survey of bloat control methods. Genetic Programming and Evolvable Machines 13(2),
197–238 (2012). DOI 10.1007/s10710-011-9150-5. URL
1007/s10710-011- 9150-5
40. Stewart, I., Cohen, J.: Figments of Reality: The Evolution of the Curious Mind. Cambridge
University Press (1999)
16 Jerry Swan and Krzysztof Krawiec
41. Swan, J., Drake, J., Krawiec, K.: Semantically-meaningful numeric constants for genetic pro-
gramming. In: C. Johnson, K. Krawiec, A. Moraglio, M. O’Neill (eds.) Semantic Meth-
ods in Genetic Programming. Ljubljana, Slovenia (2014). URL http://www.cs.put. Workshop at Par-
allel Problem Solving from Nature 2014 conference
42. Ulrich, J.W., Moll, R.: Program synthesis by analogy. SIGART Bull. (64), 22–28 (1977). DOI
10.1145/872736.806928. URL
43. Van Der Helm, P.A., Leeuwenberg, E.L.J.: Accessibility: A criterion for regularity and hi-
erarchy in visual pattern codes. J. Math. Psychol. 35(2), 151–213 (1991). DOI 10.1016/
0022-2496(91)90025-O. URL
44. Weller, S., Schmid, U.: Analogy by abstraction. In: Proceedings of the seventh International
conference on cognitive modeling (ICCM). Trieste, Italy (2006)
45. Weller, S., Schmid, U.: Solving proportional analogies by E-generalization. In: KI 2006:
Advances in Artificial Intelligence, 29th Annual German Conference on AI, KI 2006,
Bremen, Germany, June 14-17, 2006, Proceedings, pp. 64–75 (2006). DOI 10.1007/
978-3-540-69912-5 6
46. Yanai, K., Iba, H.: Towards a New Evolutionary Computation: Advances in the Estimation
of Distribution Algorithms, chap. Estimation of Distribution Programming: EDA-based Ap-
proach to Program Generation, pp. 103–122. Springer Berlin Heidelberg, Berlin, Heidelberg
(2006). DOI 10.1007/3-540-32494-1 5
47. Yu, T.: Structure abstraction and genetic programming. In: P.J. Angeline, Z. Michalewicz,
M. Schoenauer, X. Yao, A. Zalzala (eds.) Proceedings of the Congress on Evolution-
ary Computation, vol. 1, pp. 652–659. IEEE Press, Mayflower Hotel, Washington D.C.,
USA (1999). DOI doi:10.1109/CEC.1999.781995. URL
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Structural information theory is a coherent theory about the way the human visual system organises a raw visual stimulus into objects and object parts. To humans, a visual stimulus usually has one clear interpretation even though, in theory, any stimulus can be interpreted in numerous ways. To explain this, the theory focuses on the nature of perceptual interpretations rather than on underlying process mechanisms, and adopts the simplicity principle which promotes efficiency of internal resources rather than the likelihood principle which promotes veridicality in the external world. This theoretically underpinned starting point gives rise to quantitative models and verifiable predictions for many visual phenomena, including amodal completion, subjective contours, transparency, brightness contrast, brightness assimilation, and neon illusions. It also explains phenomena such as induced temporal order, temporal context effects, and hierarchical dominance effects, and extends to evaluative pattern qualities such as distinctiveness, interestingness, and beauty.
Conference Paper
Full-text available
Generalization problems arise in many branches of artificial intelligence: machine learning, analogical and case-based reasoning, cognitive mod-eling, knowledge discovery, etc. Anti-unification is a technique used often to solve generalization problems. In this paper we describe an open-source library of some newly developed anti-unification algorithms in various theories: for first-and second-order unranked terms, higher-order patterns, and nominal terms.
By extending a given analogy, a known program which solves a given problem is converted to a program which solves a different but analogous problem. The domains of the two problems need not be the same but they must be related by an initial specified analogy. There are three features which distinguish the approach. First the analogy formation evolves gradually with the synthesis of the new program. Secondly the formation of the analogy is directed by the correctness proof of the known program. Finally the output of the synthesis process produces a correctness proof for the synthesized program.
Genetic programming (GP) is a popular heuristic methodology of program synthesis with origins in evolutionary computation. In this generate-and-test approach, candidate programs are iteratively produced and evaluated. The latter involves running programs on tests, where they exhibit complex behaviors reflected in changes of variables, registers, or memory. That behavior not only ultimately determines program output, but may also reveal its `hidden qualities' and important characteristics of the considered synthesis problem. However, the conventional GP is oblivious to most of that information and usually cares only about the number of tests passed by a program. This `evaluation bottleneck' leaves search algorithm underinformed about the actual and potential qualities of candidate programs. This book proposes behavioral program synthesis, a conceptual framework that opens GP to detailed information on program behavior in order to make program synthesis more efficient. Several existing and novel mechanisms subscribing to that perspective to varying extent are presented and discussed, including implicit fitness sharing, semantic GP, co-solvability, trace convergence analysis, pattern-guided program synthesis, and behavioral archives of subprograms. The framework involves several concepts that are new to GP, including execution record, combined trace, and search driver, a generalization of objective function. Empirical evidence gathered in several presented experiments clearly demonstrates the usefulness of behavioral approach. The book contains also an extensive discussion of implications of the behavioral perspective for program synthesis and beyond.
Conference Paper
An intelligent agent can display behavior that is not directly related to the task it learns. Depending on the adopted AI framework and task formulation, such behavior is sometimes attributed to environment exploration, or ignored as irrelevant, or even penalized as undesired. We postulate here that virtually every interaction of an agent with its learning environment can result in outcomes that carry information which can be potentially exploited to solve the task. To support this claim, we present Pattern Guided Evolutionary Algorithm (PANGEA), an extension of genetic programming (GP), a genre of evolutionary computation that aims at synthesizing programs that display the desired input-output behavior. PANGEA uses machine learning to search for regularities in intermediate outcomes of program execution (which are ignored in standard GP), more specifically for relationships between these outcomes and the desired program output. The information elicited in this way is used to guide the evolutionary learning process by appropriately adjusting program fitness. An experiment conducted on a suite of benchmarks demonstrates that this architecture makes agent learning more effective than in conventional GP. In the paper, we discuss the possible generalizations and extensions of this architecture and its relationships with other contemporary paradigms like novelty search and deep learning. In conclusion, we extrapolate PANGEA to postulate a dynamic and behavioral learning framework for intelligent agents. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved.