Content uploaded by Krzysztof Krawiec

Author content

All content in this area was uploaded by Krzysztof Krawiec on Oct 20, 2016

Content may be subject to copyright.

Discovering Relational Structure in Program

Synthesis Problems with Analogical Reasoning

Jerry Swan and Krzysztof Krawiec

Abstract Much recent progress in Genetic Programming (GP) can be ascribed to

work in semantic GP, which facilitates program induction by considering program

behavior on individual ﬁtness cases. It is therefore interesting to consider whether

alternative decompositions of ﬁtness cases might also provide useful information.

The one we present here is motivated by work in analogical reasoning. So-called

proportional analogies (‘gills are to ﬁsh as lungs are to mammals’) have a hierar-

chical relational structure that can be captured using the formalism of Structural In-

formation Theory. We show how proportional analogy problems can be solved with

GP and, conversely, how analogical reasoning can be engaged in GP to provide for

problem decomposition. The idea is to treat pairs of ﬁtness cases as if they formed

a proportional analogy problem, identify relational consistency between them, and

use it to inform the search process.

Key words: Program Synthesis; Genetic Programming; Proportional Analogy; In-

ductive Logic Programming; Machine Learning

1 Introduction

Perhaps the strongest reason for favouring Genetic Programming (GP) over alterna-

tive machine learning approaches is the explanatory power afforded by the resulting

symbolic descriptions. Whilst other approaches may be faster or more accurate, GP

can provide more compelling insights into observed data than numerically-driven

approaches constrained to speciﬁc model class.

Jerry Swan

Department of Computer Science, University of York, UK

Krzysztof Krawiec

Institute of Computing Science, Poznan University of Technology, Pozna´

n, Poland

1

2 Jerry Swan and Krzysztof Krawiec

To maximize the explanatory power of GP, it is highly desirable to obtain sym-

bolic explanations which appear to the human reader to be not only comprehensi-

ble but also natural. In respect of comprehensibility, there has been considerable

work in combating expression bloat [39]. However, there has been relatively little

emphasis on building human bias into the search process. Since much human bias

originates in universal observations that stem from the speciﬁc constitution of the

natural world, its inclusion may actually lead to both quantitative and qualitative

improvements [41]. Since GP is often used to search for regularities in real-world

data, equipping it with such biases may be desirable, at the least in extracting more

compelling explanations from experimental results [40].

In this chapter, we explore a mechanism for the discovery of problem’s relational

structure, framed in terms of existing work on analogical reasoning. Analogy can

be considered as ‘a mapping between systems or processes’ and has been described

as ‘the core of cognition’ [14]. In cognitive science, it is understood to provide a

ﬂexible mechanism for re-contextualising situations in terms of prior (or hypotheti-

cal) experience and is also considered a key mechanism for escaping dichotomies of

representation [31], which is argued to be of general importance for Computational

Intelligence [22].

We start with a brief overview of analogy as a computational mechanism in Sec-

tion 2. In Section 3, we present the formalism of Structural Information Theory for

building the relational structures needed for the proposed approach. In Section 4, we

present GPCAT, a framework for solving proportional analogy problems using GP,

and experimentally assess its performance in Section 5. In Section 6, we explain

how similar mechanisms can be used to aid GP applied to conventional program

synthesis problems. In Section 7 we discuss the related work, and summarize this

study in Section 8.

2 Analogical reasoning

The use of analogy as a computational mechanism dates back to Evans’ famous

geometric reasoner [9]. More recent computational models include the Structure

Mapping Engine (SME) [10], the connectionist models ACME [15] and LISA [16],

Heuristic-Driven Theory Projection [38] and some matching techniques used in

Case-Based Reasoning [1]. A short article can only provide a brief overview of the

wide range of literature: considerably more detail is available in the recent volume

by Prade and Richard [33]. As distinct from predictive analogy, which is concerned

with inferring properties of a target object as a function of its similarity to a source

object, our interest here is in the application of proportional analogy.

The roots of analogical proportion can be traced as far back as Aristotle [3]. A

proportional analogy problem, denoted A:B::C:D, is concerned with ﬁnding D

s.t.h. D is to C as B is to A. The ‘microdomain’ of Letter String Analogy (LSA)

Problems (e.g. abc : abd :: ijk : ?) can be considered exemplary and is of long-

standing interest: although seemingly simple, the domain can require remarkable

Discovering Relational Structure in Program Synthesis Problems ... 3

Ah

//

v

B

v

Ch

//D

abc h

//

v

abd

v

i jk h

//?

Fig. 1 Commutativity of proportional analogy [37]

sophistication [11]. As can be seen in Fig. 1, proportional analogy problems can

also be considered to form a commutative diagram [37]. Notable approaches to LSA

problems include Hofstadter and Mitchell’s CopyCat [14] and the Anti-Uniﬁcation

based approach of Weller and Schmid [45]. When studied in the context of core AI

research and cognitive science, LSAs are often left ‘open-ended’:

abc : abd :: ijk : ?

abc : abd :: iijjkk : ?

abc : abd :: mrrjjj : ?

abc : abd :: xyz : ?

Posed in this way, LSAs are unlike traditional instances of computational prob-

lem solving — in general, a LSA problem has no singular ‘Platonic’ solution, so it is

therefore difﬁcult to deﬁne an objective measure for solution quality in a ‘top down’

fashion. Nevertheless, humans confronted with LSA problems typically converge on

a few answers that occur with relatively stable frequencies. For instance, the most

common answers to the above LSAs are respectively ijl,iijjll,mrrkkk and

xya, which corroborates the existence of human bias.

3 Capturing relational structure

Any method that is intended to deal with proportional analogy problems requires

some (formal or informal) means of capturing the relational structure of objects in

the domain (here: letter strings). Ideally, such a mechanism should take into account

the natural biases discussed in Section 1. One means of representing and quantifying

such bias is via the use of Structural Information Theory (SIT) [24]. SIT is a formal-

ism of relational structure which also provides a complexity metric. In contrast to the

complexity metrics of Algorithmic Information Theory (e.g. Kolmorogorov), SIT

is explicitly designed to correspond to the principles of human Gestalt perception

[8], intended to explain human propensity to prefer certain perceptual groupings.

The rules of Gestalt are readily illustrated in visual perception, where they explain

the inclination for grouping smaller objects into larger shapes, grouping objects by

proximity, closing partially occluded curves, and others.

The original description of SIT due to Leeuwenberg [24] describes linear, one-

dimensional patterns of objects in terms of repetition, alternation and symmetry,

subsequently extended to a recursive algebraic description by Dastani et al. [4]. It is

4 Jerry Swan and Krzysztof Krawiec

the latter that we use here: repetition is denoted by the iterated application of some

designated function e.g. Iter(ab,id,2)(where id is the identity function) denotes

the pattern abab and Iter(a,succ,3)(where succ is the successor function) denotes

abc. Alternation denotes a sequence into which an object is interleaved. It has ‘left’

and ‘right’ variants: for example, AltL(a,xyz)describes axayaz and AltR(a,xyz)

describes xayaza. Symmetry denotes a sequence followed by its reversal, and

occurs in an ‘even’ form (SymE(ab)=abba) and an ‘odd’ form (SymO(ab,c)=

abcba).

A SIT term determines a unique string, but the opposite does not hold: a given se-

quence may clearly be representable by many different SIT descriptions. For exam-

ple, abccba can be represented both by SymE(Iter(a,succ,3)) and SymO(ab,It er(c,

id,2)). Associated with each structural description is the notion of information load,

intended to quantify human preference between alternative relational descriptions

— those with lower information loads being preferable. The measure of informa-

tion load we adopt here is due to Dastani et al [4], which modiﬁes the previous

formulation of VanDerHelm and Leeuwenberg [43] and is deﬁned as the sum of

occurrences of individual operators in a SIT description, not including the SIT op-

erators themselves. Thus, while Iter(ab,id,2)and AltL(a,bb)both represent SIT

descriptions of abab, the former has an information load of 2 and the latter 3.

3.1 Finding SIT terms with GP

In the recursive variant of SIT described above, the patterns appearing in a SIT rela-

tion above can themselves be SIT relations. This lends itself to a direct representa-

tion of SIT relations as nodes in a tree structure, allowing the use of GP to ﬁnd a SIT

description for a given string [4]. As mentioned above, it is desirable to search for

SIT structures of low complexity, as given by the information load measure. How-

ever, this quantity alone cannot effectively drive the search, as the relations found

by GP have to produce the target string in the ﬁrst place. Therefore, we deﬁne our

ﬁtness function as:

f(t) = Lev(t,s)+ 0.001 ·I n f Load(t),(1)

where tis the SIT term being evaluated, sis the string to be reproduced, Lev(t,s)

is the Levensthein distance between the string produced by tand s, and In f Load(t)

is the information load. The ﬁtness function effectively realizes lexicographic or-

dering of search objectives, prioritizing matching the target string. Alternatively, a

multiobjective evolutionary search could be engaged here.

The instruction set of our GP setup includes all the algebraic relations presented

above, i.e. Iter,AltR,AltL,SymE,SymO, and the Sequence and Group relations

that respectively cater for ﬂat and nested (hierarchical) sequences. There are also

numeric constants that Iter needs to determine the number of iterations and function

literals: succ,pred, and id. Terms, numeric constants and function literals form

Discovering Relational Structure in Program Synthesis Problems ... 5

three types handled by strongly-typed GP mechanisms. Using EpochX GP [30],

we evolve a population of 100 individuals SIT relations, initialized with Koza’s

‘Grow’ method with program height set to 3. The upper limit on expression height

in evolution is 8. Evolution lasts for 100 generations. All other parameters are as per

the EpochX defaults.

We applied the above GP setup to all 35 unique letter strings occurring in the

problems originally considered by Mitchell [28] (Section 2), i.e.

abc abd ace cab cba cde cmg

cmz edc glz ijk kji mrr qbc

rst rsu xcg xlg xyz aabc aabd

abcd abcm abcn ijkk rijk aababc aabbcc

aabbcd hhwwqq iijjkk lmfgop mrrjjj rssttt xpqdef

and repeat each run 10 times. On average, GP ﬁnds a correct SIT term (i.e. re-

producing the target string perfectly, with Lev(t,s) = 0) in 93.4% of runs. For most

strings, the success rate is 10/10, and the worst success rate is 4/10 (for lmfgop).

The average information load amounts to 2.835, and the average number of nodes

in a term is 7.190. GP managed to ﬁnd SIT terms with minimal- or close-to minimal

load for many problems, for instance:

•ijkk: Group(Iter(i,Succ,3),k)

•aabbcc: Iter(Group(a,a),Succ,3)

•xpqdef: Seq(Group(x,Grou p(Iter(p,Succ,2),It er(d,Succ,2))),f)

Arguably, optimal SITs for these small problems could be found via exhaustive

search. However, for more complex problems that we wish to handle prospectively,

resorting to heuristic search is likely to be unavoidable.

4 Solving proportional analogies with GP

The CopyCat program [14] is a cognitive model of proportional analogy. Although

very carefully engineered, the speciﬁcs of the interactions between its architectural

elements (described in more detail in Section 7) are somewhat complex. While they

have been described at length [28, 14], this is has nonetheless been done in a rela-

tively informal fashion. It is therefore interesting to see if comparable results can be

obtained by combining more readily-demarcated methods.

We therefore propose GPCAT, a GP-based method for tacking proportional

analogies, with which we intend to achieve several goals:

1. Compose well-known formalisms like SITs, Anti-Uniﬁcation, and GP, rather

than the mechanisms that are speciﬁc to CopyCat.

2. Verify GP’s usefulness for solving proportional analogies.

3. Prospectively extend/substitute GPCAT’s components with formalism for han-

dling other domains more common to program synthesis.

6 Jerry Swan and Krzysztof Krawiec

Algorithm 1 Anti-uniﬁcation algorithm for two terms.

function AU(x,y)

if x=ythen

return x

else if x=f(x1,...,xn)∧y=f(y1,...,yn)then

return f(AU(x1,y1),...,AU(xn,yn))

else

return φ

end if

end function

There are three main components of GP CAT:

1. A domain-speciﬁc relational formalism (in this case SIT, Section 3).

2. An Anti-Uniﬁcation algorithm.

3. A GP algorithm.

Anti-Uniﬁcation (AU) is a procedure that extracts the common substructure of

a set of terms T. The AU of Tis itself a term, with some subterms replaced with

variables. The deﬁning property of such term u(Anti-Uniﬁer) is that for each t∈T

there exists a substitution σ(i.e. a mapping from variables to terms) such that when

applied to u, it makes it equal to t, i.e., uσ=t. In fact, uhas the important property

of being the most speciﬁc such term — informally, it preserves as much of the

common structure as possible.

Algorithms for n-ary Anti-Uniﬁcation were invented more-or-less simultane-

ously by Reynolds [35] and Plotkin [32]. For our purposes, uniﬁcation of two terms

(as per Algorithm 1) will sufﬁce. The value φdenotes a so-called ‘fresh variable’,

which maps to xunder some substitution σxand to yunder σy. The expressiveness

of AU is dependent on how equality between terms is deﬁned: in the case of syn-

tactic AU that we consider here, function symbols are simply unique labels, with no

intrinsic meaning.

Anti-Uniﬁcation has been used in the solution of proportional analogy problems

by Weller and Schmid [45]. Their algorithm is as follows [44]:

1. Use AU to compute the common structure of the terms A and C (Fig. 1), with

associated substitutions σA,σC.

2. Determine Das σC(σ−1

A(B))

For illustration, consider letter strings A=abcg and C=ccbbaah. Their natural

representations in terms of SITs are respectively the following terms:

•Seq(Iter(a,succ,3),g)

•Seq(Iter(Group(c,c),pred,3),h)

The above algorithm returns the following AU of these terms: Seq(Iter($1,$2,

3),$3)with substitutions σA={$1 7→ a,$2 7→ succ,$3 7→ g}and σC={$1 7→

Group(c,c),$2 7→ pred,$3 7→ h}.

Discovering Relational Structure in Program Synthesis Problems ... 7

4.1 The GPCAT algorithm

We now describe the application of GPCAT to the LSA domain. As we argue later,

it can be also generalized to handle certain types of program synthesis problem.

Given a proportional analogy problem, GPCAT generates a formal description of

detected analogies/relationships, i.e. a set of expressions with variables, which can

be then instantiated to generate the answers (i.e˙

, the possible values for D). For

some analogy of the form A:B::C:D(Fig. 1), GPCAT maintains a population

of solutions, each of which is a triple of SIT terms, (tA,tB,tC), intended to capture

the respective structures for A,B, and C. The terms are subject to the same genetic

search operators as in the single-term experiment presented in Section 3.1. The mu-

tation operator randomly picks the term to be modiﬁed from tA,tB, and tC; then, the

selected term undergoes mutation as in Section 3.1, while the remaining two terms

remain intact. Crossover operates analogously, i.e. the resulting offspring solutions

diverge from the parents in only one of the terms.

The search goal is to synthesize a triple of SIT terms that not only reproduce

the strings in LSA problem, but also together form a plausible analogical structure

and ultimately yields the correct D. To this end, we attempt to capture the analogy

between the horizontal and vertical mappings (hand vin Fig. 1) by performing

Anti-Uniﬁcation of their outcomes. As D is not given, the only explicitly known

mappings are h(tA) = tBand v(tA) = tC. These mappings share the same left-hand

side tA, so we perform Anti-Uniﬁcation of their right-hand sides only, i.e. , of tBand

tC. This is also motivated by the fact that in most LSA problems, A plays the role of

a mere ‘anchor’ for the symbols occurring in B and C; for instance in all but three

LSA problems considered in [28], A is a sequence of three consecutive characters,

typically abc.

We embed these computations into a ﬁtness function which, for a given candidate

solution (tA,tB,tC), proceeds as follows:

1. Perform Anti-Uniﬁcation of tBand tCto factor out their common substructure.

This results in a term uwith a number of variables $i,i=1,...,k, and two sub-

stitutions σBand σC, such that uσB=tBand uσC=tC. Technically, both σBand

σCare sets of mappings from variables to subterms, e.g., σB={$1 7→ a,$2 7→

Group(a,b)}. Symbols in right-hand sides of substitutions are represented as in-

teger offsets w.r.t. the ‘lowest’ character occurring in the term (the importance of

this will become clear in the example that follows).

2. Generate all 2kcombinations of mappings from σBand σC, resulting in 2k‘artiﬁ-

cial’ substitutions σj,j=1,...,2k(for low values of kin typical LSA problems,

this can be done exhaustively).

3. Apply each σjindependently to u, which results in a list of 2kSIT terms. Express

(i.e. ‘ﬂatten’) the terms, obtaining so up to 2kletter strings (distinct SIT terms,

when expressed, may result in the same letter string). The resulting letter strings

are the candidate answers, i.e. the proposed values of D, for the considered LSA

problem.

8 Jerry Swan and Krzysztof Krawiec

4. Characterize the candidate solution (tA,tB,tC)and the formal objects obtained in

the above steps using following indicators:

•L=Lev(tA,A) + Lev(tB,B) + Lev(tC,C), the total Levenshtein distance be-

tween expressed tA,tB, and tCand respectively A, B and C — to be minimized

(cf. Lev in Section 3.1).

•I=In f Load (tA)+ I n f Load(tB) + In f Load (tC), the total information load —

to be minimized.

•M— the total number of variables in u(equal also to the number of mappings

in σBand σC) — to be maximized, as the presence of multiple mappings may

signal good structural correspondence of tBand tC.

•N— the number of mappings to null value (i.e. $ j7→ ε) — to be minimized,

as such mappings signal structural inconsistency between uand of the SIT

terms it has been obtained from.

The indicators computed in step 4 form a multiobjective characterization of the

evaluated candidate solution, and can be either aggregated into a single scalar ﬁtness

or handled by a multiobjective selection procedure. In this study, we follow the

former option, and deﬁne minimized ﬁtness as:

f((tA,tB,tC)) = L+N+0.01∗(I−M)(2)

By taking into account several indicators, we mandate evolution to optimize all as-

pects of the analogy models simultaneously, i.e., conformance of SIT terms with

the underlying LSA problem (L), low complexity of terms (I), and good Anti-

Uniﬁcation (Mand N). Our ﬁtness prioritizes Land N, i.e., puts solution correctness

ﬁrst.

Note that the proposed ﬁtness function does not involve D, even if it is known.

The correct D is expected to appear in the letter string list obtained in step 3.

We work through these steps for abc : abd :: ijk : ? (Fig. 1) and a candidate

solution

(tA,tB,tC) = (Iter(a,Succ,3),Seq(Iter(a,Succ,2),d),Iter(i,Succ,3))

Note that this solution reproduces all three terms perfectly, so its L=0.

The anti-uniﬁer of tBand tC(step 1 of GPCAT) calculated using ﬁrst-order, rigid,

unranked AU algorithm [2], is given by:

Seq(Iter($1,succ,$2),$3),

with σB={$1 7→ a,$2 7→ 2,$3 7→ d}, and σC={$1 7→ i,$2 7→ 3,$3 7→ ε}. Now,

as signalled in Step 1 of GPCAT, the symbols in right-hand sides of substitu-

tions are represented as offsets w.r.t. the lowest characters (here aand i, respec-

tively), so the substitutions take the following form (note the underlined differ-

ences): σB={$1 7→ 0,$2 7→ 2,$3 7→ 3}, and σC={$1 7→ 0,$2 7→ 3,$3 7→ ε}. With

k=3 variables, there are 23=8 artiﬁcial substitutions σjthat can be built by com-

bining the individual mappings from σBand σC(step 2 of GPCAT). Among them,

Discovering Relational Structure in Program Synthesis Problems ... 9

there is σ3={$1 7→ 0,$2 7→ 2,$3 7→ 3}, which for initial character iproduces i jl ,

the most natural answer to this LSA problem.

5 The experiment

We applied GPCAT to 32 out of 35 LSA problems originally considered by Mitchell

[28], i.e. those problems with A being a sequence of three consecutive letters. In-

struction set (SIT operators) and evolutionary parameters were set as in Section 3.1,

except for higher initial tree height (5, to promote diversity in initial population) and

lower than usual selection pressure (tournament of size 2, in order to promote explo-

ration and lower the risk of premature convergence). This time we relied on imple-

mentation based on the FUE L evolutionary computation library written in Scala1.

The best-of-run solutions resulting from particular runs were subject to evalua-

tion, and the lists of answers to the problem (i.e. element ‘D’ in A : B :: C : D)

was collected with 30 runs for each LSA problem. The following table presents the

top ﬁve most frequently occurring answer strings per 30 runs of GPCAT for selected

problems from the considered suite. Each string is accompanied with the percentage

of times it has occurred.

Problem Most frequent answers

abc:abd::ijk ijl:100 ik:7 bcd:7 abbd:7 ac:7

abc:abd::xyz xya:100 bcd:7 abbd:7 xz:0.07 ac:7

abc:abd::kji ijl:70 cba:57 kln:17 bce:10 jl:7

abc:qbc::iijjkk aabbcc:53 ijl:43 ab:23 ij:23 ik:10

abc:abd::mrrjjj jkm:67 iiaaa:33 rrjjj:33 jrrjjj:17 diiaaa:17

By contrast, CopyCat responses [28] are (also in per cents of runs, but this time they

sum up to 100%, as a CopyCat run produces a single answer):

Problem Most frequent answers

abc:abd::ijk ijl:96.9 ijd:2.7 ijk:0.2 hjk:0.1 ijj:0.1

abc:abd::xyz xyd:81.1 wyz:11.4 yyz:6 dyz:0.7 xyz:0.4

abc:abd::kji kjh:56.1 kjj:23.8 lji:18.6 kjd:1.1 kki:0.3

abc:abd::iijjkk iijjll: 81.0 iijjkl: 16.5 iijjdd: 0.9 iikkll: 0.9 iijkll: 0.3

abc:abd::mrrjjj mrrkkk:70.5 mrrjjk:19.7 mrrjkk:4.8 mrrjjjj:4.2 mrrjjd:0.6

GPCAT’s outcomes tend to only partially coincide with those of CopyCat: for in-

stance for the ﬁrst problem, ijl is the most common answer in both methods, while

for the second problem their outcomes do not overlap at all (one of the reasons be-

ing that GP CAT’s process of variable alignment has built-in the concept of modulo,

whearas by design, CopyCat’s domain knowledge excludes a successor to ‘z’). One

possible research direction is thus tweaking and extending GPCAT in order to match

the distribution of human answers (of which CopyCat, despite being concerned with

1https://github.com/kkrawiec/fuel

10 Jerry Swan and Krzysztof Krawiec

plausible solutions rather than slavish reproduction of human bias, is arguably the

best known computational model).

However, exact mimicking of human behaviour, though interesting from the

viewpoint of cognitive science, might be of lesser importance for program synthe-

sis. What might be more essential in the latter context is the sole concept of pro-

portional analogy, and generative mechanism for their creation based on structural

Anti-Uniﬁcation. We discuss this perspective in the following section.

6 Analogies in program synthesis

Let us now illustrate why we ﬁnd analogical reasoning a useful concept for test-

based program synthesis. Consider the domain of list manipulation and the task of

synthesizing the append function. Let the desired behaviour of that function be

speciﬁed by the following set of tests:

append([1,2], []) = [1,2]

append([1,2], [3]) = [1,2,3]

append([1,2,3], []) = [1,2,3]

append([a,b], [c]) = [a,b,c]

By selecting pairs of tests from this list, we may form the following proportional

analogies:

([1,2],[3]) //

[1,2,3]

([a,b],[c]) //[a,b,c]

([1,2],[]) //

[1,2]

([1,2,3],[]) //[1,2,3]

([1,2],[3]) //

[1,2,3]

([1,2,3],[]) //[1,2,3]

These analogies capture three unrelated characteristics of the synthesis task. The

ﬁrst one is type-related and says that append takes no notice of the nature of the

list elements: in a sense, it behaves ‘modulo’ type, whether list elements are charac-

ters or numbers. The second analogy concerns more the operational characteristics

of append, and signals that if the second argument of append is an empty list,

then the expected result is the ﬁrst argument. The third analogy might be seen as

expressing an invariant; i.e. that moving the head of the second list to the end of the

ﬁrst list does not change the outcome.

On the face of it, these analogies express quite trivial facts. Nevertheless, our case

in point is that just by juxtaposing existing tests (i.e., without reaching to any source

of extra knowledge), we obtain concepts that capture various qualities of desired

behaviour. We claim that (i) identiﬁcation of such qualities and (ii) their separation

can make program synthesis more efﬁcient. Conventional GP has all these test cases

at its disposal, yet is completely oblivious to this opportunity.

We believe that this can provide a basis for the induction of high-level, ‘global’

descriptions of a set of ﬁtness cases from repeated encounters with local ones by

Discovering Relational Structure in Program Synthesis Problems ... 11

the search process [19]. This then begs the wide-ranging research question of how

to exploit such induced invariants for use as search drivers [21, 20, 23], i.e. addi-

tional quasi-objectives that guide the search process. Depending on the domain, it

may be possible to express them as predicates in the same function set as is used

to solve the problem. Alternatively, it may be desirable to add induced invariants

to a competitive co-evolutionary population of constraints. In either case, our ap-

proach yields relational linkages in a functional, hierarchical manner, as opposed to

the traditional models of relational linkage occasionally used in stochastic program

induction, which are primarily probabalistic [46].

In a broader perspective, of particular interest here is the prospect of using the

generative aspects of GP to help address a persistent problem in formal methods. As

observed by Luqi and Goguen [25], “formal methods tend to be brittle or discontin-

uous – a small change in the domain can require a great deal of new work”. Since

formal approaches can be sensitive to the particular manner in which their input is

presented, the ability to generate alternative representations for inputs may bring

beneﬁts not available to either approach in isolation. Conversely, it was observed

by Kocsis and Swan that the formal structure of inductively-deﬁned datatypes can

usefully be exploited for GP purposes, e.g. to eliminate otherwise stochastic opera-

tions [18]. We might also hope to make use of this kind of structure for our current

purposes. For example, it is well-known that lists (and indeed algebraic datatypes in

general) can be expressed in a relational manner, in this case via the type construc-

tors Nil and Cons. Hence [1,2,3] can be expressed as:

Cons(1,Cons(2,Cons(3,N il)))

Using SIT-style relations, this can be represented as Iter (Nil , succ,3)2. Hence, the

second analogy can be represented by the Anti-Uniﬁer:

App(Iter(Nil,succ,$1),Nil),Iter(N il,succ,$1))

with substitutions σ1={$1 7→ 2},σ2={$1 7→ 3}, in congruence with the fact that

appending Nil preserves structure.

Finally, we note that while the ‘mixing’ properties of binary recombination have

been widely examined in the EC community, even though Yu notes that structure

abstraction can contribute to success [47], the notion of an ‘abstracting’ binary op-

erator has not, to our knowledge, been further explored. It would therefore be inter-

esting to consider generalization of two programs as an addition to the traditional

palette of binary recombination operators.

2Strictly, Iter here is slightly more complex than previously, in that it expresses an inductive

construction known as a catamorphism [27]

12 Jerry Swan and Krzysztof Krawiec

7 Related work

Foundational computational work in proportional analogy was done by van der

Helm and Leeuwenberg [12], describing the problem in terms of path search in

directed acyclic graphs and giving an algorithm which is O(n4)in the size of the

input. This was subsequently extended by Dastani et al. [4] to incorporate the al-

gebraic approach to SIT adopted in this article. Dastani also applied GP (with an

uncharacteristically high mutation rate of 0.4) to the induction of SIT structures for

linear line patterns [5], i.e. polylines which can be encoded as letterstrings.

CopyCat [14] is perhaps the most well-known architecture for solving pro-

portional analogies. It has a tripartite structure, consisting of a blackboard (‘the

workspace’), a priority queue of programs for updating blackboard state (‘the coder-

ack’) and a semantic network with dynamically re-weighted link strengths (‘the slip-

net’). CopyCat is entirely concerned with (predominantly local) mechanisms that

have cognitive plausibility.

Closest to the current work is the algorithm of Weller and Schmid [45] for solv-

ing proportional analogies, which performs anti-uniﬁcation via E-generalization.

The representation for E-generalization is a regular tree grammar, which means that

the result is a (potentially inﬁnite) equivalence class of terms for D. The claimed

advantages for their approach are twofold:

1. There is no need to explicitly induce SIT representations for A,B,C, since all are

represented simultaneously via the regular tree grammar.

2. All consistent values for Dare likewise represented simultaneously.

However, this approach suffers from the severe disadvantage that no mechanism

is provided for enumerating the resulting regular tree grammar in preference order

(e.g. by information load). Since it is not possible to distinguish certain speciﬁc

representations for Das being more compelling, it also does not appear to be of

practical use. In contrast, our approach induces SIT representations with low infor-

mation load via GP driven by multi-aspect ﬁtness function, then uses syntactic AU

to determine D.

Early use of analogical mechanisms for program synthesis predominantly oper-

ated on speciﬁcations rather than concrete programs ([26, 7, 42, 6]). More recently,

Schmid learned programs from ﬁtness cases via planing [36] and Raza et al. [34]

used Anti-Uniﬁcation to address scalability issues in synthesising DSL programs

for XML transformation.

IGOR II [13] is currently considered the exemplar of program synthesis by Induc-

tive Functional Programing (IFP) [29]. It creates a recursive program to generalize

a set of ﬁtness cases via a pattern-based rewriting system, having ﬁrst obtained the

least general generalization of the set of ﬁtness cases by examples by AU. Katayama

[17] categorized approaches to IFP into analytical approaches based on analysis of

ﬁtness cases and generate-and-test approaches that create many candidate programs.

The IGOR II algorithm is further extended by Katayama to hybridize these two ap-

proaches.

Discovering Relational Structure in Program Synthesis Problems ... 13

8 Conclusions

In this chapter, we discussed two-way liaisons between GP-based program synthe-

sis and analogical reasoning. We showed that, on one hand, GP can be employed

to solve proportional analogy problems with aid of structural representations (SIT

terms) and a formal Anti-Uniﬁcation mechanism. On the other hand – and more

importantly – we pointed out to potential ways of improving the efﬁciency of a GP

search process via detection and structural characterization of analogies between

ﬁtness cases.

In this study, we have only scratched the surface regarding the exploitation of

analogical reasoning for GP-based program synthesis. For example, we have limited

our attention to analogies built on pairs of tests. Arguably, other interesting and

potentially useful structures could be obtained by working with multiple tests at a

time. We hypothesize that one way of attaining this goal could be via hierarchically

aggregating analogies, i.e. forming analogies of the form Case1 : Case2 ::

Case3 : Case4. Another possibility is to exploit the knowledge captured by

analogies for parent (mate) selection: arguably, two programs that happen to ‘solve’

analogies based on different pairs of tests feature complementary characteristics

that may be worth combining. These observations point to next steps in the research

agenda of analogy-based programming.

Acknowledgements Thanks are due to Dave Bender and the CRCC in Bloomington

for providing us with the original list of letter-string analogy examples. K. Kraw-

iec acknowledges support from grant 2014/15/B/ST6/05205 funded by the National

Science Centre, Poland. Both authors thank the reviewers for valuable and insightful

suggestions and comments.

References

1. Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations,

and system approaches. AI Commun. 7(1), 39–59 (1994). URL http://dl.acm.org/

citation.cfm?id=196108.196115

2. Baumgartner, A., Kutsia, T.: A Library of Anti-Uniﬁcation Algorithms. RISC Report Se-

ries 14-07, Research Institute for Symbolic Computation (RISC), Johannes Kepler University

Linz, Schloss Hagenberg, 4232 Hagenberg, Austria (2014). URL http://www.risc.

jku.at/publications/download/risc_5003/au_library.pdf

3. Cooke, H., Tredennick, H.: Aristotle: The Organon. No. v. 1 in Aristotle: The Organon.

Harvard University Press (1938). URL https://books.google.co.uk/books?id=

TgeISwAACAAJ

4. Dastani, M., Indurkhya, B., Scha, R.: Analogical projection in pattern perception. J. Exp.

Theor. Artif. Intell. 15(4), 489–511 (2003). DOI 10.1080/09528130310001626283

5. Dastani, M., Marchiori, E., Voorn, R.: Finding perceived pattern structures using Genetic Pro-

gramming. In: L. Spector, et al. (eds.) Proceedings of the Genetic and Evolutionary Computa-

tion Conference (GECCO-2001), pp. 3–10. Morgan Kaufmann, San Francisco, California,

USA (2001). URL http://www.cs.bham.ac.uk/˜wbl/biblio/gecco2001/

d01.pdf

14 Jerry Swan and Krzysztof Krawiec

6. Dershowitz, N.: The evolution of programs: Program abstraction and instantiation. In: Pro-

ceedings of the 5th International Conference on Software Engineering, ICSE ’81, pp. 79–

88. IEEE Press, Piscataway, NJ, USA (1981). URL http://dl.acm.org/citation.

cfm?id=800078.802519

7. Dershowitz, N., Manna, Z.: On automating structured programming. In: G. Huet, G. Kahn

(eds.) IRIA Symposium on Proving and Improving Programs, pp. 167–193. Arc-et-Senans,

France (1975)

8. Ehrenfels, C.v.: ¨

Uber Gestaltqualit¨

aten. Vierteljahresschr. f¨

ur Philosophie, 14, 249-292 (1890)

9. Evans, T.G.: A heuristic program to solve geometric-analogy problems. In: Proceedings

of the April 21-23, 1964, spring joint computer conference, AFIPS ’64 (Spring), pp. 327–

338. ACM, New York, NY, USA (1964). URL http://doi.acm.org/10.1145/

1464122.1464156

10. Falkenhainer, B., Forbus, K.D., Gentner, D.: The Structure-Mapping Engine: Algorithm and

Examples. Artiﬁcial Intelligence 41(1), 1 – 63 (1989). DOI http://dx.doi.org/10.1016/

0004-3702(89)90077-5

11. French, R.M.: The subtlety of sameness: A theory and computer model of analogy-making.

The MIT Press (1995)

12. van der Helm, P., Leeuwenberg, E.: Avoiding explosive search in automatic selection of sim-

plest pattern codes. Pattern Recognition 19(2), 181 – 191 (1986). DOI http://dx.doi.org/10.

1016/0031-3203(86)90022-1

13. Hofmann, M.: Igor II - an analytical inductive functional programming system. In: In Proceed-

ings of the 2010 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation,

pp. 29–32 (2010)

14. Hofstadter, D.R.: Fluid Concepts and Creative Analogies: Computer Models of the Funda-

mental Mechanisms of Thought. Basic Books, Inc., New York, NY, USA (1996)

15. Holyoak, K.J., Thagard, P.: Analogical mapping by constraint satisfaction. Cognitive Science

13(3), 295–355 (1989). DOI 10.1207/s15516709cog1303 1. URL http://dx.doi.org/

10.1207/s15516709cog1303_1

16. Hummel, J.E., Holyoak, K.J.: Distributed representations of structure: A Theory of Analogical

Access and Mapping. Psychological Review pp. 427–466 (1997)

17. Katayama, S.: An analytical inductive functional programming system that avoids unintended

programs. In: Proceedings of the ACM SIGPLAN 2012 Workshop on Partial Evaluation

and Program Manipulation, PEPM ’12, pp. 43–52. ACM, New York, NY, USA (2012).

DOI 10.1145/2103746.2103758. URL http://doi.acm.org/10.1145/2103746.

2103758

18. Kocsis, Z.A., Swan, J.: Asymptotic Genetic Improvement programming via type functors and

catamorphisms. In: C. Johnson, K. Krawiec, A. Moraglio, M. O’Neill (eds.) Semantic Meth-

ods in Genetic Programming. Ljubljana, Slovenia (2014). URL http://www.cs.put.

poznan.pl/kkrawiec/smgp2014/uploads/Site/Kocsis.pdf. Workshop at

Parallel Problem Solving from Nature 2014 conference

19. Kovitz, B., Swan, J.: Structural stigmergy: A speculative pattern language for metaheuristics.

In: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic

and Evolutionary Computation, GECCO Comp ’14, pp. 1407–1410. ACM, New York, NY,

USA (2014). DOI 10.1145/2598394.2609845. URL http://doi.acm.org/10.1145/

2598394.2609845

20. Krawiec, K.: Behavioral Program Synthesis with Genetic Programming, 1st edn. Springer

Publishing Company, Incorporated (2015)

21. Krawiec, K., O’Reilly, U.M.: Behavioral Programming: A broader and more detailed take on

Semantic GP. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary

Computation, GECCO ’14, pp. 935–942. ACM, New York, NY, USA (2014). DOI 10.1145/

2576768.2598288. URL http://doi.acm.org/10.1145/2576768.2598288

22. Krawiec, K., Swan, J.: Guiding evolutionary learning by searching for regularities in behav-

ioral trajectories: A case for representation agnosticism. In: AAAI Fall Symposium: How

Should Intelligence be Abstracted in AI Research (2013)

Discovering Relational Structure in Program Synthesis Problems ... 15

23. Krawiec, K., Swan, J., O’Reilly, U.M.: Behavioral program synthesis: Insights and

prospects. In: R. Riolo, W.P. Worzel, K. Groscurth (eds.) Genetic Programming The-

ory and Practice XIII, Genetic and Evolutionary Computation. Springer, Ann Arbor, USA

(2015). URL http://www.cs.put.poznan.pl/kkrawiec/wiki/uploads/

Research/2015GPTP.pdf. Forthcomming

24. Leeuwenberg, E., van der Helm, P.: Structural Information Theory: The Simplicity of Visual

Form. Cambridge University Press (2015)

25. Luqi, Goguen, J.A.: Formal methods: Promises and problems. IEEE Softw. 14(1), 73–85

(1997). DOI 10.1109/52.566430. URL http://dx.doi.org/10.1109/52.566430

26. Manna, Z., Waldinger, R.: Knowledge and reasoning in program synthesis. In: Programming

Methodology, 4th Informatik Symposium, pp. 236–277. Springer-Verlag, London, UK, UK

(1975). URL http://dl.acm.org/citation.cfm?id=647950.742874

27. Meijer, E., Fokkinga, M., Paterson, R.: Functional programming with bananas, lenses, en-

velopes and barbed wire. In: Proceedings of the 5th ACM Conference on Functional Pro-

gramming Languages and Computer Architecture, pp. 124–144. Springer-Verlag New York,

Inc., New York, NY, USA (1991). URL http://dl.acm.org/citation.cfm?id=

127960.128035

28. Mitchell, M.: Analogy-making as perception: a computer model. MIT Press (1993). URL

http://portal.acm.org/citation.cfm?id=152203

29. Muggleton, S.: Inductive Logic Programming: Derivations, successes and shortcomings.

SIGART Bull. 5(1), 5–11 (1994). DOI 10.1145/181668.181671. URL http://doi.acm.

org/10.1145/181668.181671

30. Otero, F., Castle, T., Johnson, C.: EpochX: Genetic Programming in Java with statistics and

event monitoring. In: Proceedings of the 14th Annual Conference Companion on Genetic

and Evolutionary Computation, GECCO ’12, pp. 93–100. ACM, New York, NY, USA (2012).

DOI 10.1145/2330784.2330800

31. Phillips, S., Wilson, W.H.: Categorial compositionality: A category theory explanation for the

systematicity of human cognition. PLoS Computational Biology 6(7) (2010)

32. Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)

33. Prade, H., Richard, G.: Computational Approaches to Analogical Reasoning: Current Trends.

Springer Publishing Company, Incorporated (2014)

34. Raza, M., Gulwani, S., Milic-Frayling, N.: Programming by example using least general

generalizations. In: Proceedings of the Twenty-Eighth AAAI Conference on Artiﬁcial In-

telligence, July 27 -31, 2014, Qu´

ebec City, Qu´

ebec, Canada., pp. 283–290 (2014). URL

http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8520

35. Reynolds, J.C.: Transformational Systems and the Algebraic Structure of Atomic Formulas.

In: B. Meltzer, D. Michie (eds.) Machine Intelligence 5, pp. 135–151. Edinburgh University

Press, Edinburgh, Scotland (1969)

36. Schmid, U.: Inductive Synthesis of Functional Programs, Universal Planning, Folding of Fi-

nite Programs, and Schema Abstraction by Analogical Reasoning, Lecture Notes in Computer

Science, vol. 2654. Springer (2003). DOI 10.1007/b12055. URL http://dx.doi.org/

10.1007/b12055

37. Schmid, U., Burghardt, J.: An algebraic framework for solving proportional and predictive

analogies. In: In F. Schmalhofer, et.al. (Eds.), Proceedings of the European Conference on

Cognitive Science, pp. 295–300. Erlbaum (2003)

38. Schmidt, M., Krumnack, U., Gust, H., K ¨

uhnberger, K.: Heuristic-Driven Theory Projec-

tion: An overview. In: H. Prade, G. Richard (eds.) Computational Approaches to Ana-

logical Reasoning: Current Trends, vol. 548, pp. 163–194. Springer (2014). DOI 10.1007/

978-3-642-54516-0 7

39. Silva, S., Dignum, S., Vanneschi, L.: Operator equalisation for bloat free genetic programming

and a survey of bloat control methods. Genetic Programming and Evolvable Machines 13(2),

197–238 (2012). DOI 10.1007/s10710-011-9150-5. URL http://dx.doi.org/10.

1007/s10710-011- 9150-5

40. Stewart, I., Cohen, J.: Figments of Reality: The Evolution of the Curious Mind. Cambridge

University Press (1999)

16 Jerry Swan and Krzysztof Krawiec

41. Swan, J., Drake, J., Krawiec, K.: Semantically-meaningful numeric constants for genetic pro-

gramming. In: C. Johnson, K. Krawiec, A. Moraglio, M. O’Neill (eds.) Semantic Meth-

ods in Genetic Programming. Ljubljana, Slovenia (2014). URL http://www.cs.put.

poznan.pl/kkrawiec/smgp2014/uploads/Site/Swan.pdf. Workshop at Par-

allel Problem Solving from Nature 2014 conference

42. Ulrich, J.W., Moll, R.: Program synthesis by analogy. SIGART Bull. (64), 22–28 (1977). DOI

10.1145/872736.806928. URL http://doi.acm.org/10.1145/872736.806928

43. Van Der Helm, P.A., Leeuwenberg, E.L.J.: Accessibility: A criterion for regularity and hi-

erarchy in visual pattern codes. J. Math. Psychol. 35(2), 151–213 (1991). DOI 10.1016/

0022-2496(91)90025-O. URL http://dx.doi.org/10.1016/0022-2496(91)

90025-O

44. Weller, S., Schmid, U.: Analogy by abstraction. In: Proceedings of the seventh International

conference on cognitive modeling (ICCM). Trieste, Italy (2006)

45. Weller, S., Schmid, U.: Solving proportional analogies by E-generalization. In: KI 2006:

Advances in Artiﬁcial Intelligence, 29th Annual German Conference on AI, KI 2006,

Bremen, Germany, June 14-17, 2006, Proceedings, pp. 64–75 (2006). DOI 10.1007/

978-3-540-69912-5 6

46. Yanai, K., Iba, H.: Towards a New Evolutionary Computation: Advances in the Estimation

of Distribution Algorithms, chap. Estimation of Distribution Programming: EDA-based Ap-

proach to Program Generation, pp. 103–122. Springer Berlin Heidelberg, Berlin, Heidelberg

(2006). DOI 10.1007/3-540-32494-1 5

47. Yu, T.: Structure abstraction and genetic programming. In: P.J. Angeline, Z. Michalewicz,

M. Schoenauer, X. Yao, A. Zalzala (eds.) Proceedings of the Congress on Evolution-

ary Computation, vol. 1, pp. 652–659. IEEE Press, Mayﬂower Hotel, Washington D.C.,

USA (1999). DOI doi:10.1109/CEC.1999.781995. URL http://www.cs.mun.ca/

˜tinayu/index_files/addr/public_html/cec99.pdf