ArticlePDF Available

A distributed basis for analogical mapping

Authors:
  • Independent Researcher https://www.rossgayler.com

Abstract and Figures

We are concerned with the practical feasibility of the neural basis of analogical mapping. All existing connectionist models of analogical mapping rely to some degree on localist representation (each concept or relation is represented by a dedicated unit/neuron). These localist solutions are implausible because they need too many units for human-level competence or require the dynamic re-wiring of networks on a sub-second time-scale. Analogical mapping can be formalised as finding an approximate isomorphism between graphs representing the source and target conceptual structures. Connectionist models of analogical mapping implement continuous heuristic processes for finding graph isomorphisms. We present a novel connectionist mechanism for finding graph isomorphisms that relies on distributed, high-dimensional representations of structure and mappings. Consequently, it does not suffer from the problems of the number of units scaling combinatorially with the number of concepts or requiring dynamic network re-wiring.
Content may be subject to copyright.
Analogical Mapping with Vector Symbolic Architectures
1
A
DISTRIBUTED
BASIS FOR
ANALOGICAL MAPPING
Ross
W. Gayler
School ofCommunication,Arts and Critical Enquiry
La Trobe University
Victoria 3086 Australia
Simon D. Levy
levys@wlu.edu
Department of Computer Science
Washington and Lee Un
iversity
Lexington, Virginia USA
ABSTRACT
We
are concerned with
the practical fe
a-
sibility of
the neural basis of analogical
ma
p-
ping.
All e
xisting connectionist models of an
a-
logical mapping rely to some degree on loca
l-
ist repr
e
sentation (each concept or
relation is
represented by a dedicated unit/neuron). These
localist solutions are implausible b
e
cause they
need too many units for human
-
level comp
e-
tence or require the dynamic re
-
wiring of ne
t-
works on a sub
-
second time
-
scale.
Analogical mapping can be for
ma
l
ised as
fin
ding an approximate isomorphism between
graphs representing the source and target co
n-
ceptual structures. Connectionist models of
analogical mapping implement continuous
heuristic processes for finding graph isomo
r-
phisms. We present a novel
co
nnectionist
mechanism for finding graph is
o
morphisms
that
relies on distributed
, high
-
dimensional
representations of structure and mappings.
Consequently, it does not suffer from the pro
b-
lems of the number of units scaling
combinat
o-
rially
with the number of
concepts or requiring
dynamic ne
t
work re
-
wiring.
GRAPH ISOMORPHISM
Researchers tend to divide the process of
analogy into three stages: retrieval (finding an
appropriate source situation), mapping (ident
i-
fying the corresponding elements of the source
and
target situations), and application.
Our
concern is
with
the mapping stage, which is
essentially
about
structural correspo
n
dence.
If
the source and target situations are formally
represented as graphs, the structural corr
e-
s
pondence between them can be d
e
s
cribed as
approximate graph isomorphism.
A
ny
mech
a-
nism for finding graph isomo
r
phisms is, by
definition, a mechanism for fin
d
ing structural
correspo
n
dence and a possible mechanism for
implementing analogical ma
p
ping.
W
e are
concerned with the formal unde
r
p
inning of
analogical mapping (indepen
d
ently of whether
any particular r
e
searcher chooses to describe
their specific model in these terms).
It might be supposed that representing
situations as graphs is unnecessarily restrictive
.
However,
anything that can
be formal
ise
d
can
be represented by a graph. Cat
e
gory theory,
which is effectively a theory of
structure an
d
graphs, is an alternative to set theory as a
foundation for mathematics (Ma
r
quis, 2009),
so anything that can be mathematically repr
e-
sented can be
re
p
resented as a graph.
It might also be supposed that by wor
k
ing
solely with graph isomorphism we favo
u
r
structural correspo
n
dence to the exclusion of
other factors that are known to influence an
a-
logic
al
mapping, such as semantic similarity
and pragmatics
. However, as any formal stru
c-
ture can be represented by graphs it fo
l
lows
that semantics and pragmatics can also be e
n-
coded as graphs.
For example, some models of
analogical mapping are based on labelled
graphs with the process being sensitive to label
Ross
W. Gaylerand Simon D. Levy
2
si
milarity. However, any label value
can
be
encoded as a graph and label similarity ca
p-
tured by the degree of approximate isomo
r-
phism
. Further
,
the
mathematics
of graph is
o-
morphism has been extended to include attri
b-
ute similarity
and is commonly used this w
ay
in computer vision and pattern reco
g
nition
(Bomze, Budinich, Pardalos & Pelillo, 1999)
.
T
he extent to which a
nalogical
mapping
based on graph isomorphism, is sensitive to
different types of information depends on what
information is e
n
coded into the gra
phs. Our
current research is concerned only with
the
practical feasibility of connectionist impleme
n-
tations of
graph isomorphism
. The
question
of
what
information is encoded in the graphs is
separable.
Cons
e
quently
,
we are not concerned
with modelling the
psychological properties of
analogical mapping as such que
s
tions belong
to a co
m
pletely different level of inquiry.
CONNECTIONIST IMPLEM
ENTATIONS
It is possible to model analogical ma
p-
ping as a purely algorithmic process. Ho
w
ever,
we are concerned with p
hysiological plausibi
l-
ity and consequently limit our atte
n
tion to
connectionist models of analogical mapping
such as ACME (Holyoak & Thagard, 1989),
AMBR (Kokinov, 1988), DRAMA (Eliasmith
& Thagard, 2001), and LISA (Hummel &
Holyoak, 1997). These models va
ry in their
theoretical emphases and the d
e
tails of their
connectionist implementations
, but
they all
share a problem in the scalability of
the
repr
e-
senta
tion
or construct
ion
of
the connectionist
ma
p
ping network. We contend that this is a
cons
e
quence of us
ing localist connectionist
representations or
pro
c
esses
.
In essence,
they
either have to allow in advance for all comb
i-
natorial possibilities, which requires too many
units (Stewart & El
i
asmith,
in press
), or they
have to construct the required ne
t
work
for
each
new mapping task
in a fraction of a second.
Problems with localist implement
a
tion
Rather than review all the major conne
c-
tionist models of analogical mapping, we will
use ACME and DRAMA to illustrate the pro
b-
lem with loca
l
ist representation. Locali
st and
distributed connectionist models have often
been co
m
pared in terms of properties such as
neural plausibility and robustness. Here, we
are concerned only with a single issue: d
y-
namic
re
-
wiring
(
i.e.
, the need for conne
c
tions
to be made between neuron
s as a fun
c
tion of
the source and target situations to be mapped).
ACME constructs a localist network to
represent possible mappings between the
source and target structures. The network is a
function of the source and target represent
a-
tions, and a new net
work has to be co
n
structed
for every source and target pair. A localist
unit
is constructed to represent each possible ma
p-
ping between a source vertex and target vertex.
The activation of each
unit
indicates the d
e
gree
of support for the corresponding vert
ex ma
p-
ping being part of the overall mapping b
e-
tween the source and target. The conne
c
tions
between the network
units
encode compatibi
l-
ity between the corresponding vertex ma
p-
pings. These connections are a fun
c
tion of the
source and target representations
and
co
n-
structed anew for each problem
. Compatible
vertex mappings are linked by excitatory co
n-
ne
c
tions so that support for plausibility of one
vertex mapping transmits support to compat
i-
ble mappings. Similarly, inhibitory conne
c-
tions are used to connect th
e
units
represen
t
ing
incompatible mappings. The network impl
e-
ments a relaxation
labelling
that finds a co
m-
patible set of mappings. The oper
a
tion of the
mapping network is neurally pla
u
sible, but the
process of
its
construction
is
not.
The inputs to ACME ar
e symbolic repr
e-
sentations of the source and target structures.
The mapping network is constructed by a
symbolic process that traverses the source and
target structures. The time co
m
plexity of the
traversal will be a function of the size of the
structures
to be mapped. Given that we believe
analogical mapping is a continually used core
part of cognition and that all cognitive info
r-
mation is encoded as
(large)
graph structure
s
,
we strongly prefer mapping ne
t
work setup to
require approximately constant time i
ndepen
d-
ent of the structures
to b
e
mapped.
Analogical Mapping with Vector Symbolic Architectures
3
DRAMA is a variant of ACME
with
di
s-
tributed
source and target
represent
a
tions
.
However, it appears that the process of co
n-
structing the distributed repr
e
sentation of the
mapping network
is
functionally localist, r
e-
quiring a decomposition and sequential tr
a-
versal of the source and target stru
c
tures.
Ideally, the connectionist mapping ne
t-
work sh
ould have a fixed neural arch
i
tec
ture.
T
he
units
and their connections sh
ould be
fixed in advance and not need to be
re
-
wire
d
in
response to the source and target represent
a-
tions. The structure of the current ma
p
ping
task
sh
ould
be encoded entirely in activations
ge
n
erated on the fixed neural architecture by
the source and target represe
n
tations
and
the
set
-
up process
should be
holistic
rather than
requiring decomposition of the source and ta
r-
get representations. Our r
e
search aims to
achieve this
by
using distributed representation
and processing from the VSA family of co
n-
nectionist models.
We proceed by introducing replicator
e
quations
;
a localist heuristic for finding graph
isomorphisms. Then we introduce Vector
Symbolic Architectures
(VSA),
a family of
di
s
tributed connectionist mechanisms for the
re
p
resentation and manipulation of structured
information.
Our novel contribution
is to i
m-
plement replicator equations in a completely
distributed fashion based on VSA. We co
n-
clude with a proof
-
of
-
concept demonstration
of
a distributed re
-
implementation of the
pri
n-
cipal
example from the seminal paper on graph
is
o
morphism via
replicator
equations
.
REPLICATOR EQUATIONS
The approach we are pursuing for graph
isomorphism is based on the work of Pelillo
(1999), who casts
sub
graph isomorphism as
the problem of finding a maximal clique (set of
mutually adjacent vertices) in the
association
g
raph
derived from the two graphs
to be
mapped
. Given a graph
G
of size
N
with an
N
N
adj
a
cency matrix
ij
a
A
and a graph
G
of
size
N
with an
N
N
adjacency
m
a-
trix
hk
a
A
, their association graph
G
of
size
2
N
can be represented by an
2
2
N
N
adjacency matrix
)
,
(
jk
ih
a
a
A
whose edges
e
n
code pairs of edges from
G
and
G
:
otherwise
0
and
if
)
(
1
2
,
k
h
j
i
a
a
a
hk
ij
jk
ih
(1)
The elements of
A
are
1
if the corr
e-
sponding edges in
G
and
G
have the same
state of existence and
0
if the corresponding
edges have different states of existence. D
e-
fined this way, the edges of the ass
o
ciation
graph
G
provide evidence about potential
mappings b
e
tween the
vertices of
G
and
G
based on whether the correspon
d
ing edges and
non
-
edges are consistent. The pre
s
ence of an
edge between two vertices in one graph and an
edge between two vertices in the other graph
supports a
p
ossible
mapping between the
members of each pair of vertices (as does the
a
b
sence of such an edge in both graphs).
By treating the graph isomorphism pro
b-
lem as a maximal
-
clique
-
finding problem
,
P
e-
lillo exploits an important result in graph th
e-
ory. Consider
a graph
G
with adjacency m
a-
trix
A
, a subset
C
of vert
i
ces of
G
, and a
characteristic vector
C
x
(indicating membe
r-
ship of the subset
C
) defined as
otherwise
0
if
1
C
i
C
x
C
i
(2)
where
C
is the cardinality of
C
. It turns out
that
C
is a maximum clique of
G
if and only
if
C
x
m
aximizes the function
Ax
x
x
f
T
)
(
,
where
T
x
is the transpose of
x
,
N
x
,
1
1
N
i
i
x
,
and
0
i
x
i
.
Starting at some initial condition (typ
i-
cally the baryc
enter,
N
x
i
1
corresponding
to all
i
x
being equally su
p
ported as part of the
solution),
x
can be obtained through iterative
application of the fo
l
lowing equation:
N
j
j
j
i
i
i
t
t
x
t
t
x
t
x
1
)
(
)
(
)
(
)
(
)
1
(
(
3
)
Ross
W. Gaylerand Simon D. Levy
4
where
N
j
j
ij
i
t
x
w
t
1
)
(
)
(
(4)
and
W
is a matrix of weights,
ij
w
,
typically
just the adjacency matrix
A
of the association
graph or a linear function of
A
. The
x
vector
can thus be consi
d
ered to represent the state of
the system’s belief about the vertex ma
p
pings
at a given time, with E
quations 3 and 4 repr
e-
senting a dynamical system parameterized by
the weights in
W
.
i
can be interpreted as the
evidence for
i
x
obtained from all the compat
i-
ble
j
x
where the compatibility is encoded by
ij
w
.
The denominator in E
quation 3 is a no
r-
malizing factor ensuring
that
1
1
N
i
i
x
.
Pelillo borrows Equations 3 and 4 from
the literature on evolutionary game theory in
which
i
is the overall payoff associated with
playing strategy
i
, and
ij
w
is
the payoff ass
o-
ciated with playing strategy
i
against strategy
j
. In the context of the maximum
-
clique pro
b-
lem, these
replicator equations
can be used to
derive a vector
x
(vertex mappings)
that
maximizes the “payoff” (edge consistency)
encoded in the adjacency matrix. Vertex ma
p-
pings co
rrespond to strategies, and as E
quation
3 is iterated, mappings with higher fitness
(co
n
sistency
of mappings
) come to dominate
ones with lower fitness.
Figure 1. A simple graphisomorphism pro
b
lem.
Consider the simple graphs in Figure 1,
used as
the principal
example by Pelillo
(1999)
and which we will later re
-
implement in a di
s-
tributed fashion
. The maximal isomorphism
b
e
tween t
hese two graphs is {A=P, B=Q, C=R,
D=S} or {A=P, B=Q, C=S, D=R}. Table 1
shows the first and last rows of the adjacency
matrix for the association graph of these
graphs, gene
r
ated using Equation 1. Looking at
the first row of the table, we see that the ma
p-
ping A=P is consistent with the mappings
B=Q, B=R, B=S, C=Q, C=R, C=S, D=Q,
D=R, and D=S, but not with A=Q, A=R, A=S,
B=P, etc.
AP
AQ
AR
AS
BP
BQ
BR
BS
CP
CQ
CR
CS
DP
DQ
DR
DS
AP
0
0
0
0
0
1
1
1
0
1
1
1
0
1
1
1
DS
1
0
1
0
0
1
0
0
1
0
1
0
0
0
0
0
Table 1. Fragment of adjacency matrix for Fi
g
. 1.
Initially, all values in the state vector
x
are set to 0.0625 (
1/16). Repeated applic
a
tion
of E
quations 3 and 4 produces a final state
vector that encod
es the two maximal isomo
r-
phisms, with 0.3 in the pos
i
tions for A=P and
B=Q, 0.1 in the positions for C=R, C=S, D=R,
and D=S, and 0 in the others. The conflicting
mappings for
C, D, R, and S correspond to a
saddle point in the d
y
namics of the replicator
equ
ations, created by the symmetry in the
graphs. Adding a small amount of noise to the
state breaks this symmetry, pr
o
ducing a final
state vector with va
l
ues of 0.25 for the optimal
mappings A=P, B=Q, and C=R, D=S or C=S,
D=R, and zero elsewhere.
The top gra
ph of
Figure 4 shows the time course of the settling
process from our impl
e
mentation o
f Pelillo’s
localist algorithm.
This example is trivially small
.
H
owever
,
the same approach has been successfully a
p-
plied to graphs with more than 65,000 vertices
(
Pelill
o
& Torsello, 2006)
. It has also been
extended to match hierarchical, attributed
structures for computer vision problems (P
e-
lillo
, Siddiqi & Zucker
1999). Thus
,
we are
confident that replicator equations
are
a re
a-
sonable candidate mechanism for the structu
re
matching at the heart of analogical ma
p
ping.
DISTRIBUTED
IMPLEMENTATION
The replicator equation mechanism can
be easily implemented as a localist connectio
n-
ist
circuit
. This is qualitatively very similar to
A
B
C
D
P
Q
S
R
Analogical Mapping with Vector Symbolic Architectures
5
ACME and
suffers the same problems due to
th
e localist representation. In this section we
present a distributed conne
c
tionist scheme for
representing edges, vertices, and mappings that
does not suffer from these pro
b
lems.
Vector Symbolic Archite
c
ture
Vector Symbolic Architecture
is a name
that we
coined
(Gayler, 2003)
to describe a
class of connectionist models that use high
-
dimensional vectors (typically around 10,000
dimensions
) of low
-
precision numbers to e
n-
code structured information as distributed re
p-
resenta
tions.
VSA can represent complex en
t
i-
ties such as trees and graphs
as vectors. E
very
such entity, no matter how simple or complex,
is represented by a pattern of activation di
s-
tributed over all the elements of the vector.
This general class of archite
c
tures traces its
origins to the tensor
product work of Smole
n-
sky (1990), but avoids the exponential growth
in dimensionality of tensor products. VSAs
employ three types of
vector
operat
or
: a mult
i-
plication
-
like operator, an addition
-
like oper
a-
tor, and a permu
tation
-
like operator. The mu
l-
tiplica
tion operat
or
is used to assoc
i
ate or bind
vectors. The addition operat
or
is used to s
u-
perpose vectors or add them to a set. The pe
r-
mutation operat
or
is used to quote or pr
o
tect
vectors from the other oper
a
tions.
The use of hyperdimensional vectors to
repr
esent symbols and their combinations pr
o-
vides a number of mathematically desirable
and biologically realistic features (Kanerva,
2009
). A hyperdimensional vector space co
n-
tain
s
as many mutually orthogonal ve
c
tors as
there are dimensions and exp
o
nentially m
any
almost
-
orthogonal vectors (Hecht
-
Nielsen,
1994), thereby supporting the repr
e
sentation of
astronomically large numbers of distinct items.
Such representations are also highly robust to
noise
.
Approximately 30%
of the va
l
ues in a
vector can be randomly
changed before it b
e-
comes more similar to another mea
n
ingful
(previously
-
defined) ve
c
tor than to its original
form. It is also possible to impl
e
ment such
vectors in a spiking neuron model (El
i
asmith,
2005
)
.
The main difference among types of
VSAs is in the
type
of numbers used as vector
elements and the related choice of multiplic
a-
tion
-
like operation. Holographic Reduced Re
p-
resentations (Plate, 2003) use real numbers and
circular convolution.
Kanerva’s (1996)
Binary
Spatter Codes (
BSC) use
Boolean va
l
ues an
d
element
wise exclusive
-
or.
Gayler
’s (
1998
)
Mu
l
tiply, Add, Permute coding (
MAP
) uses
values
from
}
1
,
1
{
and elementwise mult
i-
plication. A useful fe
a
ture of BSC and MAP is
that each vector is its own multiplicative i
n-
verse
.
M
ultiplying a
ny
vector by itself eleme
n-
twise yields the identity vector. As in ordinary
algebra, multiplication and addition are ass
o-
ciative and commutative, and multiplication
distri
b
We use MAP in the work described here.
As an illustration of how VSA
can be used to
represent graph structure, consider again the
optimal mapping {A=P, B=Q, C=R, D=S} for
the graphs in Figure 1.
W
e represent this set of
mappings as the vector
S
D
R
C
Q
B
P
A
(5)
where
A
,
B
,
C
, ... are arbitrarily chosen
(ra
n
dom) vectors over
}
1
,
1
{
and
and
represent elementwise vector multiplic
a
tion
and addition respectively. For any
mapped
vertex pair X
=Y, the
representation
Y
of
ve
r-
tex
Y
can be retrieved by multiplying the
ma
p-
ping
vector
)
*
(
Y
X
by
X
, and vice
-
versa. The resulting vector will contain the
representation of Y plus a set of repres
ent
a-
tions not corr
e
sponding to any vertex, which
can be treated as noise;
e.g.
:
noise
P
S
D
A
R
C
A
Q
B
A
P
S
D
A
R
C
A
Q
B
A
P
A
A
S
D
R
C
Q
B
P
A
A
)
(
(6)
The noise can be removed from the r
e-
trieved vector by passing it through a “cleanup
memory” that stores only the meaningful ve
c-
tors
)
,
,
,
,
,
,
,
(
S
R
Q
P
D
C
B
A
. Cleanup memory
can be impl
e
mented in a biologically plausible
way as a Hopfield network that ass
o
ciates each
meaningful vector to itself (a variant of He
b-
bian learning). Such networks can reco
n
struct
Ross
W. Gaylerand Simon D. Levy
6
the original form of a vector from a highly
deg
raded exemplar, via self
-
reinforcing fee
d-
back dynamics.
Note that although the vectors de
picted in
E
quations 5 and 6 appear complex they are just
vector values like any other. From the point of
view of the implementing har
d
ware all vectors
are of equal com
putational complexity. This
has profound implications for the r
e
source
requirements of VSA
-
based systems. For e
x-
ample, the computational cost of labe
l
ling a
graph vertex with a simple attribute or a co
m-
plex structure is e
x
actly the same.
Our Model
Our go
al is to build a distributed impl
e-
mentation of the replicator E
quations 3 and 4
by representing the problem as distributed pa
t-
terns of fixed, high dimension in VSA such
that the
distributed
system
has
the same
d
y-
namic
s
as the localist formulation. As in th
e
localist version, we need a represent
a
tion
x
of
the evolving state of the system’s belief about
the vertex mappings, and a repr
e
sentation
w
of the adjacencies in the associ
a
tion graph.
In
the
VSA
representation
of a graph
w
e
represent vertices
by
random hyperdime
n-
sional vectors
,
edges
by
products of the ve
c
tors
representing
the
vertices
, and
mapping
s
by
product
s
of the mapped ent
i
ties
.
I
t is natural to
represent the set of vertices as the sum of the
vectors repr
esenting the
ve
r
tices. The product
of the vertex sets of
the
two graphs is
then
ide
n
tical to the sum of the possible mappings
of vertices
(
E
qu
a
tion 7). That is, the initial
value of
x
can be calculated holistically from
the represent
a
tions of the graphs using only
one product operation that does not require
decomposition of the vertex set into comp
o-
nent vert
i
ces. For the graphs in Figure 1:
S
D
R
D
Q
B
P
B
Q
A
P
A
S
R
Q
P
D
C
B
A
x
)
(
)
(
(7)
For VSA it is natural to represent the set
of edges of a graph as th
e sum of the products
of the vertices connected by each edge. The
product of the edge sets of the two graphs is
identical to a sum of products of four vertices
.
This
encodes information about mappings of
edges, or equivalently, about compatibility of
verte
x mappings. That is, one holistic product
operation applied to the edge sets is able to
encode all the possible edge mappings in co
n-
stant time no matter how many edges there are.
The reader may have noticed that the d
e-
scription above refers only to edges,
whereas
Pelillo’s association graph also encodes info
r-
mation about the mapping of non
-
edges in the
two graphs. We believe the explicit represent
a-
tion of non
-
edges is cognitively i
m
plausible.
However, Pelillo was not concerned with co
g-
nitive plausibility. S
ince our aim here is to
r
e
produce his work, we include non
-
edges in
Equation 8
.
The distributed vector
w
fun
c-
tions as the
localist
ass
o
ciation matrix
W
. For
the graphs in Fi
g
ure 1:
S
R
D
C
S
R
C
A
R
P
C
A
Q
P
C
A
S
R
B
A
R
P
B
A
Q
P
B
A
S
Q
D
B
R
Q
D
B
S
Q
C
B
R
Q
C
B
S
R
S
P
R
P
Q
P
D
C
D
A
C
A
B
A
S
Q
R
Q
D
B
C
B
w
)
(
)
(
)
(
)
(
(8)
The
terms of this sum correspond to the
no
n
zero elements of Table 1 (allowing for the
symmetries due to commutati
v
ity). With
x
and
w
set up this way, we can co
m
pute the
payoff vector
as the pr
oduct of
x
and
w
.
A
s in
the localist formulation (E
qu
a
tion 4), this
product causes consistent mappings to rei
n-
force each other
. E
vidence is propagated from
each vertex mapping to consistent vertex ma
p-
pings via th
e edge compat
i
bility information
encoded in
w
. (The terms of E
quation
9 have
been
rearranged to highlight th
is
cancell
a
tion.)
R
B
P
Q
B
P
R
B
Q
B
R
B
P
A
Q
B
P
A
Q
A
P
A
w
x
)
(
)
(
(9)
Implementing the update of
x
(E
quation
3)
is
more challeng
ing for the VSA formul
a-
tion. As in the localist ve
r
sion, the idea is for
corresponding ver
te
x ma
p
pings in
x
and
to
reinforce each other multiplic
a
tively, in a kind
of multiset intersection (d
e
noted here as
): if
)
(
3
2
1
R
B
k
Q
B
k
P
A
k
x
a
n
d
)
(
5
4
Q
B
k
P
A
k
then
x
equal
s
)
(
5
2
4
1
Q
B
k
k
P
A
k
k
, for
Analogical Mapping with Vector Symbolic Architectures
7
non
-
negative
weights
1
k
,
2
k
,
3
k
,
4
k
,
and
5
k
.
B
e
cause of the self
-
cancellation property of the
MAP architecture, simple elementwise mult
i-
plication of
x
and
will not work. We could
extract the
i
k
by it
e
r
ating through each of the
pairwise mappings
)
,
,
,
(
S
D
Q
A
P
A
and
dividing
x
and
element
wise by each ma
p-
ping, but this is the kind of functionally loca
l-
ist approach we argue is neurally implausible.
Ins
tead, we need a holistic distributed interse
c-
tion operator. This can be construed as a sp
e-
cial case of lateral inhib
i
tion,
a
winner
-
takes
-
all competition, which has traditionally been
considered a l
o
calist operation (Page, 2000
;
Levy & Gayler,
in press
).
Figure 2. A neural circuit for vector interse
c
tion.
To implement this intersection operator
in a holistic, distributed manner we e
x
ploit the
third component of the MAP architecture:
permutation. Our solution,
shown
in Fi
g
ure 2,
works as follows:
1:
and
2:
are registers (ve
c-
tors of units) loaded with the vectors represen
t-
ing the multisets to be intersected.
P
1
(
)
co
m-
putes some arbitrary, fixed permutation of the
vector in
1:
, and
P
2
(
)
computes a different
fixed permutation of th
e vector in
2:
. Regis
ter
3:
contains the
product
of these permuted ve
c-
tors. Register
4:
is a memory (a constant ve
c-
tor value) pre
-
loaded with each of the possible
multiset
elements
transformed by multiplying
it with
both
pe
r
mutations of itself
.
T
hat is
,
)
(
)
(
:
4
2
1
1
i
i
M
i
i
X
P
X
P
X
, where
M
is
the number of items in
the
me
m
ory
vector
(
4:
)
.
To implement the r
e
plicator equations the
clean
-
up memory
4:
must be loaded with a
pattern based on the sum of all the possible
vertex mappings (sim
i
lar
to the initial value of
the mapping vector
x
).
To see how this circuit implements inte
r-
section, consider the simple case of a system
with three meaningful vectors
X
,
Y
, and
Z
where we want to compute the intersection of
X
k
1
with
)
(
3
2
Y
k
X
k
. The first vector is
loaded into register
1:
, the second into
2:
, and
the sum
)
(
)
(
)
(
)
(
)
(
)
(
2
1
2
1
2
1
Z
P
Z
P
Z
Y
P
Y
P
Y
X
P
X
P
X
is loaded into
4:
. After passing the register
con
tents through their respective permut
a
tions
and multiplying the results, register
3:
will
contain
)
(
2
)
(
1
3
1
)
(
2
)
(
1
2
1
)
3
2
(
2
)
1
(
1
Y
P
X
P
k
k
X
P
X
P
k
k
Y
k
X
k
P
X
k
P
Multiplying registers
3:
and
4:
together will
then result in the desired intersection
(relevant
terms in bold)
plus noise
,
which can b
e r
e-
moved by standard cleanup techniques:
noise
Z
P
Z
P
Z
Y
P
Y
P
Y
Y
P
X
P
k
k
X
2
k
1
k
(X)
2
P
(X)
1
P
X
(X)
2
P
(X)
1
P
2
k
1
k
))
(
2
)
(
1
)
(
2
)
(
1
(
))
(
2
)
(
1
3
1
(
In brief, the circuit in Figure 2 works by
guaranteeing that the permutations will cancel
only
for
th
os
e
terms
i
X
that
a
re
presen
t
in
both input registers, with other
te
rms
being
re
n
dered as noise.
In order to improve noise
-
reduction it is
necessary to sum over several such intersection
circuits, each based on different permutations.
This sum over permutations has a natural i
n-
terpretation in
terms of
sigma
pi units
(R
u-
mel
hart, Hinton & McClelland,
1986
)
, where
each unit calculates the sum of many pro
d
ucts
of a few inputs from units in the prior layer.
The apparent complexity of Figure 2
results
from drawing
it
for
ease of
explanat
i
o
n
rather
than
correspondence
to
i
m
plement
ation
. The
intersection network
of Figure 2
could be i
m-
plemented
as
a single layer of sigma
pi units.
COMPARING THE APPROA
CHES
Figure 3 shows the replicator
equation
approach to graph isomorphism as a recurrent
neural circuit. Common to Pelillo’s approa
ch
and ours is the initialization of a weight vector
w
with evidence of compat
i
bility of edges and
non
-
edges from the associ
a
tion graph, as well
as the computation of the payoff vector
2:
P
2
( )
1:
P
1
( )
4:
3:
5:
Ross
W. Gaylerand Simon D. Levy
8
from multiplication (
) of
x
and
w
, the
computation of the intersection of
x
and
(
), and the normalization of
x
(
/
). The
VSA formulation
additionally
requires a
cleanup memory (
c
) and intersection
-
cleanup
memory (
c
),
each
initia
l
ized
to a constant
value
.
Figure 3. A neural circuit fo
r graph is
o
morphism.
Figure 3
also
shows the commonality of
the localist and VSA approaches, with the
VSA
-
only components depicted in dashed
lines. Note that the archite
c
ture is completely
fixed and the specifics of the mapping pro
b
lem
to be solved are re
presented entirely in the
patterns of activation loaded into the circuit.
Likewise, the circuit does not make any dec
i-
sions based on the contents of the vectors b
e-
ing manipulated. The product and intersection
operators are applied to wha
t
ever vectors are
p
resent on their inputs and the circuit settles to
a stable state representing the solution.
To demonstrate the viability of our a
p-
proach, we used this circuit with a 10,000
-
dimensional VSA to deduce isomorphisms for
the graphs in Figure 1. This example was
ch
o-
sen to allow direct comparison with Pelillo’s
results. Although it was not intended as an
example of analogical mapping, it does d
i-
rectly address the underlying mechanism of
graph isomorphism. Memory and processor
limitations made it impractical to imp
lement
the main cleanup memory as a Hopfield net
(10
8
weights), so we simulated the Hopfield
net with a table that stored the meaningful ve
c-
tors and returned the one closest to the noisy
version. To implement the intersection circuit
from Figure 2 we summe
d over 50 replicates
of that circuit, differing only in their arbitrary
permutations. The updated mapping vector
was passed back through the circuit until the
Euclidean distance between
t
x
and
1
t
x
di
f-
fered by less t
han 0.00
1. At each iteration we
computed the c
o
sine of
x
with each item in
cleanup memory, in order to compare our VSA
implementation with the localist version; ho
w-
ever, nothing in our implementation depended
on this functionally loca
list comput
a
tion.
Figure 4. Convergence of localist (top) and VSA
(bottom) implementation.
Figure 4 compares the results of Pelillo’s
localist approach to ours, for the graph is
o-
morphism problem shown in Figure 1. Time
(iterations)
t
is plotted on the abscissa, and the
corresponding values in the mapping ve
c
tor on
the ordinate. For the loca
l
ist version we added
a small amount of Gau
s
sian noise to the state
vector on the first iter
a
tion in order to keep it
from getting stuck on a sa
ddle point; the VSA
version, which starts with a noisy mapping
vector, does not suffer from this pro
b
lem. In
both versions one set of consistent vertex
mappings (shown in marked lines) comes to
dominate the other, inconsistent mappings
c
Λ
c
w
*
Λ
cleanup
x
t
x
t+
1
π
t
Analogical Mapping with Vector Symbolic Architectures
9
(shown in solid line
s) in less than 100 iter
a-
tions.
The obvious difference between the VSA and
localist versions is that the loca
l
ist version
settles into a “clean” state correspon
d
ing to the
characteristic vector in Equ
a
tion 2, with four
values equal to 0.25 and the others
equal to
zero; whereas in the VSA ve
r
sion the final
state approximates this distrib
u
tion. (The small
negative values are an artifact of using the c
o-
sine as a metric for compar
i
son.)
CONCLUSIONS AND FUTU
RE WORK
The work presented here
has
demo
n-
strate
d a p
roof
-
of
-
concept
that a distributed
representation (Ve
c
tor Symbolic Architecture)
can be applied successfully to a problem
(graph isomorphism) that until now has been
considered the purview of loca
l
ist
modelling
.
The results achieved with VSA are qualit
a-
tiv
ely similar to those with the localist form
u-
lation. In the process, we have provided an
example of how a distributed representation
can impl
e
ment an operation reminiscent of
lateral inhib
i
tion, winner
-
takes
-
all competition,
which likewise has been consi
d
er
ed to be a
localist operation. The ability to model comp
e-
tition among neurally encoded structures and
rel
a
tions, not just individual items or concepts,
points to promising new directions for cogn
i-
tive
modelling
in ge
n
eral.
The next steps in this research w
ill be to
demonstrate the technique on larger graphs and
investigate how performance degrades as the
graph size exceeds the representational capa
c-
ity set by the vector dimensionality. We will
also investigate the performance of the system
in finding subgra
ph isomorphisms.
Graph isomorphism by itself
does
not
constitute
a
psychologically
realistic
analogical
mapping system. The
re are many related pro
b-
lems to be investigated
in that broader co
n
text
.
The question ofwhat conceptual information is
encoded in th
e graphs, and how, is foremost.
It
also
seems reasonable to expect constraints on
the
graphs enco
d
ing cognitive structures (
e.g.
constraints on the
maximum and minimum
nu
m
bers of edges from each vertex). It may be
po
s
sible to exploit such constraints to im
prove
some aspects of the mapping circuit. For e
x-
ample, it may be possible to avoid the cogn
i-
tively implausible use of non
-
edges as ev
i-
dence for mappings.
Another area we intend to investigate is
the requirement for population of the clean
-
up
memories. In
this system the clean
-
up mem
o-
ries are populated from representations of the
source and target graphs. This is not unreaso
n-
able if retrieval is completely separate from
mapping. However, we wish to explore the
possibility of intertwining retrieval and ma
p-
pi
ng. For this to be feasible we would need to
reconfigure the mapping so that cleanup me
m-
ory can be populated with items that have been
previously encountered rather than items co
r-
responding to potential mappings.
We expect this approach to provide fertile
lines of research for many years to come.
SOFTWARE DOWNLOAD
MATLAB
code implementing the alg
o-
rithm in (Pelillo, 1999) and our VSA version
can be downloaded from
tinyurl.com/gidemo
ACKNOWLEDGMENTS
We thank Pentti Kanerva, Tony Plate,
and Roger Wales for
many useful sugge
s
tions.
REFERENCES
Bomze, I. M., Budinich, M., Pa
r
dalos, P. M.,
& Pelillo, M. (1999) The Maximum
Clique Problem. In D.
-
Z. Du & P. M.
Pardalos (Eds.)
Handbook of
c
ombinat
o-
rial
o
ptimization
.
Supplement
Vol
ume A
(pp. 1
-
74)
.
Boston, MA, USA
: Kluwer
Academic
Publishers
.
Eliasmith, C. (2005).Cognition with neurons:
A large
-
scale, biologically realistic
model of the Wason task. In G. Bara, L.
Barsalou, & M. Bucciarelli (Eds.),
Pr
o-
ceedings of the
27
th
Annual Meeting of
the Cognitive Science So
ciety
.
Eliasmith, C., & Thagard, P. (2001). Integra
t-
ing structure and meaning: A distributed
Ross
W. Gaylerand Simon D. Levy
10
model of analogical mapping.
Cognitive
Science
,
25
, 245
-
286.
Gayler, R. (1998). Multiplicative binding, re
p-
resentation operators, and analogy,. In K.
Holyoak, D. G
entner, &
B.
Kokinov
(Eds.),
Advances in analogy research: I
n-
tegration of theory and data from the
cognitive, computational, and neural sc
i-
ences
(p. 405). Sofia, Bulgaria: New Bu
l-
garian University.
Gayler, R.
W. (2003).
Vector Symbolic Arch
i-
tectures answer
Jackendoff’s challenges
for cognitive neuroscience. In Peter
Slezak (Ed.),
ICCS/ASCS International
Conference on Cognitive Science
(pp.
133
-
138). Sydney, Australia: University
of New South Wales.
Hecht
-
Nielsen, R. (1994). Context vectors:
general purpose
approximate meaning
representations self
-
organized from raw
data. In J. Zurada, R. M. II, &
B.
Robi
n-
son (Eds.),
Computational intelligence:
Imitating life
(
p
p. 43
-
56). IEEE Press.
Holyoak, J., & Thagard, P. (1989). Analogical
mapping by constraint satisfa
ction.
Co
g-
nitive Science
,
13
, 295
-
355.
Hummel, J., & Holyoak, K. (1997). Distri
b-
uted representations of structure: A th
e-
ory of analogical access and mapping.
Psychological Review
,
104
,
427
-
466.
Kanerva, P. (1996). Binary spatter
-
coding of
ordered
k
-
tuples
. In C. von der Malsburg,
W. von Seelen,
J. Vorbr
ü
ggen, & B.
Sendhoff (Eds.),
Artificial neural ne
t-
works
(
Proceedings of
ICANN 96)
(
p
p.
869
-
873). Berlin:
Springer
-
Verlag.
Kanerva, P. (
2009
). Hyperdimensional co
m-
puting: An introduction to computing in
distr
i
b
uted representation with high
-
dimensional random vectors.
Cognitive
Computation
,
1
, 139
-
159
.
Kokinov, B. (1988). Associative memory
-
based reasoning: Howto represent and
r
e
trieve cases. In T. O’Shea & V. Sgurev
(Eds.),
Artificial intelligence III: Metho
d-
ology, systems, applications
(
p
p. 51
-
58).
Amsterdam: Elsevier Science Publishers
B.V. (North Holland).
Levy, S.
D., & Gayler, R.
W. (
in press
).
"La
t-
eral inhibition" in afully distributed co
n-
nectionist architecture
. In
Proceedings of
the Ninth Internationa
l Conference on
Cognitive Modeling
(ICCM 2009)
. Ma
n-
chester, UK
.
Marquis, J.
-
P. (2009). Category theory. In E.
N. Zalta (Ed.),
The
Stanford
E
ncyclop
e-
dia of
P
hilosophy
(Spring 2009 Edition)
,
http://plato.stanford.edu/archives/spr2009/
e
n
tries/category
-
theory
/
Page, M. (2000). Connectionist modellingin
psychology: A localist manifesto.
Beha
v-
ioral and Brain Sciences
,
23
, 443
-
512.
Pelillo, M. (1999). Replicator equations,
maximal cliques, and graph isomo
r
phism.
Neural Computation
,
11
, 1933
-
1955.
Pelillo,
M.,
Si
ddiqi,
K., &
Zucker,
S. W.
(1999).
Matching hierarchical stru
c
tures
using association graphs
.
IEEE Transa
c-
tions on Pattern Analysis and Machine
Intelligence
,
21
, 1105
-
1120
.
Pelillo
, M., &
T
orsello
, A. (2006).
Payoff
-
monotonic game dynamics and the
maximum clique problem
.
Neural Co
m-
putation
, 18,
1215
-
1258
.
Plate, T.
A. (2003).
Holographic reduced re
p-
resent
ation: Distributed representation
for cognitive science
.
Stanford, CA,
USA:
CSLI Publications.
Rumelhart, D.
E., Hinton, G.
E., &
McClelland, J.
L. (1986).
A general
framework for parallel distributed pro
c-
essing.
In
D. E. Rumelhart & J. L.
McClelland
(Eds.
),
Parallel distributed
processing: Explorations in the micr
o-
structure of cognition.
Volume 1: Fou
n-
dations
(p
p
. 45
-
76
).
Cambridge, MA,
USA
:
The MIT Press
.
Smolensky, P. (1990). Tensor product variable
binding and the representation of sy
m-
bolic structures i
n connectionist systems.
Artificial Intelligence
,
46
, 159
-
216.
Stewart, T.
,
&
Eliasmith, C.
(
in press
).
Compositionality and biologically plaus
i-
ble models
. In
M. Werning
,
W. Hinzen,
&
E. Machery (Ed
s
.),
The
Oxford
h
an
d-
book of
c
ompositionality
.
Oxford, UK:
Oxford University Press.
... Besides cognitive tasks, the properties of HDC have also been a motivation to apply it to more general computation problems, which we will refer to as stochastic computation. Research in this domain seeks to create solutions to classical problems, such as hashing [32] or graph isomorphism [23] by exploiting the characteristics of HDC. These applications encompass both deterministic computation in the presence of noise and probabilistic algorithms that search for approximate solutions. ...
... In real applications, permutation has been used to represent time series and n-grams [4,42,73,89]. A significant effort in HDC has been devoted to developing new and better encoding strategies [23,31,44,74,83,102]. Encoding functions are generally application specific and are critical in successfully applying HDC to a problem. ...
... Encoding patterns are used to combine multiple atomic pieces of information to encode something more complex. Examples include encoding text from characters [91,99], graphs from vertices and edges [23,78], time series from samples [25,89] and images from pixel values [62,68], among others. Below we present the most common encoding strategies. ...
Article
Full-text available
Hyperdimensional Computing (HDC), also known as Vector Symbolic Architectures (VSA), is a neuro-inspired computing framework that exploits high-dimensional random vector spaces. HDC uses extremely parallelizable arithmetic to provide computational solutions that balance accuracy, efficiency and robustness. The majority of current HDC research focuses on the learning capabilities of these high-dimensional spaces. However, a tangential research direction investigates the properties of these high-dimensional spaces more generally as a probabilistic model for computation. In this manuscript, we provide an approachable, yet thorough, survey of the components of HDC. To highlight the dual use of HDC, we provide an in-depth analysis of two vastly different applications. The first uses HDC in a learning setting to classify graphs. Graphs are among the most important forms of information representation, and graph learning in IoT and sensor networks introduces challenges because of the limited compute capabilities. Compared to the state-of-the-art Graph Neural Networks, our proposed method achieves comparable accuracy, while training and inference times are on average 14.6× and 2.0× faster, respectively. Secondly, we analyse a dynamic hash table that uses a novel hypervector type called circular-hypervectors to map requests to a dynamic set of resources. The proposed hyperdimensional hashing method has the efficiency to be deployed in large systems. Moreover, our approach remains unaffected by a realistic level of memory errors which causes significant mismatches for existing methods.
... This differs from the localist representation used by modern disentanglement models, where a single vector component potentially has a meaning. The nature of HVs can be different, such as binary [23], real [53][54][55][56][57], complex [53], or bipolar [58,59]. Also, the exact implementation of the vector operations may vary for different vector spaces while maintaining the computational properties. ...
Preprint
Full-text available
The idea of disentangled representations is to reduce the data to a set of generative factors that produce it. Typically, such representations are vectors in latent space, where each coordinate corresponds to one of the generative factors. The object can then be modified by changing the value of a particular coordinate, but it is necessary to determine which coordinate corresponds to the desired generative factor-a difficult task if the vector representation has a high dimension. In this article, we propose ArSyD (Architecture for Symbolic Disentanglement), which represents each generative factor as a vector of the same dimension as the resulting representation. In ArSyD, the object representation is obtained as a superposition of the generative factor vector representations. We call such a representation a symbolic disentangled representation. We use the principles of Hyperdimensional Computing (also known as Vector Symbolic Architectures), where symbols are represented as hypervectors, allowing vector operations on them. Disentanglement is achieved by construction, no additional assumptions about the underlying distributions are made during training, and the model is only trained to reconstruct images in a weakly supervised manner. We study ArSyD on the dSprites and CLEVR datasets and provide a comprehensive analysis of the learned symbolic disentangled representations. We also propose new disentanglement metrics that allow comparison of methods using latent representations of different dimensions. ArSyD allows to edit the object properties in a controlled and interpretable way, and the dimensionality of the object property representation coincides with the dimensionality of the object representation itself.
... The first work that proposed encoding graphs into hyperspace is due to Gayler and Levy [55]. Poduval et al. [27] built up on it and demonstrated different properties of the encoding. ...
... An alternative formulation of the fractional power encoding may bring insight into why HDC models can learn effectively [23][24][25][26][27][28][29][30][31]. In particular, it coincides with the Random Fourier Features (RFF) [18] encoding, an efficient approximation of kernel methods. ...
Preprint
Full-text available
Deep learning has achieved remarkable success in recent years. Central to its success is its ability to learn representations that preserve task-relevant structure. However, massive energy, compute, and data costs are required to learn general representations. This paper explores Hyperdimensional Computing (HDC), a computationally and data-efficient brain-inspired alternative. HDC acts as a bridge between connectionist and symbolic approaches to artificial intelligence (AI), allowing explicit specification of representational structure as in symbolic approaches while retaining the flexibility of connectionist approaches. However, HDC's simplicity poses challenges for encoding complex compositional structures, especially in its binding operation. To address this, we propose Generalized Holographic Reduced Representations (GHRR), an extension of Fourier Holographic Reduced Representations (FHRR), a specific HDC implementation. GHRR introduces a flexible, non-commutative binding operation, enabling improved encoding of complex data structures while preserving HDC's desirable properties of robustness and transparency. In this work, we introduce the GHRR framework, prove its theoretical properties and its adherence to HDC properties, explore its kernel and binding characteristics, and perform empirical experiments showcasing its flexible non-commutativity, enhanced decoding accuracy for compositional structures, and improved memorization capacity compared to FHRR.
... We proposed our random projection method in previous work [18], framing it as a graph embedding method in the spirit of hyperdimensional computing(HDC) [8][10] [6]. HDC graph representations encode vertices as high-dimensional vectors, bind these vectors to generate edge embeddings, and sum these edge embeddings to represent the entire edgeset: a bind-and-sum approach [15] [12] [17][16] [9]. ...
Preprint
Full-text available
We analyze a random projection method for adjacency matrices, studying its utility in representing sparse graphs. We show that these random projections retain the functionality of their underlying adjacency matrices while having extra properties that make them attractive as dynamic graph representations. In particular, they can represent graphs of different sizes and vertex sets in the same space, allowing for the aggregation and manipulation of graphs in a unified manner. We also provide results on how the size of the projections need to scale in order to preserve accurate graph operations, showing that the size of the projections can scale linearly with the number of vertices while accurately retaining first-order graph information. We conclude by characterizing our random projection as a distance-preserving map of adjacency matrices analogous to the usual Johnson-Lindenstrauss map.
Article
Full-text available
Distributed sparse block codes (SBCs) exhibit compact representations for encoding and manipulating symbolic data structures using fixed-width vectors. One major challenge however is to disentangle, or factorize, the distributed representation of data structures into their constituent elements without having to search through all possible combinations. This factorization becomes more challenging when SBCs vectors are noisy due to perceptual uncertainty and approximations made by modern neural networks to generate the query SBCs vectors. To address these challenges, we first propose a fast and highly accurate method for factorizing a more flexible and hence generalized form of SBCs, dubbed GSBCs. Our iterative factorizer introduces a threshold-based nonlinear activation, conditional random sampling, and an ℓ ∞ -based similarity metric. Its random sampling mechanism, in combination with the search in superposition, allows us to analytically determine the expected number of decoding iterations, which matches the empirical observations up to the GSBC’s bundling capacity. Secondly, the proposed factorizer maintains a high accuracy when queried by noisy product vectors generated using deep convolutional neural networks (CNNs). This facilitates its application in replacing the large fully connected layer (FCL) in CNNs, whereby C trainable class vectors, or attribute combinations, can be implicitly represented by our factorizer having F-factor codebooks, each with C F fixed codevectors. We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function. With this integration, the convolutional layers can generate a noisy product vector that our factorizer can still decode, whereby the decoded factors can have different interpretations based on downstream tasks. We demonstrate the feasibility of our method on four deep CNN architectures over CIFAR-100, ImageNet-1K, and RAVEN datasets. In all use cases, the number of parameters and operations are notably reduced compared to the FCL.
Article
Full-text available
The paper surveys ongoing research on hyperdimensional computing and vector symbolic architectures which represent an alternative approach to neural computing with various advantages and interesting specific properties: transparency, error tolerance, sustainability. In particular, it can be demonstrated that hyperdimensional patterns are well-suited for the encoding of complex knowledge structures. Consequently, the related architectures offer perspectives for the development of innovative neurosymbolic models with a closer correspondence to cognitive processes in human brains. We revisit the fundamentals of hyperdimensional representations and examine some recent applications of related methods for analogical reasoning and learning tasks, with a particular focus on knowledge graphs. We then propose potential extensions and delineate prospective avenues for future investigations.
Conference Paper
Full-text available
Thispaper introduces a novel implementation of the bind() operator that is simple, can be efficiently implemented,and highlights the relationship between retrieval queries and analogical mapping.A frame of role/filler bindings can easily be represented using bind() and bundle(). However, typical bindingsystems are unable to adequately represent multiple frames and arbitrary nested compositional structures. Anovel family of representational operators (called braid()) is introduced to...
Article
This article describes an integrated theory of analogical access and mapping, instantiated in a computational model called LISA (Learning and Inference with Schemas and Analogies). LISA represents predicates and objects as distributed patterns of activation that are dynamically bound into propositional structures, thereby achieving both the flexibility of a connectionist system and the structure sensitivity of a symbolic system. The model treats access and mapping as types of guided pattern classification, differing only in that mapping is augmented by a capacity to learn new correspondences. The resulting model simulates a wide range of empirical findings concerning human analogical access and mapping. LISA also has a number of inherent limitations, including capacity limits, that arise in human reasoning and suggests a specific computational account of these limitations. Extensions of this approach also account for analogical inference and schema induction.
Article
A general method, the tensor product representation, is defined for the connectionist representation of value/variable bindings. The technique is a formalization of the idea that a set of value/variable pairs can be represented by accumulating activity in a collection of units each of which computes the product of a feature of a variable and a feature of its value. The method allows the fully distributed representation of bindings and symbolic structures. Fully and partially localized special cases of the tensor product representation reduce to existing cases of connectionist representations of structured data. The representation rests on a principled analysis of structure; it saturates gracefully as larger structures are represented; it permits recursive construction of complex representations from simpler ones; it respects the independence of the capacities to generate and maintain multiple bindings in parallel; it extends naturally to continuous structures and continuous representational patterns; it permits values to also serve as variables; and it enables analysis of the interference of symbolic structures stored in associative memories. It has also served as the basis for working connectionist models of high-level cognitive tasks.
Article
In this paper we present Drama, a distributed model of analogical mapping that integrates semantic and structural constraints on constructing analogies. Specifically, Drama uses holographic reduced representations (Plate, 1994), a distributed representation scheme, to model the effects of structure and meaning on human performance of analogical mapping. Drama is compared to three symbolic models of analogy (SME, Copycat, and ACME) and one partially distributed model (LISA). We describe Drama’s performance on a number of example analogies and assess the model in terms of neurological and psychological plausibility. We argue that Drama’s successes are due largely to integrating structural and semantic constraints throughout the mapping process. We also claim that Drama is an existence proof of using distributed representations to model high-level cognitive phenomena.
Article
A theory of analogical mapping between source and target analogs based upon interacting structural, semantic, and pragmatic constraints is proposed here. The structural constraint of isomorphism encourages mappings that maximize the consistency of relational corresondences between the elements of the two analogs. The constraint of semantic similarity supports mapping hypotheses to the degree that mapped predicates have similar meanings. The constraint of pragmatic centrality favors mappings involving elements the analogist believes to be important in order to achieve the purpose for which the analogy is being used. The theory is implemented in a computer program called ACME (Analogical Constraint Mapping Engine), which represents constraints by means of a network of supporting and competing hypotheses regarding what elements to map. A cooperative algorithm for parallel constraint satisfaction identities mapping hypotheses that collectively represent the overall mapping that best fits the interacting constraints. ACME has been applied to a wide range of examples that include problem analogies, analogical arguments, explanatory analogies, story analogies, formal analogies, and metaphors. ACME is sensitive to semantic and pragmatic information if it is available, and yet able to compute mappings between formally isomorphic analogs without any similar or identical elements. The theory is able to account for empirical findings regarding the impact of consistency and similarity on human processing of analogies.
Conference Paper
An attempt is made at investigating human reasoning from an uniform poin of view and a possible approach 'associative memory-based reasoning' is proposed. Deduction, induction and analogy are treated as slightly different manifestations of associative memory-based reasoning. The representation and the retrieval of problem solving cases is studied in greater detail.