ArticlePDF Available


Infants are curious learners who drive their own cognitive development by imposing structure on their learning environment as they explore. Understanding the mechanisms by which infants structure their own learning is therefore critical to our understanding of development. Here we propose an explicit mechanism for intrinsically motivated information selection that maximizes learning. We first present a neurocomputational model of infant visual category learning, capturing existing empirical data on the role of environmental complexity on learning. Next we “set the model free”, allowing it to select its own stimuli based on a formalization of curiosity and three alternative selection mechanisms. We demonstrate that maximal learning emerges when the model is able to maximize stimulus novelty relative to its internal states, depending on the interaction across learning between the structure of the environment and the plasticity in the learner itself. We discuss the implications of this new curiosity mechanism for both existing computational models of reinforcement learning and for our understanding of this fundamental mechanism in early development.
Developmental Science. 2017;e12629.   
 1 of 13
DOI: 10.1111/desc.12629
Curiosity- based learning in infants: a neurocomputational
Katherine E. Twomey1| Gert Westermann2
1Division of Human
Communication, Development and
Communication, Development and Hearing,
Funding Information
Communicative Development (LuCiD), an
Infants are curious learners who drive their own cognitive development by imposing
by which infants structure their own learning is therefore critical to our understanding of
ofinfant visualcategorylearning, capturingexistingempiricaldataontheroleof envi-
learning between the structure of the environment and the plasticity in the learner itself.
tional models of reinforcement learning and for our understanding of this fundamental
mechanism in early development.
• Wepresentanovelformalizationofthe mechanismunderlyingin-
• Weimplementthismechanisminaneural networkthatcaptures
• In the same model we test four potential selection mechanisms and
show that learning is maximized when the model selects stimuli
based on its learning history, its current plasticity and its learning
• Themodeloffersnewinsightintohowinfantsmaydrivetheirown
Formorethan halfacentury,infants’information selectionhasbeen
documentedinlab-basedexperiments.These carefullydesigned,rig-
orously controlled paradigms allow researchers to isolate a variable
offering a fine- grained picture of the range of factors that affect early
learning. Decades of developmental research have brought about a
broad consensus that infants’ information selection and subsequent
tions, the learning environment, and discrepancies between the two
(for a review, see Mather, 2013). On the one hand, there is substantial
evidence that infants’ performance in these studies depends heav-
ily on the characteristics of the learning environment. For example,
earlyworkdemonstrated that infants under 6monthsofage prefer
to look at patterned over homogenous grey stimuli (Fantz, Ordy, &
Udelf, 1962), and in a seminal series of categorization experiments
with 3- month- old infants, Quinn and colleagues demonstrated that
the category representations infants form are directly related to
the visual variability of the familiarization stimuli they see (Quinn,
4- month- old infants were shown to learn animal categories when fa-
miliarized with paired animal images, but not when presented with
©2017TheAuthors.Developmental SciencePublishedbyJohnWiley&SonsLtd.
2 of 13 
alsoKovack-Lesh& Oakes, 2007). Thus, therepresentationsinfants
infants’existing knowledge has a profound effect on their behavior
inthese experiments.For example,whilenewborns respondequiva-
lently to images of faces irrespective of the race of those faces, by
8 months infants show holistic processing of images of faces from
their own race, but not of other- race faces, which they process fea-
turally (Ferguson, Kulkofsky, Cashon, & Casasola, 2009). Similarly,
4-month-old infants with pets at home exhibit more sophisticated
visual sampling of pet images than infants with no such experience
Lesh, McMurray, & Oakes, 2014). Effects of learning history also
emerge when infants’ experience is controlled experimentally. For
example,afteraweekoftrainingwith onenamed andone unnamed
novel object, 10-month-old infants exhibited increased visual sam-
plingof the previouslynamedobject in a subsequentsilentlooking-
2010;Gliga,Volein,&Csibra,2010). Thus, learning depends on the
interaction between what infants encounter in- the- moment and what
1.1 | Active learning in curious infants
A long history of experiments, starting with Piaget’s (1952) notion of
childrenas“little scientists”,has shownthatchildrenaremore thanpas-
siveobservers;rather, theytakeanactiverole inconstructingtheirown
learning.Recent work demonstrates this active learning in infants also.
For example, allowing 16-month-old infants to choose between two
to elicit help from their caregivers in finding a hidden object when they
were unable to see the hiding event than when they saw the object
beinghidden(Goupil, Romand-Monnier,& Kouider,2016).Indeed,even
to 8- month- olds increased their visual sampling of a sequence of images
when those images are moderately—but not maximally or minimally—
predictable (Kidd, Piantadosi, & Aslin, 2012; see also Kidd, Piantadosi,
&Aslin,2014).However, as a newly developing field active learning in
Critically, outside the lab infants interact with their environment
freely and largely autonomously, learning about stimuli in whichever
order they choose (Oudeyer & Smith, 2016). Thisexploration is not
drivenbyan external motivation such as finding foodtosatiatehun-
ger. Rather, it is intrinsically motivated(Baldassarreetal.,2014;Berlyne,
1960;Oudeyer& Kaplan, 2007; Schlesinger, 2013): in the real world
infants learn based on their own curiosity. Consequently, in construct-
ingtheirownlearningenvironment,infantsshape theknowledgethey
acquire. However, in the majority of studies on early cognitive devel-
opment,infants’experiencein alearningsituation isfullyspecified by
about the cognitive processes underlying infants’ curiosity as a form of
intrinsicmotivation,or indeed the extent towhichwhat infants learn
fromcuriosity-drivenexplorationdiffersfromwhattheylearn inmore
riences—and consequently, their mental representations—is fundamen-
tal to our understanding of development more broadly.
1.2 | Computational studies of intrinsic motivation
In contrast to the relative scarcity of research into infant curiosity,
recent years have seen a surge in interest in the role of intrinsic mo-
tivation in autonomous computational systems. Equipping artificial
learningsystems withintrinsic motivationmechanismsis likelyto be
2013;Oudeyer,Kaplan, &Hafner,2007),andconsequently arapidly
expandingbody of computational androboticwork now focuses on
the intrinsic motivation mechanisms that may underlie a range of
behaviors; for example, low-level perceptual encoding (Lonini etal.,
2013; Schlesinger & Amso, 2013), novelty detection (Marsland,
Nehmzow, & Shapiro, 2005), and motion planning (Frank, Leitner,
Computational work in intrinsicmotivation has suggested a wide
range of possible formal mechanisms for artificial curiosity- based learn-
couldbe underpinned bya drive tomaximizelearning progressby in-
teracting with the environment in a novel manner relative to previously
be driven by prediction mechanisms, allowing the system to engage in
activitiesforwhichpredictabilityis maximal(Lefort&Gepperth,2015)
or minimal (Botvinick, Niv,& Barto, 2009). Still other approaches as-
sume that curiosity involves maximizing a system’s competence or
abilitytoperforma task(Murakami,Kroger,Birkholz,&Triesch,2015).
osity algorithms, it remains largely agnostic as to the psychological plau-
arate“reward”modulein which thesizeand timing ofthe rewardare
defined a priori by the modeler. Only recently has research highlighted
the value of incorporating developmental constraints in curiosity- based
computationaland robotic learning systems (Oudeyer& Smith, 2016;
Seepanomwan, Caligiore, Cangelosi, & Baldassarre,2015). While this
research shows great promise in incorporating developmentally inspired
curiosity- driven learning mechanisms into artificial learning systems, a
mechanismforcuriosityin humaninfantshasyetto bespecified.The
aim of this paper therefore is to develop a theory of curiosity- based
learning in infants, and to implement these principles in a computational
1.3 | The importance of novelty to curiosity-
based learning
Fromvery early indevelopment,infants show a novelty preference;
that is, they prefer new items to items they have already encountered
 3 of 13
becomes less novel; that is, the child habituates. During habituation,
if a further new stimulus appears, and that stimulus is more novel
to the infant than the currently attended item, the infant abandons
arelinked: broadly,increases innovelty elicitincreasesinattention
that excessive novelty leads to a decrease in attention). Here, we
propose that curiosity in human infants consists of intrinsically mo-
On this view, infants will selectively attend to stimuli that best
supportthisdiscrepancyminimization. However,to date thereisno
agreement in the empirical literature as to what an optimal learn-
ing environment might be. For example, Bulf,Johnson, and Valenza
(2011) demonstrated that newborns learned from highly predictable
sequences of visual stimuli, but not from less predictable sequences.
In contrast, 10-month-old infants in a categorizationtask formed a
uncovered a “Goldilocks” effect in which learning is optimal when
alsoKinney& Kagan, 1976;Twomey,Ranson, & Horst,2014).From
ronment that best supports learning is unclear.
Across these studies, novelty and complexity are operational-
izeddifferently;for example, as objective environmentalpredictability
infants who are engaged in curiosity- driven learning, novelty is not a
perceptualenvironmental characteristics and what the learner knows.
Importantly, each infant has a different learning history which can affect
their exploratorybehavior. Forexample, infant A playswith blocks at 
B’s favoritetoy is a rattle, and she is familiar with the noise it makes
ple,sequencepredictabilityordifferencesinvisual features(Kiddetal.,
elty based both on the learner’s internal representations (what infants
know) and the learning environment(what infants experience). In the
following paragraphs we provide a mechanistic account of this learner–
environment interaction using a neurocomputational model.
1.4 | Computational mechanisms for infant curiosity
Computational models have been widely used to investigate
various cognitive processes, lending themselves in particular to
capturing early developmental phenomena such as category learn-
ing(e.g.,Althaus&Mareschal,2013;Colunga&Smith,2003; Gliozzi,
Thomas,2007;Munakata&McClelland,2003; Rogers&McClelland,
2008;Westermann & Mareschal, 2004, 2012, 2014). Here we take
a connectionist or neurocomputational approach in which abstract
simulationsof biologicalneural networksareused toimplement and
explore theories of cognitive processes in an explicit way, offering
tions about novel behaviors. Neurocomputational models employ a
network of simple processing units to simulate the learner situated
interest, and can have important effects across representational de-
from the interaction between learner and environment. Thus, neu-
rocomputational models are well suited to implementing and testing
developmental theories.
In the current work we employed autoencoder networks: ar-
tificial neural networks in which the input and the output are the
same (Cottrell& Fleming, 1990; Hinton & Salakhutdinov, 2006; see
from infant category learning tasks (Capelier-Mourguy,Twomey, &
Westermann, 2016; French, Mareschal, Mermillod, & Quinn, 2004;
Westermann& Mareschal,2004,2012,2014).Autoencoders imple-
nalrepresentationuntil thetwo match.Atthispointtheinfantlooks
morenovel astimulus,thelonger fixation timewill be. Similarly,au-
toencoder models receivean external stimulus on their input layer,
and aim to reproduce this input on the output layer via a hidden layer.
weighted connections to the hidden layer. Inputs to each hidden layer
unit are summed and this value passed through a typically sigmoid
activationfunction.The values on the hidden units are then passed
throughthe weightedconnectionsto the outputlayer.Again,inputs
to each output node are summed and passed through the activation
function, generating the model’s output representation. Learning is
discrepancybetweentheinput and outputrepresentations.Because
multiple iterations of weight adaptation are required to match the
model’s input and output, erroracts as an index of infants’ looking
times(Mareschal&French,2000) or,morebroadly,thequalityofan
internal representation.
Self-supervised autoencoder models are trained with the well-
known generalizeddelta rule (Rumelhart, Hinton, & Williams, 1986)
withthe specialcasethat inputand targetarethesame.Theweight
update rule of these models is:
4 of 13 
where Δw is the change of a weight after presentation of a stim-
ulus. Thefirst term, (i − o), describes the difference between the
o(1 − o),is the derivative of the sigmoid activationfunction. This
termis minimalforoutputvalues near0 or1 andmaximalforo =
0.5.Because (i o) represents the discrepancy between the mod-
el’s input and its representation, and because learning in the model
consistsofreducingthisdiscrepancy,thesizeofo(1−o) determines
the amount the model can learn from a particular stimulus by con-
strainingthe size of the discrepancyto be reduced. Inthissense,
o(1 − o) reflects the plasticity of the learner, modulating its adapta-
tiontotheexternalenvironment.Finally,η represents the model’s
learningrate.The amountofadaptationisthusa functionboth of
the environment and the internal state of the learner.
adaptation—learning—is proportional to (i−o)o(1 − o); that is, learn-
ing is greatest when (i−o)o(1 − o)is maximal.Ifcuriosityisadrive
tomaximize learning, (i−o)o(1 − o) offers a mechanism for stimu-
lusselectionto maximizelearning:acuriousmodel shouldattempt
tomaximizeitslearningbychoosing stimuli forwhich(i−o)o(1
o) is greatest. Below,in Experiment 2 we test this possibility in a
model, and compare it against three alternative methods of stimulus
1.5 | A test case: infant categorization
Goodnow,&Austin,1972). Consequently,thedevelopmentofthis
powerful skill has generated a great deal of interest, and a large
body of research now demonstrates that infant categorization
is flexible and affected by both existing knowledge and in-the-
momentfeaturesof theenvironment(forareview,seeGershkoff-
Stowe& Rakison,2005).Categorization thereforelendsitself well
totesting the curiositymechanism specified above.InExperiment
1 we present a model that captures infants’ behavior in a recent
categorization task in which the learning environment was artifi-
ciallymanipulated(thus examiningdifferentlearningenvironments
in a controlled laboratory study in which infants do not select in-
ated in the curiosity mechanism against three alternative mecha-
nisms, and demonstrate that learning history and learning plasticity
(i.e., the learner’s internal state) as well as in- the- moment input (i.e.,
the learning environment) are all necessary for maximal learning.
Takentogether, these simulations offeranexplicit and parsimoni-
ous mechanism for curiosity- driven learning, providing new insight
intoexistingempiricalfindings,andgenerating novel,testablepre-
variations in perceptual features came from an influential series
of familiarization/novelty preference studies by Barbara Younger
fantsarefamiliarizedwithaseriesof relatedstimuli—forexample,an
infant might see eight images of different cats, for 10 seconds each.
Then,infantsare presented with two new images side-by-side, one
of which is a novel member of the just- seen category, and one of
preference,if infantslookfor longeratthe out-of-categorystimulus
than the within-category stimulus the experimenter concludes that
out-of-categoryitem.In thisexample,longerlookingatthedog than
whichexcludedthenovel dogexemplar (andindeed, theydo; Quinn
et al., 1993)
Younger(1985) exploredwhetherinfants couldtrackcovariation
of stimulus features and form a category based on this environmen-
talstructure. Ten-month-old infantswere shown aseriesofpictures
of novel animals (see Figure1) that incorporated four features (ear
separation, neck length, leg length and tail width) that could vary
systematicallyinsizebetween discretevalues of1 and 5.At test,all
children saw two simultaneously presented stimuli: one peripheral (a
newexemplarwithextremefeaturevalues)and onecategory-central
(anewexemplarwith the centralvalue for each feature dimension).
Infants’increased looking times to the peripheral stimulus indicated
that they had learned a category that included the category- central
stimulus. This study was one of the first to demonstrate the now
Lesh & Oakes,2007; Quinn etal., 1993; Rakison, 2004; Rakison &
extensionofthisstudywhichtoourknowledgehasnotyetbeen cap-
tured in a computational model. Matherand Plunkett (2011; hence-
waspresentedduring familiarizationwouldaffectinfants’ categoriza-
Younger (1985, E1).Although all infants saw the same stimuli, M&P
manipulated the order in which stimuli were presented during the fa-
Attest, allinfantssawtwosimultaneously presentednovelstimuli,in
line with Younger (1985): one category-central and one peripheral.
M&P found thatinfants in the maximum distance condition showed
an above- chance preference for the peripheral stimulus, while infants
inthe minimum distancecondition showednopreference.Thus, only
 5 of 13
egory space”, then infants in the maximum distance condition would
traverse greater distances during familiarization than infants in the
minimum distance condition, leading to better learning. However, it is
not clear from these empirical data how infants adjusted their repre-
lateM&P’stask. Closelyfollowingthe originalexperimental design,we
trainedourmodelwithstimulus setsinwhichpresentationordermax-
imizedand minimizedsuccessiveperceptualdistances.Toenablemore
fine- grained analyses we tested additional conditions with intermediate
perceptual distances as well as randomly presented sequences (the
usual case in familiarization/novelty preference studies with infants).
LikeM&Pwethen tested themodelon new peripheraland category-
centralstimuli. Basedontheir results,we expectedthe model toform
the strongest category after training with maximum distance stimuli,
then intermediate/random distance, and finally minimum distance.
2.1 | Model architecture
Weused an autoencoderarchitectureconsisting of fourinputunits,
threehiddenunits,andfouroutputunits(Figure2).Each input unit
corresponded to one of the four features of the training stimuli (i.e.,
Hidden and output units used a sigmoidal activation function and
2.2 | Stimuli
features neck length, leg length, ear separation, and tail width. Individual
stimuliwerebased on the stimulus dimensionsprovidedinYounger
(1985,E1, Broad; seeFigure1). For eachfeature,these values were
normalizedto lie between0and 1. Eachstimulus(that is, inputori)
therefore consisted of a four- element vector in which each element
represented the value for one of the four features. Model inputs were
eachsequence we calculated the mean Euclidean distance (ED) be-
tweensuccessive stimuli. This resultedina single overallperceptual
distance value for each sequence.
We created orders for the following four conditions based on
• Maximumdistance(max;cf.M&Pmaximumdistance):24setswith
• Minimum distance (min;cf.M&Pminimumdistance): 24setswith
Medium distance (med): 24 sets with an intermediate mean ED,
specifically sets 20,149–20,172 when sets are sorted in order of
stimuli presented in random order
oftwocategory-peripheralstimuli(newexemplarswith extremefea-
turevalues)and one category-centralstimulus (anewexemplarwith
FIGURE1 StimuliusedinYounger(1985)andthecurrent
&Plunkett(2011)withpermission FIGURE2 Model architecture
6 of 13 
these test stimuli was part of the training set.
2.3 | Procedure
Duringtraining, each stimulus was presented foramaximum of 20
set after each sweep (with no weight updating) and recorded sum
squared error (SSE)asaproxyforlookingtime(Mareschal&French,
2000; Westermann & Mareschal, 2012, 2014). Order of presenta-
tion of training stimuli varied by condition (see Stimuli). Following
eral, one central), presented sequentially for a single sweep with no
weight updates, and again recorded SSE. There were 24 separate
models in each condition, reflecting the 24 participants in each con-
2.4 | Results and discussion
2.4.1 | Training trials
Duringfamiliarization infantsinM&P demonstratedasignificant de-
creasein looking fromthefirst to the finalthree-trialblock. For the
maxandmin conditions we submitted SSE during thefirstandfinal
three-trialblockstoa2(block:first,last;within-subjects)×2 (condi-
maineffectofblock(F(1,46)=97.35,p < .0001, η2
G = .46) confirmed
thatoverallSSE decreasedfrom thefirstblock(M= 0.57,SD = 0.11)
tothefinal block (M = 0.54, SD = 0.11). A main effect of condition
(F(1, 46) = 2079.12, p < .0001, η2
G = .96) revealed that there was less
erroroverallinthemaxcondition(M=0.45,SD = 0.03) than in the min
condition (M = 0.66, SD=0.03).Finally,therewasasignificantblock-
by- condition interaction (F(1, 46) = 4.40, p = .041, η2
G = .03), which
arosefromagreaterdecreasein SSEinthe maxcondition(mean de-
crease=0.045)than in the min condition (mean decrease = 0.030).
Thus,as with the infants in M&P, “looking” in the model decreased
over training.
2.4.2 | Test trials
InM&P,increased lookingtotheperipheralstimuli attestwastaken
proxyforlookingtime,wecollapsed ouranalysesacross thetwope-
ripheralstimuli(Mather&Plunkett, 2011),and calculatedproportion
sum tests against chance confirmed that in all conditions the model
formed a category (all Vs = 300, all ps<.001). However, a Kruskal-
tion) differed between conditions (H(3) = 80.13, p < .001). Post- hoc
Wilcoxon tests (all Ws two-tailed and Bonferroni-corrected) con-
= 0.99) than in the min condition (Mdn = 0.76; W=576,p < .0001, r =
−1.53),themedcondition(Mdn = 0.79; W=576,p < .0001, r=−1.53)
or the random condition (Mdn = 0.83; W=575,p < .0001, r=−1.51).
Allotherbetween-conditiondifferenceswerealsosignificant(allps <
formation in M&P’s minimum distance condition, the authors argue
that these infants were in fact learning a category; since distances
were smaller, these infants traversed less of the category space than
resentations were therefore not sufficiently robust to be detected at
data, likely accounting for our detection of differences where M&P
found null effects.
Overall, our results support M&P’s distance-based account.
We maketheir theoretical category space explicit by implementing
stimuli as feature vectors, which can be interpreted as locations in
Euclidean space.The greater overall Euclideandistances in the max
condition thereforeforce the model to “travel” furtherfrom trial to
therefore greater adaptation, resulting in stronger category learning
overall.The model therefore explains how manipulation ofstimulus
order during training can lead to observed differences in learning at
In Experiment 1 (as in M&P) the orderof stimulus presenta-
tion was fixedin each condition to control the mean successive
ED.This approach created an artificially structured environment
in which the model learned best from the inputs with the most
tational data indicate that both infants and the model learn dif-
ferently in differently structured environments—even when those
differences may seem minor, such as the order in which stimuli
FIGURE3 ProportionSSEtoperipheralstimulusattestin
***p < .001
*** ***
all between-condition differences ***
 7 of 13
areexperienced. However,Experiment 1 reflectedartificially op-
timized ratherthan curiosity-based learning. An important ques-
tion for research on curiosity- based learning is how a model that
selects its own experiences structures its environmentand how
learning in this self- generated environment compares with learn-
ing in the artificially optimized environment in Experiment 1.
Thus,inExperiment2we allowedthe modelto choosethe order
in which it learned from stimuli based both on environmental and
vation in which curiosity is triggered when a learner notices a dis-
crepancy between the environment and their representation (e.g.,
Loewenstein, 1994), the model scans the environment and then
selects the stimulus that maximizes a given function.This learn-
ing is analogous to an infant looking at and processingan array
ofobjects beforechoosing one tolearnfrom.Wecompared the
curiosity- based learning discussed above with three alternative
or plasticity at each learning step.
possible mechanisms for stimulus selection.
3.1 | Model architecture and stimuli
Model architecture and parameters and stimuli were identical to
those used in Experiment 1. Stimulus selection proceeded without
replacement;thus,asinExperiment1 the model saw exactly eight
3.2 | Procedure
The procedure used in Experiment 2 was identical to that used in
Experiment 1, with the exception that stimulus order was deter-
mined by the model based on the following four methods of stimulus
3.2.1 | Curiosity
In the curiosityconditionwetestedourformalizationofinfantcurios-
itybased on the delta rule.Specifically,before presentation of each
stimulus, the model calculated (i − o)o(1 − o) for all possible stimuli
where i = input values and o=outputvalues.Forexample,afterpres-
entation of the first stimulus, the model calculated (i − o)o(1 − o) for
each of the remaining seven stimuli, resulting in a set of seven poten-
tialcuriosity values.Thenext stimuluschosenas input tothe model
was that for which the absolute value of this curiosity function was
ing a novelty detection mechanism rather than the novelty reduction
process of learning.
3.2.2 | Objective complexity maximization
M&Pused Euclidean distance as a measure of inter-stimulus novelty
andshowed that maximizing noveltyobjectivelypresentin the learn-
ing environment led to better learning than minimizing this novelty.
However, M&P selected the presentation orders in advance of the
experiment so that the max condition maximized mean ED between
stimuli across the sequence as a whole. However, our model aimed
toprovidean account of in-the-moment information selection. Thus,
in the objective complexity maximization condition, at each step the
modelchosethe stimulusthatwasmaximallydistant(byED)from the
current stimulus. Complexity is therefore specifically implemented as
EDhere. Inthis conditionthe firststimuluswaschosenrandomly and
3.2.3 | Subjective novelty maximization
In the subjective novelty maximization condition the model selected
stimulibymaximizingi − o, leading to the selection of a stimulus that
was maximally different from its representation in the model. This
mechanismmaximized novelty relative to the model’s learning history.
Subjective novelty maximization therefore reflects prediction-error-
based computational reinforcement learning systems (for a review,
seeBotvinick etal., 2009; see also Ribas-Fernandes etal., 2011), in
3.2.4 | Plasticity maximization
Choosing stimuli based on o(1 − o)minimizesthein-the-momenteffect
of the environment (i) on the model’s learning by omitting (i − o). Put
differently,thismechanism maximizesthemodel’splasticity.Thus, in
the plasticity maximization condition the model selected stimuli about
which it was most ready to learn (disregarding how much it would
actually be able to learn from that stimulus).
In all conditions the test phase was exactlyas in Experiment 1,
comparingnetwork errortocentral and peripheralstimuliasa mea-
sure of strength of category learning.
3.3 | Results and discussion
Proportion of total SSE for peripheral test stimuli is depicted in
the model formed a category in all conditions (all ps<.001).Active
learning therefore led to category formation irrespective of the basis
onwhich the modelselectedstimuli. A Kruskal-Wallis testrevealed,
however,thatSSEdifferedbetweenconditions.In thefollowingsec-
tion we discuss the differences between the four stimulus selection
bestin the curiosity condition.First,the model learned amorerobust
category in the curiosity condition (Mdn = 0.97) than in the objective
8 of 13 
complexitymaximizationcondition(Mdn = 0.91; W=495,p < .001, r =
−0.92).Thisresulthighlights theroleofthelearnerin thelearning pro-
cess: when the model selected stimuli based solely on objective, envi-
alsooutperformedthe subjectivenoveltymaximization condition(Mdn
= 0.77; W=575,p < .001, r=−1.51).Here,althoughthemodel’slearned
the difference between its representation (o) and the environment (i)
were greatest in- the- moment, the longer- term effect of learning history,
demonstrates that the additional plasticity provided by the o(1−o) term
tent to which the model could adapt to its learning environment, reduc-
ing its ability to select stimuli that would lead to optimum information
aloneis notsufficientto maximizelearning: the modelalsoperformed
dition (Mdn=0.75,W=575,p < .001, r=−1.51).Sincethislattermech-
anism ignores the in- the- moment effect of the environment this result
suggests that while focusing solely on the environment is not the best
strategy for active learning, ignoring how much can actually be learned
fromastimulusis notoptimaleither.Finally,inline withExperiment 1
jectivenoveltyandplasticitymaximizationconditions(respectively,W =
564,p < .0001, r=−1.37; W= 56,p < .0001, r= −1.36),further high-
lighting the importance of environmental input; however, we found no
and plasticity maximization conditions (W = 318, p= .55, r = −0.12).
Overall,then, ourformalization ofcuriositymaximizedlearning viathe
dynamic interaction of plasticity, learning history, and in- the- moment
environmental input.
Next, wewere interested in the level of complexityof the se-
quences that maximized learning in the curiosity condition. Inthe
contextof Experiment 1 and M&P,we might expect that the curi-
ousmodelhad maximizedtheseenvironmentaldistances.However,
support learning (Kidd etal., 2012, 2014; Kinney & Kagan, 1976;
learningin some cases (Bulfetal.,2011; Son, Smith, & Goldstone,
2008).Tohelp makesense of theseconflictingresults, all ofwhich
come from experimentswith predetermined stimulus presentation
orders,we analyzedthe stimulus sequencesgeneratedbythecuri-
ous model. Overall, the model generated four different sequences
out of the totalpossible 40,320, depicted in Figure5. On the one
hand, these sequences are very similar; recall that the model selected
stimuli without replacement, reducing the degrees of freedom as
trainingproceeded.Ontheotherhand, theyarenot identical.Their
differences stem from the stochasticity provided to the model by the
humandata,the model data exhibit individual differencesunderly-
generated only four different sequences over 24 runs, this result also
predicts that systematicity in infants’ curiosity- based learning should
be relatively robust.
Toobtain an index ofthe level ofcomplexity ofthegenerated
orderswe ranked the entire set of 40,320 permutations bymean
overall ED, generating281 unique values. Table1 provides these
inthe curiositycondition.The curiousmodel generatedsequences
of intermediate objective complexity. However, these sequences
werenot of averagecomplexity(i.e., from ranks around140/281)
butwere towards the highendofthe range. Toexplorethis find-
ineach ofthe foursequences andrankedthese accordingto their
complexity(i.e., a rank of 1 would mean that the model has cho-
senthemaximallydifferent next stimulus from the set ofremain-
ingstimuli).These individual inter-stimulusdistancesareprovided
in Table2. Interestingly, the model did not generate intermediate
ingthemean overallED masks amoreinteresting behavior: in all
sequences,the model firstmaximizedED (1/7) (cf.M&P). Inthree
out of the foursequences the model then minimized the second
ED(6/6),thenchosean intermediateED(3/5)andmaximized EDs
thereafter.Therefore, when measuredin terms of objective com-
plexity,overall intermediate complexity arose froma combination
ofmaximallycomplex, minimallycomplex andmoderatelycomplex
optimalintermediacybeshifted towardsthe morecomplex endof
thescale?Figure6plotsthe curiosityfunctionforvaluesofi and o
FIGURE4 ProportionSSEtoperipheralstimulusattestin
***p < .001
Proportion SSE to
peripheral stimulus
*** ***
Curiosity Objective
 9 of 13
between 0 and 1 and illustrates that (i − o)o(1 − o) is minimal when
(i − o)iszero,andmaximalwhen(i−o)isaround0.7.Thus,learning
is greatest when both plasticity and subjective novelty are interme-
diate, but shifted towards the higher end of the spectrum.
This striking novelty-maximization–novelty-minimization behavior
emerges because curiosity-driven learning maximizes subjective—not
modelis initializedrandomlywithoutpriorknowledge abouttheto-be-
experiencedstimuli.Atthis stage,thestimulus most similartothis ran-
domrepresentationin thecontextofthe to-be-learnedcategorywould
maximizes learning by choosing a category-peripheral stimulus that
is maximally different from its initial, random representation. Next, it
othercategoryperipheral stimulus.Now,thetwo mostperipheral cate-
gory stimuli, having just been encoded, are the most familiar to the model
possible between these two representations; that is, a category- central
stimulus—and this is what the model chooses. Thus, notwithstanding
thenoise inherentinthe initializationofthe model,which accountsfor
itschoiceofdifferentspecificorders,broadlythemodel exploreswitha
“startfromtheoutside andmovein”strategyfrom theextremes tothe
prototype. Notethat while the model predicts that infants will exhibit
thesamepattern ofexplorationthisisbasedon theassumptionofnoa
learnedrepresentationsby10 months.Whether infantswill exhibitthe
tasks involving truly free exploration—are exciting empirical questions
which we are currently addressing.
toa previouslyunseenprototypical exemplar,weassumethatit has
learned a category with the prototypical exemplar at its center. In
as vectors, can be thought of as locations in representational space.
Category learning is therefore a process of moving from location to
locationwithin this space. Fromthis perspective, theorderin which
thecurious model choosesstimulimaximizes the numberof timesit
traverses the central location in this space, resulting in strong encod-
ingofthis arearelativetoweakencoding ofperipheralstimuli.More
generally, the curiosity mechanism makes the intriguing prediction
forfutureworkthat infants engagedincuriosity-drivenlearning will
FIGURE5 Stimulusorderschosenby
curious model
Trial 12345678
1515 5151 5511 1155 2424 2244 4422 4242
1515 5151 5511 1155 4242 2424 4422 2244
1515 5151 2244 2424 5511 1155 4422 4242
1155 5511 4422 4242 5151 1515 2244 2424
TABLE1 RankmeanEuclideandistanceschoseninthecuriosity
Rank mean ED Frequency/24
34/281 5
41/281 18
50/281 1
10 of 13 
In the current work we used a neurocomputational model to first
categorization, and then to offer an explicit account of curiosity-
driven learning in human infants. In Experiment 1 we captured
empiricaldatapresented by Mather and Plunkett (2011), in which
10-month-old infants formed a robust category when familiarized
but not in sequences which minimized it. In Experiment 2, we al-
lowedthemodeltotakeanactive role in its own learning by let-
ting it select its own stimuli, comparing four different mechanisms
for stimulus selection. Here, curiosity- based learning depended
critically on the interaction between learning history, plasticity and
the learning environment, allowing the model to choose stimuli for
opmental trajectory.
4.1 | Novelty is in the eye of the beholder
Our goal here was to develop a mechanistic theory of infants’ intrinsi-
callymotivated—orcuriosity-based—visual exploration. We selected
the autoencoder model and its learning mechanism based on their
roots in psychological theory and their established success in cap-
turinginfants’ behaviorin empiricaltasks.Importantly, theproposed
curiosity mechanism is theoretically compatible with classical optimal
1994; Vygotsky, 1980). According to these theories, learning is op-
timal in environments of intermediate novelty. Typically, these ap-
proaches have interpreted this intermediacy as information that is
neither too similar nor too different from what the learner has previ-
offers a new perspective: what constitutes optimal novelty changes
asthechildlearns.Thus, what is initially too novel to be useful be-
eltyasmodulatedbyitsplasticity.Theoptimal learningenvironment
is therefore related to subjective novelty, not objective complexity.
Critically, this insight may explain the conflicts in the extant litera-
intermediate novelty: the relationship between subjective novelty and
objective complexity is nonlinear. That is, different levels of objec-
tivecomplexity couldprovide anenvironmentof maximalsubjective
novelty, depending on the infant’s learning history. Developing robust
methodsoftappingsubjective noveltyininfantlookingtimetasks,in
particular individual differences, is therefore critical to understanding
decisions about what aspect of the environment to learn from, learn-
ing can be maximal. Given recent work showing that infants can
explicitlystructure theirlearningenvironment byasking their care-
givers forhelp (Goupil etal., 2016), this suggests that infants may
also implicitly optimize their own learning (for an early empirical
testofthis predction, see Twomey,Malem,&Westermann, 2016).
Second, in line with looking time studies showing that infants se-
lectinformationsystematically(Kiddetal.,2012, 2014), the model
chosestimuliofintermediate objective complexity.However,anal-
yses of the sequences chosen by the model predict that rather than
Trial number
Order A (chosen
× 1)
Order B (chosen
× 5)
Order C (chosen
× 11)
Order D (chosen
× 7)
ED Rank ED Rank ED Rank ED Rank
1 – – – – – – – –
21.5885 1/7 1.5885 1/7 1.5885 1/7 1.5885 1/7
3 1.0974 3/6 1.0974 3/6 0.3971 6/6 0.3971 6/6
41.5885 1/5 1.5885 1/5 0.7942 3/5 0.7942 3/5
50.8717 3/4 0.904 2/4 0.904 1/4 0.904 1/4
60.5487 3/3 0.7942 1/3 1.5885 1/3 1.5885 1/3
7 0.7942 1/2 0.5742 1/2 1.1914 1/2 1.1914 1/2
80.5487 0.7942 – 0.7942 – 0.7942 –
TABLE2 Euclideandistances(ED)
between successive stimuli for sequences
chosen in the curiosity condition of
FIGURE6 Plot of the curiosity function, (i − o)o(1 − o)
 11 of 13
seekingout intermediatecomplexityateach learningevent,infants
may switch systematically between more and less objectively com-
our account goes further than classical theories in which curiosity is
viewedaseitheranovelty-seekingor a novelty-minimizing behav-
ior (e.g., Loewenstein, 1994). Rather, our model predicts that infants’
visualexploration shouldexhibitboth noveltyseeking and novelty-
minimizing components when novelty is viewed objectively, unifying
these theories in a single mechanism.
4.2 | A new approach to computational curiosity in
visual exploration
Thiswork contributes tocomputationalresearchin intrinsic motiva-
ingmodel based onin-the-moment,local decision-making without a
separate, top- down system for monitoring learning progress and/or
reward as generated by a discrete, engineered module that calculates
areward value usingtask-specificcomputations. Our modeldeparts
from this approach, showing that domain- general mechanisms can
produce the motivation to learn, performing a similar function to re-
wardwithout requiringa separatemodule; thatis,in ourmodel, “re-
ward” is part of the algorithm itself. Overall, then the current work
tion, and offers a broader account of the cognitive mechanisms that
may drive curiosity: learning that integrates a search for subjective
novelty modulated by the learner’s plasticity. Here, intrinsically mo-
tivatedinformation selection emerges fromwithinthe model byex-
Overall, this neurocomputational model provides the first formal
account of curiosity- based learning in human infants, integrating sub-
jective novelty and intrinsic motivation mechanisms in a single model.
which infants select information to construct their own optimal learning
environment, and it provides a parsimonious mechanism by which this
it is possible that another one of the many potential mechanisms for
intrinsically motivated learning may take over later in development,
particularly once metacognition is established and language begins in
the current implementation of curiosity not only provides novel insight
tic theory of early intrinsically motivated visual learning.
This work was supported by the ESRC International Centre for
Language and Communicative Development (LuCiD), an ESRC
Future Research Leaders fellowship to KT and a British Academy/
Leverhulme Trust Senior Research Fellowship to GW. The support
of the Economic and Social Research Council (ES/L008955/1; ES/
N01703X/1)is gratefully acknowledged. Data and scripts are avail-
able on request from the authors. Portions of these data were pre-
sented at the 2015 5th International Conference on Development
andLearning and on EpigeneticRobotics,Providence, Rhode Island,
Katherine E. Twomey
Gert Westermann
Althaus, N., & Mareschal, D. (2013). Modeling cross-modal interac-
tions in early word learning. IEEE Transactions on Autonomous Mental
Development, 5, 288–297.
(2014). Intrinsic motivations and open- ended development in animals,
humans,androbots:Anoverview.Frontiers in Psychology, 5,985.
Baranes, A., & Oudeyer, P.-Y. (2013). Active learning of inverse models
with intrinsically motivated goal exploration in robots. Robotics and
Autonomous Systems, 61, 49–73.
Begus,K., Gliga, T.,& Southgate,V.(2014). Infants learnwhat theywant
to learn: Responding to infant pointing leads to superior learning. PLoS
ONE, 9, e108817.
Berlyne,D.E.(1960).Conflict, arousal, and curiosity.NewYork:McGraw-Hill.
rizationofobjectsinearlyinfancy.Child Development, 81, 884–897.
haviorand its neural foundations: A reinforcementlearning perspec-
tive. Cognition, 113, 262–280.
Bruner,J.D.,Goodnow,J.J., &Austin, G.A. (1972). Categories and cogni-
tion. In J.P. Spradley(Ed.), Culture and cognition (pp. 168–190). New
newborn infant. Cognition, 121, 127–132.
Capelier-Mourguy,A.,Twomey,K.E.,& Westermann,G.(2016,August). A
neurocomputational model of the effect of learned labels on infants’ object
fromnetworksandbabies.Philosophical Transactions of the Royal Society
of London Series B- Biological Sciences, 358,1205–1214.
Cottrell,G.W.,&Fleming,M. (1990).Facerecognitionusing unsupervised
feature extraction. In Proceedings of the International Neural Network
iar patterns relative to novel ones. Science, 146, 668–670.
Fantz, R.L., Ordy, J.M., & Udelf, M.S. (1962). Maturation of pattern vi-
sionin infants duringthe first sixmonths.Journal of Comparative and
Physiological Psychology, 55, 907–917.
Ferguson, K.T., Kulkofsky, S., Cashon, C.H., & Casasola, M. (2009). The
development ofspecialized processing of own-race faces in infancy.
Infancy, 14, 263–284.
Frank,M., Leitner,J.,Stollenga, M., Förster,A.,& Schmidhuber,J. (2014).
Curiosity driven reinforcement learning for motion planning on human-
oids. Frontiers in Neurorobotics, 7,25.
French, R.M., Mareschal, D., Mermillod, M., & Quinn, P.C. (2004).
The role of bottom-up processing in perceptual categorization by
12 of 13 
3-to4-month-oldinfants:Simulationsanddata.Journal of Experimental
Psychology: General, 133, 382–397.
Gershkoff-Stowe,L., & Rakison, D.H. (2005). Building object categories in
developmental time.Mahwah,NJ:PsychologyPress.
Gliga, T., Volein, A., & Csibra, G. (2010).Verbal labels modulate percep-
tual object processing in 1- year- old children. Journal of Cognitive
Neuroscience, 22, 2781–2789.
names) for infant categorization: A neurocomputational approach.
Cognitive Science, 33, 709–738.
Gottlieb,J., Oudeyer,P.-Y.,Lopes,M.,&Baranes, A. (2013). Information-
seeking, curiosity, and attention: Computational and neural mecha-
nisms. Trends in Cognitive Sciences, 17,585–593.
whentheyknowtheydon’tknow.Proceedings of the National Academy
of Sciences, USA, 113, 3492–3496.
Hebb, D. (1949). The organization of behavior: A neuropsychological theory.
datawithneuralnetworks.Science, 313,504–507.
Horst,J.S.,Oakes,L.M.,& Madole,K.L.(2005). Whatdoesit looklikeand
Child Development, 76, 614–631.
Hurley, K.B., Kovack-Lesh, K.A., & Oakes, L.M. (2010). The influence of
pets on infants’ processing of cat and dog images. Infant Behavior and
Development, 33, 619–628.
Hurley,K.B., &Oakes,L.M. (2015). Experienceanddistribution of atten-
tion: Petexposure and infants’ scanning of animal images. Journal of
Cognition and Development, 16, 11–30.
Kagan,J.(1972).Motivesanddevelopment.Journal of Personality and Social
Psychology, 22,51–66.
osity. Neuron, 88, 449–460.
infants allocate attention to visual sequences that are neither too sim-
plenortoocomplex.PLoS ONE, 7, e36399.
Kidd,C., Piantadosi,S.T.,&Aslin,R.N. (2014).TheGoldilockseffectinin-
fant auditory attention. Child Development, 85,1795–1804.
Kinney,D.K.,&Kagan,J.(1976). Infantattentionto auditorydiscrepancy.
Child Development, 47,155–164.
Kovack-Lesh, K.A., McMurray,B., & Oakes, L.M. (2014). Four-month-old
ence and attentional strategy. Developmental Psychology, 50, 402–413.
Kovack-Lesh, K.A., & Oakes, L.M. (2007). Holdyour horses: How expo-
sure to different items influences infant categorization. Journal of
Experimental Child Psychology, 98, 69–93.
Lefort,M.,&Gepperth,A.(2015).Active learning of local predictable represen-
tations with artificial curiosity.Paperpresentedatthe5thInternational
Providence, RI.
pretation. Psychological Bulletin, 116,75–98.
Robust active binocular vision through intrinsically motivated learning.
Frontiers in Neurorobotics, 7, 20.
Mareschal, D.,& French, R. (2000). Mechanisms of categorization in in-
fancy. Infancy, 1,59–76.
opmental psychology. IEEE Transactions on Evolutionary Computation,
for autonomous mobile robots. Robotics and Autonomous Systems, 51,
Mather, E. (2013). Novelty, attention, and challenges fordevelopmental
psychology. Frontiers in Psychology, 4, 491.
Mather,E., & Plunkett,K.(2011).Sameitems, different order: Effectsof
temporalvariabilityoninfantcategorization.Cognition, 119, 438–447.
ment. Developmental Science, 6, 413–429.
Murakami,M., Kroger,B., Birkholz, P.,& Triesch,J.(2015). Seeing [u] aids
vocal learning: Babbling and imitation of vowels using a 3D vocal tract
model, reinforcement learning, and reservoir computing. Paper presented
Oakes,L.M., Kovack-Lesh,K.A.,& Horst,J.S. (2009). Twoarebetter than
one: Comparison influences infants’ visual recognition memory. Journal
of Experimental Child Psychology, 104, 124–131.
of computational approaches. Frontiers in Neurorobotics, 1, 6.
for autonomous mental development. IEEE Transactions on Evolutionary
Computation, 11(2),265–286.
Oudeyer, P.-Y., & Smith, L.B. (2016). Howevolution may work through
curiosity- driven developmental process. Topics in Cognitive Science, 8,
Piaget, J. (1952). The origins of intelligence in children (Vol.8). New York:
connectionist net. Connection Science, 4, 293–312.
tations of perceptually similar natural categories by 3- month- old and
4- month- old infants. Perception, 22,463–475.
dynamic featuresin a category context. Journal of Experimental Child
Psychology, 89, 1–30.
Rakison,D.H., & Butterworth, G.E. (1998).Infants’use of objectpartsin
earlycategorization.Developmental Psychology, 34, 49–62.
In A.H. Black & W.F.Prokasy (Eds.), Classical Conditioning II: Current
Research and Theory(pp.64–99).NewYork:Appleton-Century-Crofts.
Y.,& Botvinick, M.M. (2011). A neuralsignature of hierarchical rein-
forcement learning. Neuron, 71, 370–379.
allel distributed processing approach. Behavioral and Brain Sciences, 31,
tionsbyback-propagatingerrors.Nature, 323,533–536.
Schlesinger,M. (2013). Investigatingthe origins ofintrinsicmotivation in
human infants. In G. Baldassare& M. Mirolli (Eds.), Intrinsically moti-
vated learning in natural and artificial systems (pp. 367–392). Berlin:
Schlesinger,M.,&Amso,D. (2013).Imagefree-viewingasintrinsically-
image samples in infants and adults. Frontiers in Psychology, 4, 802.
infant robots. Cognitive Processing, 16,S100–S100.
Sokolov, E.N. (1963). Perception and the conditioned reflex. New York:
Son,J.Y.,Smith,L.B., & Goldstone, R.L. (2008). Simplicityandgeneraliza-
tion: Short-cutting abstraction in children’s object categorizations.
Cognition, 108, 626–638.
Thelen,E.,&Smith,L.B.(1994).A dynamic systems approach to the develop-
ment of cognition and action.Cambridge,MA:MITPress.
Thomas, M., & Karmiloff-Smith,A. (2003). Connectionist models of de-
velopment, developmental disorders, and individual differences.
In R.J. Sternberg,J. Lautrey & T.Lubart (Eds.), Models of intelligence:
 13 of 13
International perspectives (pp. 133–150). Washington, DC: American
Twomey, K.E., Malem, B., & Westermann, G. (2016, May). Infants’ in-
formation seeking in a category learning task. In In K.E. Twomey
(chair), Understanding infants’ curiosity-based learning: Empirical and
computational approaches. Symposium presented at the XX Biennial
exemplars facilitateword learning. Infant and Child Development, 23,
Twomey,K.E.,&Westermann,G.(2017).Labelsshape pre-speechinfants’
object representations. Infancy,
Vygotsky,L.S.(1980).Mind in society: The development of higher psychologi-
cal processes.Cambridge,MA:HarvardUniversityPress.
Westermann, G., & Mareschal, D. (2004). From parts to wholes:
Mechanisms of development in infant visual object processing.
Infancy, 5,131–151.
Westermann,G., & Mareschal, D. (2012). Mechanisms ofdevelopmental
changeininfantcategorization.Cognitive Development, 27, 367–382.
Westermann, G., & Mareschal,D. (2014). From perceptual to language-
mediatedcategorization.Philosophical Transactions of the Royal Society
B: Biological Sciences, 369, 20120391.
Younger, B.A. (1985). The segregation of items into categories by ten-
month- old infants. Child Development, 56,1574–1583.
Younger, B.A., & Cohen, L.B. (1983). Infant perception of correlations
among attributes. Child Development, 54,858–867.
Younger, B.A., & Cohen, L.B. (1986). Developmental change in infants’
perception of correlations among attributes. Child Development, 57,
How to cite this article:TwomeyKE,WestermannG.
Curiosity- based learning in infants: a neurocomputational
approach. Dev Sci. 2017;e12629.
... 19 That is, the novelty of a stimulus may engage intrinsic motivation mechanisms that, in turn, drive organisms to invest cognitive resources in learning under conditions of uncertainty. 20 Preferences for novelty have been identified in numerous diverse species 21 and have been linked with both basic survival processes like locomotor adaptation 22 and high-level constructs like curiosity [23][24][25][26] . Understanding the determinants of infants' familiarity and novelty preferences, therefore, is important not only for methodological reasons but also for clarifying mechanisms of cognitive development. ...
... Notably, the current design has reasonable power to detect a three-way interaction between the main effects (>80% power for d = 0.2 or larger). Overall, the power analysis indicates that the current design has sufficient power to detect each of the three main predictions of the Hunter and Ames model.26 ...
Full-text available
Much of our basic understanding of cognitive and social processes in infancy relies on measures of looking time, and specifically on infants’ visual preference for a novel or familiar stimulus. However, despite being the foundation of many behavioral tasks in infant research, the determinants of infants’ visual preferences are poorly understood, and differences in the expression of preferences can be difficult to interpret. In this large-scale study, we test predictions from the Hunter and Ames model of infants' visual preferences. We investigate the effects of three factors predicted by this model to determine infants’ preference for novel versus familiar stimuli: age, stimulus familiarity, and stimulus complexity. Drawing from a large and diverse sample of infant participants (N = XX), this study will provide crucial empirical evidence for a robust and generalizable model of infant visual preferences, leading to a more solid theoretical foundation for understanding the mechanisms that underlie infants’ responses in common behavioral paradigms. Moreover, our findings will guide future studies that rely on infants' visual preferences to measure cognitive and social processes.
... We might start by describing essential propertiesthe "engineering specifications" of childhood learning. Early childhood learning is incredibly interactive (Fantz 1964;Gopnik et al. 1999;Begus et al. 2014;Goupil et al. 2016;Twomey and Westermann 2018). Children play, grabbing and manipulating objects, learning about the properties and affordances of their worlds. ...
Full-text available
As “scientists in the crib,” children learn through curiosity, tirelessly seeking novelty and information as they interact—really, play—with both physical objects and the people around them. This flexible capacity to learn about the world through intrinsically motivated interaction continues throughout life. How would we engineer an artificial, autonomous agent that learns in this way – one that flexibly interacts with its environment, and others within it, in order to learn as humans do? In this chapter, I will first motivate this question by describing important advances in artificial intelligence in the last decade, noting ways in which artificial learning within these methods are and are not like human learning. I will then give an overview of recent results in artificial intelligence aimed at replicating curiosity-driven interactive learning. I will then close by speculating on how AI that learns in this fashion could be used as fine-grained computational models of human learning.
... Whereas the study of curiosity in adults has a relatively long tradition, only recently has research begun to address the role of curiosity and active exploration in infants' knowledge acquisition (Kidd & Hayden, 2015;Oudeyer & Smith, 2016;Poli, Serino, Mars, & Hunnius, 2020;Smith, Jayaraman, Clerkin, & Yu, 2018;Twomey & Westermann, 2018). This early work suggests that infants are curious learners who actively navigate and structure their own learning, allocating their attention to the resources that allow them to maximize information gain to learn rapidly (Poli et al., 2020). ...
Full-text available
Recent research with adults indicates that curiosity induced by uncertainty enhances learning and memory outcomes and that the resolution of curiosity has a special role in curiosity-driven learning. However, the role of curiosity-based learning in early development is unclear. Here we presented 8-month-old infants with a novel looking time procedure to explore (a) whether uncertainty-induced curiosity enhances learning of incidental information and (b) whether uncertainty-induced curiosity leads infants to seek uncertainty resolution over novelty. In Experiment 1, infants saw blurred images to induce curiosity (Curiosity sequence) or a clear image (Non-curiosity sequence) followed by presentation of incidental objects. Despite looking equally to the incidental objects in both sequences, in a subsequent object recognition phase infants looked longer to incidental objects presented in the Non-curiosity condition compared with the Curiosity condition, indicating that curiosity induced by blurred pictures enhanced the processing of the incidental object, leading to a novelty preference for the incidental object shown in the Non-Curiosity condition. In Experiment 2, a blurred picture of a novel toy was first presented, followed by its corresponding clear picture paired with a clear picture of a new novel toy side by side. Infants showed no preference for either image, providing no evidence for a drive to resolve uncertainty. Overall, the current experiments suggest that curiosity has a broad attention-enhancing effect in infancy. Taking into account existing studies with older children and adults, we propose a developmental change in the function of curiosity, from this attentional enhancement to more goal-directed information seeking in older children and adults.
... Furthermore, it has been suggested that active learning is driven by a goal to maximize learning progress by interacting with the environment in a novel manner 196,197 . Supporting this line of thought, computational modelling approaches that compared presenting stimuli in a fixed order or allowing the model to choose its own input showed that maximal learning happens when the model can maximize stimulus novelty relative to its internal states 198 . This work emphasized the importance of the interaction between the structure of the environment and the previously acquired knowledge of the learner. ...
The desire to reduce the dependence on curated, labeled datasets and to leverage the vast quantities of unlabeled data has triggered renewed interest in unsupervised (or self-supervised) learning algorithms. Despite improved performance due to approaches such as the identification of disentangled latent representations, contrastive learning and clustering optimizations, unsupervised machine learning still falls short of its hypothesized potential as a breakthrough paradigm enabling generally intelligent systems. Inspiration from cognitive (neuro)science has been based mostly on adult learners with access to labels and a vast amount of prior knowledge. To push unsupervised machine learning forward, we argue that developmental science of infant cognition might hold the key to unlocking the next generation of unsupervised learning approaches. We identify three crucial factors enabling infants’ quality and speed of learning: (1) babies’ information processing is guided and constrained; (2) babies are learning from diverse, multimodal inputs; and (3) babies’ input is shaped by development and active learning. We assess the extent to which these insights from infant learning have already been exploited in machine learning, examine how closely these implementations resemble the core insights, and propose how further adoption of these factors can give rise to previously unseen performance levels in unsupervised learning. Unsupervised machine learning algorithms reduce the dependence on curated, labeled datasets that are characteristic of supervised machine learning. The authors argue that the developmental science of infant cognition could inform the design of unsupervised machine learning approaches.
How does cognition develop in infants, children and adolescents? This handbook presents a cutting-edge overview of the field of cognitive development, spanning basic methodology, key domain-based findings and applications. Part One covers the neurobiological constraints and laws of brain development, while Part Two covers the fundamentals of cognitive development from birth to adulthood: object, number, categorization, reasoning, decision-making and socioemotional cognition. The final Part Three covers educational and school-learning domains, including numeracy, literacy, scientific reasoning skills, working memory and executive skills, metacognition, curiosity-driven active learning and more. Featuring chapters written by the world's leading scholars in experimental and developmental psychology, as well as in basic neurobiology, cognitive neuroscience, computational modelling and developmental robotics, this collection is the most comprehensive reference work to date on cognitive development of the twenty-first century. It will be a vital resource for scholars and graduate students in developmental psychology, neuroeducation and the cognitive sciences.
Children start to communicate and use language in social interactions from a very young age. This allows them to experiment with their developing linguistic knowledge and receive valuable feedback from their – often more knowledgeable – interlocutors. While research in language acquisition has focused a great deal on children's ability to learn from the linguistic input or social cues, little work, in comparison, has investigated the nature and role of Communicative Feedback, a process that results from children and caregivers trying to coordinate mutual understanding. In this work, we draw on insights from theories of communicative coordination to formalize a mechanism for language acquisition: We argue that children can improve their linguistic knowledge in conversation by leveraging explicit or implicit signals of communication success or failure. This new formalization provides a common framework for several lines of research in child development that have been pursued separately. Further, it points towards several gaps in the literature that, we believe, should be addressed in future research in order to achieve a more complete understanding of language acquisition within and through social interaction.
Full-text available
Early object name learning is often conceptualized as a problem of mapping heard names to referents. However, infants do not hear object names as discrete events but rather in extended interactions organized around goal-directed actions on objects. The present study examined the statistical structure of the nonlinguistic events that surround parent naming of objects. Parents and 12-month-old infants were left alone in a room for 10 minutes with 32 objects available for exploration. Parent and infant handling of objects and parent naming of objects were coded. The four measured statistics were from measures used in the study of coherent discourse: (i) a frequency distribution in which actions were frequently directed to a few objects and more rarely to other objects; (ii) repeated returns to the high-frequency objects over the 10-minute play period; (iii) clustered repetitions and continuity of actions on objects; and (iv) structured networks of transitions among objects in play that connected all the played-with objects. Parent naming was infrequent but related to the statistics of object-directed actions. The implications of the discourse-like stream of actions are discussed in terms of learning mechanisms that could support rapid learning of object names from relatively few name-object co-occurrences. © 2022 The Author(s). Published with license by Taylor & Francis Group, LLC.
Full-text available
Although curiosity has huge implications for human creativity and learning, its evolutionary roots and function in animals remain poorly understood. Modern humans, who lack natural predators, thrive with curiosity, but our ancestors faced more hazardous environments that would not necessarily favor individual curiosity. Instead, being curious may have undergone selection in interaction with sociality. Our closest living relatives, the great apes (henceforth apes) have evolved facing conditions more like human ancestors and as such, can help us understand the functions of curiosity and its expression in non‐human species. In this study, we defined curiosity as a combination of behavioral traits like neophilia, exploration diversity, and prolonged interest in exploring novelty and compared it, under similar captive environments across four ape species (N = 101): Pan troglodytes, Pan paniscus, Pongo abelii, and Pongo pygmaeus. Results revealed that curiosity followed a linear gradient across the four species in accordance with their sociality. We propose the social curiosity hypothesis to explain the observed pattern, reflecting those individuals in highly social species, like bonobos and chimpanzees, regularly are accompanied by conspecifics, and thereby accustomed to an abundance of social cues, leading to inhibited curiosity when alone, compared to more solitary orangutans. As such, our study implies that ape curiosity evolved interlinked with sociality. Further, a subset of the sample (N = 46) enabled us to examine if curiosity benefits problem‐solving skills, but our data did not support such link. Social influences on intrinsic curiosity in great apes: The ape species in our sample (chimpanzees, bonobos, Sumatran‐ and Bornean orangutan) differ in curiosity but not problem‐solving skills. The less sociable a species of great ape is in the wild, the more curious the individual when tested alone. Ape curiosity may have evolved interlinked with a species social system.
Humans constantly search for and use information to solve a wide range of problems related to survival, social interactions, and learning. While it is clear that curiosity and the drive for knowledge occupies a central role in defining what being human means to ourselves, where does this desire to know the unknown come from? What is its purpose? And how does it operate? These are some of the core questions this book seeks to answer by showcasing new and exciting research on human information-seeking. The volume brings together perspectives from leading researchers at the cutting edge of the cognitive sciences, working on human brains and behavior within psychology, computer science, and neuroscience. These vital connections between disciplines will continue to lead to further breakthroughs in our understanding of human cognition.
Full-text available
Infants rapidly learn both linguistic and nonlinguistic representations of their environment, and begin to link these from around six months. While there is an increasing body of evidence for the effect of labels heard in-task on infants’ online processing, whether infants’ learned linguistic representations shape learned nonlinguistic representations is unclear. In the current study 10-month-old infants were trained over the course of a week with two 3D objects, one labeled and one unlabeled. Infants then took part in a looking time task in which 2D images of the objects were presented individually in a silent familiarization phase, followed by a preferential looking trial. During the critical familiarization phase, infants looked for longer at the previously labeled stimulus than the unlabeled stimulus, suggesting that learning a label for an object had shaped infants’ representations as indexed by looking times. We interpret these results in terms of label activation and novelty response accounts, and discuss implications for our understanding of early representational development.
Conference Paper
Full-text available
In this article, we present some preliminary work on integrating an artificial curiosity mechanism in PROPRE, a generic and modular neural architecture, to obtain online, open-ended and active learning of a sensory-motor space, where large areas can be unlearnable. PROPRE consists of the combination of the projection of the input motor flow, using a self-organizing map, with the regression of the sensory output flow from this projection representation, using a linear regression. The main feature of PROPRE is the use of a predictability module that provides an interestingness measure for the current motor stimulus depending on a simple evaluation of the sensory prediction quality. This measure modulates the projection learning so that to favor the representations that predict the output better than a local average. Especially, this leads to the learning of local representations where an input/output relationship is defined [1]. In this article, we propose an artificial curiosity mechanism based on the monitoring of learning progress, as proposed in [2], in the neighborhood of each local representation. Thus, PROPRE simultaneously learns interesting representations of the input flow (depending on their capacities to predict the output) and explores actively this input space where the learning progress is the higher. We illustrate our architecture on the learning of a direct model of an arm whose hand can only be perceived in a restricted visual space. The modulation of the projection learning leads to a better performance and the use of the curiosity mechanism provides quicker learning and even improves the final performance.
Full-text available
See also conference proceedings paper (Capelier-Mourguy, Twomey & Westermann, 2016)
Full-text available
Uncertainty monitoring is a core property of metacognition, allowing individuals to adapt their decision-making strategies depending on the state of their knowledge. Although it has been argued that other animals share these metacognitive abilities, only humans seem to possess the ability to explicitly communicate their own uncertainty to others. It remains unknown whether this capacity is present early in development, or whether it emerges later with the ability to verbally report one's own mental states. Here, using a nonverbal memory-monitoring paradigm, we show that 20-montholds can monitor and report their own uncertainty. Infants had to remember the location of a hidden toy before pointing to indicate where they wanted to recover it. In an experimental group, infants were given the possibility to ask for help through nonverbal communication when they had forgotten the toy location. Compared with a control group in which infants had no other option but to decide by themselves, infants given the opportunity to ask for help used this option strategically to improve their performance. Asking for help was used selectively to avoid making errors and to decline difficult choices. These results demonstrate that infants are able to successfully monitor their own uncertainty and share this information with others to fulfill their goals.
Conference Paper
Full-text available
We present a model of imitative vocal learning consisting of two stages. First, the infant is exposed to the ambient language and forms auditory knowledge of the speech items to be acquired. Second, the infant attempts to imitate these speech items and thereby learns to control the articulators for speech production. We model these processes using a recurrent neural network and a realistic vocal tract model. We show that vowel production can be successfully learnt by imitation. Moreover, we find that acquisition of [u] is impaired if visual information is discarded during imitation. This might give sighted infants an advantage over blind infants during vocal learning, which is in agreement with experimental evidence.
It has become clear to researchers in robotics and adaptive behaviour that current approaches are yielding systems with limited autonomy and capacity for self-improvement. To learn autonomously and in a cumulative fashion is one of the hallmarks of intelligence, and we know that higher mammals engage in exploratory activities that are not directed to pursue goals of immediate relevance for survival and reproduction but are instead driven by intrinsic motivations such as curiosity, interest in novel stimuli or surprising events, and inter-est in learning new behaviours. The adaptive value of such intrinsically motivated activities lies in the fact that they allow the cumulative acquisition of knowledge and skills that can be used later to accomplish ?tness-enhanc-ing goals. Intrinsic motivations continue during adulthood, and in humans they underlie lifelong learning, artistic creativity, and scientific discovery, while they are also the basis for processes that strongly affect human well-being, such as the sense of competence, self-determination, and self-esteem. This book has two aims: to present the state of the art in research on intrinsically motivated learning, and to identify the related scientific and technological open challenges and most promising research directions. The book introduces the concept of intrinsic motivation in artificial systems, reviews the relevant literature, offers insights from the neural and behavioural sciences, and presents novel tools for research. The book is organized into six parts: the chapters in Part I give general overviews on the concept of intrinsic motivations, their function, and possible mechanisms for implementing them; Parts II, III, and IV focus on three classes of intrinsic motivation mechanisms, those based on predictors, on novelty, and on competence; Part V discusses mechanisms that are complementary to intrinsic motivations; and Part VI introduces tools and experimental frameworks for investigating intrinsic motivations. The contributing authors are among the pioneers carrying out fundamental work on this topic, drawn from related disciplines such as artificial intelligence, robotics, artificial life, evolution, machine learning, developmental psychology, cognitive science, and neuroscience. The book will be of value to graduate students and academic researchers in these domains, and to engineers engaged with the design of autonomous, adaptive robots. The contributing authors are among the pioneers carrying out fundamental work on this topic, drawn from related disciplines such as artificial intelligence, robotics, artificial life, evolution, machine learning, developmental psychology, cognitive science, and neuroscience. The book will be of value to graduate students and academic researchers in these domains, and to engineers engaged with the design of autonomous, adaptive robots.