Content uploaded by Marc O Ernst

Author content

All content in this area was uploaded by Marc O Ernst on Jan 15, 2016

Content may be subject to copyright.

Content uploaded by Massimiliano Di Luca

Author content

All content in this area was uploaded by Massimiliano Di Luca

Content may be subject to copyright.

CHAPTER 12

Multisensory Perception: From Integration

to Remapping

Marc O. Ernst and Massimiliano Di Luca

INTRODUCTION

The brain receives information about the

environment from all the sensory modalities,

including vision, touch, and audition. To

interact efﬁciently with the environment, this

information must eventually converge to form a

reliable and accurate multimodal percept. This

process is often complicated by the existence

of noise at every level of signal processing,

which makes the sensory information derived

from the world unreliable and inaccurate. We

deﬁne reliability as the inverse variance of

the probability distribution that describes the

information a sensory signal contributes to

the perceptual estimation process. In contrast,

accuracy is deﬁned as the probability with

which the sensory signal truly represents the

magnitude of the real-world physical property

that it reﬂects. In other words, it is inversely

related to the probability of a sensory signal

being biased with respect to the world property.

There are several ways in which the nervous

system may minimize the negative consequences

of noise in terms of reliability and accuracy.

Two key strategies are to combine redundant

sensory estimates and to use prior knowledge.

There is behavioral evidence that the human

nervous system employs both of these strategies

to reduce the adverse effects of noise and thus to

improve perceptual estimates.

In this chapter, we elaborate further on how

these strategies may be used by the nervous

system to obtain the best possible estimates

from noisy signals. We ﬁrst describe how

weighted averaging can increase the reliability

of sensory estimates, which is the beneﬁt of

multisensory integration. Then, we point out

that integration can also come at a cost of

introducing inaccuracy in the sensory estimates.

This shows that there is a need to balance

the beneﬁts and costs of integration. This is

done using the Bayesian approach, with a joint

likelihood function representing the reliability

of the sensory estimates (e.g.,

ˆ

S

V

and

ˆ

S

H

, for

visual and haptic sensory estimates) and a

joint prior probability distribution providing

the co-occurrence statistics of sensory signals

p(S

V

, S

H

), that is, the prior probability of

jointly encountering an ensemble of sensory

signals derived from the world. This framework

naturally leads to a continuum of integration

between fusion and segregation. We further

show how this framework can be used to

model the breakdown of integration by having

the joint prior conditioned on multisensory

discordance (i.e., a separation of the sensory

signals in time, space, or some other measure

of similarity). If the multisensory signals differ

constantly over a period of time, because they

may be consistently inaccurate, recalibration of

the multisensory estimates will be the result. The

rate of recalibration can be described using a

Kalman-ﬁlter model, which can also be derived

from the Bayesian approach. We conclude by

224

MULTISENSORY PERCEPTION 225

proposing how integration and recalibration

can be jointly described under this common

approach.

MULTISENSORY INTEGRATION

For estimating a speciﬁc environmental prop-

erty, such as the size of an object in the

world S

W

, there are often multiple sources of

sensory information available. For example, an

object’s size can be estimated by sight and touch

(haptics), S

V

and S

H

. Typical models of sensory

integration assume unbiased (accurate) sensory

signals (i.e., S

V

= S

H

) with normally distributed

noise sources that are independent, a situation

in which sensory integration is beneﬁcial (see

Chapter 1; Landy, Maloney, Johnston, & Young,

1995). For the estimation of an object’s size from

vision and touch, the assumption of independent

noise sources is likely to be true since most of

the neuronal processing for sensory signals, that

is, their transmission from sensory transducers

to the brain, is largely independent. As was

introduced in Chapter 1, Figure 12.1 illustrates

the optimal mechanism of sensory combination

given these assumptions and given that the goal is

to compute a minimum-variance estimate. This

can be considered the standard model of sensory

integration. The likelihood functions represent

two independent estimates of size, the visual

likelihood

functions

size

S

H

S

VH

S

V

probability

visualhaptic

combined

V

H

VH

Figure 12.1 Schematic representation of the

likelihood functions of the individual visual and

haptic size estimates

ˆ

S

V

and

ˆ

S

H

and of the combined

visual-haptic size estimate

ˆ

S

VH

, which is a weighted

average according to Eq. 12.1. The variance

associated with the visual-haptic distribution is less

than either of the two individual estimates (Eq.

12.3). (Adapted from Ernst & Banks, 2002.)

size estimate

ˆ

S

V

and the haptic size estimate

ˆ

S

H

,

based on sensory measurements (z

V

, z

H

) that are

corrupted by noise (with standard deviations

σ

V

and σ

H

). The integrated multisensory estimate

ˆ

S

VH

is a weighted average of the individual

sensory estimates with weights w

V

and w

H

that

sum up to unity (Cochran, 1937):

ˆ

S

VH

= w

V

ˆ

S

V

+ w

H

ˆ

S

H

, where w

V

+ w

H

= 1.

(12.1)

To achieve optimal performance, the cho-

sen weights need to be proportional to the

reliability r, which is deﬁned as the inverse of

the signal variance:

w

j

=

r

j

i

r

i

, with r

i

=

1

σ

2

i

. (12.2)

The indices i and j refer to the sensory

modalities (V

, H ). The modality that provides

more reliable information in a given situation

is given a higher weight, and so has a greater

inﬂuence on the ﬁnal percept. In the example

shown in Figure 12.1, visual information about

the size of the object is four times more

reliable than the haptic information. Therefore,

the combined estimate (the weighted sum) is

“closer” to the visual estimate than the haptic one

(in the present example the visual weight is 0.8

according to Eq. 12.2). In another circumstance

where the haptic modality might provide a

more reliable estimate, the situation would be

reversed.

Given this weighting scheme, the beneﬁt of

integration is that the variance of the combined

estimate from vision and touch is less than that

of either of the individual estimates that are

fed into the averaging process. Therefore, the

combined estimate arising from integration of

multiple sources of independent information

shows greater reliability and diminished effects

of noise. Mathematically, this is expressed by

the combined reliability r being the sum of the

individual reliabilities:

r

=

i

r

i

. (12.3)

226 BEHAVIORAL STUDIES

Given that all estimates are unbiased, this inte-

gration scheme can be considered statistically

optimal, since it provides the lowest possible

variance of its combined estimate. Thus, this

form of sensory combination is the best way to

reduce uncertainty given the assumptions that

all estimates are accurate and contain Gaussian-

distributed, independent noise (Chapter 1).

Even if the noise distributions of the individual

signals displayed a correlation, averaging of

sensory information would still be advantageous

and the combined estimate would still be more

reliable than each individual estimate alone

(Oruç, Maloney, & Landy, 2003).

Several studies have tested this integra-

tion scheme empirically (e.g., Gharahmani,

Wolpert, & Jordan, 1997; van Beers, Sittig, &

Denier van der Gon, 1998, 1999). In 2002,

Ernst and Banks showed that humans integrate

visual and haptic information in such a

statistically optimal fashion. It has further been

demonstrated that this ﬁnding of optimality

also holds across and within other sensory

modalities, for example, vision and audition

(e.g., Alais & Burr, 2004; Hillis, Watt, Landy, &

Banks, 2004; Knill & Saunders, 2003; Landy &

Kojima, 2001). Thus, weighted averaging of

sensory information appears to be a general

strategy employed by the perceptual system to

decrease the detrimental effects of noise.

If redundant sources of sensory information

are absent or if the noises of these sources

are perfectly correlated, averaging different

estimates is not an option to reduce noise.

However, because the world is structured quite

regularly, the nervous system can use prior

knowledge about such statistical regularities

to reduce the uncertainty and ambiguity in

neuronal signals. Prior knowledge can also

be formalized as a probability distribution in

a manner similar to that for sensory signals

corrupted by noise. For example, let us consider

the distribution of velocities for all objects.

While some objects in our environment do move

around occasionally, from a purely statistical

point of view, on average most objects are likely

to remain stationary at most times, that is,

the velocity of an object is most likely to be

zero. Thus, a reasonable probability distribution

describing the velocity of all objects is centered at

zero with some variance (Stocker & Simoncelli,

2006; Weiss, Simoncelli, & Adelson 2002).

This prior knowledge can be combined with

unreliable sensory evidence in order to minimize

the uncertainty in the ﬁnal velocity estimate. If

all the probability distributions are Gaussian,

using Bayes’ rule it is possible to derive that

the combined posterior estimate (the maximum

a posteriori or MAP estimate) is a weighted

average as well; however, now it is a weighted

average between the prior and the likelihood

function, that is, the sensory evidence:

ˆ

S

MAP

= w

likelihood

ˆ

S

likelihood

+ w

prior

ˆ

S

prior

. (12.4)

The reliability of the MAP estimate then is

given by:

r

MAP

= r

likelihood

+ r

prior

. (12.5)

The principles of weighted averaging and the

use of prior knowledge can be combined and

placed into a larger mathematical framework

of optimal statistical estimation and decision

theory, known as Bayesian decision theory

(Chapter 1; Mamassian, Landy, & Maloney,

2002). This approach is illustrated in Figure 12.2

in the context of the action-perception loop.

Psychophysical experiments have conﬁrmed that

at least some aspects of human perception and

action that deal with noise and uncertainty can

be described well using this Bayesian framework

(e.g., Adams, Graf, & Ernst, 2004; Kersten,

Mamassian, & Yuille, 2004; Körding & Wolpert,

2004; Stocker & Simoncelli, 2006).

THE COST OF INTEGRATION

While weighted averaging of sensory measure-

ments or use of prior knowledge has the beneﬁt

of reducing noise and uncertainty in perceptual

estimates, it also incurs a potential cost. The

cost is the introduction of potential biases into

perception. Biases can occur, for example, when

the sensory estimates (

ˆ

S

V

and

ˆ

S

H

) as deﬁned

by the likelihood functions and thus sensory

signals (S

V

and S

H

) do not accurately represent

the physical stimuli (S

W

V

and S

W

H

). Accuracy of

MULTISENSORY PERCEPTION 227

Environment

change environment

through interaction

sensory signals

from the environment

senses

effectors

Organism

Action

Gain/Loss

Function

Posterior

Distribution

Stimulus

Bayes'

Rule

Goal

Decision

Rule

Response

Action

Prior

Knowledge

(Prior Distribution)

Sensory

Processing

(Likelihood Function)

Perception Sensation

Figure 12.2 The action/perception-loop schematically illustrates the processing of information according

to Bayesian decision theory. Multiple sensory signals are averaged during sensory processing and then

combined with prior knowledge, to derive the most reliable, unbiased estimate (posterior) that can be used

in a task that has a goal as deﬁned by a gain or loss function. (Adapted from Ernst & Bülthoff, 2004.)

the sensory estimates was one of the assumptions

made in the previous section for deriving

the optimal integration scheme (Chapter 1).

However, if the estimates are no longer accurate

due to external or internal inﬂuences on the

signals, the potential cost of biases has to be

considered.

1

Examples of sources of inaccuracies

in signals may be muscle fatigue, variance in grip

posture, or wearing gloves. Additionally there

might be glasses that distort the visual image

and so affect visual position estimates, or effects

of temperature or humidity that affect sound

1

Throughout the paper we are only considering

additive biases, although the general scheme can be

extended to other forms of biases, for example,

multiplicative biases.

propagation and thus affect auditory estimates,

to name just a few. Figure 12.3A illustrates

some examples of processes that might affect

the accuracy of visual-haptic size estimates. The

top panel shows sensory signals (S

V

and S

H

) that

are accurate with respect to the world property

S

W

(so S

W

= S

W

V

= S

W

H

= S

V

= S

H

)tobe

estimated (i.e., the size of the object at a speciﬁc

position), followed by three examples of S

V

and

S

H

signals that are inaccurate and contain an

additional bias B (i.e., S

V

= S

W

V

+ B

V

and

S

H

= S

W

H

+ B

H

). For now we assume that the

signals are derived from the same location, so

the visual and haptic sizes to be estimated are

identical: S

W

= S

W

V

= S

W

H

.

If the sensory signals S

V

and S

H

, and hence

the sensory estimates derived from these signals,

228 BEHAVIORAL STUDIES

Δx

Δx

Δx

Δx

glasses

gloves

grip posture

signal discrepancy

( S

W

= S

W

V

= S

W

V

)

BA

S

V

S

V

S

V

S

H

S

H

S

V

S

H

S

V

S

H

S

V

S

H

S

W

S

V

S

H

S

W

S

V

S

H

S

W

S

V

S

H

S

W

S

W

V

S

W

H

S

W

V

S

W

H

S

W

V

S

W

H

S

V

S

H

S

W

V

S

W

H

S

H

(S

V

,S

H

)-distribution

(S

V

,S

H

)-distribution

fingertip

spatial discordance Δx

(

S

W

V

≠ S

W

H

)

Figure 12.3 (A) Visual and haptic size signals S

V

and S

H

measured near the same location on an object at

which the true size is S

W

. In this case visual and haptic sizes are identical (S

W

V

= S

W

H

). The sensory signals

can be corrupted by various disturbances, which affect their accuracy, such as different grip postures, glasses,

or gloves. (B) Visual and haptic size signals S

V

and S

H

derived from locations on an object in close proximity

(offset horizontally by

x). In this case visual and haptic sizes may differ slightly (S

W

V

= S

W

H

). Thus, the

visual and haptic size signals will also differ slightly due to variations in the shape of the object. However,

in general there will still be a correlation between the S

V

and S

H

signals as the object’s size varies smoothly.

Most probably this correlation will decrease with increasing

x. In both cases, the lower panel labeled (S

V

,

S

H

)–distribution provides the co-occurrence statistics of the signal values S

V

and S

H

that build the basis for

the prior used for multisensory integration.

MULTISENSORY PERCEPTION 229

ˆ

S

V

and

ˆ

S

H

,

2

are inaccurate, that is, if they are

biased by B

= (B

V

, B

H

) with respect to the

world property S

W

or with respect to each other

(sensory discrepancy D

= S

V

− S

H

= (S

W

+

B

V

) − (S

W

+ B

H

)), their respective values need

not necessarily agree even when they are derived

from the same location (S

W

= S

W

V

= S

W

H

). In

such a case, weighted averaging of the estimates

derived from these biased signals will inevitably

also bias the combined estimate. To avoid the

cost of biased estimates, the perceptual system

must be able to infer how accurate the signals

are. This is a difﬁcult problem that cannot be

determined directly from the sensory estimates,

because these estimates do not carry information

about their own accuracy. Reliability, on the other

hand, which is the inverse variance associated

with the estimates, can be directly assessed

from sensory measurements. Furthermore, the

mere existence of a discrepancy between sensory

estimates

ˆ

D =

ˆ

S

V

−

ˆ

S

H

does not reveal whether

some of the estimates are inaccurate, because

2

Variables with a hat always denote noisy

sensory estimates, whereas variables without a hat

represent world signals from which the sensory estimates

are derived.

even when they are accurate, the presence of

noise in the estimation process will cause the

respective peaks of their likelihood functions to

disagree slightly (as illustrated in Fig. 12.1). We

will discuss later in the chapter how persistently

biased estimates may be avoided through the

process of recalibration.

A problem regarding potential biases also

exists while using prior knowledge to reduce

perceptual uncertainty. If the prior probability

distribution does not accurately describe the

statistics of the current environment and if

the mean of the prior distribution differs from

the mean of the sensory measurements, it will

introduce a bias in the ﬁnal perceptual estimates.

Evidence for this phenomenon can be found

in several perceptual illusions, for instance, the

one illustrated in Figure 12.4. Both pictures

show footprints in the sand. However, most

people see the left image as an indentation in

the sand, whereas they see the right image as

if it were embossed or raised from the surface.

The reason for this counterintuitive perception

is the inherently ambiguous nature of the image

and the need to make certain prior assumptions

in order to interpret it (Rock, 1983). The prior

assumption we make in this case is that the light

Figure 12.4 Effect of the light-from-above prior on perception using ambiguous images. The left and right

images show footprints in the sand. In the left image the light illuminating the scene is actually coming

from above, and the footprint is correctly seen as an indentation. In the right image, which is the left

image presented upside down, the light is coming from below. Employing the light-from-above prior in

this situation causes the footprint to be seen as embossed or raised from the surface.

230 BEHAVIORAL STUDIES

source in the image is placed above the surface

(Brewster, 1826; Mamassian & Goutcher, 2001).

The assumption reﬂects our common world

experience of always having artiﬁcial or natural

light from above (Dror, Willsky, & Adelson,

2004). In the illustration, this assumption is

only correct for the left image. The illusion

arises when one views the right image. In

the right image the footprints are actually

illuminated from below. Thus, making prior

assumptions about light from above, that is,

using an inappropriate prior for the current

situation, forces our perception toward a bias

that causes us to see the footprints raised from

the surface.

To interact successfully with the environment

in order to, say, point to an object, the goal of

the sensorimotor system must be to derive accurate

estimates for the motor actions to be performed.

For example, we might wish to interact with

the environment by touching one of the toes

of the footprints shown in Figure 12.4. Evoking

an inappropriate prior will introduce a bias into

the inferred depth used for pointing. That is, in

the right part of Figure 12.4 we would wrongly

point to the illusory perceived embossed toe

instead of the actual imprinted toe (Hartung,

Schrater, Bülthoff, Kersten, & Franz, 2005).

Therefore, biases such as those discussed earlier

are undesirable and should be avoided. This,

in turn, predicts that multisensory integration

must break down with an increase of conﬂicting

information between the multisensory sources.

For this reason prior knowledge should be

disregarded if it is evident that the sensory

information is derived from an environment

with statistical regularities that conﬂict with

those represented by the prior probability

distributions. There is experimental evidence

to back up both these claims, which will be

discussed next.

As indicated earlier, there are many percep-

tual illusions that arise because prior assump-

tions bias the percept. Another example is the use

of prior knowledge about symmetry or isotropy

in visual slant perception (Palmer, 1985). When

asked for the three-dimensional interpretation of

an ellipse, humans consistently see the ellipse as

a circle slanted in depth. This perceptual effect

is explained using a prior for symmetry which

when evoked interprets the ellipse as a circle.

This may make sense because, considering the

statistics of our world, we are more likely to

encounter circles than ellipses. Therefore, under

these statistical considerations, for the unlikely

event that the ellipse is really an ellipse, this

prior will give rise to a biased percept. Knill

(2007a) showed that we down-weight such prior

knowledge for seeing circles if we are placed

in an environment where ellipses or irregular

shapes occur more frequently. This is consistent

with the idea that we begin to ignore the prior

when there is statistical evidence against the

symmetry assumption. As a consequence, this

strategy saves us from acquiring biases based on

false prior assumptions. Along the same lines,

Adams et al. (2004) showed that the light-from-

above prior (as demonstrated in Fig. 12.4) adapts

when observers are put in an environment where

the light source is placed predominantly to the

left or right, instead of above.

There is also empirical evidence for biases in

multisensory perception and for the breakdown

of multisensory integration with large discrepan-

cies between the sensory estimates. For example,

multisensory integration has been studied

experimentally by the deliberate introduction of

small discrepancies between sensory signals such

that the perceptual consequences of integration

are evident in a bias resulting from weighted

averaging; a method termed “perturbation

analysis” (Young, Landy, & Maloney, 1993).

Some notable demonstrations of multisensory

biases induced by weighted averaging include

shifts in perceived location (Alais & Burr, 2004;

Bertelson & Radeau, 1981; Pick, Warren, & Hay,

1969; Welch & Warren, 1980), perceived rate of

a rhythmic stimulation (Bresciani, Dammeier,

& Ernst 2006, 2008; Bresciani & Ernst, 2007;

Bresciani et al., 2005; Gebhard & Mowbray, 1959;

Myers, Cotton, & Hilp, 1981; Recanzone, 2003;

Shams, Kamitami, & Shimojo, 2002; Shipley,

1964; Welch, DuttonHurt, & Warren, 1986), or

perceived size (Ernst & Banks, 2002; Helbig &

Ernst, 2007). With larger experimentally induced

discrepancies between the perceptual estimates,

however, the integration and weighted averaging

process breaks down (Knill, 2007b). Integration

MULTISENSORY PERCEPTION 231

breaks down even more rapidly if there is

additional evidence that the sources of infor-

mation do not originate from the same object

or event. For example, Gepshtein, Burge, Banks,

and Ernst (2005) showed that visual and haptic

size integration breaks down rapidly if the visual

and haptic information do not come from the

same location. That is, location information

is used in addition to determine whether to

integrate the size estimates. Several studies have

shown this breakdown of integration with spatial

discordance in a similar way (Jack & Thurlow,

1973; Jackson, 1953; Warren & Cleaves, 1971;

Witkin, Wapner, Leventhal, 1952; but see also

Recanzone, 2003). The breakdown also happens

with temporal discrepancies (e.g., Bresciani et al.,

2005; Radeau & Bertelson, 1987; Shams et al.,

2002; van Wassenhove, Grant, & Poeppel, 2007).

This breakdown of integration with increasing

discordance in space and time deﬁnes the spatial

and temporal windows of integration. It is

more generally referred to as robustness of

integration.

We have now identiﬁed two competing goals

of the perceptual-motor system: the ﬁrst goal,

discussed in the previous section, was to achieve

the most reliable estimates possible; the second

goal, discussed in this section, was to avoid

inaccuracy of the estimates, that is, to achieve

the most accurate estimates. To maximize the

gain from integration, these two competing goals

must be best balanced. For this the precision

(reliability) and accuracy of the sensory estimates

has to be known to the system. As mentioned

earlier, reliability can in principle be determined

online from analyzing the estimates. However,

there is no direct information in the sensory

signals or estimates that would allow one to

determine their accuracy. In the following we

will therefore concentrate on the question of how

the brain determines whether sensory signals

and estimates are accurate, whether there is

a discrepancy between the sensory estimates,

and so whether to integrate. The same question

arises for the use of prior knowledge as well

and whether it conforms to the statistics of the

present environment. To keep matters simple,

however, from now on we will concentrate on

the ﬁrst question.

BALANCING BENEFITS

AND COSTS

Whether to integrate different multisensory

estimates depends on the presence of an actual

difference D between the multiple sensory

signals. The perceptual system, however, does

not have direct access to the sensory signals but

only to the estimates derived from these signals.

Thus, to estimate what constitutes an actual

difference D between the signals is a question

that is itself shrouded in uncertainty because of

the noise in the estimation process. That is, when

we make estimates

ˆ

D of sensory discrepancies,

we are unable to do so reliably because of the

noise in such estimates (see Wallace et al., 2004).

For this reason, it is practically impossible to

determine an absolute threshold for whether to

integrate. Every time a discrepancy is detected

between two estimates, the perceptual system

must determine (either implicitly or explicitly)

the reason for such a discrepancy. If the

discrepancy

ˆ

D arises from random noise in

the processing of the neuronal signals, the

discrepancy changes randomly from trial to trial.

In this case, by integrating the two estimates,

the perceptual system could average out the

inﬂuence of such noise as shown in the beginning

of this chapter. However, if the discrepancy

in the estimates

ˆ

D were due to a systematic

difference D between the signals, then the best

strategy would require the perceptual system to

not integrate the multisensory information. This

may occur, for example, in a scenario where

the sensory signals to be combined show some

inaccuracy (in form of an additive bias B) with

respect to the world (i.e., S

V

= S

W

V

+ B

V

or

S

H

= S

W

H

+ B

H

), or with respect to one another

(i.e., D

= S

V

− S

H

). Figure 12.3A illustrates this

with a few examples showing how the sensory

signals (S

V

and S

H

) may become inaccurate with

respect to the world property S

W

= (S

W

V

, S

W

H

)

to be estimated. As a consequence, determining

the reason for the discrepancies in the sensory

estimates is a credit-assignment problem with

two possibilities: The reason for the discrepancy

could either be a difference between the signals

or a random perturbation as a result of noise,

where both possibilities are uncertain. Since both

232 BEHAVIORAL STUDIES

possibilities are plausible and have associated

uncertainty, the optimal strategy would be to

use them both and weight each according to its

relative certainty. We call this optimal because it

balances the beneﬁt of multisensory integration

while minimizing the potential costs associated

with it. This intuitive concept forms the basis

of a model that we discuss further in the next

section.

MODELING FUSION, PARTIAL

FUSION, AND SEGREGATION

To summarize, no matter how small it may be,

a discrepancy always exists between perceptual

estimates derived from different signals (

ˆ

D =

ˆ

S

V

−

ˆ

S

H

). Such discrepancies could either be

caused by random noise in the estimates (with

standard deviations

σ

V

and σ

H

), which is

unavoidable and always present, or it could be

caused by a systematic difference D in magnitude

between the sensory signals. To make the best

possible use of such discrepant information,

the brain must use different and antithetical

strategies for random noise and systematic

difference. Information should be fused if the

discrepancy was caused by random noise in

the estimates, and it should be segregated if the

discrepancy was caused by an actual difference in

the signals. Interestingly, the very determination

of the source of the discrepancy, random or

systematic, is itself uncertain and difﬁcult to

estimate and so the reason for any discrepancy

can only be determined with uncertainty. Thus,

the best solution to model such a process is to

use a fully probabilistic approach.

While our nervous system is capable of

processing many complex signals and sources

at once, we try to keep matters simple here

by considering a discrepancy between only two

estimates, each of which represents a property

S

W

= (S

W

V

, S

W

H

) speciﬁed by sensory signals

S

= (S

V

, S

H

). Thus, it is reasonable to think of

the integration process in a 2D space (Fig. 12.5),

although the problem can be extended easily to

higher dimensions. We now continue with the

example we used earlier (Eq. 12.1) in which

visual and haptic estimates are combined to

determine the size of an object S

W

= (S

W

V

, S

W

H

)

with S

W

V

= S

W

H

. Let S = (S

V

, S

H

) be the sensory

signals derived from the world S

W

= (S

W

V

, S

W

H

),

which may be biased (B

= (B

V

, B

H

); Fig. 12.3)

with respect to some world property or with

respect to one another, and let z

= (z

V

, z

H

)

be the sensory measurements derived from S.

Both the visual and haptic measurements are

corrupted by independent Gaussian noise with

variance

σ

2

i

,soz

i

= S

i

+ ε

i

with i referring to the

individual sensory modality (V

, H )

3

. With these

assumptions, the joint likelihood function takes

the form of a Gaussian density function:

p(z

|S) =N (

ˆ

S,

z

)

with

z

=

σ

2

V

0

0

σ

2

H

,

(12.6)

which is a bivariate normal distribution with

mean

ˆ

S

= (

ˆ

S

V

,

ˆ

S

H

) = (z

V

, z

H

) (i.e., the

maximum-likelihood estimates of the sensory

signals equal the noisy measurements) and

covariance matrix

z

(left column in Fig. 12.5).

The likelihood function represents the sen-

sory measurements on a given trial. The goal

of this task will be the estimation of a property

of the world, such as S

W

, while taking into

account both sensory imprecision (due to

random noise) and inaccuracy (additive bias).

In the rest of this chapter, we will develop a

Bayesian model of this process that proceeds

in two steps. In the ﬁrst stage, discussed in

this section and the next, we describe how

the observer can use Bayes’ rule to calculate

a posterior distribution of the sensory signals

given the noisy measurements, P(S

V

, S

H

|z

V

, z

H

),

and MAP estimates of those sensory signals,

ˆ

S

MAP

, that take into account prior knowledge of

the correlations between the signals, p(S

V

, S

H

).

In subsequent sections, we describe how the

observer can use prior knowledge of the likely

inaccuracy in each modality (B

V

, B

H

) along with

current estimates of the discrepancy between

3

As previously, we assume the visual and haptic

estimates are normally distributed and statistically

independent. Oruç et al. (2003) and Ernst (2005) describe

an analysis of how such a system behaves in case the

estimates are not independent and how this may give rise

to negative weights.

MULTISENSORY PERCEPTION 233

m

2

x

2

Likelihood Prior Posterior

S

W

V

S

W

H

MAP

Flat prior

Impulse prior

x

x

x

x

x

x

∝

MLE

x

2

x

2

= 0

(

(

)

)

V

2

H

2

ˆ

D

MAP

=

ˆ

S

ˆ

S

ˆ

D =

ˆ

S

ˆ

S

(

ˆ

S ,

ˆ

S )

(

ˆ

S ,

ˆ

S )

Figure 12.5 The combination of visual and haptic measurements with different prior distributions. (Left

column) Likelihood functions resulting from noise with standard deviation

σ

V

twice as large as σ

H

;x

indicates the maximum-likelihood estimate (MLE) of the sensory signals

ˆ

S = (

ˆ

S

V

,

ˆ

S

H

). (Middle column)

Prior distributions with variance

σ

2

m

→∞, but different variances σ

2

x

. Top: ﬂat prior σ

2

x

→∞; middle:

intermediate prior 0

<σ

2

x

< ∞; bottom: impulse prior σ

2

x

= 0. (Right) Posterior distributions, which

are the normalized product of the likelihood and prior distributions. A dot indicates the maximum a

posteriori (MAP) estimate

ˆ

S

MAP

= (

ˆ

S

MAP

V

,

ˆ

S

MAP

H

). Arrows correspond to bias in the MAP estimate relative

to the MLE estimate. The orientation of the arrows indicates the weighting of the

ˆ

S

V

and

ˆ

S

H

estimates.

The length of the arrow indicates the degree of fusion. (Adapted with permission from Ernst, 2007.

Copyright ARVO.)

sensory signals (

ˆ

D

MAP

=

ˆ

S

MAP

V

−

ˆ

S

MAP

H

) after

integration occurred to solve iteratively the

credit-assignment problem: What portion of

ˆ

D

MAP

should be attributed to the bias B

i

or

the world property S

W

i

of each modality? The

solution of this problem will allow the observer

to remap each modality, as a means of providing

the best possible (low bias and low uncertainty)

estimate of S

W

.

To begin, we assume that the system has

acquired a priori knowledge about the proba-

bility of jointly encountering a combination of

sensory signals encoded in the prior p(S

V

, S

H

).

Some examples of visual and haptic signals

to size (S

V

, S

H

) that might be encountered

in conjunction when trying to estimate the

world property S

W

are provided in Figure 12.3.

The lower row in Figure 12.3 shows what such a

distribution of jointly encountered signals might

look like. Figure 12.3A shows cases where the

signals are derived from the same location for

which we can assume that S

W

V

= S

W

H

. All these

examples show signals with varying accuracy

(B

i

= S

W

i

− S

i

). The point here is that the

variance in the joint distribution and hence the

variance of the prior learned from these signals is

affected by the variability in accuracy of the two

signals. Figure 12.3B illustrates a similar example

of co-occurrence of visual and haptic signals,

but here these signals are derived from slightly

disparate locations

x for which in general

S

W

V

= S

W

H

. We return to this example in a later

section of this chapter when we discuss the link

between integration and remapping.

234 BEHAVIORAL STUDIES

Assuming for now that all the joint distribu-

tions are Gaussians, a prior that fulﬁlls what we

have discussed thus far can be deﬁned as:

p(S)

=p (S

V

,S

H

)=N (n,)

with

=R

T

σ

2

m

→∞ 0

0

σ

2

x

R, (12.7)

which is a bivariate normal distribution with

mean n

= (0, 0) and covariance matrix . σ

2

m

and σ

2

x

are the variances of the prior along its

principal axes and R is an orthogonal matrix

that rotates the coordinate system by 45

◦

so

that the prior is aligned with the diagonal

where S

V

= S

H

(Fig. 12.5, middle column).

We choose the variance along the positive

diagonal to be

σ

2

m

→∞, which indicates that the

probability of jointly encountering two signals

(S

V

, S

H

) is independent of their mean value.

4

The second variance, σ

2

x

, indicates the spread

of the joint distribution, which represents the

a priori distribution of possible discrepancies

between the signals. Therefore, the probability

that the source of any detected discrepancy

ˆ

D

is not random noise but an actual difference

between the signals D

= S

V

− S

H

is a function

of the variance

(i.e., σ

2

x

) of this prior.

The diagonal with S

V

= S

H

represents the

mapping between the signals since it provides

the functional relationship between the two. We

can therefore also refer to

σ

2

x

as the mapping

uncertainty. Furthermore, this distribution also

provides a measure of redundancy between the

two signals; the smaller the variance

σ

2

x

, the more

redundant the signals are with respect to one

another.

Figure 12.5 illustrates three examples of the

model described earlier for prior distributions

with different

σ

2

x

(middle column) ranging from

very large (top row) to near zero (bottom row).

A prior probability with

σ

2

x

→∞corresponds to

a state in which any possible combination of S

V

and S

H

signals contains roughly an equal a priori

probability of occurrence. Such a prior is often

referred to as a “ﬂat prior.” In this extreme case

4

Thus, n could have any value with S

V

= S

H

.We

arbitrarily choose n

= (0, 0).

of σ

2

x

→∞, there is no mapping between the

sensory signals or estimates derived from them

and thus the discrepancy between the estimates

is ill deﬁned. Theoretically, however, one might

argue that the accuracy of the signals with

respect to this ill-deﬁned mapping approaches

zero. This has also been referred to as signals

that are invalid (with respect to the property

deﬁned by the mapping). Such a situation is an

example of signals S

V

and S

H

that do not carry

redundant information. Thus, as an example we

could take any set of nonrelated signals, such

as the luminance and the stiffness of an object,

which are highly unlikely to carry any redundant

information (Ernst, 2007) and can co-occur in

any possible combination.

A prior probability with

σ

2

x

= 0, on the

other hand, corresponds to a state in which

signals occur only for the condition S

V

= S

H

.

Such a prior relates to signals that are always

perfectly accurate (with respect to the property of

interest). In this situation the prior probability of

encountering an actual difference D between the

signals is zero. Thus, in this situation the sensory

signals are completely redundant. While such

a scenario would be purely theoretical because

there is always some variance present, indirect

empirical evidence that humans use very tight

priors was provided by Hillis, Ernst, Banks, and

Landy (2002), who found close to mandatory

fusion of disparity and texture estimates to slant

(see later discussion).

An intermediate value of

σ

2

x

corresponds to

a state in which the probability distribution

indicates some uncertainty with respect to the

possible co-occurrence of signal values S

V

and

S

H

. Such a prior relates to signals that display

some inaccuracy with respect to the mapping

and thus there exists a nonzero probability of

encountering various differences D between the

signals. The signals in this situation are thus

only partially redundant (with respect to one

another). Since this prior refers to the probability

of co-occurrence of certain signals, that is,

it represents the prior probability of jointly

encountering an ensemble of sensory estimates,

in earlier work this prior p(S

V

, S

H

) has also been

referred to as the “coupling prior” (Bresciani

et al., 2006; Ernst, 2005, 2007). Realistically, all

MULTISENSORY PERCEPTION 235

cases of multisensory integration, such as size

estimation from vision and touch, fall into this

category (Ernst, 2005; Hillis et al., 2002). This

is because there is always some probability that

the signals are inaccurate due to external or

internal factors, such as muscle fatigue, optical

distortion, or other environmental or bodily

inﬂuences (Fig. 12.3A).

Using Bayes’ rule (see Chapter 1), the joint

likelihood function obtained from the sensory

signals is combined with prior knowledge

about the co-occurrence statistics of these

signals. This gives rise to a ﬁnal estimate of

the sensory signals

ˆ

S

MAP

= (

ˆ

S

MAP

V

,

ˆ

S

MAP

H

) based

on the posterior distribution p(S

V

, S

H

|z

V

, z

H

) ∝

p(z

V

, z

H

|S

V

, S

H

)p(S

V

, S

H

), which balances the

beneﬁt of reduced variance with the cost of

a potential bias in the estimate (Fig. 12.5,

right column). Note that this step does not

yet provide an estimate of the world property

S

W

= (S

W

V

, S

W

H

) or the biases B = (B

V

, B

H

).

How we estimate S

W

and B will be discussed

in the later section, “From Integration to

Remapping.” However, from the MAP estimate

of the sensory signals we can derive the best

estimate of the current discrepancy D between

the signals, which is

ˆ

D

MAP

=

ˆ

S

MAP

V

−

ˆ

S

MAP

H

.

The posterior estimate

ˆ

S

MAP

is shifted with

respect to the likelihood

ˆ

S. This shift is

highlighted by the arrow in Figure 12.5, right

column. The length of the arrow indicates the

strength of the integration, whereas the direction

of the arrow indicates the weighting of the

sensory estimates. In the following we will more

closely investigate this shift (captured by the

two parameters of the arrow) for the three

values of

σ

2

x

.

If the prior is ﬂat (

σ

2

x

→∞; Fig. 12.5,

top row), the posterior becomes identical to

the likelihood function, which implies that the

multisensory estimates are not integrated but

kept independent, that is, they are segregated

(no shift). Since the signals are independent,

any form of integration in this case would only

introduce a bias into the ﬁnal estimates. Given

this situation, there can also be no beneﬁt from

integration in the form of reduced variance

because the signals do not carry redundant

information.

In contrast, a prior with

σ

2

x

= 0 gives rise

to a posterior that results in complete fusion

(Fig. 12.5, bottom row). As can be observed from

the ﬁgure, such an impulse prior denotes the

existence of only those signals for which S

V

= S

H

.

Thus, in the case of fusion, the maximum a

posteriori (MAP) estimate

ˆ

S

MAP

coincides with

the prior p(S

V

, S

H

). The direction α in which

the estimate is shifted is solely determined by

σ

2

V

and σ

2

H

of the likelihood function (Bresciani

et al., 2006):

α = arctan

σ

2

H

σ

2

V

.

(12.8)

In this particular case, the MAP estimate

maximally beneﬁts from fusion by acquiring

the smallest possible variance in the combined

estimate. The prior with

σ

2

x

= 0 applies to a

situation with entirely accurate and perfectly

redundant signals. Thus, whatever detected

discrepancy exists must be a consequence of

measurement noise. This case where

σ

2

x

=

0 is identical to the previously discussed

standard model of cue integration, which

also assumed unbiased (accurate) signals and

estimates derived from the same world property

(see section on “Multisensory Integration” in

this chapter and Chapter 1).

For cases where 0

<σ

2

x

< ∞, the MAP

estimate

ˆ

S

MAP

= (

ˆ

S

MAP

V

,

ˆ

S

MAP

H

) is situated midway

between the maximum-likelihood estimates

(

ˆ

S

V

,

ˆ

S

H

) and the diagonal (Fig. 12.5, middle

row). In other words, the result here lies between

the “no fusion” case and “complete fusion” case,

and thus we refer to it as “partial fusion.” The

strength of integration is indicated by the length

L of the arrow, which has been normalized to

the size of the conﬂict and can be described as

a weighting function between the likelihood and

the prior in the direction of

α (the direction of

bias

α can be determined from Eq. 12.8):

L

=

σ

2

likelihood

(α)

σ

2

likelihood

(α) + σ

2

prior

(α)

. (12.9)

Any measured discrepancy

ˆ

D =

ˆ

S

V

−

ˆ

S

H

is the

result of both measurement noise (

σ

V

and σ

H

)

and an actual discrepancy (D

= S

V

− S

H

) due to

236 BEHAVIORAL STUDIES

bias B (inaccuracy in S

V

and/or S

H

, assuming

S

W

V

= S

W

H

). Combining the likelihood with

the prior, resulting in this weighting function

(Eq. 12.9), provides the best balance between

reliability and accuracy of the estimates of the

sensory signals (in the MAP sense). The overall

variance of the ﬁnal estimate resulting from

partial integration of the sensory signals lies in

between that resulting from pure segregation and

complete integration (Fig. 12.5, right column). It

must be noted, however, that the ﬁnal estimate

can only proﬁt from the integration process to

the extent to which the signals are redundant.

Thus, this weighting scheme constitutes the best

balance between the costs of introducing a bias

in the estimates and beneﬁts of reducing their

variances. The remaining difference in the MAP

estimates

ˆ

D

MAP

=

ˆ

S

MAP

V

−

ˆ

S

MAP

H

corresponds

to the best current estimate of the actual

discrepancy D between the sensory signals.

The predictions of this model, both regarding

bias and variance, have been conﬁrmed by an

experimental study of the perceived quantity

of visual and haptic events (Bresciani et al.,

2006).

This theoretical framework can also explain

how we can learn to integrate over an artiﬁ-

cially enforced, statistical relationship between

two arbitrary signals (Ernst, 2007). In this

study, participants were trained by presenting

previously unrelated aspects of a stimulus,

for example, the luminance and the stiffness

of an object, in correlation for some time.

Participants learned this correlation; they began

to exhibit integration of the two aspects of

the stimulus which were previously unrelated.

This was interpreted as the learning of a

new prior probability that certain combinations

of the two stimulus aspects—luminance and

stiffness—are likely to co-occur. Once such

a relationship is learned, the newly acquired

prior knowledge can be used to integrate the

estimates and therefore observers can beneﬁt

from a reduction in estimation noise. Thus,

during the experiment, the participants switched

their behavior from treating the estimates as

completely independent to a more intermediate

perception of the estimates exhibiting “partial

fusion.”

BREAKDOWN OF INTEGRATION

It is important to note that in the model

described in the previous section, the extent

of the discrepancy between the maximum-

likelihood estimates

ˆ

D =

ˆ

S

V

−

ˆ

S

H

does not

inﬂuence the integration process (i.e., whether

estimates are integrated or segregated). The

weighting between the estimates, that is, the

weighting between the likelihood and the prior,

as well as the direction of shift

α, are all

independent of the extent of the discrepancy

given the assumptions of this model. Thus, this

model so far does not capture the breakdown

of integration. This is because the shape of the

prior and the shape of the likelihood are both

assumed to be Gaussian. The problem arises at

larger conﬂicts between signals where, in order to

behave robustly, integration should break down.

Roach, Heron, and McGraw (2006) suggested

relaxing the Gaussian assumption to account for

this possibility. In particular, they introduced

“heavy tails” to the Gaussian distribution of the

prior. This transforms the prior in a very sensible

way: Close to the diagonal the prior by and

large keeps its Gaussian shape with a reasonable

variance. Far from the diagonal the prior does

not approach zero probability as a Gaussian

would, but maintains a nonzero probability.

In essence Roach et al. (2006) suggest a linear

combination of a ﬂat coupling prior that is used

for modeling segregation (Fig. 12.5, upper row)

and a coupling prior that is used for modeling

partial fusion or fusion. As a result, the system

continues to behave as it did without the long

tails when the discrepancies are reasonably small,

since the central Gaussian part of the prior

plays the dominant role. For larger discrepancies,

however, this prior ensures that the process

converges toward segregation, because of the

increased inﬂuence of the ﬂat part of the prior.

This model can further be extended to

orthogonal dimensions to include, for example,

spatial and temporal discordance as well.

5

5

We call a conﬂict along the dimension to be estimated

(e.g., size) discrepancy, whereas when we refer to a conﬂict

in an orthogonal dimension (e.g., space or time) we call

this discordance.

MULTISENSORY PERCEPTION 237

Δx= 0

Δx=−5

Δx=−10

Δx= 5

Δx= 10

Likelihood Prior Posterior

(Δt= 0)

(Δt=−100)

(Δt=−200

)

(Δt= 100)

(Δt= 200)

S

W

H

S

W

V

Figure 12.6 Schematic illustration demonstrating robust estimation, that is, the breakdown of integration.

The coupling prior is assumed to be of Gaussian shape with heavy tails (Roach et al., 2006). The variance of

the Gaussian increases with increasing spatial or temporal discordance between the two signals, reﬂecting

a lower correlation between the signals (Fig. 12.3B). Thus, with small discrepancies between the S

V

and S

H

signals, the weight of the prior decreases with temporal asynchrony and spatial disparity and so the effect

of integration disappears. With large spatial inconsistencies or temporal asynchronies the two signals can

then be perceived independently of one another as the correlation tends to disappear and the coupling prior

becomes ﬂat. (Adapted with permission from Ernst, 2007. Copyright ARVO.)

The conceptual basis of the model is illustrated

in Figure 12.3B. We assume that under most

circumstances, objects and the environment

tend to change in their properties over space

and time in a smooth, continuous way rather

than a discontinuous and chaotic manner. Thus,

despite the spatial or temporal discordance,

generally there will still be a correlation

between the multisensory signals. This corre-

lation implies that despite some spatial and

temporal discordance there is still redundancy

in the multisensory signals. This redundancy

should be used by the brain to improve its

estimates. An example of a distribution of

spatially discordant S

V

and S

H

size signals is

indicated in the lower panel of Figure 12.3B.

With increasing spatial discordance

x, this

correlation becomes weaker and weaker until

ﬁnally the co-occurrence statistics of signals

derived from vision and touch will result in a ﬂat

distribution. This change in the co-occurrence

statistics with increasing spatial or temporal

discordance is illustrated in Figure 12.6. The left

column shows a likelihood function, which

is identical for all ﬁve situations depicted

since the sensory measurements are assumed

to be identical in all cases. The effect of

spatial and temporal discordance is reﬂected

in the prior. For

x = 0(t = 0) such a

prior would resemble a central Gaussian with

intermediate variance (analogous to Fig. 12.5,

middle row), which also has heavy tails to

account for the integration breakdown with

increasing disparity in the size estimates (the ﬂat

tails are indicated by the gray background in

the prior). With increasing spatial or temporal

discordance (

x = 0ort = 0), the variance

of the central part of the prior increases. This

is because the prior probability of encountering

combinations of S

V

and S

H

signals, for which the

discrepancy D

= S

V

− S

H

is large, increases with

the discordance in space or time (

x or t ).

As a consequence, as the discordance in space or

time increases, the inﬂuence that the Gaussian

part of the prior exerts on the likelihood

function decreases. This process is represented

238 BEHAVIORAL STUDIES

by the arrows in Figure 12.6 (right column).

This phenomenon corresponds to a breakdown

of the integration process across space and

time, which upon experimentation manifests

itself as the spatial and temporal windows of

integration.

The exact shape of the prior distribution

reﬂects the co-occurrence statistics of the sensory

signal values S

V

and S

H

. This in turn determines

the point at which the integration falloff occurs

and therefore also determines the dimensions of

the temporal or spatial window of integration.

It is likely that all these priors have ﬂat

tails, because even at large discrepancies there

will always be some remaining probability

of encountering outliers in the co-occurrence

statistics. The tails enable the independent

treatment of signals at large inconsistencies. In

principle, it should be possible to reconstruct

the observers’ embodiment of such a prior

from experiments that measure the spatial and

temporal integration windows. This could be

achieved, for example, by extending the methods

introduced by Stocker and Simoncelli (2006) to

this two-dimensional estimation problem.

Recently, a few other approaches have been

proposed to model this elusive aspect of the

robustness or breakdown of integration. Some

of these methods have also been described in this

book (Chapters 2 and 13). We will discuss two of

the more prominent proposals in this direction,

both of which closely resemble the proposal

presented here. The ﬁrst proposes that the like-

lihood function is a mixture of Gaussians (Knill,

2007b) to explain the breakdown of integration,

whereas the second approach formalizes the

concept of causal inference to achieve the same

purpose (Körding et al., 2007). Both approaches

model the transition from fusion to segregation

successfully; however, they both relate to special

cases and speciﬁc scenarios for which they might

be considered optimal.

The mixture-of-Gaussians approach by Knill

(2007b) refers to a speciﬁc scenario in which

a texture signal to slant is modeled by a

likelihood function, which is composed of a

central Gaussian with heavy tails. This proposal

resembles what we have discussed earlier in this

chapter, except that the heavy tails are added

to the probability distributions of one of the

sensory estimates and not to a coupling prior.

The primary argument in this theory is that in

order for texture to be a useful signal, we must

make some prior assumptions about the isotropy

of the texture that, in statistical considerations,

could possibly fail in some cases. This argument

provides a suitable justiﬁcation for the use of

heavy tails. The argument, however, is speciﬁc

to the texture signal and can therefore not be

easily extended to other within- or cross-modal

sensory signals.

The second proposal attempts to formalize

the concept of causal inference to model why

integration breaks down with highly discrepant

information (Körding et al., 2007). The proposal

has the same intuitive basis that we have been

referring to from time to time, that is, segregation

at large discrepancies, integration when there is

no apparent discrepancy. This model, however,

concentrates on the causal attribution aspects of

combining different signals. Two signals could

either have one common cause, if they are

generated by the same object/event, or they may

have different causes when generated by different

objects/events. In the former case, the signals

should be integrated, and in the latter case they

should be kept apart. The model takes into

account a prior probability p

common

of whether

a common source or separate sources exist for

a given set of multisensory signals. p

common

= 1

corresponds to perfect knowledge that there is

a common source and thus complete fusion.

As discussed previously, complete fusion can

be described by a coupling prior corresponding

to an impulse prior with

σ

2

x

= 0. p

common

=

0 corresponds to complete knowledge that

there are two independent sources and thus

complete segregation. Complete segregation was

previously described by a ﬂat coupling prior

with

σ

2

x

→∞. Whenever two sensory signals

are detected, in general there will be some

probability p

common

of a common cause and

some probability 1

− p

common

of independent

causes. This probability depends on many factors

such as, for example, temporal delays, visual

experience, context, and many more (Körding

et al., 2007), so it is not easy to predict. In

any case, however, it will lead to a weighted

MULTISENSORY PERCEPTION 239

combination of the two priors for complete

fusion and segregation, and will thus in essence

be analogous to a coupling prior, which has

the form of an impulse prior with ﬂat, heavy

tails (Körding et al., 2007, supplement). In this

sense, the causal-inference model is a special

case of the model described earlier. It does not

allow for variance in the prior describing the

common cause (i.e., the impulse prior), because

just like the standard model of integration (see

Chapter 1), the causal-inference model is based

on the assumption that all sensory signals with

a common cause are perfectly correlated and

accurate (i.e., the sensory estimates are assumed

to be unbiased). Because it does not consider a

weaker correlation between the co-occurring sig-

nals (i.e., the situation illustrated in Fig. 12.3B)

and because it does not take into account the

(in)accuracy of the signals (i.e., the situation

in Fig. 12.3A), the causal-inference model does

not optimally balance the beneﬁts and costs

of multisensory integration, that is, reduced

variance and potential biases, respectively.

REMAPPING

As discussed earlier, multisensory integra-

tion breaks down with increasing discrep-

ancy between the estimates. However, if the

discrepancy is systematic and persists over

several measurements, we adapt to such a

discrepancy and doing so brings the conﬂicting

sensory maps (or sensorimotor maps) back

into correspondence. This process of adaptation

is therefore also referred to as remapping or

recalibration. In this section, we review optimal

linear models of remapping in the context of a

visuomotor task. In the next section, we apply

this model to the problem of combining visual

and haptic size signals while simultaneously

determining the best remapping of each.

There are many examples of such sensory and

sensorimotor adaptation processes (e.g., Adams,

Banks, & van Ee, 2001; Bedford, 1993; Frissen,

Vroomen, & de Gelder, 2003; Pick et al., 1969;

Welch, 1978; Welch & Warren, 1980, 1986). The

most classic examples of this phenomenon are

the experiments on prism shifts ﬁrst studied

by Hermann von Helmholtz (1867). In these

experiments observers were asked to point to a

visual target in “open loop” using fast pointing

movements. Here, the use of open loop refers

to the absence of online feedback to control

the movement. The visual feedback can only be

procured at the end of the pointing movement

upon observing the location where the ﬁnger

landed. Let this position of the estimated

location of the feedback signal be

ˆ

S

F

and the

estimated target location be

ˆ

S

L

. After each trial

of such open loop pointing, an error in pointing

response can be detected that corresponds to

the difference between the feedback- and target-

position estimates:

ˆ

D =

ˆ

S

F

−

ˆ

S

L

. It is this error

that adaptation seeks to minimize.

A typical visuomotor adaptation experiment

consists of three phases: a baseline, in which

the accuracy of pointing performance is assessed

(Fig. 12.7, trial

<60). Once the baseline is

established, observers receive spectacles ﬁtted

with prisms that shift the visual world by some

constant amount (e.g., 10

◦

). Once observers

wear the prism-ﬁtted spectacles, they exhibit an

initial error in their pointing response, which

is equivalent to the extent of the prism shift.

After only a few pointing movements, however,

observers begin to correct for the error induced

by the prism and eventually “adapt” to this

change (Fig. 12.7, 60

≤ trial ≤ 110). After

adaptation has been achieved, the removal of

these prism glasses results in recalibration back

to baseline (Fig. 12.7, trial

>110).

An interesting aspect of this phenomenon is

the rate at which people adapt to these changes.

This rate varies enormously depending on the

experimental condition. For instance, the rate

of adaptation strongly depends on the nature of

the conﬂicting signals provided to the observer.

In visuomotor tasks, like pointing to targets,

usually the ﬁrst few trials after wearing prism-

spectacles are sufﬁcient for reaching an almost

constant minimization of the error, that is,

reaching an asymptote for the newly introduced

change. In contrast, adaptation purely within

the visual domain, for instance, for texture and

binocular disparity signals, has been known to

take up to several days until adaptation saturates

and a constant minimization of the error

has been achieved (Adams et al., 2001). Four

240 BEHAVIORAL STUDIES

Low

60 110

D

t

D

t

^

Trial number

Mapping change

160

10

Mapping change

60 110

160

10

60 110

Trial number

160

10

60 110

160

10

High

Low

High

Mapping uncertainty parameter

(s

x

)

Feedback uncertainty parameter

(s

z

)

Figure 12.7 Kalman-ﬁlter responses to step changes. The dashed black lines in each panel represent the

relationship between the position of the reach endpoint and the position of the visual feedback. This

relationship is the visuomotor mapping. As in our experiments, there are three phases: prestep (trials

1–60), step (61–110), and poststep (111–160). A ﬁrst step change in the mapping occurs at the end of

the prestep phase; the initial mapping is then restored after the step phase. The blue curves represent the

visuomotor mapping estimates

ˆ

D

MAP

t

+

over time. The upper and lower rows show models of the estimates

when the measurement uncertainty

σ

2

z

is small and large, respectively. An increase in σ

2

z

causes a decrease

in adaptation rate. The left and right columns show responses when the mapping uncertainty

σ

2

x

is small

and large, respectively. An increase in

σ

2

x

causes an increase in adaptation rate; the effect is larger when σ

2

z

is large. (Adapted with permission from Burge et al., 2008. Copyright ARVO.)

examples of adaptation proﬁles with different

rate parameters are provided in Figure 12.7.

Even though adaptation has been actively

researched for over a 100 years, the search for

a computational framework for it only began in

recent times with models that tried to describe

the process underlying remapping (e.g., Baddley,

Ingram, & Miall, 2003; Burge, Ernst, & Banks,

2008; Gharahmani et al., 1997).

In 2008, we investigated how the statistics

of the environment and the system together

inﬂuence the rate of adaptation in visuomotor

tasks (Burge et al., 2008). The problem can be

formulated in a manner almost analogous to that

faced in integration. When a conﬂict

ˆ

D

t

=

ˆ

S

F,t

−

ˆ

S

L,t

is detected on a given trial t , which in this case

would be the difference between the estimated

feedback and target positions, the perceptual

system must ask itself, what is the source of this

conﬂict.

6

Upon consideration, we ﬁnd that the

6

In the earlier example, we would deﬁne

ˆ

D

t

=

ˆ

S

V ,t

−

ˆ

S

H,t

.

answer is twofold: The conﬂict could be caused

by an actual discrepancy between the sensory

(or the sensorimotor) maps D

t

. Alternatively,

it could merely be due to measurement noise

σ

2

z

when acquiring the sensory estimates

ˆ

D

t

.If

this latter is indeed the case and the discrepancy

is caused solely by measurement noise, there

would be a new random discrepancy from

trial to trial, which would best be ignored

by the system. In other words, the system

should not attempt to adapt to this randomly

ﬂuctuating change in discrepancy caused by

measurement noise because to do so would

actually make things worse. In sharp contrast,

if the discrepancy instead arose due to an actual

mismatch in the sensorimotor maps D

t

, it would

cause a systematic and sustained discrepancy

over trials. Because the occurrence of this

discrepancy is persistent and systematic, it would

be appropriate for the system to adapt to it.

Analogous to what has been discussed for

integration, also for remapping the estimates of

both types of error, random versus systematic,

contain uncertainty. That is, on a given trial the

MULTISENSORY PERCEPTION 241

system can only determine the discrepancy with

some uncertainty. The measure of uncertainty

for random errors is the variance

σ

2

z

of the mea-

surement z. As noted in the previous sections on

integration, detecting a systematic error presents

more challenge for the system. This is because

such an error cannot be determined from one

trial observation alone. We must accumulate

prior knowledge about the error signal over

several observations and use this information to

successfully identify a systematic error. Those

data, however, also contain uncertainty: the

uncertainty

σ

2

x

associated with the mapping.

Since it is likely that visuomotor tasks contain

both systematic and random errors, the nervous

system must be able to weight the error estimates

ﬂexibly based on their relative uncertainties

to solve this credit-assignment problem and

to create an optimal estimate of the current

mapping. We now turn to a computational

framework that formalizes these arguments.

Let us consider that the purpose of the

system is primarily to obtain the best possible

estimate of the visuomotor mapping in order to

remain accurate. The best estimate of the current

systematic discrepancy on a given trial,

ˆ

D

MAP

t

+

(the MAP estimate derived from the posterior),

is a weighted average of the conﬂict currently

measured,

ˆ

D

t

=

ˆ

S

F,t

−

ˆ

S

L,t

(the MLE estimate),

and the prediction based on past history,

ˆ

D

t

−

(derived from the prior):

ˆ

D

MAP

t

+

= w

x

ˆ

D

t

−

+ w

z

ˆ

D

t

=

ˆ

D

t

−

+ K (

ˆ

D

t

−

ˆ

D

t

−

). (12.10)

The value K is a proportion of the error signal

by which the visuomotor mapping is adjusted.

In the framework we propose further, we refer

to this proportion as the Kalman gain. The

“

+” on the index indicates that this conﬂict

estimate is used in the next trial to update

the mapping; the “

−” on the index indicates

that this prior information is derived from

previous trials, whereas no modiﬁer on the index

indicates that it is the measurement derived

on the current trial. In an optimal scenario,

the weights would be inversely proportional to

the relative uncertainties associated with error

estimates based on measurements and prior

knowledge:

w

x

=

σ

2

z

σ

2

z

+ σ

2

x

and w

z

=

σ

2

x

σ

2

z

+ σ

2

x

.

(12.11)

From Eqs. 12.10 and 12.11, we obtain

K

=

σ

2

x

σ

2

z

+ σ

2

x

. (12.12)

Since

ˆ

D

MAP

t

+

is the optimal current estimate of the

systematic error that determines the discrepancy,

recalibration in any given trial should occur

based on this combined estimate.

Adaptation is an iterative process where every

trial t results in an updated combined estimate

of the current error signal, which is used for

updating the prior in the next step, thereby

enabling the efﬁcient tracking of the changes

that occur in the mapping. Many experiments

show that the brain can adapt under quite

complex conditions. For the sake of simplicity,

however, here we consider a linear system,

which has achieved steady state. Under these

assumptions, and following our arguments for

Bayesian optimality, the Kalman ﬁlter presents

an optimal solution to these modeling efforts

(for the derivation, refer to Burge et al., 2008).

In doing so, we treat the performance of a

visuomotor task as a control system in which

the error signal is adjusted by the proportion

K , which represents the Kalman gain of such a

system.

Figure 12.7 shows the response of a Kalman-

ﬁlter model to step changes in the mapping.

Such a step change is analogous to introducing a

prism and later removing it. As the ﬁlter adjusts

the visuomotor mapping, the error between

target and reach position decreases exponentially

with time. In other words, human subjects

compensate for the error on a trial-by-trial basis

to achieve exponentially a constant asymptote

at which they have minimized their error.

Therefore, we use the exponent

λ to express the

adaptation rate, which is a function of K :

λ =−log(1

− K ). (12.13)

242 BEHAVIORAL STUDIES

From this equation, we ﬁnd that the model

predicts faster adaptation rates for higher gains

and low adaptation rates for lower gains.

The measurement uncertainty

σ

2

z

and the

mapping uncertainty

σ

2

x

affect the Kalman gain

and thus the adaptation rate in contrasting

ways (Eq. 12.12). These opposing effects are

illustrated in Figure 12.7. With an increase in

measurement uncertainty

σ

2

z

, the adaptation rate

slows down, whereas with an increase in map-

ping uncertainty

σ

2

x

, adaptation becomes faster.

These predictions have been tested empiri-

cally by systematically varying the measurement

noise using various blur conditions on the visual

feedback signals, thus making them less reliable

to estimate (Burge et al., 2008). They found that

observers did indeed adapt more slowly with an

increase in the blur of the feedback stimuli. When

they introduced a perturbation into the mapping

on a trial-by-trial basis instead of blurring the

feedback signal, however, they found that a

random but statistically stationary error in the

feedback did not elicit any change in the rate of

adaptation. That trial-by-trial variation did not

affect the rate of recalibration suggests that the

measurement noise may be estimated online in

any given trial, but not over trials.

In a second experiment Burge et al. (2008)

perturbed the mapping from trial to trial with

time-correlated noise in a random-walk fashion.

To put it simply, in each trial a new random

variable drawn from a Gaussian distribution

was added to the previous mapping. If correctly

learned, this manipulation affects the mapping

uncertainty as the mapping is constantly chang-

ing in a time-correlated fashion. Consistent with

the predictions of the optimal adaptor, the

results showed an increased adaptation rate for

an increase in the variance of the random-walk

distribution. In conclusion, it seems that to a

ﬁrst approximation (e.g., assuming stationary

statistics) the Kalman-ﬁlter model is a good

predictor of human adaptation performance.

FROM INTEGRATION TO

REMAPPING

In this section we apply the Bayesian (Kalman-

ﬁlter) model of remapping to the visual-haptic

size-estimation task and combine it with the

partial-integration model from earlier in this

chapter. This is illustrated in Figure 12.8.

We assume there is a sequence of trials at

times t in which the observer has sensory

estimates

ˆ

S

V ,t

and

ˆ

S

H,t

, and tries to estimate

S

W

= (S

W

V

, S

W

H

). For simplicity, we assume the

perceptual situation to be constant throughout

the trials, so that S

W

, S

i

, and B

i

are all

independent of t . Furthermore, for now we

assume that S

W

V

= S

W

H

, which implies that

we are estimating the identical world property

by vision and touch. The initial situation for

estimating S

W

is that there may exist an unknown

additive bias B

= (B

V

, B

H

) in the visual and

haptic signals S

= (S

V

, S

H

) = (S

W

V

+ B

V

, S

W

H

+

B

H

) leading to a discrepancy D = S

V

− S

H

between the sensory signals. At ﬁrst, we do not

know these biases, so at time step t

= 0, before

any measurement is performed, the initial bias

estimate is

ˆ

B

0

= 0, the initial prediction for the

discrepancy is

ˆ

D

t

−

=0

= 0, and the initial coupling

prior p

0

(S

V

, S

H

) = p(S

V ,0

, S

H,0

) is unbiased, that

is, it is centered on the diagonal S

V

= S

H

.

For every time step t

= 1,2,3,..., the observer

begins by deriving the maximum-likelihood

estimate

ˆ

S

t

= (

ˆ

S

V ,t

,

ˆ

S

H,t

) of the current signals

S

t

= (S

V ,t

, S

H,t

). These MLE estimates contain

a discrepancy

ˆ

D

t

=

ˆ

S

V ,t

−

ˆ

S

H,t

. In the leftmost

column of Figure 12.8 this is indicated by

the red Gaussian blobs being off the diagonal

(equivalent to Fig. 12.5, left column).

7

The

variance of the likelihood function

z

indicates

the measurement uncertainty. Next, to solve

the credit-assignment problem of whether this

discrepancy

ˆ

D

t

is caused by noise

z

or an

actual difference D between the signals, the

Bayesian integration scheme is applied, com-

bining the maximum-likelihood estimate with

prior knowledge about the joint distribution of

S

V ,t

and S

H,t

, that is, the mapping between the

signals. That is, the column labeled “prior” in

Figure 12.8 shows an example of an intermediate

“coupling prior” with variance

σ

2

x

. This variance

7

For illustrative purposes we assume that at every trial

the same noise value

ε is added to the measurement z

i,t

=

S

i,t

+ ε

i,t

so the likelihood function is identical in each

row of Figure 12.8.

MULTISENSORY PERCEPTION 243

t=...

likelihood prior posterior

B

V

B

H

t=2

t=1

t=3

S

W

H

S

W

V

bias estimate

0

0

0

0

0

p(S

V, t

, S

H,t

)

p(B

V, t

, B

H,t

)

MLE:

S

t

^

B

t-1

^

S

t

^

MAP:

B

t

^

D

t

^

^

S

W, t

=

(S

t

– B

t

)

^^

D

t

= S

V, t

- S

H,t

^

^

^

Figure 12.8 Illustration of the link between integration and remapping of visual and haptic size estimates.

The leftmost column illustrates the maximum-likelihood estimates

ˆ

S

t

= (

ˆ

S

V ,t

,

ˆ

S

H,t

) indicated by a dot

with the corresponding measurement noise

z

indicated by the red Gaussian blob. The column labeled

“prior” gives a coupling prior p(S

V ,t

, S

H,t

) = p

0

(S

V ,t

−

ˆ

B

MAP

V

,t−1

, S

H,t

−

ˆ

B

MAP

H

,t−1

) with corresponding mapping

uncertainty

σ

2

x

indicated by the red shaded area. The column labeled “posterior” shows the maximum a

posteriori (MAP) estimate, indicated by the

◦, together with its variance. The MAP estimate is the result

of the Bayes’ product between likelihood and prior. The amount of integration and the weighting of the

signals are given by the length and the orientation of the red arrow, respectively, just as in Figure 12.5. The

estimate of discrepancy resulting from the MAP estimate is given by

ˆ

D

MAP

t

+

=

ˆ

S

MAP

V

,t

−

ˆ

S

MAP

H

,t

. To determine

the part of

ˆ

D

MAP

t

+

that can be attributed to a visual or haptic bias is again an ambiguous problem. This new

credit-assignment problem is solved in the rightmost column labeled “bias estimate.” Here the ambiguous

ˆ

D

MAP

t

+

estimate is represented by the diagonal line. Additionally, there is prior information p(B

V

, B

H

) about

potential biases occurring in the visual and haptic modality, which is indicated by the blue Gaussian blob.

The discrepancy estimate combined with the bias prior according to Bayes’ rule results in the current bias

estimate

ˆ

B

MAP

t

. This resulting bias estimate is used for shifting the coupling prior in the next time step. The

estimate

ˆ

B

MAP

t

is indicated by x and the blue arrow. The size estimate of the object is the combination of

the MAP and the bias estimate according to

ˆ

S

W ,t

=

ˆ

S

MAP

t

−

ˆ

B

MAP

t

= (

ˆ

S

MAP

V

,t

−

ˆ

B

MAP

V

,t

,

ˆ

S

MAP

H

,t

−

ˆ

B

MAP

H

,t

). This is

indicated by the sum of the red and blue arrow in the “posterior” column. Each row provides a new time

step in the remapping process. Repeating the same estimation over several trials t , the bias estimate

ˆ

B

MAP

t

,as

indicated by the blue arrow, is exponentially increasing so that in the end the system reaches the calibrated

steady state.

244 BEHAVIORAL STUDIES

corresponds to the mapping uncertainty.

Applying Bayes’ rule p(S

V ,t

, S

H,t

|z

V ,t

, z

H,t

) ∝

p(z

V ,t

, z

H,t

|S

V ,t

, S

H,t

)p(S

V ,t

, S

H,t

) results in the

optimal current estimates of the sensory signals

ˆ

S

MAP

t

=(

ˆ

S

MAP

V

,t

,

ˆ

S

MAP

H

,t

), thereby maximally reduc-

ing the variance in the sensory estimates while

at the same time providing the best possible

estimate of the current discrepancy

ˆ

D

MAP

t

+

=

ˆ

S

MAP

H

,t

−

ˆ

S

MAP

H

,t

at time step t . Thus, the MAP

estimate of the discrepancy

ˆ

D

MAP

t

+

is smaller

than

ˆ

D

t

to the extent that the two sensory

signals are coupled. The result of combining

likelihood with prior knowledge using Bayes’

rule is illustrated in Figure 12.8 in the column

labeled “posterior.” The result of integration

corresponds to Eq. 12.10:

ˆ

D

MAP

t

+

= w

x

ˆ

D

t

−

+

w

z

ˆ

D

t

=

ˆ

D

t

−

+ K(

ˆ

D

t

−

ˆ

D

t

−

). This integration

process, illustrated by the red distributions and

the red arrow, is identical to what was shown in

Figure 12.5. The MAP estimate

ˆ

S

MAP

t

=(

ˆ

S

MAP

V

,t

,

ˆ

S

MAP

H

,t

) at each time step is the best current

estimate of the size signals available. The best

current discrepancy estimate between the size

signals corresponds to

ˆ

D

MAP

t

+

=

ˆ

S

MAP

V

,t

−

ˆ

S

MAP

H

,t

.

Note, up to now we have no estimate of bias

ˆ

B

t

and no estimate of visual and haptic object

size S

W

= (S

W

V

, S

W

H

). What we do have is the

discrepancy estimate

ˆ

D

MAP

t

+

, but to what extent

the visual and haptic biases contribute to this

discrepancy

ˆ

D

MAP

t

+

= (S

W

V

+

ˆ

B

MAP

V

,t

) − (S

W

H

+

ˆ

B

MAP

H

,t

)

=

ˆ

B

MAP

V

,t

−

ˆ

B

MAP

H

,t

, (12.14)

given that we are assuming S

W

V

= S

W

H

, is still

unknown. This ambiguity in the discrepancy

estimate after integration is indicated by the

blue diagonal line in the rightmost column of

Figure 12.8. It illustrates that there is an inﬁnite

combination of visual and haptic biases that are

consistent with

ˆ

D

MAP

t

+

. For now, we assume that

we know for sure that the visual and haptic sizes

are identical (S

W

V

= S

W

H

), so the discrepancy

estimate given by the blue line contains no noise,

that is, is not blurry. The attribution of visual

and haptic bias to the discrepancy estimate is a

second credit-assignment problem, and in order

to solve it we need additional prior knowledge.

In the following we will discuss how to

best resolve this new credit-assignment problem.

Gharahmani and colleagues (1997) proposed

that the discrepancy in the sensory estimates

should be resolved in proportion to their

variances (

σ

2

V

, σ

2

H

), that is, more credit should

be given to a signal with higher variance.

However, since the variance of an estimate does

not necessarily determine the probability of it

containing a bias (i.e., its contribution to the

discrepancy), this might lead to a suboptimal

strategy. A better way to resolve the credit

assignment problem resulting from the “bias

ambiguity” may be to use prior knowledge

about the probability of the signals being biased

p(B

V

, B

H

). We call this the “bias prior.” We

need to use prior knowledge because there is

no direct information in the sensory signals

about whether they are accurate or biased.

For example, if the estimates derived from the

haptic modality have often been biased in the

past, it is more likely that the haptic modality

provides the biased signal also in the current

situation. This prior knowledge encoding the

probability of a bias in a sensory signal is

indicated by the blue Gaussian blob in the

rightmost column of Figure 12.8. The variance of

this prior distribution determines the probability

of the signal to be biased. In the example of

Figure 12.8, the visual signal is less likely to be

biased than the haptic signal. Consequently, in

the absence of any other evidence as to what

may have caused the discrepancy, the ambiguity

in the discrepancy estimate will be resolved

once again using Bayes’ rule. This time we use

Bayes’ rule to combine the discrepancy estimate

ˆ

D

MAP

t

+

with this bias prior p(B

V

, B

H

). This will

result in the current best bias estimate

ˆ

B

MAP

t

=

(

ˆ

B

MAP

V

,t

,

ˆ

B

MAP

H

,t

) indicated by the blue arrow in

the rightmost column of Figure 12.8. The

proportion

ˆ

B

MAP

V

,t

/

ˆ

B

MAP

H

,t

and thus the direction

of the blue arrow are solely dependent on

the variance of the bias prior p(B

V

, B

H

). Now

that we have a bias estimate we also have an

estimate of the visual and haptic size of the

object, which was our objective from the start

of this chapter. The visual and haptic sizes are

given by

ˆ

S

W ,t

=

ˆ

S

MAP

t

−

ˆ

B

MAP

t

= (

ˆ

S

MAP

V

,t

−

ˆ

B

MAP

V

,t

,

ˆ

S

MAP

H

,t

−

ˆ

B

MAP

H

,t

) as indicated in Figure 12.8 by the

MULTISENSORY PERCEPTION 245

sum of the red and blue arrows in the column

labeled “posterior.”

With this our estimation problem is solved

and at the end of the time step we have the

best current estimate of the sensory signals

ˆ

S

MAP

t

,

the sizes of the objects

ˆ

S

W ,t

, and the biases

ˆ

B

MAP

t

. However, to achieve even more accurate

estimates in the future, we have to recalibrate

our system based on these bias estimates.

The iterative recalibration process is

described next. Each row in Figure 12.8 denotes

a new time step t

− 1. After integration at time

step t

− 1 the perceptual system is left with a

bias estimate

ˆ

B

MAP

t

−1

= (

ˆ

B

MAP

V

,t−1

,

ˆ

B

MAP

H

,t−1

). It is this

bias estimate that is used during recalibration

(remapping) to change the mapping at time t

deﬁned by the coupling prior. Thus, the coupling

prior at time t will be shifted to be consistent

with the current estimate of the bias, so that

p(S

V ,t

, S

H,t

) = p

0

(S

V ,t

−

ˆ

B

MAP

V

,t−1

, S

H,t

−

ˆ

B

MAP

H

,t−1

),

indicated by the blue arrow in the “prior”

column of Figure 12.8.

8

This iterative updating process corresponds

to the Kalman-ﬁlter approach to remapping that

we discussed in the last section. As can be seen

from Figure 12.8, while the direction of the

blue arrow stays constant, the length of the blue

arrow continuously increases with every time

step, providing an increasingly accurate estimate

of the bias

ˆ

B

MAP

t

and the world property

ˆ

S

W ,t

.

Thereby, it is the discrepancy estimate

ˆ

D

MAP

t

+

that

determines the extent to which one must adapt

at each time step, and this in turn determines

the rate of adaptation. This is consistent with the

exponential adaptation response discussed in the

previous section (Fig. 12.7 and Eq. 12.13). After

several time steps the system eventually reaches

steady state. This steady state, however, can only

be reached if the bias B is constant over several

trials, such as for example when wearing glasses

(Fig. 12.3A, third row). In contrast, if the bias B

is constantly changing