Content uploaded by Monica Castelhano
Author content
All content in this area was uploaded by Monica Castelhano on Dec 02, 2015
Content may be subject to copyright.
Eye movements during reading, scene perception, visual search, and while looking at
print advertisements.
Keith Rayner and Monica S. Castelhano
Department of Psychology
University of Massachusetts, Amherst
Correspondence to:
Keith Rayner
Department of Psychology
University of Massachusetts
Amherst, MA 01003, USA
413-545-2175 (phone); 413-545-0996 (FAX)
rayner@psych.umass.edu
Running Head: Eye movements when looking at ads
2
Abstract
In this chapter, we review research on eye movements in reading, scene
perception, and visual search. Understanding eye movements in these three tasks is quite
relevant for understanding eye movements when viewers look at print advertisements.
Research on issues such as the size of the perceptual span in each task and how decisions
are made about when and where to move the eyes in each of the three tasks is discussed.
Research on eye movements while looking at ads is also reviewed, and some general
comments regarding characteristics of eye movements when looking at ads are provided.
We argue that the goal of the viewer has an important influence on the nature of eye
movements when looking at ads.
3
Where do people look in print advertisements? This question has recently
generated a fair amount of research activity to determine the factors that influence which
aspects of an ad are salient in capturing a viewer’s attention (Goldberg, 1999; Pieters,
Rosbergen, & Wedel, 1999; Pieters & Warlop, 1999; Pieters & Wedel, 2007; Radach,
Lemmer, Vorstius, Heller, & Radach, 2003; Rayner, Miller, & Rotello, 2006; Rayner,
Rotello, Stewart, Keir, & Duffy, 2001; Wedel & Pieters, 2000). Given that eye
movement research has been so successful in illuminating how cognitive processes are
influenced on-line in various information processing tasks like reading, scene perception
and visual search (Rayner, 1978, 1998), such interest is not at all surprising. More
recently, there have also been attempts to provide models of eye movement control in
scanning advertisements (Liechty, Pieters, & Wedel, 2003; Reichle & Nelson, 2003).
Research on eye movements during reading, scene perception, and visual search
is obviously quite relevant for understanding how people look at advertisements. Let us
be very clear at the outset that our overview of reading will be more complete than our
overview of scene perception or visual search. The reason for this is quite obvious. We
know more about the nature of eye movements in reading than in the other two tasks.
And, the reason for this is also quite apparent. In reading, there is a well-defined task for
the viewer: people generally read to understand or comprehend the text. This involves a
sequence of eye movements that typically move from left-to-right across the page and
then down the page. Of course, the task can be varied somewhat so that, for example,
readers are asked to skim the text, and this will result in different eye movements
characteristics. Yet, the vast bulk of the research on eye movements during reading has
utilized comprehension as the goal of the reader. On the other hand, in scene perception,
4
the nature of the task is inherently much vaguer. Viewers may be asked to look at a
scene to remember it, but the sequence in which they examine the scene may be highly
idiosyncratic and variable. In visual search, there are many different types of search tasks
(search for a letter, search for a colored object, search for a person in a large group
picture, search for Waldo in a “Where’s Waldo” children’s book, and so on), and viewers
can use idiosyncratic strategies in dealing with the task. Despite these differences, some
information on the nature of eye movements in each task is available. In this chapter, we
will review some of the main findings concerning eye movements in these tasks. Then
we will move to a brief review of eye movements when looking at print advertisements
(see also the chapters by Pieters & Wedel, and by Chandon, Hutchinson, Bradlow, &
Young in this volume).
Basic Characteristics of Eye Movements
When we read or look at a scene or search for a target in a visual array, we move
our eyes every 250-350 ms. Eye movements serve the function of moving the fovea (the
high resolution part of the retina encompassing 2 degrees in the center of vision) on to
that part of the visual array that we want to process in detail. Because of acuity
limitations in the retina, eye movements are necessary for processing the details of the
array. Our ability to discriminate fine detail drops off markedly outside of the fovea in the
parafovea (extending out to about 5 degrees on either side of fixation) and in the
periphery (everything beyond the parafovea). During the actual eye movement (or
saccade), vision is suppressed
1
and new information is acquired only during the fixation
(the period of time when the eyes remain still for about 250-350 ms). Although we have
the impression that we can process the entire visual array in a single fixation and while
5
we can rapidly obtain the gist of the scene from a single fixation, in reality we would be
unable to fully process the information outside of foveal vision if we were unable to
move our eyes (Rayner, 1978, 1998).
It is often assumed that we can move our attention so as to attend to one object
while the eyes are fixated on another object. While it is indeed the case that in very
simple tasks (Posner, 1980) attention and eye location can be separated, in tasks like
reading, scene perception, and visual search, covert attention and overt attention (the
exact eye location) are pretty tightly linked. To be sure, when looking at a complicated
scene, we can dissociate covert and overt attention. But, it generally takes either a certain
amount of almost conscious effort to do so (as when we hold fixation and move our
attention elsewhere) or it is a natural consequence of programming eye movements. That
is, there is considerable evidence that attention typically precedes an eye movement to the
intended target of the saccade (Deubel & Schneider, 1996; Hoffman & Subramaniam,
1995; Kowler, Anderson, Dosher, & Blaser, 1995; Rayner, McConkie, & Ehrlich 1978).
An important point about eye movements is that they are more or less ballistic
movements. Once initiated, it is difficult (though not impossible) to change their
trajectory. Furthermore, since they are motor movements, it takes time to plan and
execute a saccade. In simple reaction time experiments, where there is no necessity of
cognitive processing of the fixated material and participants merely need to monitor when
a simple fixation target moves from one location to another (and their eyes accordingly),
it takes on the order of 175 ms to move the eyes under the best of circumstances (Becker
& Jürgens, 1979; McPeek, Skavenski, & Nakayama, 2000; Rayner, Slowiaczek, Clifton,
& Bertera, 1983).
6
Insert Table 1 about here
Table 1 shows some summary information regarding mean fixation durations and
saccade lengths in reading, scene perception, and visual search. From this table, it is
evident that the nature of the task influences the average amount of time spent on each
fixation and the average distance the eyes move. Furthermore, it is very important to
note that while the values presented in Table 1 are quite representative of the different
tasks, they show a range of average fixation durations and for each of the tasks there is
considerable variability both in terms of fixation durations and saccade lengths. To
illustrate this, Figure 1 shows the frequency distributions of fixation durations in the three
tasks. Here, it is very evident that there is a lot of variability in fixation time measures;
although not illustrated here, the same point holds for saccade size measures.
Insert Figure 1 about here
At one time, the combination of the relatively long latency (or reaction time of the
eyes) combined with the large variability in the fixation time measures led researchers to
believe that the eyes and the mind were not tightly linked during information processing
tasks like reading, scene perception, and visual search. Basically, the argument was that
if the eye movement latency was so long and if the fixation times were so variable, how
could cognitive factors influence fixation times from fixation to fixation? Actually, an
underlying assumption was that everything proceeded in a serial fashion and that
cognitive processes could not influence anything very late in a fixation, if at all.
However, a great deal of recent research has established a fairly tight link between the
eye and the mind, and furthermore it is now clear that saccades can be programmed in
7
parallel (Becker & Jürgens, 1979) and that information processing continues in parallel
with saccade programming.
With this preamble (and basic information) out of the way, let’s now turn to a
brief overview of eye movements in each of the three tasks. We’ll begin with reading
(which will receive the most attention since there is more research on eye movements in
this task than the other two), and then move to scene perception and visual search.
Eye Movements in Reading
As noted above, the average fixation duration in reading is about 225-250 ms and
the average saccade size is 8-9 character spaces. Typically, character spaces in reading
are used rather than visual angle because it has been demonstrated that character spaces
drive the eyes more than visual angle. That is, if the size of the print is held constant and
the viewing distance varied (so that there are either more or fewer characters per degree
of visual angle), how far the eyes move is determined by character spaces and not visual
angle (Morrison & Rayner, 1981). The other important characteristic of eye movements
during reading is that about 10-15% of the time readers move their eyes back in the text
to read previously read material. These regressions, as they are called, are somewhat
variable depending on the difficulty of the text. Indeed, fixation duration and saccade
size are both modulated by text difficulty: as the text becomes more difficult, fixation
durations increase, saccade size decreases, and regressions increase. So, it is very clear
that global properties of the text influence eye movements. The three main global
measures mentioned here are also influenced by the type of reading material and the
reader’s goals in reading (Rayner & Pollatsek, 1989).
8
Likewise, there are also very clear local effects on fixation time on a word (see
below). In these studies, rather than using global measures like average fixation duration,
more precise processing measures are examined for fixated target words. These
measures include: first fixation duration (the duration of the first fixation on a word),
single fixation duration (those cases where only a single fixation is made on a word), and
gaze duration (the sum of all fixations on a word prior to moving to another word). If it
were the case that readers fixated (1) only once on each word and (2) each word, then
average fixation duration on a word would be a useful measure. But, the reality is that
many words are skipped during reading (i.e., don’t receive a direct eye fixation) and
some words are fixated more than once. There is good reason to believe that the words
that are skipped were processed on the fixation prior to the skip, and likewise there is
good reason to think that words are refixated (before moving on in the text) in order to
fully process their meaning. The solution to this possible conundrum is to utilize the
three measures just described which provide a reasonable estimate of how long it takes to
process each word (Rayner, 1998).
The Perceptual Span. A very important issue with respect to reading has to do
with the size of the perceptual span (also called the region of effective vision or the
functional field of view) during a fixation in reading. Each time the eyes pause (for 200-
250 ms) how much information is the reader able to process and use during that fixation?
We often have the impression that we can clearly see the entire line of text, even the
entire page of text. But, this is an illusion as experiments utilizing a gaze-contingent
moving window paradigm (see Figure 2) introduced by McConkie and Rayner (1975;
Rayner & Bertera, 1979) have clearly demonstrated.
9
In these experiments, the rationale is to vary how much information is available to
a reader and then determine how large the window of normal text has to be before readers
read normally. Conversely, how small can the window be before there is disruption to
reading? Thus, in the experiments, within the window area text is normally displayed,
but outside of the window, the letters are replaced (with other letters or with X’s or a
homogenous masking pattern). A great deal of research using this paradigm has
demonstrated that readers of English obtain useful information from a region extending
3-4 character spaces to the left of fixation to about 14-15 character spaces to the right of
fixation
2
. Indeed, if readers have the fixated word and the word to the right of fixation
available on a fixation (and all other letters are replaced with visually similar letters),
they are not aware that the words outside of the window are not normal, and their reading
speed only decreases by about 10%. If two words to the right of fixation are available
within the window, there is no slowdown in reading. Furthermore, readers do not utilize
information from the words on the line below the currently fixated line (Rayner, 1998).
Finally, in moving mask experiments (Rayner & Bertera, 1979; Rayner, Inhoff, Morrison,
Slowiaczek, & Bertera, 1981) when a mask moves with the eyes on each fixation
covering the letters in the center of vision (see Figure 2), it is very clear that reading is
very difficult if not impossible when the central foveal region is masked (and only letters
in parafoveal vision are available for reading).
Insert Figure 2 about here
A great deal of other research using another type of gaze-contingent display
change paradigm (see Figure 2), called the boundary paradigm (Rayner, 1975), has also
revealed that when reader have a valid preview of the word to the right of fixation, they
10
spend less time fixating that word (following a saccade to it) than when they don’t have a
valid preview (i.e., another word or nonword or random string of letters initially occupied
the target word location). The size of this preview benefit is typically on the order of 30-
50 ms. Interestingly, research using this technique has revealed that readers don’t
combine a literal representation of the visual information across saccades, but rather
abstract (and phonological) information is combined across eye fixations in reading
(McConkie & Zola, 1979; Rayner, McConkie, & Zola, 1980).
Linguistic Influences on Fixation Time. Over the past few years, it has become
very clear that the ease or difficulty associated with processing the fixated word strongly
influences how long the eyes remain in place. How long the eyes remain in place is
influenced by a host of linguistic variables such as the frequency of the fixated word
(Inhoff & Rayner, 1986; Rayner & Duffy, 1986), how predictable the fixated word is
(Ehrlich & Rayner, 1981; Rayner & Well, 1996), how many meanings the fixated word
has (Duffy, Morris, & Rayner, 1988; Sereno, O’Donnell, & Rayner, 2006), when the
meaning of the word was acquired (Juhasz & Rayner, 2003, 2006), semantic relations
between the word and prior words (Carroll & Slowiaczek, 1986; Morris, 1994), how
familiar the word is (Williams & Morris, 2004), and so on (see Rayner, 1998 for review).
Perhaps the most compelling evidence that cognitive processing of the fixated
word is driving the eyes through the text comes from experiments in which the fixated
word either disappears or is masked after 50-60 ms (Ishida & Ikeda, 1989; Liversedge,
Rayner, White, Vergilino-Perez, Findlay, & Kentridge, 2004; Rayner et al., 1981;
Rayner, Liversedge, White, & Vergilino-Perez, 2003; Rayner, Liversedge, & White,
2006). Basically, these studies show that if readers are allowed to see the fixated word
11
for 60 ms before it disappears, they read quite normally. Interestingly, if the word to the
right of fixation also disappears or is masked, then reading is disrupted (Rayner et al.,
2006); this quite strongly demonstrates that the word to the right of fixation is very
important in reading. More critically for our present purposes, when the fixated word
disappears after 60 ms, how long the eyes remain in place is determined by the frequency
of the word that disappeared: if it is a low frequency word, the eyes remain in place
longer (Rayner et al., 2003, 2006). Thus, even though the word is no longer there, how
long the eyes remain in place is determined by that words’ frequency. This is very
compelling evidence that the cognitive processing associated with a fixated word is the
engine driving the eyes through the text.
To summarize the foregoing overview, it is now clear that readers acquire
information from a limited region during a fixation (extending to about 14-15 character
spaces to the right of fixation). Information used for word identification is obtained from
an even smaller region (extending to about 7-8 character spaces to the right of fixation).
Furthermore, the word to the right of fixation is important and readers obtain preview
benefit from that word. On some fixations, readers can process the meaning of the
fixated word and the word to the right of fixation. In such cases, they will typically skip
over the word to the right of fixation. Finally, the ease or difficulty associated with
processing the fixated word strongly influences how long readers look at that word.
Models of Eye Movements in Reading. Given the vast amount of information
that has been learned about eye movements during reading in the last 25-30 years, a
number of models of eye movements in reading have recently appeared. The E-Z Reader
model (Pollatsek, Reichle, & Rayner, 2006; Rayner, Ashby, Pollatsek, & Reichle, 2004;
12
Rayner, Reichle, & Pollatsek, 1998; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle,
Pollatsek, & Rayner, 2006; Reichle, Rayner, & Pollatsek, 2003) is typically regarded as
the most influential of these models. In the interests of space limitations, other models
will not be discussed here
3
. Basically, the E-Z Reader model accounts for all of the data
and results discussed above, and it also does a good job of predicting how long readers
will look at words, which words they will skip, and which words will be refixated. It can
account for global aspects of eye movements in reading, while also dealing with more
local processing characteristics; the competitor models also can account for similar
amounts of data. In many ways, the models share many similarities, though they differ
on some precise details and how they go about explaining certain effects varies between
them. As a computational model, E-Z Reader has the virtue of being highly transparent,
so it makes very clear predictions and when it can’t account for certain data, it is very
clear why it can’t (thus enabling one to change parameter values in the model). The
model has also enabled us to account for data patterns that in the past may have been
difficult to explain. The model isn’t perfect and has many limitations. For example,
higher order processes due to sentence parsing and discourse variables do not currently
have an influence within the model. It basically assumes that lexical processing is
driving the eyes through the text, but we believe that this isn’t an unreasonable
assumption
4
.
The main, and concluding, point from the foregoing is that great advances have
been made in understanding eye movements in reading (and inferring the mental
processes associated with reading) via careful experimentation and via the
implementation of computational models that nicely simulate eye movements during
13
reading. In the next two sections, eye movements during scene perception and visual
search will be discussed. Although there hasn’t been as much research on these areas as
on reading, it is still the case that some clear conclusions emerge from the work that has
been done.
Eye Movements and Scene Perception
Figure 3 shows the eye movements of a viewer on a scene. As is very evident in
this figure, viewers don’t fixate on every part of the scene. This is largely because
information can be obtained over a wider region in scene perception than reading.
However, it is clear that the important aspects of the scene are typically fixated (and
generally looked at for longer periods than less important parts of the scene). In Figure 3,
the fixations are on the informative parts of the scene, and viewers do not fixate on the
sky or the road in front of the houses. As we noted at the outset, the average fixation in
scene perception tends to be longer than that in reading, and likewise the average saccade
size tends to be longer. In this section, a brief summary of where people tend to look in
scenes will be provided, as well as information regarding the perceptual span region for
scenes and the nature of eye movement control when looking at scenes.
Insert Figure 3 about here
Getting the Gist of a Scene. One very important general finding with respect to
scene perception is that viewers get the gist of a scene very early in the process of
looking, sometimes even from a single brief exposure that is so quick that it would be
impossible to move the eyes (De Graef, 2005). In fact, in a recent study, Castelhano and
Henderson (2006b) showed that with exposures lasting as little as 40 ms, participants
were able to extract enough information to get the gist of the scene. It has typically been
14
argued that the gist of the scene is obtained in the first fixation, and that the remaining
fixations on a scene are used to fill in the details.
Where do Viewers Look in Scenes? Since the pioneering work of Buswell
(1938) and Yarbus (1967), it has been widely recognized that viewer’s eyes are drawn to
important aspects of the visual scene and that their goals in looking at the scene very
much influence their eye movements. Quite a bit of early research demonstrated that the
eyes are quickly drawn to informative areas in a scene (Antes, 1974; Mackworth &
Morandi, 1967) and that the eyes quickly move to an object that is out-of-place in a scene
(Friedman, 1979; Loftus & Mackworth, 1978). On the other hand, the out-of-place
objects in these scenes tended to differ from the appropriate objects on a number of
dimensions (Rayner & Pollatsek, 1992). For example, an octopus in a farm scene is not
only semantically out-of-place, but it also tends to have more rounded features than the
objects typically in a farm scene. So, these early studies confounded visual saliency and
semantic saliency. More recent experiments in which appropriate featural information
was well-controlled raise questions about the earlier findings, and suggest that the eyes
are not invariably and immediately drawn to out-of-place objects (De Graef, Christiaens,
& d’Ydewalle, 1990; Henderson, Weeks, & Hollingworth, 1999).
But, it is certainly the case that the eyes do get quickly to the important parts of a
scene. In a recent study, the influence that context has on the placement of eye
movements in search of certain objects within pseudo-realistic scenes was investigated
(Neider & Zelinksy, 2006). Viewers were asked to look for target objects that are
typically constrained to certain parts of the scene (i.e., jeep on the ground, blimp in the
sky). When a target was present, fixations were largely limited to the area one would
15
expect to find the target object (i.e., ground or sky); while, when the target was absent,
the inclination to restrict search to these areas was less so. They also found that when the
target was in the expected area, search times were on average 19% faster. From these
results, they concluded that not only do viewers focus their fixations in areas of a scene
that most likely contain the target to improve search times, but also that the visual system
is flexible in the application of these restriction and viewers very quickly adopt a “look
everywhere” strategy when the first proves unfruitful. Thus, it seems that search
strategies in scenes are guided by the scene context, but not with strict adherence.
It is also clear that the saliency of different parts of the scene influence what part
of the scene is fixated (Parkhurst & Niebur, 2003; Mannan, Ruddock, & Wooding, 1995,
1996). Saliency is typically defined in terms of low-level components of the scene (such
as contrast, color, intensity, brightness, spatial frequency, etc.). Indeed, there are now a
fair number of computational models (Baddeley & Tatler, 2006; Itti & Koch, 2000, 2001;
Parkhurst, Law, & Niebur, 2002) that use the concept of a saliency map to model eye
fixation locations in scenes. In this approach, bottom-up properties in a scene (the
saliency map) make explicit the locations of the most visually prominent regions of the
scene. The models are basically used to derive predictions about the distribution of
fixations on a given scene.
While these models can account for some of the variability in where viewers
fixate in a scene, they are limited in that the assumption is that fixation locations are
driven primarily by bottom-up factors and it is clear that higher level factors also come
into play in determining where to look next in a scene (Henderson & Castelhano, 2006a;
Henderson & Ferreira, 2004). A model that includes more in the way of top-down and
16
cognitive strategies has recently been presented by Torralba, Oliva, Castelhano, and
Henderson (2006). Indeed, while there has been a considerable amount of research to
localize where viewers move their eyes while looking at scenes, there has been precious
little in the way of attempting to determine what controls when the eyes move. This is in
contrast with reading where the issues of where to move the eyes and when to move the
eyes have both received considerable attention. One recent study attempting to correct
this imbalance investigated the effect of repeated exposure to a scene and its effect on
fixation durations (Hidalgo-Sotelo, Oliva & Torralba, 2005). Observers were asked to
search for a target and respond whether it was present in a scene while their eye
movements were tracked. Unbeknownst to them, there were certain scene-target
combinations that repeated throughout the experiment twenty times. As expected, these
repeated searches showed a large decrease in response time. Interestingly though, the
number of fixations did not decrease as much as the average fixation duration prior to
fixating the target object. Furthermore, the results showed that the proportion of target
objects that were fixated before a response was made did not change with increased
repetitions (85%). And although the average gaze durations on the target fell from 450
ms during the first exposure to 310 ms in the twentieth, it seems that observers chose to
verify the target object before making a response. These results showed that with
repeated exposure, the reduced response time is primarily due to a decrease in the
average duration of fixations during the search and in the time to verify the target object.
Thus, it seems that in this study it became easier to identify the fixated regions as non-
targets and targets, but not to cut down on the number of fixations made.
17
Another difference between scenes and reading is the question of what
information is used from memory. We know that memory for the information read plays
a large role in integrating information from the current fixation with what has already
been read and directing subsequent fixations (such as deciding whether to regress and
reread a certain section). In scenes, the role that memory plays in directing fixations is
not as clear. Many of the models using saliency as the primary driving force of eye
movements do not consider how information gathered initially may influence the placing
of subsequent fixations. In a recent study, Castelhano and Henderson (2006a)
investigated whether this initial representation of a scene can be used to guide subsequent
eye movements on a real-world scene. Observers were shown a very short preview of the
search scene and then were asked to find the target object using a moving window, thus
eliminating any immediately available visual information. A preview of the search scene
itself elicited the most efficient searches when compared to a meaningless control (the
preview yielded fewer fixations and the shortest saccade path to the target). When a
preview of another scene within the same semantic category was shown (thereby
providing general semantic information without the same visual details), results revealed
no improvement in search. These results suggest that the initial representation used to
improve search efficiency was not based on general semantics, but rather on something
more specific. When a reduced scale of the search scene was shown as the preview,
search efficiency measures were as high as when the full-scale preview was shown.
Taken together, these results suggest that the initial scene representation is based on
abstract, visual information that is useful across changes in spatial scales. Thus, the
18
information used to guide eye movements in scenes is said to have two sources: the
saliency of the scene and the information in memory about that scene and scene type.
The Perceptual Span. How much information do viewers obtain in a scene? As
noted at the outset of this section, it is clear that information is acquired over a wider
range of the visual field when looking at a scene than is the case for reading. Henderson,
McClure, Pierce, and Shrock (1997) used a moving mask procedure (to cover the part of
the scene around the fixation point) and found that although the presence of a foveal
mask influenced looking time, it did not have nearly as serious effects for object
identification as a foveal mask has for reading.
Nelson and Loftus (1980) examined how close to fixation an object had to be for
it to be recognized as having been in the scene. They found that objects located within
about 2.6 degrees from fixation were generally recognized, but recognition depended to
some extent on the characteristics of the object. They also suggested that qualitatively
different information is acquired from the region 1.5 degrees around fixation than from
regions further away (see also Nodine, Carmody, & Herman, 1979). While a study by
Parker (1978) is often taken to suggest (see Henderson & Ferreira, 2004 for discussion)
that the functional field of view for specific objects in a scene is quite large (with a radius
of at least 10 degrees around fixation resulting in a perceptual span of up to 20 degrees),
other more recent studies using better controlled stimuli and more natural images
(Henderson & Hollingworth, 1999; Henderson, Williams, Castelhano, & Falk, 2003)
suggest that the functional field of view extends about 4 degrees away from fixation.
An early study using the moving window technique by Saida and Ikeda (1979)
suggested that the functional field of view is quite large, and can consist of about half of
19
the total scene regardless of the absolute size of the scene (at least for scenes that are up
to 14.4 degrees by 18.8 degrees). In this study and other studies using the moving
window paradigm (van Diepen & d’Ydewalle, 2003; van Diepen, Wampers, &
d’Ydewalle, 1998) normal scene information within the window area around a fixation
point is presented normally, but the information outside of the window is degraded in
some systematic way. Saida and Ikeda (1979) found a serious deterioration in
recognition of a scene when the window was limited to a small area (about 3.3 degrees X
3.3 degrees) on each fixation. Performance gradually improved as the window size
became larger, as noted, up to about 50% of the entire scene. Saida and Ikeda noted that
there was considerable overlap of information across fixations.
It should be clear from the studies we have reviewed that the answer to the
question of how large the perceptual span in scene perception is hasn’t been answered as
conclusively as it has in reading. Nevertheless, it does appear that viewers typically gain
useful information from a fairly wide region of the scene, which also probably varies as a
function of the scene and the task of the viewer. For instance, the ease with which an
object is identified has been linked to its orientation (Boutsen, Lamberts, & Verfaille,
1998), frequency within a scene context (Hollingworth & Henderson, 1998), and how
well camouflaged it is (De Graef, et al., 1990). As has been shown in reading
(Henderson & Ferreira, 1990), it is likely that the ease of identifying a fixated object has
an effect on the extent of processing in the periphery.
Preview Benefit. Just as in reading, viewers obtain preview benefit from objects
that they have not yet fixated (Henderson, 1992; Henderson, Pollatsek, & Rayner, 1987,
1989; Pollatsek, Rayner, & Collins, 1984; Pollatsek, Rayner, & Henderson, 1990) and the
20
amount of the preview benefit is on the order of 100 ms (so it is larger than in reading).
Interestingly, viewers are rather immune to changes in the scene. In a series of
experiments by McConkie and Grimes (McConkie, 1991; McConkie & Grimes, 1995;
Grimes, 1996) observers were asked to view scenes with the task of memorizing what
they saw. They were also informed that changes could be made to the image while they
were examining it, and they were instructed to press a button if they detected those
changes. While observers viewed the scenes, changes were made during a saccade. As
discussed earlier, during saccades vision is suppressed meaning that these changes would
not have been visible as they were occurring. Remarkably, observers were unaware of
most changes, which included the appearance and disappearance of large objects and the
changing of colors, all of which were happening while the scene was being viewed.
Although later studies found that any disruption served to induce an inability to detect
changes (such as inserting a blank screen in between two changing images (Rensink,
Clark & O’Regan, 1997), movie cuts (Levin & Simons, 1997), or the simultaneous onset
of patches covering portions of the scene (O’Regan, Rensink, & Clark, 1999), these
experiments highlighted the relation between what is viewed during the initial
exploration of a scene and then what is remembered about that scene. Further studies
have shown that this lack of awareness does not mean that there is no recollection of any
visual details, but rather that the likelihood of remembering visual information is highly
dependent on the processing of that information (Hollingworth & Henderson, 2002;
Hollingworth, 2003). This means that knowing something about the processes that go on
during a fixation on a scene is extremely important if one would want to predict how well
visual information being viewed is stored.
21
When do viewers move their eyes when looking at scenes? With the
assumption that attention precedes an eye movement to a new location within a scene
(Henderson, 1992; van Diepen & D’Ydewalle, 2003), it follows that the eyes will move
once information at the center of vision has been processed and a new fixation location
has been chosen. In a recent study, van Diepen and D’Ydewalle (2003) investigated
when this shift in attention (from the center of fixation to the periphery) took place in the
course of a fixation. They had observers view scenes whose information at the center of
fixation was masked during the initial part of fixations (from 20- 90 ms). In another case,
the periphery was masked at the beginning of each fixation (for 10-85 ms). As expected
based on the assumptions made above, they found that when the center of fixation was
masked initially, fixation durations increased with longer mask durations (61% increase).
When the periphery was masked, they found a slight increase in fixation durations, but
not as much as with a central mask (15% increase). Interestingly, they found that the
average distance of saccades decreased and the number of fixations increased with longer
mask durations in the periphery. They surmised that with the longer peripheral masking
durations the visual system does not wait for the unmasking of peripheral information,
but instead chooses information that is immediately available. These results suggest that
the extracting of information at the fovea occurs very rapidly, and the attention is directed
to the periphery almost immediately following the extraction of information (70-120 ms)
to choose a viable saccade target. Although the general timing of the switch between
central and peripheral information processing is now being investigated, the variability of
information across scenes makes it more difficult to come up with a specific time frame
as has been done in reading.
22
Eye Movements and Visual Search
Visual search is a research area that has received considerable effort over the past
40 years. Unfortunately, the vast majority of this research has been done in the absence
of considering eye movements (Findlay & Gilchrist, 1998). That is, eye movements have
typically not been monitored in this research area and it has often been assumed that eye
movements are not particularly important in understanding search. However, this attitude
seems to be largely changing as there are now many experiments reported each year on
visual search utilizing eye movements to understand the process. Many of these studies
deal with very low-level aspects of search and often focus on using the search task to
uncover properties of the saccadic eye movement system (see Findlay, 2004; Findlay &
Gilchrist, 2003).
In this chapter, we’ll focus primarily on research that has some implications for
how viewers search through arrays to find specific targets (as is often the case when
looking at ads). As we noted at the outset, fixation durations in search tend to be highly
variable. Some studies report average fixation times as short as 180 ms while others
report averages on the order of 275 ms. This wide variability is undoubtedly due to the
fact that how difficult the search array is (or how dense or cluttered it is) and the exact
nature of the search task strongly influence how long viewers pause on average.
Typically, saccade size is a bit larger than in reading (though saccades can be quite short
with very dense arrays) and a bit shorter than in scene perception.
The Search Array Matters. Perhaps the most obvious thing about visual search is
that the search array makes a big difference in how easy it is to find a target. When the
array is very dense (with many objects and distractors) or cluttered, search is more costly
23
than when the array is simple or less dense and eye movements typically reflect this fact
(Bertera & Rayner, 2000; Greene & Rayner, 2001a. 2001b). The number of fixations and
fixation duration both increase as the array becomes more complicated, and the average
saccade size decreases (Vlaskamp & Hooge, 2006). Additionally, the configuration of
the search array has an effect on the pattern of eye movements. In an array of objects
arranged in an arc, fixations tend to fall in-between objects, progressively getting closer
to the area where viewers think the target is located (Zelinsky, 2005; Zelinsky, Rao,
Hayhoe, & Ballard, 1997). On the other hand, in randomly placed arrays, other factors
such as color of the items and shape similarity to the target object influence the
placement of fixations (Williams, Henderson, & Zacks, 2005).
Does Visual Search Have a Memory? This question has provoked a considerable
amount of research. Horowitz and Wolfe (1998) initially proposed that visual search
doesn’t have a good memory and that the same item will be re-sampled during the search
process. However, they made this assertion based on reaction time functions, and eye
movement data are ideal for addressing the issue (since one can examine how frequently
the eyes return to a previously sampled part of the array). And, eye movement
experiments (Beck, Peterson, Boot, Vomela, & Kramer, 2006; Beck, Peterson, &
Vomela, 2006; Peterson, Kramer, Wang, Irwin, & McCarley, 2001) make it quite clear
that viewers generally do not return to previously searched items.
The Perceptual Span. Rayner and Fisher (1987a, 1987b) used the moving
window technique as viewers searched through horizontally arranged letter strings for a
specified target letter. They found that the size of the perceptual span varied as a
function of the difficulty of the distractor letter; when the distractor letters where visually
24
similar to the target letter, the size of the perceptual span was smaller than when the
distractor letters were distinctly different from the target letter. They suggested that there
were two qualitatively different regions within the span: a decision region (where
information about the presence or absence of a target is available, and a preview region
where some letter information is available but where information on the absence of a
target is not available.
Bertera and Rayner (2000) had viewers search though a randomly arranged array
of letters and digits for the presence of a target letter. They used both the moving
window and moving mask techniques. They varied the size of the array (so that it was 13
degrees by 10 degrees, 6 degrees by 6 degrees, or 5 degrees by 3.5 degrees), but the
number of items was held constant (so in the smaller arrays, the information was more
densely packed). The moving mask had a deleterious effect on search time and accuracy,
and the larger the mask, the longer the search time. In the moving window condition,
search performance reached asymptote when the window was 5 degrees (all letters/digits
falling within 2.5 degrees from the fixation point where visible with such a window size
while all other letters were masked).
Where and When to Move the Eyes. While there have been considerable efforts
undertaken to determine the factors involved in deciding where and when to move the
eyes (Greene, 2006; Greene & Rayner, 2001a, 2001b; Hooge & Erkelens, 1996, 1998;
Jacobs, 1986; Vaughan, 1982), a clear answer to the issue has not emerged. Some have
concluded that fixation durations in search are the result of both preprogrammed saccades
and fixations that are influenced by the fixated information (Vaughan, 1982). Others
have suggested that the completion of foveal analysis is not necessarily the trigger for an
25
eye movement (Hooge & Erkelens, 1996, 1998) while others have suggested that it is
(Greene & Rayner, 2001b). Rayner (1995) suggested that the trigger to move the eyes in
a search task is something like: is the target present in the decision area of the perceptual
span? If it is not, a new saccade is programmed to move the eyes to a location that has
not been examined. As with reading (and presumably scene perception), attention would
move to the region targeted for the next saccade.
Finally, the decision about where to fixate next and when to move the eyes is
undoubtedly strongly influenced by the characteristics of the specific search task and the
density of the visual array. In a recent study, van Zoest, Donk, and Theeuwes (2004)
investigated what type of information had more influence over the placement of fixations:
goal-driven information (i.e., target knowledge) or distractor saliency. They found that
when fixations were made quickly subjects tended to fixate the target and distractor
equally, however for longer fixation latencies, the target was fixated more often. They
concluded that the longer observers took to choose a location and execute a saccade, the
more likely it would be influenced by goal-driven control. Thus it seems that the
parallels between visual search arrays and scenes are greater than with reading, in that
visual saliency plays a greater role in directing fixations. Also, search for targets within
visual search displays and scenes have different dimensions that are not as variable as in
reading. For instance, with respect to search tasks, there are many different types of
targets that people may be asked to search for. Searching for a certain product in a
grocery store shelf or searching for a particular person in a large group picture or for a
word in a dictionary may well yield very different strategies than skimming text for a
word (and hence influence eye movements in different ways). Although the task is
26
generally much better defined in visual search than in scene perception, it cannot be as
well specified as in reading.
General Comments on Eye Movements
In the preceding sections, we have reviewed research on eye movements in three
tasks that are very much related to what happens when viewers look at print
advertisements. Although there are obviously many differences between reading, scene
perception, and visual search, there are some general principles that we suspect hold
across the three tasks (and are relevant for considering eye movements when looking at
ads). First, how much information is processed on any fixation (the perceptual span or
functional field of view) varies as a function of the task. The perceptual span is
obviously smaller in reading than in scene perception and visual search. Thus, for
example, fixations in scene perception tend to be longer and saccades are longer because
more information is being processed on a fixation. Second, the difficulty of the stimulus
influences eye movements: in reading, when the text becomes more difficult, eye
fixations get longer and saccades get shorter; likewise in scene perception and visual
search, when the array is more difficult (crowded, cluttered, dense), fixations get longer
and saccades get shorter. Fourth, the difficulty of the specific task (reading for
comprehension versus reading for gist, searching for a person in a scene versus looking at
the scene for a memory test, and so on) clearly influences eye movements across the three
tasks. Finally, in all three tasks there is some evidence (Najemnik & Geisler, 2005;
Rayner, 1998) that viewers integrate information poorly across fixations and that what is
most critical is that there is efficient processing of information on each fixation.
Eye Movements and Advertisements
27
In comparison to reading, scene perception, and visual search, there has been
considerably less research on eye movements when looking at ads than there has been on
these other topics. Obviously, however, what is known about eye movements in these
other tasks has some relevance to looking at ads since there is often a reading component,
a scene perception component, and a search component to the task of looking at an ad.
While there was some research on eye movements while viewers examined print
advertisements prior to the late-1990’s (see Radach et al., 2003, for a summary), it tended
to be rather descriptive and non-diagnostic. More recent research has focused on
attempts to analytically determine how (1) aspects of the ad and (2) the goal of the viewer
interact to influence looking behavior and the amount of attention devoted to different
parts of the ad. For example, Rayner et al. (2001) asked American participants to
imagine that they had just moved to the United Kingdom and that they needed to either
buy a new car (the car condition) or skin care products (the skin care condition). Both
groups of participants saw the same set of 24 ads; participants in the car group saw 8
critical car ads, but they also saw 8 critical skin car ads and 8 filler ads (consisting of a
variety of ad types) while participants in the skin care group also saw the same 8 car ads,
the same 8 skin care ads, and the same 8 filler ads. Obviously, the two different types of
ads should have differing amounts of relevance to the viewers. Indeed, viewers in the car
condition spent much more time looking at car ads than at skin care ads, while viewers in
the skin care condition spent much more time looking at skin care ads than car ads.
In a follow-up experiment, Rayner et al. (2006) used the same set of ads, but this
time participants were asked to rate the ads in terms of (1) how effective each ad was or
(2) how much they liked the ad. Interestingly, the pattern of looking times was very
28
different in this experiment in comparison to the earlier Rayner et al. (2001) study.
Indeed, when asked to rate pictures for effectiveness or likeability, viewers tended to
spend much more time looking at the picture part of the ad in comparison to the text. In
contrast, viewers in the Rayner et al. (2001) study spent much more time reading the text
portion of the ad, particularly if the ad was relevant for their goal. Thus, viewers in the
car condition spent a lot of time reading the text in the car ads (but not in the skin care
ads), while those in the skin care condition spent a lot of time reading the text in the skin
care ads (but not in the car ads). As seen in Table 2, the amount of time that viewers
devoted to the picture or text part of the ad varied rather dramatically as a function of
their goals. When the goal was to think about actually buying a product they spent more
time reading; when the goal was to rate the ad, they spent much more time looking at the
picture (for further evidence of the importance of the viewer’s goals, see Pieters &
Wedel, 2007).
Insert Table 2 about here
Clearly, advertisements differ in many ways, yet from our perspective there
appear to be some underlying principles with respect to how viewers inspect them. First,
when viewers look at an ad with the expectation that they might want to buy a product,
they often quickly move their eyes to the text in the ad (Rayner et al., 2001), especially
the large text (typically called the headline). Second, viewers spend more time on
implicit ads in which the pictures and text are not directly related to the product than they
spend on explicit ads (Radach et al., 2003). Third, although brand names tend to take up
little space in an ad, they receive more eye fixations per unit of surface than text or
pictures (Wedel & Pieters, 2000). Fourth, viewers tend to spend more time looking at the
29
text portion than at the picture portion of the ad, especially when the amount of space
taken up is taken into account (Rayner et al., 2001; Wedel & Pieters, 2000). Fifth,
viewers typically do not alternate fixations between the text and the picture part of the ad
(Rayner et al., 2001, 2006). That is, given that the eyes are in either the text or picture
part of the ad, the probability that the next fixation is also in that part of the ad is fairly
high (about .75, Rayner et al., 2006). Rayner et al. (2001) found that viewers tended to
read the headline or large print, then the smaller print, and then they looked at the picture
(although some viewers did an initial cursory scan of the picture). However, Radach et
al. (2003) found that their viewers looked back and forth between different elements
(often scanning back and forth between the headline, the text, and the picture). Radach
et al. (2003) argued that the differences lie in the fact that the tasks they used were more
demanding than those used by Rayner et al. (2001). This brings us to the sixth important
point: it is very clear that the goal of the viewer very much influences the pattern of eye
movements and how much time viewers spend on different parts of the ad (Pieters &
Wedel, 2007; Rayner et al., 2006). As noted above (see Table 2), where people look
(and how soon they look at the text or the picture part of the ad) varies rather
dramatically as a function of the goals of the viewer (Rayner et al., 2006).
Summary
In this chapter, we have reviewed the basic findings concerning eye movements
when (1) reading, (2) looking at a scene, (3) searching through a visual array, and (4)
looking at ads. Although there is no question that the tasks differ considerably, and that
eye movements also differ considerably as a function of the task, it is the case that eye
movements can be very informative about what exactly viewers do in each type of task.
30
Each of these points has been discussed in the preceding sections. We didn’t discuss how
people look at ads on web pages (or eye movements on web pages in general) since such
research is in its infancy. But, we do suspect that many of the findings we have outlined
above with more traditional tasks will carry over to that situation. It will also be
interesting to see how well the findings we have described hold up when viewers look at
dynamically changing scenes (as virtually all of the work that we described has dealt with
static scenes). Finally, our expectation is that eye movements will continue to play a
valuable role for those interested in how ads are processed and how effective they are for
consumers.
31
References
Antes, J.R. (1974). The time course of picture viewing. Journal of Experimental
Psychology, 103, 62-70.
Baddeley, R. J., & Tatler, B. W. (2006). High frequency edges (but not contrast) predict
where we fixate: a Bayesian system identification analysis. Vision Research, 46,
2824-2833.
Beck, M.R., Peterson, M.S., Boot, W.R., Vomela, M., & Kramer, A.F. (2006). Explicit
memory for rejected distractors during visual search. Visual Cognition, 14, 150-
174.
Beck, M.R., Peterson, M.S., & Vomela, M. (2006). Memory for where, but not what, is
used during visual search. Journal of Experimental Psychology: Human
Perception and Performance, 32, 235-250.
Becker, W., & Jürgens, R. (1979). Analysis of the saccadic system by means of double
step stimuli. Vision Research, 19, 967-983.
Bertera, J. H., & Rayner, K. (2000). Eye movements and the span of effective stimulus in
visual search. Perception & Psychophysics, 62, 576-585.
Boutsen, L., Lamberts, K. & Verfaillie, K. (1998) Recognition times of different views of
56 depth-rotated objects: A note concerning Verfaillie and Boutsen (1995).
Perception & Psychophysics, 60, 900-907.
Buswell, G. T. (1935). How people look at pictures. Chicago: University of Chicago
Press.
Carroll, P.J., & Slowiaczek, M.L. (1986). Constraints on semantic priming in reading: A
fixation time analysis. Memory & Cognition, 14, 509-522.
Castelhano, M. S., & Henderson, J. M. (2006a). Initial scene representations facilitate eye
movement guidance in visual search. Journal of Experimental Psychology:
Human Perception and Performance, in press.
Castelhano, M. S., & Henderson, J. M. (2006b). The influence of color and structure on
perception of scene gist. Manuscript under review.
De Graef, P. (2005). Semantic effects on object selection in real-world scene perception.
In G. Underwood (ed), Cognitive processes in eye guidance. Oxford: Oxford
University Press.
De Graef, P., Christiaens, D., & d'Ydewalle, G. (1990). Perceptual effects of scene
context on object identification. Psychological Research, 52, 317-329.
32
Deubel, H., & Schneider, W.X. (1996). Saccade target selection and object recognition:
Evidence for a common attentional mechanism. Vision Research, 36, 1827-1837.
Duffy, S. A., Morris, R. K., & Rayner, K. (1988). Lexical ambiguity and fixation times in
reading. Journal of Memory and Language, 27, 429-446.
Ehrlich, S. E, & Rayner, K. (1981). Contextual effects on word perception and eye
movements during reading. Journal of Verbal Learning and Verbal Behavior, 20,
641-655.
Findlay, J. M. (2004). Eye scanning and visual search. In J. M. Henderson, and F.Ferreira
(Eds.), The interface of language, vision, and action: Eye movements and the
visual world (pp 135-160). New York: Psychology Press.
Findlay, J. M., & Gilchrist, I. D. (1998). Eye guidance and visual search. In G.
Underwood (Ed.), Eye guidance in reading and scene perception (pp. 295-312).
Oxford, England: Elsevier.
Findlay, J. M., & Gilchrist, I. D. (2003). Active vision. The Psychology of looking and
seeing. Oxford: Oxford University Press.
Friedman, A. (1979). Framing pictures: The role of knowledge in automatized encoding
and memory for gist. Journal of Experimental Psychology: General, 108, 316-
355.
Goldberg, J.H. (1999). Visual search of food nutrition labels. Human Factors, 41, 425-
437.
Greene, H. (2006). The control of fixation duration in visual search. Perception, 35,
303-315.
Greene, H., & Rayner, K. (2001a). Eye movements and familiarity effects in visual
search. Vision Research, 41, 3763-3773.
Greene, H., & Rayner, K. (2001b). Eye-movement control in direction-coded visual
search. Perception, 29, 363-372.
Grimes, J. (1996). On the failure to detect changes in scenes across saccades. In K. Akins
(Ed.), Vancouver studies in cognitive science: Vol. 5. Perception (pp. 89–110).
New York: Oxford University Press.
Grimes J., & McConkie, G. (1995). On the insensitivity of the human visual system to
image changes made during saccades. In K. Akins, (Ed.), Problems in Perception.
Oxford, UK: Oxford University Press.
33
Henderson, J. M. (1992). Identifying objects across saccades: Effects of extrafoveal
preview and flanker object context. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 18, 521-530.
Henderson, J.M, & Castelhano, M.S. (2005). Eye Movements and Visual Memory for
Scenes. In G. Underwood (Ed.), Cognitive Processes in Eye Guidance (pp. 213-
235). Oxford University Press.
Henderson. J. M., & Ferreira, F. (1990). Effects of foveal processing difficulty on the
perceptual span in reading: implications for attention and eye movement control.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(3),
417-429.
Henderson. J. M., & Ferreira, F. (2004). Scene perception for psycholinguists. In J. M.
Henderson, and F. Ferreira (Eds.), The interface of language, vision, and action:
Eye movements and the visual world (pp 1-58). New York: Psychology Press.
Henderson, J. M., & Hollingworth, A. (1999). High-level scene perception. Annual
Review of Psychology, 50, 243–271.
Henderson, J. M., McClure, K. K., Pierce, S., & Schrock, G, (1997). Object identification
without foveal vision: Evidence from an artificial scotoma paradigm. Perception
& Psychophysics, 59, 323-346.
Henderson, J. M., Pollatsek, A., & Rayner, K. (1987). The effects of foveal priming and
extrafoveal preview on object identification. Journal of Experimental Psychology:
Human Perception and Performance, 13, 449-463.
Henderson, J.M., Pollatsek, A., & Rayner, K. (1989). Covert visual attention and
extrafoveal information use during object identification. Perception &
Psychophysics, 45, 196-208.
Henderson, J. M., Weeks, P. A. Jr., & Hollingworth, A. (1999). Effects of semantic
consistency on eye movements during scene viewing. Journal of Experimental
Psychology: Human Perception and Performance, 25, 210-228.
Henderson, J. M., Williams, C. C., Castelhano, M. S., & Falk, R. J. (2003). Eye
movements and picture processing during recognition. Perception &
Psychophysics, 65, 725-734.
Hidalgo-Sotelo B, Oliva A, Torralba A (2005). Human Learning of Contextual Priors for
Object Search: Where does the time go? Proceedings of 3rd Workshop on
Attention and Performance at the International Conference in Computer Vision
and Pattern Recognition (CVPR), San Diego CA.
34
Hoffman, J.E., & Subramaniam, B. (1995). The role of visual attention in saccadis eye
movements. Perception & Psychophysics, 57, 787-795.
Hooge, I. T. C., & Erkelens, C. J. (1996). Control of fixation duration during a simple
search task. Perception & Psychophysics, 58, 969-976.
Hooge, I. T. C., & Erkelens, C. J. ( 19983t. Adjustment of fixation duration during visual
search. Vision Research, 38, 1295-1302.
Horowitz, T. S. & Wolfe, J. M. (1998). Visual search has no memory. Nature, 94, 575-
577.
Inhoff, A. W., & Rayner, K. (1986). Parafoveal word processing during eye fixations in
reading: Effects of word frequency. Perception & Psychophysics, 40, 431-439.
Irwin, D.E. (2004). Fixation location and fixation duration as indices of cognitive
processing. In J.M. Henderson & F. Ferreira (eds) The interface of language,
vision, and action: Eye movements and the visual world (pp 105-134). New
York: Psychology Press.
Ishida, T., & Ikeda, M. (1989). Temporal properties of information extraction in reading
studied by a text-mask replacement technique. Journal of the Optical Society A:
Optics and Image Science, 6, 1624-1632.
Itti., L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert
shifts of visual attention. Vision Research, 40, 1489-1506.
Itti, L., & Koch, C. (2001). Computational modeling of visual attention. Nature Reviews:
Neuroscience, 2, 194-203.
Jacobs, A.M. (1986). Eye movement control in visual search: How direct is visual span
control? Perception & Psychophysics, 39, 47-58.
Juhasz, B.J., & Rayner, K. (2003). Investigating the effects of a set of intercorrelated
variables on eye fixation durations in reading. Journal of Experimental
Psychology: Learning, Memory & Cognition, 29, 1312-1318.
Juhasz, B.J., & Rayner, K (2006). The role of age-of-acquisition and word frequency in
reading: Evidence from eye fixation durations. Visual Cognition, 13, 846-863.
Kowler, E., Anderson, E., Dosher, B., & Blaser, E. (1995). The role of attention in
programming saccades. Vision Research, 35, 1897-1916.
Levin, D. T. & Simons, D. J. (1997). Failure to detect changes to attended objects in
morion pictures. Psychonomic Bulletin & Review, 4 (4), 501-506.
35
Liechty, J., Pieters F. G. M., & Wedel M. (2003), “Global and Local Covert Visual
Attention: Evidence from a Bayesian Hidden Markov Model,” Psychometrika, 68,
519–41.
Liversedge, S.P., Rayner, K., White, S.J., Vergilino-Perez, D., Findlay, J.M., &
Kentridge, R.W. (2004). Eye movements while reading disappearing text: Is there
a gap effect in reading? Vision Research, 44, 1013-1024.
Loftus, G. R., & Mackworth, N. H. (1978). Cognitive determinants of fixation location
during picture viewing. Journal of Experimental Psychology: Human Perception
and Performance, 4, 565-572.
Mackworth, N. H., & Morandi, A. J. (1967). The gaze selects informative details within
pictures. Perception & Psychophysics, 2, 547-552.
Mannen, S. K., Ruddock, K. H., & Wooding, D. S. (1995). Automatic control of saccadic
eye movements made in visual inspection of briefly presented 2-D images. Spatial
Vision, 9, 363-386.
Mannen, S. K., Ruddock, K. H., & Wooding, D. S. (1996). The relationship between the
locations of spatial features and those of fixation made during visual examination
of briefly presented images. Spatial Vision, 10, 165-188.
McConkie, G.W. (1991). Perceiving a stable visual world. In Proceedings of the Sixth
European Conference on Eye Movements, pages 5–7. Leuven, Belgium:
Laboratory of Experimental Psychology.
McConkie, G. W., & Rayner, K. ( 1975 ). The span of the effective stimulus during a
fixation in reading. Perception & Psychophysics, 17, 578-586.
McConkie, G. W., & Zola, D. (1979). Is visual information integrated across successive
fixations in reading? Perception & Psychophysics, 25, 221-224.
McPeek, R. M., Skavenski, A. A., & Nakayama, K. (2000). Concurrent processing of
saccades in visual search. Vision Research, 40, 2499–2516.
Morris, R. K. (1994). Lexical and message-level sentence context effects on fixation
times in reading. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 20, 92-103.
Morrison, R. E., & Rayner, K. (1981). Saccade size in reading depends upon character
spaces and not visual angle. Perception & Psychophysics, 30, 395-396.
Najemnik, J. & Geisler, W.S. (2005) Optimal eye movement strategies in visual search.
Nature, 434, 387-391.
36
Neider, M. B., & Zelinsky, G. J. (2006). Scene context guides eye movements during
search.Vision Research, 46(5), 614-621.
Nelson, W.W., & Loftus, G.R. (1980). The functional visual field during picture viewing.
Journal of Experimental Psychology: Human Learning and Memory, 6, 391-399.
Nodine, C. E, Carmody, D. P., & Herman, E. (1979). Eye movements during visual
search for artistically embedded targets. Bulletin of the Psychonomic Society, 13,
371-374.
O'Regan JK, Rensink RA, & Clark JJ (1999). Change blindness as a result of
'mudsplashes'. Nature, 398, 34.
Parker, R. E. (1978). Picture processing during recognition. Journal of Experimental
Psychology: Human Perception and Performance, 4, 284-293.
Parkhurst, D. J., & Niebur, E. (2003). Scene content selected by active vision. Spatial
Vision, 16, 125–154.
Parkhurst, D., Law, K., & Niebur, E. (2002). Modeling the role of salience in the
allocation of overt visual attention. Vision Research, 42, 107-123.
Peterson, M. S., Kramer, A. F., Wang, R. F., Irwin, D. E., & McCarley, J. S. (2001).
Visual search has memory. Psychological Science, 12, 287-292.
Pieters, F.G.M. & Warlop, L. (1999). Visual attention during brand choice: The impact of
time pressure and task motivation. International Journal of Research in
Marketing, 16, 1-16.
Pieters, F.G.M., Rosbergen, E. & Wedel, M. (1999). Visual attention to repeated print
advertising: A test of scanpath theory. Journal of Marketing Research, 36, 424-
438.
Pieters, R., & Wedel, M. (2007). Goal control of attention to advertising: The Yarbus
implication. Journal of Consumer Research, in press.
Pollatsek, A., Rayner, K.,& Collins,W. E. (1984). Integrating pictorial information across
eye movements. Journal of Experimental Psychology: General, 113, 426-442.
Pollatsek, A., Rayner, K., & Henderson, J. M. (1990). Role of spatial location in
integration of pictorial information across saccades. Journal of Experimental
Psychology: Human Perception and Performance, 16, 199-210.
Pollatsek, A., Reichle, E. D., & Rayner, K. (2006). Tests of the E-Z Reader model:
Exploring the interface between cognition and eye-movement control. Cognitive
Psychology, 52, 1-52.
37
Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental
Psychology, 32, 3-25.
Radach, R., Lemmer, S., Vorstius, C., Heller, D., & Radach, K. (2003). Eye movements
in the processing of print advertisements. In J. Hyönä, R. Radach & H.
Deubel (Eds), The Mind's Eyes: Cognitive and Applied Aspects of Eye Movement
Research (pp. 609-632). Amsterdam: Elsevier Science Publishers.
Rayner, K. (1975). The perceptual span and peripheral cues in reading. Cognitive
Psychology, 7, 65-81.
Rayner, K. (1978). Eye movements in reading and information processing. Psychological
Bulletin, 85, 618-660.
Rayner, K. (1995). Eye movements and cognitive processes in reading, visual search, and
scene perception. In J. M. Findlay, R. Walker, & R.W. Kentridge (Eds.), Eye
movement research: Mechanisms, processes and applications (pp. 3-22).
Amsterdam: North Holland.
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of
research. Psychological Bulletin, 85, 618-660.
Rayner, K., Ashby, J., Pollatsek, A., & Reichle, E.D. (2004). The effects of frequency
and predictability on eye fixations in reading: Implications for the E-Z Reader
model. Journal of Experimental Psychology: Human Perception and
Performance, 30, 720-732.
Rayner, K., & Bertera, J. H. (1979). Reading without a fovea. Science, 206, 468-469.
Rayner, K., & Duffy, S.A. (1986). Lexical complexity and fixation times in reading:
Effects of word frequency, verb complexity, and lexical ambiguity. Memory &
Cognition, 14, 191-201.
Rayner, K., McConkie, G.W., & Ehrlich, S.F. (1978). Eye movements and integrating
information across saccades. Journal of Experimental Psychology: Human
Perception and Performance, 4, 529-544.
Rayner, K., & Fisher, D. L. (1987a). Eye movements and the perceptual span during
visual search. In J. K. O'Regan &A. Ltvy-Schoen (Eds.), Eye movements: From
physiology to cognition (pp. 293-302). Amsterdam: North Holland.
Rayner, K., & Fisher, D. L. (1987b). Letter processing during eye fixations in visual
search. Perception & Psychophysics, 42, 87-100.
38
Rayner, K., Inhoff, A. W., Morrison, R. E., Slowiaczek, M. L., & Bertera, J. H. (1981).
Masking of foveal and parafoveal vision during eye fixations in reading. Journal
of Experimental Psychology: Human Perception and Performance, 7, 167--179.
Rayner, K., Liversedge, S.P. & White, S.J. (2006). Eye movements when reading
disappearing text: The importance of the word to the right of fixation. Vision
Research 46: 310-323.
Rayner, K., Liversedge, S.P., White, S.J., & Vergilino-Perez, D. (2003). Reading
disappearing text: Cognitive control of eye movements. Psychological Science,
14, 385-389.
Rayner, K., McConkie, G. W., & Zola, D. (1980). Integrating information across eye
movements. Cognitive Psychology, 12, 206-226.
Rayner, K., Miller, B., & Rotello, C.M. (2006). Eye movements when looking at print
advertisements: The goal of the viewer matters. Manuscript under review.
Rayner, K., & Pollatsek, A. (1989). The psychology of reading. Englewood Cliffs, NJ:
Prentice Hall.
Rayner, K., & Pollatsek, A. (1992). Eye movements and scene perception. Canadian
Journal of Psychology, 46, 342-376.
Rayner, K., Reichle, E. D., & Pollatsek, A. (1998). Eye movement control in reading: An
overview and model. In G. Underwood (Ed.), Eye guidance in reading and scene
perception (pp. 243-268). Oxford, England: Elsevier.
Rayner K., Rotello, C., Stewart A., Keir, J., & Duffy, S. (2001). Integrating text and
pictorial information: Eye movements when looking at print advertisements.
Journal of Experimental Psychology: Applied, 7, 219-226.
Rayner, K., Slowiaczek, M.L., Clifton, C., & Bertera, J.H. (1983). Latency of sequential
eye movements: Implications for reading. Journal of Experimental Psychology:
Human Perception and Performance, 9, 912-922.
Rayner, K., Warren, T., Juhasz, B.J., & Liversedge, S.P. (2004). The effect of
plausibility on eye movements in reading. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 30, 1290-1301.
Rayner, K., & Well, A. D. (1996). Effects of contextual constraint on eye movements in
reading: A further examination. Psychonomic Bulletin & Review, 3, 504-509.
Reichle, E. D. & Nelson, J. R. (2003). Local vs. global attention: Are two states
necessary? Comment on Liechty et al., 2003. Psychometrika, 68, 543-549.
39
Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye
movement control in reading. Psychological Review, 105, 125-157.
Reichle, E. D., Pollatsek, A. & Rayner, K. (2006). E-Z Reader: A cognitive-control,
serial-attention model of eye-movement behavior during reading. Cognitive
Systems Research, 7, 4-22.
Reichle, E.D., Rayner, K., & Pollatsek, A. (2003). The E-Z Reader model of eye
movement control in reading: Comparison to other models. Behavioral and Brain
Sciences. 26, 507-526.
Rensink RA, O'Regan JK, & Clark JJ (1997). To see or not to see: The need for attention
to perceive changes in scenes. Psychological Science, 8:368-373.
Saida, S., & Ikeda, M. (1979). Useful field size for pattern perception. Perception &
Psychophysics, 25, 119-125.
Sereno, S.C., O'Donnell, P.J., & Rayner, K. (2006). Eye movements and lexical
ambiguity resolution: Investigating the subordinate bias effect Journal of
Experimental Psychology: Human Perception and Performance, 32, 335-350.
Torralba, A., Oliva, A., Castelhano, M.S., & Henderson, J.M. (2006). Contextual
guidance of attention in natural scenes. Psychological Review, in press.
van Diepen, E M. J., & d'Ydewalle, G. (2003). Early peripheral and foveal processing in
fixations during scene perception. Visual Cognition, 10, 79-100.
van Diepen, E M. J., Wampers, M., & d'Ydewalle, G. (1998). Functional division of the
visual field: Moving masks and moving windows. In G. Underwood (Ed.), Eye
guidance in reading and scene perception (pp. 337-356). Oxford, England:
Elsevier.
van Zoest, L.J.F.M., Donk, M., & Theeuwes, J. (2004). The role of bottom-up control in
saccadic eye movements. Journal of Experimental Psychology: Human
Perception and Performance, 30, 746-759.
Vaughan, J. (1982). Control of fixation duration in visual search and memory search:
Another look. Journal of Experimental Psychology: Human Perception and
Performance, 8, 709-723.
Vlaskamp, B.N.S., & Hooge, I.T.C. (2006). Crowding degrades saccadic search
performance. Vision Research, 46, 417-425.
Wedel, M. & Pieters, F.G.M. (2000). Eye fixations on advertisements and memory for
brands: A model and findings. Marketing Science, 19, 297-312.
40
Williams, C. C., Henderson, J.M., & Zacks, R. T. (2005). Incidental visual memory for
targets and distractors in visual search. Perception & Psychophysics, 67, 816-827.
Williams, R.S., & Morris, R.K. (2004). Eye movements, word familiarity, and
vocabulary acquisition. European Journal of Cognitive Psychology, 16, 312-339.
Yarbus, A. (1967). Eye movements and vision. New York: Plenum Press.
Zelinsky, G. (2005). Specifying the components of attention in a visual search task. In L.
Itti, G. Rees, & J. Tsotsos (Eds.), Neurobiology of attention (pp. 395-400).
Elsevier, Inc.
Zelinsky, G. J., Rao, R. P. N., Hayhoe, M. M., & Ballard, D. H. (1997). Eye movements
reveal the spatiotemporal dynamics of visual search. Psychological Science, 8(6),
448-453.
41
Acknowledgments
Preparation of this chapter was supported by a Grant from the Microsoft
Corporation and by Grant HD26765 from the National Institute of Health.
Correspondence should be addressed to Keith Rayner, Department of Psychology,
University of Massachusetts, Amherst, MA 01003, USA.
42
Footnotes
1. Although vision is suppressed, for most cognitive tasks, mental processing continues
during the saccade (see Irwin, 2004 for a review of when cognition is also suppressed
during saccades).
2. The nature of the writing system also very much influences the size of the perceptual
span, but this is beyond the scope of the present chapter (see Rayner, 1998 for a review).
3. For a comprehensive overview of these models, see the 2006, vol. 7 special issue of
Cognitive Systems Research.
4. Our primary argument is that lexical processing drives the eyes through the text and
higher order processes primarily serve to intervene when something doesn’t compute (see
Rayner, Warren, Juhasz, & Liversedge, 2004).
43
Table 1. Eye movement characteristics in reading, scene perception, and visual
search.
Task Mean Fixation Mean Saccade Size
Duration (ms) (degrees)
Silent reading 225-250 2 (8-9 letter spaces)
Oral reading 275-325 1.5 (6-7 letter spaces)
Scene perception 260-330 4
Visual search 180-275 3
44
Table 2. Mean viewing time (in seconds) and number of fixations for the text and
picture parts of ads as a function of task. Values in parentheses equal the percent of time
looked at the text or picture (for the Viewing Time) and the percent of fixations in the
text or picture (for the Number of fixations).
Viewing Time Number of Fixations
Text Picture Text Picture
Rayner et al. (2006) 3.64 (39%) 5.72 (61%) 14.7 (39%) 22.7 (61%)
Rayner et al. (2001)
Intended 5.61 (73%) 2.12 (27%) 25.2 (72%) 9.8 (28%)
Non-intended 3.60 (71%) 1.50 (29%) 16.4 (70%) 6.9 (30%)
Note: In the Rayner et al. (2001) study, intended refers to ads that viewers were
instructed to look at to purchase whereas non-intended refers to the other ads they
viewed.
45
Figure Captions
Figure 1. Fixation duration frequency distributions for reading, scene perception,
and visual search. The data are from the same 24 observers engaged in the three different
tasks. No lower cutoffs of fixation duration were used in these distributions while an
upper cutoff of 1000 ms was used.
Figure 2. Examples of a moving window (with a thirteen character window), a
moving mask (with a 7 character mask), and the boundary paradigm. When the reader’s
eye movement crosses an invisible boundary location (the letter n), the preview word
house changes to the target word print. The asterisk represents the location of the eyes in
each example.
Figure 3. Examples of where viewers look in scenes. The top portion of the
figure shows where one viewer fixates in the scene (the dots represent fixation points and
the lines represent the sequence). The bottom portion shows where a number of different
viewers fixate (with the dots representing fixation locations across a number of viewers).
46
Figure 1
47
Figure 2
Normal Line:
do people look in print advertisements and
Moving Window Paradigm (13 character window):
Where xx xeople look inxxxxxxxxxxxxxxxxxxxxxxxxx
*
Where xx xxxxxxxxxok in print axxxxxxxxxxxxxxxxx
*
Moving Mask Paradigm (7 character mask):
Where do people lxxxxxxxprint advertisements and
*
Where do people look in xxxxxxxdvertisements and
*
Boundary Paradigm:
Where do people look in house advertisements and
*
Where do people look in print advertisements and
*
48
Figure 3