A Reconsideration of Cognitive Load Theory
Wolfgang Schnotz &Christian Kürschner
Published online: 19 September 2007
#Springer Science + Business Media, LLC 2007
Abstract Cognitive load theory has been very influential in educational psychology during
the last decade in providing guidelines for instructional design. Whereas numerous
empirical studies have used it as a theoretical framework, a closer analysis reveals some
fundamental conceptual problems within the theory. Various generalizations of empirical
findings become questionable because the theory allows different and contradicting
possibilities to explain some empirical results. The article investigates these theoretical
problems by analyzing the conceptual distinctions between different kinds of cognitive
load. It emphasizes that reduction of cognitive load can sometimes impair learning rather
than enhancing it. Cognitive load theory is reconsidered both from the perspective of
Vygotski’s concept of the zone of proximal development and from the perspective of
research on implicit learning. Task performance and learning are considered as related, but
nevertheless fundamentally different processes. Conclusions are drawn for the further
development of the theory as well as for empirical research and instructional practice.
Keywords Working memory .Intrinsic load .Extraneous load .Germane load .
Zone of proximal development .Implicit learning
Learning and instruction has undergone important changes during the last years. New
technologies enable the construction of learning environments that allow presenting
information electronically by different representational formats in flexible ways. Although
the technical aspect is fundamental for the functioning of these environments, it is by itself
not very interesting from a psychological or instructional science perspective. Instead, the
important aspects refer to the representational formats and to the perceptual as well as
higher cognitive processes that occur when learners interact with these learning environ-
ments (Ainsworth and Van Labeke 2004; Mayer 2001,2005; Schnotz 2001,2005).
Educ Psychol Rev (2007) 19:469–508
W. Schnotz :C. Kürschner
Faculty of Psychology, Department of General and Educational Psychology,
University of Koblenz-Landau, Landau, Germany
W. Schnotz (*)
Department of General and Educational Psychology, Thomas-Nast-Str. 44,
76829 Landau (Pfalz), Germany
Questions of central importance are: What goes on in the mind of the learner when spoken or
written texts with or without static or animated pictures or graphs are presented to him or her?
How can the displayed information be adapted to the limitations of the cognitive system?
The necessity of adapting instruction to the constraints of the learner’s cognitive system
has been the main concern of cognitive load theory, which has been developed by John
Sweller and his colleagues and which has become increasingly influential in instructional
psychology (cf. Paas et al. 2003a,2004; Paas and Van Gog 2006; Sweller 1999,2003,
2005; Sweller and Chandler 1994; Sweller et al. 1998). The fundamental claim of this
theory is that without knowledge about the human cognitive architecture the effectiveness
of instructional design is likely to be random. More specifically, cognitive load theory
argues that many traditional instructional techniques do not adequately take into account the
limitations of the human cognitive architecture, as they unnecessarily overload the learner’s
working memory, the central “bottleneck”of his/her cognitive system. Accordingly,
cognitive load theory tries to integrate knowledge about the structure and functioning of the
human cognitive system with principles of instructional design.
Numerous empirical studies have demonstrated that traditional instruction can and should
be re-designed according to principles of cognitive load theory, and that this re-design results
in better learning. However, there are also numerous conceptual problems related to cognitive
load theory, which sometimes make interpretation of empirical findings difficult. Although
the concept of cognitive load has been frequently described in general terms and although
definitions have been provided for different kinds of cognitive load, a closer look reveals that
the exact nature of these different kinds of load is not sufficiently clear yet. Further
clarification is needed regarding the relations between different kinds of cognitive load and
whether they can and how they should be manipulated to enhance learning. Other open
questions refer to the role of working memory in the process of learning. Although working
memory is a key concept in cognitive load theory, it is not sufficiently clear to what extent
working memory is in fact required for learning. Finally, further clarification is needed
whether and in which way different kinds of cognitive load constrain each other, how they
relate to the process of learning and, last not least, how they can be measured.
The following article aims at contributing some clarification to these issues. First, we
will outline the history of cognitive load theory and describe (second) the basic assumptions
of the theory. Third, we will analyze the relation between the two basic types of cognitive
load that were originally distinguished in cognitive load theory. Fourth and based on this
analysis, we will investigate how the concept of cognitive load is related to another
instructional and developmental psychological concept, the concept of the individual’s zone
of proximal development. Fifth, we will analyze how working memory contributes to
learning, taking into account also the possibilities of implicit (unconscious) learning. Sixth,
we will suggest to re-define a specific kind of cognitive load, which is assumed to
especially enhance learning, and will analyze its underlying constraints. Seventh, we will
deal with some problems of measuring cognitive load. Finally, we will summarize our
results and suggest further perspectives.
History of Cognitive Load Theory
Learning to solve problems
The development of cognitive load theory started in the late 1970s with a focus on students’
learning to solve problems (Sweller 1976). According to a widely held belief, students learn
470 Educ Psychol Rev (2007) 19:469–508
problem solving simply by solving problems, which is mirrored for example in the field of
teaching mathematics by extensive practice in solving problems in algebra or geometry.
Various studies could show, however, that problem solving is exceptionally demanding in
terms of working memory capacity. In the absence of more specific prior knowledge,
problem solvers have to search a solution through means-ends analysis. This requires the
individual to hold the current problem state, the goal state, any sub-goal states, the relation
between these states, and the possible operators associated with these states continuously in
working memory. Accordingly, this technique turned out to result in little learning (Sweller
1980; Sweller et al. 1982). Instead of having students searching for a specific problem goal,
it is also possible to present them so-called goal-free problems. When solving goal-free
problems, students are simply asked to calculate the value of as many variables as they can.
In contrast to means-ends-analysis, this strategy requires nothing more than considering
each problem state encountered and finding any operator that can be applied to this state. It
turned out that students learn considerably better to solve transfer problems from goal-free
problems than from traditional problem solving (Sweller and Levine 1982).
Further research demonstrated the effectiveness of other alternatives to the traditional
problem solving as a method of teaching and learning. Sweller and Cooper (1985)
investigated the use of worked-out examples (i.e., examples that provide a solution to a
problem) as a substitute for conventional problem solving in learning algebra. They found
that learners who studied worked-out examples, which focused their attention on problem
states and the associated operators, enhanced their ability to solve new algebra problems
more than learners who were required to solve the equivalent problems on their own
(Cooper and Sweller 1987).
Intrinsic and extraneous load
By the end of the 1980ies, the concept of cognitive load was introduced to explain these
kinds of results (Sweller 1988; Sweller et al. 1990). Cognitive load referred to any demands
on working memory storage and processing of information. Within cognitive load, a
distinction was made between load that is caused by the intrinsic nature of the learning task
(intrinsic load), and cognitive load that is caused by the format of instruction (extraneous
load) rather than by the intrinsic characteristics of the learning task. Problem solving
through means-ends analysis was assumed to place a heavy extraneous cognitive load on
working memory, which interferes with learning. Goal-free problems and worked-out
examples were considered as an effective way to reduce extraneous cognitive load by
eliminating mental means-ends search processes (Sweller and Chandler 1991,1994).
Besides problem solving, research within the framework of cognitive load theory focused
on knowledge acquisition from multiple sources of information. There were two effects that
attracted the researchers’special attention: the split-attention effect and the modality effect
(cf. Yeung et al. 1997).
The split-attention effect occurs when the learner’s attention must be split between
multiple sources of visual information that have to be integrated for comprehension,
because the individual sources cannot be understood in isolation. For example, a geometric
diagram may be unintelligible for a learner without associated verbal explanations. The two
sources must be mentally integrated before comprehension can take place. Mental
integration imposes a considerable cognitive load on working memory. This load is
unnecessarily high when the two information sources are spatially separated rather than
spatially integrated. Sweller et al. (1990) found that the extraneous cognitive load of
separated sources of information can be considerably reduced by integrating the two
Educ Psychol Rev (2007) 19:469–508 471
sources of information as far as possible (cf. Chandler and Sweller 1992; Bobis et al. 1993).
Split-attention could also be demonstrated in learning to operate a technical device such as
a computer. Chandler and Sweller (1996) found that a self-contained computer program
manual that physically integrated disparate information and which did not require the use of
the computer hardware was considerably better for comprehension and learning than an
instructional format that required continual interaction with the computer. Similar findings
regarding learning under the conditions of split-attention were reported by Mayer and his
co-workers, who have called these findings the spatial contiguity effect (Mayer 1997,
The modality effect also refers to a situation when multiple sources of information have
to be integrated for comprehension and when using only visual sources of information
would require learners to split their attention. In case of the modality effect, extraneous
cognitive load is reduced not by spatially integrating different sources of visual information
as far as possible. Instead, it is reduced by presenting verbal material in auditory rather than
in visual form, as a spoken instead of a written text. Mousavi et al. (1995) showed that a
visually presented diagram combined with an auditorily presented text resulted in better
learning than the diagram combined with a visually presented text, if the learning task was
demanding. Similar findings were reported by Tindall-Ford et al. (1997). The authors
explained their results based on the assumption that working memory consists of an
auditory subsystem and a visual subsystem, which corresponds more or less to Baddeley’s
(1986) view of a phonological loop and a visuo-spatial sketchpad as subsystems of working
memory. Accordingly, effective working memory capacity was increased by including both
visual and auditory working memory instead of only visual working memory into cognitive
processing. The increase of effective working memory due to the use of two modalities and
the avoidance of split-attention considerably decreased extraneous cognitive load. Similar
findings were reported by Mayer and his co-workers, who have called this the modality
effect under the condition of temporal contiguity (Mayer 1997,2001).
Introduction of germane load
Until the second half of the 1990s, research on cognitive load theory has almost exclusively
focused on instructional designs, which intended to decrease extraneous cognitive load.
Because the intrinsic load refers to the inherent nature of the learning task and therefore was
assumed to be fixed, the only cognitive load that seemed to be manipulable was extraneous
load. This is also mirrored in publications about the reduction of cognitive load, which
concentrate entirely on extraneous load (Mayer and Moreno 2003).
Paas and van Merriënboer (1994) studied the effects of high and low variability of
problem situations on learning. They found that students confronted with variable sets of
problems were better able to categorize statistic word problems after learning and showed
better transfer, although the variability had increased cognitive load. The authors interpreted
these results as the effect of a further kind of cognitive load, which would have a positive
effect on learning. In other words: Besides the “necessary”intrinsic load and the “bad”
extraneous load, cognitive load theory introduced a third, a “good”cognitive load, which
was called germane load (Sweller et al. 1998). The assumption was that germane load is due
to the development of cognitive schemata, which requires extra working memory capacity. As
a consequence of introducing the concept of germane load, cognitive load theory
recommended that instructional design should decrease extraneous load, but could become
even more effective if it increases germane load, provided that the total cognitive load stays
within limits and does not overburden the learner’s working memory (Sweller 2005).
472 Educ Psychol Rev (2007) 19:469–508
Redundancy and expertise reversal
Further research on cognitive load theory aimed at differentiating the picture of cognitive
load effects. It was demonstrated, for example, that integrating multiple sources of
information as far as possible with the intention to avoid split attention does not always
result in better learning. In fact, a split-attention effect only occurs when different sources of
information are unintelligible in isolation and therefore need to be mentally integrated.
Sometimes, however, multiple sources of information can also be understood in isolation.
Chandler and Sweller (1991) found, for example, that adding a text to a diagram, which
only described the content of the diagram, did not improve learning. Learning was in this
case enhanced by the elimination of the textual material rather than by integrating the text
and the diagram as far as possible. When a second source of information merely reiterates
the information of the first source in a different form, it can be considered as redundant.
Cognitive load theory therefore refers to the beneficial effect of removing redundant
information as the redundancy effect (Sweller and Chandler 1994). The redundancy effect
occurs when students, who were not presented with redundant information, perform better
after learning than students, who were presented with redundant information. Although the
latter can fully understand the subject matter based on one source of information, they
unnecessarily process and integrate further sources of information without an additional
benefit for comprehension.
Recent cognitive load theory research has emphasized the necessity that the instructional
format has to be matched to the learner’s expertise. It was demonstrated, for example, that
the split-attention effect, the modality effect and the worked-examples effect do not occur
under all conditions. For example, Kalguya et al. (1998) found that when learners’expertise
increases, physically integrating multiple sources of information (as a means to minimize
split of attention) first lost its advantage and then became disadvantageous in comparison to
a physically separated presentation. The authors argue that a further source of information
that is essential for the comprehension of novices can become redundant for more expert
learners. Similar results were reported by Kalguya et al. (2003).
The same kind of effects was found for the use of multiple modalities. Kalguya et al.
(2000) showed that novices performed better after learning from a diagram plus an auditory
text than after learning from a diagram plus a visual text. Performance was worst when
students had learned only from a diagram. When students had acquired more prior
knowledge, however, the advantage of the diagram combined with an auditory text
disappeared. After a further increase of prior knowledge, students learned best when they
were presented only a diagram. Similar findings were reported by Leahy et al. (2003).
Kalguya et al. (2001a) compared learning from worked-out examples with learning from
a less guided exploratory-based environment, which allowed participants to explore the
same material on their own. Tasks had different levels of difficulty. Whereas only minimal
differences could be observed for easy tasks, novices clearly benefited most from worked-
out examples with complex tasks. However, when learners became more experienced, the
advantage of the worked-out examples disappeared and the exploratory group performed
better than the worked examples group. Similarly, Kalguya et al. (2001b) found that
worked-out examples were more beneficial for learning than traditional problem solving.
However, when learners achieved higher expertise, traditional problem solving turned out
to be the better alternative for learning. The dependencies of instructional effects on the
learners’level expertise are referred to as the expertise-reversal effect by cognitive load
theory (Kalguya et al. 2003). An expertise reversal effect occurs, when an instructional
format that is beneficial for novices compared to other formats looses its advantage with
Educ Psychol Rev (2007) 19:469–508 473
increasing expertise of the learners and finally becomes disadvantageous for individuals
with higher expertise.
The evolutionary perspective
During the last years, Sweller aims at embedding cognitive load theory into a broader
theoretical framework of evolutionary theory. Referring to recent work in the field of
biology, psychology and anthropology (cf. Geary 2005), he assumes that structures and
processes of human cognition are closely analogous to structures and processes associated
with evolution by natural selection. For example, the human long-term memory is seen as
analogous to the genetic code in biology. Similarly to the genetic code, which has
developed as a biological adaptation to a specie’s environment, long-term memory has
developed as a cognitive adaptation to an individual’s environment. The necessarily small
steps of genetic modifications in order to promote biological evolution are considered as
analogous to the limited capacity of human working memory, which allows only small
modifications in long-term memory (Sweller 2003,2004).
The evolutionary reframing of cognitive load theory has lead to a clearer definition of
the theory’s scope. The assumption is that there exist biologically predetermined
dispositions to acquire specific knowledge. For example, we learn easily to discriminate
individual faces without being aware of any learning. The corresponding knowledge is
therefore called biologically primary knowledge. Other kinds of knowledge cannot be
learned as easily without effort and consciousness. They have to be explicitly taught, such
as, for example, reading, writing, and mathematics. This culturally mediated knowledge is
referred to as biologically secondary knowledge (Geary 2007). Cognitive load theory
claims only validity for the acquisition of biologically secondary knowledge, because this is
where working memory is needed (Sweller 2003; J. Sweller and S. Sweller 2006).
Basic Assumptions of Cognitive Load Theory
We have already introduced above some fundamental concepts of cognitive load theory. In
the following, we will analyze the basic assumptions of the theory more systematically. The
theory’s fundamental claim is that in order to be effective, instruction has to be adapted to
the structure and functioning of the learners’cognitive architecture. Accordingly, we will
first describe the assumption of cognitive load theory about human memory as well as the
nature of the cognitive units stored and processed in memory. We will then describe the
theory’s assumptions about the different kinds of cognitive load imposed by different kinds
of processing under different kinds of conditions, and we will present the assumptions
about the interplay between these kinds of loads. Finally, we will describe the assumptions
of cognitive load theory about the process of understanding, the process of learning, and the
resulting instructional consequences.
Multiple memory stores Cognitive load theory assumes that the human cognitive
architecture consists of multiple memory stores, including a very limited working memory
and an extensive long-term memory. Working memory is limited in capacity and in duration
when dealing with novel information. It can combine, contrast or manipulate no more than
four information elements at one time (cf. Miller 1956), and without rehearsal, information
in working memory is lost within about 20 s (L. Peterson and M. Peterson 1959). Working
memory limitations disappear when dealing with information from long-term memory,
474 Educ Psychol Rev (2007) 19:469–508
where information is organized into higher order units called cognitive schemata (cf.
Ericsson and Kintsch 1995). However, because instruction generally provides novel
information, the limitations of working memory make it difficult for the learner to
assimilate multiple information elements simultaneously (Sweller 2005). In its new,
evolution-oriented version, cognitive load theory assumes that the working memory
limitations are not accidental, but are an essential concomitant of human cognitive
architecture, because a somewhat smaller working memory is likely to be more efficient
than a larger one
Cognitive load theory also assumes with reference to the work of Baddeley (1986) that
working memory includes different channels for visual and auditory information.
Accordingly, effective memory may be increased by presenting material in an auditory
and visual mode rather than in an only visual mode (Mousavi et al. 1995). Contrary to
Baddeley’s theory of working memory, cognitive load theory does not assume a domain-
unspecific central executive. Instead, the cognitive schemata stored in long-term memory
are assumed to function as a central executive, because schemata indicate what should be
done, when it should be done and how it should be done (Sweller 2005).
Cognitive schemata Cognitive load theory assumes that information is organized in long-
term memory in the form of cognitive schemata. Schemata are cognitive constructions that
help to reduce the cognitive burden on working memory, because they allow categorizing
multiple elements of information as a single element. Accordingly, schemata provide the
elements of knowledge, which interact in working memory (Sweller and Chandler 1994;
Sweller et al. 1998). If the number of interacting elements (i.e. elements to be processed
simultaneously) exceeds working memory capacity, some elements must be combined into
(higher order) schemata, before the material can be understood (Marcus et al. 1996).
Cognitive load and mental effort Each cognitive process that requires conscious control
puts a cognitive load on working memory. Cognitive load is seen as a construct
representing the working memory resources required to learn a particular material (Sweller
and Chandler 1994) or to perform a particular task (Sweller et al. 1998)
. The amount of
working memory resources that is actually allocated by the learner to the process of
learning or to task performance is called the mental effort. A learner’s performance on a
Sweller (2005) assumes that a smaller working memory is more efficient than a larger one. The argument is
that in a working memory including four elements, finding the correct sequence of operations by trial and
error is still possible, because the number of permutations is only 24 (4!). In a working memory including 10
elements, on the contrary, the number of permutations would be more than 3 millions (10!), which would no
longer allow finding the correct sequence of operations. We are in doubt, however, whether the analogy
between biological evolution and cognition really holds at this point. If a task requires n operations to be
performed in the right sequence, the learner has to find the right sequence out of the n! possibilities anyway
by trial and error, regardless of his/her size of working memory. A larger working memory would then allow
longer sequences of operations being stored and memorized and, thus, be more advantageous for learning
than a smaller working memory. A larger working memory allows also anticipating longer sequences of
operations. According to our knowledge, human working memory is a relatively recent development of
evolution, and there are no indications that any human or nonhuman species was disadvantaged due to a too
large working memory.
It should be noted that these definitions of cognitive load are not fully equivalent. Learning a particular
material (Sweller and Chandler 1994) means the acquisition of knowledge or skills that corresponds to a
change in long-term memory. Performing a particular task Sweller et al. (1998) means finding a solution by
manipulating an external or internal situation regardless whether or not a change in long-term memory takes
Educ Psychol Rev (2007) 19:469–508 475
task depends on the cognitive load of the task and the learner’s mental effort invested into
Intrinsic load The cognitive load imposed by the intrinsic aspects of a task is called the
intrinsic load. In other words: Intrinsic load is due to the natural complexity of the
information that must be processed. It is determined entirely by element-interactivity, that
is, by the number of cognitive elements that have to be held simultaneously in working
memory (Sweller and Chandler 1994). For example, if an individual has to learn the syntax
of a language, the elements of the analysis can be the words of a sentence. The learner has
to analyze how each word of a sentence is related to its other words. In this case, each
element must be analyzed in conjunction with several other elements, and element
interactivity is high. High element interactivity imposes a high intrinsic cognitive load on
working memory. On the contrary, if an individual has to learn long lists of vocabulary, a
huge number of elements must be assimilated. Nevertheless, element interactivity is low,
because the elements do not have to be held simultaneously in working memory. Element
interactivity cannot be determined merely by analyzing the tasks or the learning material,
because a large number of interacting elements for one learner may be only a single element
for another learner with more expertise. Expertise determines what counts as an element.
Accordingly, intrinsic cognitive load can be determined only with reference to a particular
level of expertise. This implies that for a specific learner’s expertise the intrinsic load of a
specific task cannot be altered (Sweller et al. 1998; Sweller 2005).
Extraneous load Whereas intrinsic load is caused by the task-intrinsic aspects of learning,
extraneous load is caused entirely by the format of the instruction (Sweller 2005, Sweller
et al. 1998). More specifically, extraneous load is an unnecessary load caused by the design
and organization of the learning material (Kalguya et al. 1998). Extraneous load requires
an extra effort due to an inappropriate instructional format. Cognitive load theory has
defined extraneous load in different ways, which are not fully equivalent. One definition
considers extraneous load as resulting from an unnecessarily high degree of element
interactivity in working memory due to the instructional format. For example, if a diagram
is presented with integrated explanatory text, it is very hard to ignore the text even if the
learner does not need the text for understanding. The learner is forced to simultaneously
assimilate multiple elements of information, which imposes a heavy extraneous load on
working memory. Another definition considers extraneous load as resulting from irrelevant
cognitive activities. Activities are seen as irrelevant if they are not directed to schema
acquisition and schema automation (Sweller and Chandler 1994; Sweller 2005). The two
definitions are not equivalent, because cognitive activities that are irrelevant for learning do
not necessarily include high element interactivity. Insofar, the second definition is broader
than the first one. Regardless of the specific way of defining extraneous load, the theory
assumes that extraneous load interferes with learning and, thus, should be reduced as far as
possible by eliminating irrelevant cognitive activities (Leung et al. 1997; Sweller et al.
Germane load When learners are engaged in conscious cognitive processing that is
directed to the construction of schemata, cognitive load is increased. However, this load is
germane because it assists schema construction (Sweller et al. 1998). Germane load is the
load caused by effortful learning resulting in schema construction and schema automation.
According to cognitive load theory, germane load should be increased as far as possible.
Instructional design that results in unused working memory capacity because of low
476 Educ Psychol Rev (2007) 19:469–508
intrinsic load and low extraneous load may be further improved be encouraging learners to
engage in conscious cognitive processing that is directly relevant to the construction of
cognitive schemata (Sweller 2005).
Additivity of cognitive load Cognitive load theory assumes that intrinsic load, extraneous
load and germane load are additive. The total load is the sum of the three kinds of
cognitive load. If intrinsic load is low, increases of germane load may be possible even if
extraneous load is high. This is why instructional design is not very important when
dealing with simple material that can be easily understood. However, if intrinsic load is
high, adding a heavy extraneous load may exceed the learner’s working memory capacity
or interfere with learning, because no capacity is left for germane load. Accordingly,
cognitive load effects can only be demonstrated using material with high element
interactivity. This is called the element-interactivity assumption (Sweller 2005).
Learning Learning is an increase in expertise due to an alteration in long-term memory. If
nothing has been altered in long-term memory, nothing has been learned. The major
mechanisms of learning are schema acquisition and schema automation. Schema
acquisition changes what individuals treat as an element. Thus, learning reduces cognitive
load. If a schema is acquired, the set of former elements that were integrated into the
schema can now be treated as a single element. In this way, schema acquisition reduces the
number of interacting elements in working memory. In former versions of cognitive load
theory, acquisition of schemata was assumed to contribute to the intrinsic load, because the
corresponding elements have to interact for a meaningful assimilation (Sweller and
Chandler 1994), whereas in the more recent versions of the theory, schema acquisition
contributes to the germane load (Sweller et al. 1998; Sweller 2005). Information can be
processed either consciously or automatically (Schneider and Shiffrin 1977; Shiffrin and
Schneider 1977). Once a schema has been acquired, further practice can permit it to be
processed automatically. This process is called schema automation. It allows cognitive
processes to occur without conscious control and, thus, allows providing working memory
reserves for other kinds of processes (Sweller 2005).
Understanding According to cognitive load theory, understanding occurs when all relevant
elements of information are processed simultaneously in working memory (Marcus et al.
1996). Material is too hard to understand if it consists of too many interacting elements that
cannot be held simultaneously in working memory (Sweller et al. 1998). Understanding
cannot occur until schema construction and automation have progressed to the point where
working memory can hold and process all essential elements. Cognitive load theory further
assumes, that understanding requires also changes in long-term memory besides processing
in working memory. Sweller (2005) argues that without changes in long-term memory,
nothing has been understood. Insofar, understanding is not clearly distinguished from
learning in cognitive load theory.
Instructional consequences Cognitive load theory assumes that without appropriate prior
knowledge, instructional guidance can provide a substitute for missing schemata that allows
learners to develop their own schemata (Sweller 2005). In the case of instruction, the
external information provided by others serves as an external executive for the learner’s
working memory, which directs further cognitive processing. Of course, other people’s
knowledge can only take over this function if it is made available in a suitable form.
According to cognitive load theory, many commonly instructional techniques result in an
Educ Psychol Rev (2007) 19:469–508 477
unnecessarily high extraneous load that interferes with learning (Sweller and Chandler
1994; Sweller 2003,2005). Applications of the theory therefore aim at reducing extraneous
cognitive load caused by inappropriate instructional design as well as enhancing schema
construction or schema automation. Practical consequences include, for example, avoidance
of split-attention and unnecessary redundancy, the use of goal-free problems and worked
examples or the use of different modalities as described above.
A Closer Look at Intrinsic and Extraneous Load
Why intrinsic load is not necessarily fixed
Cognitive load theory aims at adapting instruction to the cognitive architecture and the
expertise of the learner. An instruction given by a teacher requires activities on the side of
the learner: The learner is presented a learning task as well as instructional help to solve the
task. Learning tasks can be selected from a huge variety of possible forms, depending on
the educational objectives. Learning tasks can require the student to comprehend a subject
matter by reading a text or a diagram, to solve a mathematical equation, to prove a theorem,
and so forth. Tasks are usually described according to the following format: The individual
is expected to show a specific kind performance at a specific level of quality under specific
conditions as, for example, with or without help (Mager 1975). As has been mentioned
above, the intrinsic load is the necessary load of a specific task under specific conditions.
The lower the learner’s expertise, the higher the complexity of the task, and the less help is
available, the higher the required element-interactivity and, thus, intrinsic load on working
memory will be.
Learning tasks are derived from educational objectives and include therefore a normative
component. Accordingly, what counts as intrinsic cognitive load depends also on
educational objectives. Consider as an example 10th graders who should understand
abstract legal subject matters such as the laws regarding stock companies. These students
would need a highly readable text in order to understand, whereas a text written in lawyers’
terminology would be incomprehensible for them. If the 10th graders would understand the
highly readable text sufficiently well, the cognitive load of comprehension would be
considered an intrinsic load. If they had to read a text written in lawyers’terminology, they
would have a hard time to understand, and the cognitive load of comprehension would
include besides the intrinsic load also extraneous load, because the text difficulty would be
unnecessarily high. If the same text is presented to students of law, they are expected to
understand the text regardless of whether it had been possible to simplify the text. They will
have to read such documents later in their professional life and, thus, being able to cope
with it is part of the learning task. Accordingly, there is no extraneous load associated with
such texts for students of law, and the cognitive load of comprehension consists only of
intrinsic load. In other words: The distinction between intrinsic and extraneous load
depends (among others) on the educational objectives.
Individuals learn from the learning tasks presented to them. They acquire task-
appropriate schemata and become less and less dependent on instructional help. Learning
increases expertise, and increasing expertise reduces intrinsic load. Whereas intrinsic load
varies with expertise, the intrinsic load of a specific learning task with regard to a specific
level of expertise cannot be changed, because at this level the intrinsic nature of the task
requires a specific amount of element-interactivity. This has lead to an assertion frequently
made by cognitive load theorists, namely, that intrinsic load is fixed. Intrinsic load is indeed
478 Educ Psychol Rev (2007) 19:469–508
fixed for a specific learning task at a specific level of expertise. However, intrinsic load is
not fixed in general. We will argue that instruction can and has to manipulate intrinsic load,
because learning tasks have to be carefully matched to the learners’expertise.
Alignment of task complexity, instructional help, and expertise
In order to be effective, instruction has to be carefully matched to the individual’s learning
prerequisites. Accordingly, complexity of learning tasks and instructional help have to be
well aligned with the learners’expertise. Tasks must not be too difficult, because otherwise
their intrinsic load (i.e. required element-interactivity) would overburden the learner’s
working memory. Tasks must also not be too easy, because otherwise the learner’s working
memory would be sub-challenged by a too low intrinsic load.
Figure 1shows an idealized version of possible alignments and misalignments between
learner’s expertise and task difficulty by a line graph. The abscissa of the graph represents
expertise, whereas the ordinate represents task difficulty. The points on or close to the
diagonal function line represent instructional variants, in which expertise and task difficulty
are well aligned. Points relatively distant from the diagonal function line represent
instructional variants, which are characterized by a misalignment of expertise and task
difficulty. As an example for the purpose of illustration, Fig. 1displays two (arbitrarily
chosen) levels of expertise, a low level L1 and a high level L2, and it displays (two
arbitrarily chosen) levels of task difficulty, an easy task T1 and a difficult task T2. The
different levels of task difficulty can be interpreted both in terms of task complexity and in
terms of instructional help. In the first case, a task is easy due to low complexity (T1) or it
is difficult due to high complexity (T2). In the second case, a task is easy because it is
presented with sufficient help (T1), or it is difficult because it is presented without help
(T2). Of course, both sources of difficulty can also combine in various forms. It should also
be emphasized that expertise, task complexity, instructional help and hence task difficulty
are continuous variables and that we have displayed only two levels just for simplicity.
As long as a learner has low expertise (L1), presenting him/her an easy task (T1) means
that expertise and difficulty are well aligned. This is represented in Fig. 1by the fact that
the combination L1-T1 is located on the diagonal function line. Presenting the same learner
Fig. 1 Relations between intrin-
sic load (ICL) as a result of
adequate alignment of learning
task difficulty with learner’s
expertise and additional
extraneous load (+ECL) as a
result of misalignment of task
difficulty with expertise.
Learning increases expertise
Educ Psychol Rev (2007) 19:469–508 479
(L1) the difficult task (T2) would overburden his/her working memory. This is represented in the
figure by the fact that the combination L1–T2 is far above the diagonal function line. Learning
implies an increase of expertise, which is represented in Fig. 1as a shift of the individual’s
position from L1 to L2. For a learner with high expertise (L2), a task at level T1 would be too
easy and sub-challenge his/her capacities. This is represented by the fact that the combination
L2–T1 is far below the diagonal function line. Learners with high expertise (L2) need a more
difficult task (T2) for an adequate alignment between expertise and difficulty. This is represented
in the figure by the fact that the combination L2–T2 is located on the diagonal function line.
When the learner’s expertise and the difficulty of the learning task are well aligned (L1–
T1 and L2–T2), the learner has to deal only with intrinsic load. If expertise and difficulty
are not aligned, an additional extraneous load is generated which draws on the learner’s
resources. Misalignment between expertise and task difficulty exists in two variants. One
variant of misalignment is visualized by the area above the diagonal function line in Fig. 1.
It represents instruction, where the task difficulty exceeds expertise (L1–T2). In this case,
the learner will most likely be overloaded by too high element-interactivity. The other
variant of misalignment is visualized by the area beneath the diagonal function line in the
figure. It represents instruction, where the expertise exceeds task difficulty (L2–T1). In this
case, the learner wastes time and energy with processing unneeded help or solving too
simple learning tasks, which do not challenge his/her cognitive capacities and have no (or
very limited) benefit for learning.
The schema displayed in Fig. 1seems to fit many studies on cognitive load theory. For
illustrational purposes, we will briefly report a few sample studies and try to map them onto
the schema. For example, Kalguya et al.(2001a) compared learning with worked-out
examples and learning under conditions where participants could explore the same material
on their own. Worked-out examples provide more help, which makes tasks easier (T1),
whereas free exploration provides less help, which makes tasks more difficult (T2). Novices
(L1) benefited more from the worked examples (T1) than from the free exploration (T2),
whereas more experienced learners (L2) benefited more from free exploration (T2) than
from the worked examples (T1). Cooper et al.(2001) compared learning through real
worked examples, which might be easier (T1), with learning through imagining worked
examples, which might be more difficult (T2). Again, novices (L1) benefited more from the
real worked examples (T1) than from imagining worked examples (T2), whereas more
experienced learners (L2) benefited more from imagining worked examples (T2) than from
real worked examples (T1). According to cognitive load theory, the combination L1–T2
and the combination L2–T1 imposed an extraneous load on the learners’resources, which
had negative effects on learning.
Various studies on cognitive load theory were dealing with the split-attention effect,
which occurs when learners have to split their attention among multiple sources of
information, which cannot be understood in isolation, but need to be mentally integrated.
Geometry instruction in textbooks, for example, frequently keeps a diagram and the
corresponding verbal statements unnecessarily separated, although for a beginner in
geometry the diagram as well as the verbal statements is unintelligible until they have been
mentally integrated. A segregated format imposes an unnecessary extraneous load on the
learner, because finding relations among the elements in the diagram and the text is more
difficult. Mental integration can be enhanced by an integrated instructional format, in which
the different sources of information are presented as closely together as possible. Mayer
(2001) has called this the spatial contiguity principle. Research on cognitive load has shown
repeatedly that multiple visual representations presented by an integrated format of
instruction (T1) are more beneficial for the learning of novices (L1) than a segregated
480 Educ Psychol Rev (2007) 19:469–508
format (T2), whereas a segregated format (T2) is better for more advanced learners (L2)
than an integrated format (T1) (Chandler and Sweller 1991; Kalguya et al.1998,2000;
Yeung et al. 1997).
Similar results were found in studies on the modality effect. As mentioned above, the
modality effect occurs when visually presented pictures are combined with text presented in
auditory rather than in visual form in order to avoid split of visual attention. Various studies
in instructional psychology have shown that for learners with low prior knowledge (L1) the
negative consequences of split-attention can be ameliorated by presenting verbal statements
in an auditory form (T1) rather than in a visual form (T2) (cf. Mayer 2001; Moreno and
Mayer 1999a,1999b). However, this so-called modality effect disappeared or even became
negative when learners had higher expertise (L2) (Kalyuga 2000; Kalguya et al.2000;
Mousavi et al.1995).
The redundancy effect and the expertise reversal effect can also be easily mapped on the
schema of Fig. 1. The redundancy effect occurs, when multiple sources of information are
intelligible in isolation (T2) for sufficiently advanced learners (L2), but are presented in an
integrated format (T1) that results in unnecessary integrating of information without an
additional benefit for understanding. Leung et al.(1997), for example, found that
supplementing a mathematical equation that is intelligible for the learner with extensive
verbal information did not assist learning. The expertise reversal effect occurs, when an
instructional format (T1) that is beneficial for novices (L1) becomes disadvantageous for
individuals with higher expertise (L2) (Kalguya et al.2003). All these findings can be
interpreted in terms of the schema presented in Fig. 1, namely, that both the combination
L1–T2 and the combination L2–T1 impose extraneous load on the individual’s resources
and therefore deteriorate learning.
Kinds of extraneous load
As mentioned above, misalignment between expertise and task difficulty exists in two
variants: In one variant, task difficulty exceeds expertise (L1–T2), in the other variant,
expertise exceeds task difficulty (L2–T1). Although both variants represent inappropriate
forms of instruction, which are assumed to result in extraneous load, a closer analysis
reveals that different kinds of misalignment are associated with different kinds of
extraneous load. More specifically, extraneous load can be due to (exaggerated)
interactivity between relevant information, maintenance of relevant information, interac-
tivity between irrelevant information, and simply due to waste of time and effort.
Interactivity between relevant information Task difficulty can exceed expertise (L1–T2),
when the learning task is too complex or the learner does not receive sufficient instructional
help. In this case, the required element-interactivity can easily exceed his/her working
memory capacity. Due to a novice learner’s limited expertise, he/she does not possess
sufficient cognitive schemata and, thus, has to keep too many cognitive elements
simultaneously in working memory. In this case, extraneous load is due to an exaggerated
interactivity between relevant information.
Maintenance of relevant information Task difficulty can also exceed expertise (L1–T2)
because the learner needs to process and integrate multiple sources of information that are
unnecessarily presented in a segregated rather than an integrated format. In this case,
learners have to split their attention between these multiple sources, which imposes a heavy
extraneous load on their working memory. However, this extraneous load is not due to
Educ Psychol Rev (2007) 19:469–508 481
unnecessarily high element interactivity, because a segregated format does not require more
elements to interact than an integrated format. Instead, the load results from the limited
temporal duration of information in working memory: Learners have to invest additional
effort to keep information in working memory while their attention is shifting from one to
another source of information. In this case, extraneous load is due to the need of
maintaining cognitive elements in working memory rather than too high element-
Besides cases in which task difficulty exceeds expertise, there exist also cases in which
expertise exceeds task difficulty (L2–T1). This situation is frequently associated with the
redundancy effect. For example, if an advanced learner receives a diagram that is perfectly
intelligible and receives also an accompanying text that describes the diagram without
providing additional information, then the text is redundant for the learner. According to
cognitive load theory, redundancy overloads working memory and therefore creates
extraneous load. We will argue in the following, however, that this is only correct in some
cases, whereas in other cases working memory capacity is not overloaded by redundancy.
Interactivity between irrelevant information If a diagram is presented with a text in an
integrated format, it is very hard to ignore the text. The same is true if the text is presented
in auditory form and cannot be turned off, skipped or otherwise ignored by the learner
(Kalyuga 2000; Kalguya et al.2000). In this case, the learner is forced to consider the
diagram as well as the text and to relate them to each other. If the diagram is perfectly
intelligible, the text provides only superfluous information. Processing the diagram and the
text would in this case result in an unneeded increase of element-interactivity that draws on
working memory capacity.
Waste of time and effort Extraneous load can also be caused without high element
interactivity. Let us assume, that an advanced learner is studying a diagram that is perfectly
intelligible for him/her. If the learner nevertheless reads afterwards an accompanying text,
which explains the diagram and which is presented in a segregated format (thus, with no
need to process diagram and text simultaneously), then he/she is processing unnecessary
information, which only wastes time and effort without an added value for learning. As
processing unnecessary information is irrelevant for learning, the corresponding cognitive
load counts as extraneous (Kalguya et al.1998). However, reading the unnecessary
information from the text after studying the diagram does not necessarily require high
element interactivity that takes away working memory capacity from studying the diagram
or even exceed the learner’s working memory capacity. In this case, the extraneous load is
simply due to a waste of time and effort even with low element-interactivity.
There is a fundamental difference between extraneous load due to high element-
interactivity and extraneous load due to waste of time and energy by processing of
unneeded instructional help, which is also relevant for understanding the expertise reversal
effect. As mentioned above, this effect occurs, when an instructional format that is
beneficial for novices becomes disadvantageous for individuals with higher expertise. Let
us refer again to the example of a diagram combined with an explanatory text. If novices
benefit from this kind of instruction more than from seeing only the diagram, they are
obviously able to process and integrate successfully the pictorial and the verbal information.
This implies that the comprehension task does not exceed their working memory capacity.
As mentioned above, learning leads to the acquisition of new schemata and reduces
element-interactivity of a given task. Regarding the expertise reversal effect, it follows that
if the combination of diagram and text is advantageous for a novice, but becomes
482 Educ Psychol Rev (2007) 19:469–508
disadvantageous for an advanced learner, the disadvantage cannot be caused by an
increased element-interactivity. Instead, processing of (formerly helpful) information that is
now unneeded help is extraneous load only due to a waste of time and effort with moderate
or even low element-interactivity.
Why reduction of cognitive load is not always helpful for learning
We have interpreted above the misalignment between expertise and instruction, when the
level of expertise exceeds task difficulty (L2–T1), as a cause of an extraneous load due to
interactivity between irrelevant information or due to waste of time and effort. However,
this specific kind of misalignment can also be interpreted from an alternative perspective,
which focuses on the intrinsic load. We have visualized the two different perspectives in
Fig. 2. The figure has the same basic structure as Fig. 1. The extraneous load perspective
mentioned above is visualized in Fig. 2by the vertical arrow going down from L2–T2 to
L2–T1. This arrow can be read as follows: For a learner at expertise level L2, appropriate
instruction would include learning tasks at difficulty level T2. When instructional help is
provided, the task difficulty may be reduced to level T1. However, a learner at expertise
level L2 does no longer need this instructional support. Processing unneeded help causes
only interactivity of irrelevant information or a waste of time and effort without additional
benefit for learning, which imposes an extraneous load on the learner. According to this
interpretation, a shift from the well-aligned instructional combination L2–T2 down to the
misaligned combination L2–T1 is associated with an increase of extraneous load.
However, the misaligned instructional combination L2–T1 can also be considered from
an intrinsic load perspective. Remember that intrinsic load is due to the natural complexity
of the learning task, which is determined by its element-interactivity, that is, by the number
of cognitive elements that have to be held simultaneously in working memory. Remember
also that learning leads to the acquisition of schemata, which in turn reduces element-
interactivity and, thus, intrinsic load. Accordingly, the intrinsic load of a task at difficulty
level T1 is lower at expertise level L2 than at level L1. This reduction of intrinsic load is
visualized in Fig. 2by the horizontal arrow going from L1–T1 on the left hand to L2–T1 on
the right hand. The arrow can be read as follows: For a learner at expertise level L1,
Fig. 2 Different interpretations
of a misalignment of learning
task difficulty (T1) with learner’s
expertise (L2) resulting from
unneeded help or too low task
complexity: Compared to
alignment L2–T2, the misalign-
ment L2–T1 can be viewed as
adding unnecessary extraneous
load (+ECL). Compared to
alignment L1–T1, the misalign-
ment L2–T1 can also be viewed
as decreasing intrinsic load
(−ICL) to a too low level
Educ Psychol Rev (2007) 19:469–508 483
appropriate instruction would include learning tasks at difficulty level T1. After expertise
level has increased from L1 to L2 as a result of learning, difficulty level T1 is too low to
enhance further learning. Tasks at this level are too easy, because they do not challenge the
learner any more.
Accordingly, one can argue that learners at expertise level L2 who are dealing with
learning tasks at level T1 suffer from a too low intrinsic load rather than from an extraneous
load. The presented learning tasks are too easy for these individuals, and learning is
impeded not by an additional extraneous load but by a too low intrinsic load, which does
not challenge the capacities of the learner.
There are a various studies, which indicate that learning can be impeded not only by too
high extraneous load, but also by too low intrinsic load, when the learning tasks do not
challenge the learner, either because they are too simple or because too much help is
provided. McNamara et al.(1996) found that low-knowledge readers benefited more from a
highly coherent text, which included much information explicitly and therefore reduced the
need for inferences, than from a less coherent text, which did not include as much explicit
information and therefore required more inferences. On the contrary, high-knowledge
readers showed deeper text comprehension after reading the less coherent and therefore
more difficult text than after reading the easier, more coherent text. It seemed that the more
coherent text included information that could be easily inferred by high-knowledge readers
and which therefore could have been omitted. In other words: The high-coherence text
made all details and interrelations explicit and provided therefore too much help for
coherence formation, when readers had higher expertise. The text was too easy for these
learners and did not challenge the readers’capacities, because the intrinsic load of
comprehension was too low.
Similar results were reported by Kalguya et al.(1998) and by Yeung et al.(1997), who
found that additions to textual material which increased coherence benefited low-
knowledge readers, but impeded high-knowledge readers. The authors used the traditional
view of cognitive load theory for interpreting their findings, namely, that the additional
information of the more coherent text was redundant for high-knowledge readers and, thus,
imposed an extraneous load on working memory. The authors’view seems to be supported
by the higher mental effort ratings for the redundancy condition than for the non-
redundancy condition. However, it should be noted that reports of high mental effort are not
necessarily due to high element-interactivity in working memory. Strenuous processing of a
long-winded text that finally turns out not to be fruitful for understanding may also be
experienced as effortful, even if element-interactivity during reading is not high. It should
also be noted that low-knowledge readers were in fact able to understand the high-
coherence material, which implies that the required element-interactivity did not exceed
their working memory capacity. Because learning reduces element-interactivity, high-
knowledge readers cannot experience higher element-interactivity when reading high-
coherence material than low-knowledge readers, who can read and understand the same text
successfully. Given the conceptual problems regarding the assumption of extraneous load
due to unneeded help, we are more inclined to assume that learning was impeded in these
studies because tasks were too easy rather than assuming an additional (extraneous)
cognitive load. The learning tasks did not challenge the learner’s capacities and large parts
of his/her cognitive capacities remained unused, which affected learning negatively.
Schnotz and Rasch (2005) investigated how animated or static pictures influence
students’understanding of time and date phenomena on the earth. One kind of learning
tasks required students to mentally represent the earth’s rotation around its axis. It was
assumed that observing an animation displaying the earth’s rotation would require less
484 Educ Psychol Rev (2007) 19:469–508
effort than performing a corresponding mental simulation based on a static picture (see
Cooper et al.(2001) for a similar argument). Students performed better with the static than
with the animated pictures. Low expertise participants were more impaired by the
animation than high expertise participants. Because the low expertise participants invested
less learning time, whereas the high expertise participants invested more learning time into
learning from animation compared to learning from static pictures, it was concluded that the
animation had not imposed an extra cognitive load on low expertise learners. Instead, the
authors interpreted the negative effect of animation as a result of a too low intrinsic load:
Students who saw the animation did not have to perform the mental simulation on their
own, although they had been able to do so. Instead, they could passively follow the earth’s
rotation displayed on a screen. The animation had facilitated the task of envisioning the
earth’s rotation, but this facilitation had made the task too easy, because the learners’
capabilities were not sufficiently challenged. Instead of assuming an additional extraneous
load that would interfere with learning, the authors considered a too low intrinsic load as a
possible reason for insufficient learning.
According to our analysis, reduction of cognitive load is not always helpful for learning.
Learning cannot only be enhanced by reduction of extraneous load, but also by adapting the
intrinsic load to the learner’s level of expertise. This adaptation can require either a decrease
of intrinsic load due to too high learning task difficulty or an increase of intrinsic load due
to too low task difficulty.
To summarize: Individuals learn from the learning tasks presented to them. In order to
stimulate effective learning, instruction has to align learning task difficulty with the
learner’s level of expertise. Because intrinsic load is due to the natural complexity of
learning tasks and because learning tasks are chosen according to educational objectives,
the intrinsic load is also dependent on these objectives. Furthermore, because intrinsic load
is determined by element-interactivity and because element-interactivity is influenced by
expertise, the intrinsic load is also dependent on expertise. It follows that whether a
cognitive load is intrinsic or extraneous depends on the educational objectives as well as the
When the level of learning task difficulty is well aligned with the learner’s level of
expertise, instruction is appropriate and the learner has only to deal with the inherent
intrinsic load of the learning task. In case of a misalignment between task difficulty and
expertise, the instruction imposes an extraneous cognitive load on the learner’s resources.
Whereas traditional cognitive load theory assumes that extraneous load is created by an
unnecessarily increased element-interactivity, our analysis reveals that there are different
kinds of extraneous cognitive load, which can influence learning negatively in different
ways. Not all of them are necessarily associated with high element-interactivity. Extraneous
load can be due to an exaggerated interactivity between relevant information. It can also be
due to the need of maintaining cognitive elements in working memory. Furthermore,
extraneous load can be caused by enforced interactivity of superfluous information in
working memory and it can simply be due to a waste of the learner’s time and effort even
when element-interactivity is low. According to the different variants of extraneous
cognitive load, there should distinct instructional measures be expected in order to avoid
different kinds of extraneous load.
Intrinsic load cannot be changed for a specific learning task with regard to a specific
level of expertise. Insofar, intrinsic load is fixed. However, intrinsic load is not fixed in
general. In fact, instruction has to manipulate both extraneous and intrinsic load. Whereas
extraneous load should be reduced by instructional design, intrinsic load should be adapted
to the expertise level of the learner. This includes, that intrinsic load should sometimes be
Educ Psychol Rev (2007) 19:469–508 485
increased rather than reduced. According to the classical view of cognitive load theory, the
efficiency of instruction depends on the extent to which an extraneous cognitive load on the
learner’s working memory is minimized. This implies that instructional design should
reduce extraneous load as far as possible. Our previous analysis, however, suggests that
instead of focusing only on the reduction of extraneous load, instructional design should
sometimes also increase intrinsic load in order to create an adequate alignment of learner
expertise and learning task difficulty. As we will see in the following, the idea of adapting
the intrinsic load to the learner’s expertise is in line with another influential concept
in educational and developmental psychology, Vygotski’s(1963)zoneofproximal
Cognitive Load and the Zone of Proximal Development
We have visualized the idea of alignment between learning task difficulty and learners’
expertise in Figs. 1and 2by a graph, in which a diagonal function line represented
adequate alignment of expertise and difficulty. This provided a rough picture of
instructional landscapes, but it also oversimplified things because the relation between
expertise and difficulty is subtler. A more differentiated picture of the relation is given in
Fig. 3. In this figure, the non-shaded middle area between the two diagonal lines represents
cases of adequate instructional alignment between the learners’expertise and task difficulty.
The shaded area above the middle area represents cases of too high task difficulty, whereas
the shaded area beneath the middle area represents cases of too low task difficulty.
zone of proximal
expertise level Li
Fig. 3 Enabling function and facilitating functions of reducing task difficulty by instructional help provided
to a learner at expertise level Li. Both functions have positive effects on learning as long as the reduction of
difficulty remains within the zone of proximal development (ZPD). If task difficulty is shifted below the
ZPD, the facilitation has negative effects on learning
486 Educ Psychol Rev (2007) 19:469–508
Enabling and facilitating task performance
Figure 3shows for a fictitious expertise level Li two different levels of task difficulty
labeled as Tmax_Li(H−) and as Tmax_Li(H+). Tmax_Li(H−) refers to the level of the most
difficult task a learner at expertise level Li can handle without help, whereas Tmax_Li(H+)
refers to the level of the most difficult task a learner at expertise level Li can handle with
help. For a learner at level Li, even the best possible help would not allow him/her to cope with
tasks at a difficulty level beyond Tmax_Li(H+). For the same learner, providing unnecessary
help or presenting simple tasks that decrease task difficulty below Tmax_Li(H−) would waste
the learner’s time and energy and sub-challenge his/her cognitive capacities.
If instructional help reduces the difficulty of a task that would otherwise be impossible,
then the instructional help has an enabling function. Enabling means that due to a
reduction of cognitive load, a process becomes possible which otherwise had remained
impossible. The enabling function of instructional help is represented in Fig. 3by a small
vertical arrow indicating a transition from the upper shaded area into the non-shaded middle
area, where task difficulty is adequately aligned with expertise. If instructional help reduces
the difficulty of tasks that could otherwise be solved only with high mental effort, then the
help has a facilitating function. Facilitation means that due to a reduction of cognitive load,
processes that are already possible, but which still require high mental effort, become
possible with less effort.
Facilitation exists in two variants with different effects on learning. When the facilitation
stays within limits so that the task difficulty is still aligned with the learners’expertise (non-
shaded middle area in Fig. 3), then the facilitation has instructionally positive effects.
Accordingly, we call this an instructionally positive facilitation. This is represented in Fig. 3
by a vertical arrow from the upper down to the lower diagonal line. When the facilitation
reduces the task difficulty to an extent that is no longer aligned with the learner’s expertise
(lower shaded area of Fig. 3), then the facilitation has instructionally negative effects.
Accordingly, we call this an instructionally negative facilitation. This is represented in
Fig. 3by a vertical arrow within the lower shaded area. In this case, the learner wastes time
and energy into processing of unneeded help without benefit for learning.
The two different levels of task difficulty indicated in Fig. 3as Tmax_Li(H−) and as
Tmax_Li(H+) correspond exactly to what has been introduced by Vygotski (1963) as the
zone of proximal development (ZPD). According to Vygotski, the zone of proximal
development is defined as the range between a lower limit and an upper limit of task
difficulty. The lower limit of the ZPD is defined as the most difficult task the learner can
perform successfully without help, whereas the upper limit of the ZPD is defined as the
most difficult task the learner can perform successfully with help. In the following, we will
further elaborate this concept from a cognitive load perspective in order to better distinguish
between different functions of instructional manipulations.
Expertise, task difficulty and intrinsic load
We have so far conceptualized the relationship between learner’s expertise and task
difficulty based on Cartesian diagrams, which include one axis for expertise and the other
axis for task difficulty (Figs. 1,2, and 3). However, it is possible to conceptualize the
relationship also in another format, in which the same axis is used for both variables as
commonly used in item response theory. Within this theory the parameters of individuals
(i.e. the level of expertise) and the parameters of items (i.e. the difficulty level of tasks) are
considered as different points on the same variable (Sijtsma 2004). Figure 4uses (in its
Educ Psychol Rev (2007) 19:469–508 487
upper part) this format to display the relation between expertise, learning task difficulty and
learning task performance. The whole figure shows the assumed relation between different
levels of learners’expertise, learners’performance on a specific hypothetical task X and the
intrinsic load imposed by this task on working memory. For the sake of simplicity, we will
assume for a moment that there is no extraneous load.
The upper part of Fig. 4shows how the learner’s expertise (represented on the abscissa
of the figure) determines the likelihood of his/her successful performance (represented on
the ordinate of the figure) on task X. Learning as an increase of competence would be
represented in this figure as a movement on the abscissa from left to right. Within the area
of low expertise on the left hand, the likelihood of successful performance remains at 0%
up to the expertise level L1. Between the expertise level L1 and the expertise level L2, the
likelihood of successful performance increases from 0% to100%. Beyond the expertise
level L2, the likelihood of successful performance remains at 100%. In other words: If the
learner’s expertise level is below L1, then the task is too difficult for the learner. If the
learner’s expertise level becomes higher than L1, then the task becomes more and more
easy for the learner. At the expertise level L2 and beyond, the task is so easy that
performance is likely to be perfect.
The lower part of Fig. 4shows how the intrinsic cognitive load created by task X varies
with the learner’s level of expertise. Up to L1, the cognitive load (CL) of task X exceeds the
learner’s working memory capacity (WMC). The learner is therefore unable to perform the
task successfully. Between L1 and L2, the cognitive load of the task is lower than
the learner’s working memory capacity. The learner is therefore able to perform the task,
and there is free capacity of working memory left which can be used for germane cognitive
load activities (GCL). At L2, the intrinsic cognitive load of task X drops down to zero,
Fig. 4 Task performance (top) and cognitive load (bottom) of a hypothetical task X for learners with
different levels of expertise
488 Educ Psychol Rev (2007) 19:469–508
because task performance becomes automated and does not need working memory capacity
any more. The available working memory capacity can therefore be used for other
Adaptation of learning task difficulty to the zone of proximal development
Whereas Fig. 4shows the learner’s performance on one hypothetical task X and the
cognitive load in relation to the learner’s expertise, Fig. 5shows the same dependencies for
two different hypothetical tasks A and B. Task A is easier, whereas task B is more difficult.
Both tasks can be performed without additional help or with additional help. If a task has to
be performed without help, this is indicated in Fig. 5by the symbol ‘(H−)’. If help is
available during task performance, this is indicated by the symbol ‘(H+)’. For a student who
has reached learning state L3, task A would be very easy, when help is provided (A
The likelihood of successful performance would be 100% under this condition. Even
without help, the student’s performance would be relatively good (A
). For the same
student, task B would be very difficult, if no help is provided (B
). The likelihood of
successful performance would be 0% under this condition. With help, however, the
likelihood would increase considerably and the student would have a real chance to perform
the task successfully (B
For a student at learning state L3 as shown in Fig. 5, the most difficult task he/she can
perform successfully without help would be task A: The curve A
performance characteristics of the most difficult task that the learner can perform (with
Fig. 5 Task performance (top) and cognitive load (bottom) of an easy task A which can be solved by a
learner at expertise level L3 without help as well as task performance (top) and cognitive load (bottom)ofa
difficult task B which can be solved by learner at expertise level L3 only with help. The range of difficulty
between the two tasks is known as the zone of proximal development (ZPD)
Educ Psychol Rev (2007) 19:469–508 489
reasonable quality) without help, whereas a higher task difficulty would decrease
performance tremendously if no help is provided. For the same student, the most difficult
task he/she can perform successfully with help would be task B: The curve B
performance characteristics of the most difficult task that the learner can perform (with
reasonable quality) with optimal help. Further increase of task difficulty would decrease
performance tremendously even with optimal help. Remember that the lower limit of the
ZPD is defined as the most difficult task the learner can perform successfully without help,
whereas the upper limit of the ZPD is defined as the most difficult task that the learner can
perform successfully with the best possible help. Thus, the shaded area between the easier
task A performed without help (A
) and the more difficult task B performed with help
) represents the zone of proximal development (ZPD) for the student at expertise level
Any instruction that aims at promoting learning should include learning tasks within the
limits of the ZPD. If the task difficulty were higher than the ZPD, the learner’s cognitive
capacity would be overwhelmed, because the cognitive load would exceed the learner’s
working memory capacity. If the task difficulty were lower than the ZPD, the learner would
be sub-challenged and a great deal of the available cognitive capacities would remain
unused for the learning process. Accordingly, the difficulty of learning tasks has to be
adapted to the learner’s zone of proximal development by, for example, choosing other
learning tasks or defining other task performance conditions.
The instructional consequences derived from the concept of the zone of proximal
development correspond exactly to the results of our previous analysis. Accordingly,
instructional design should manipulate not only extraneous load. It should also manipulate
intrinsic load in order to align the task requirements with the learner’s level of expertise or
his/her zone of proximal development, respectively. In fact, various studies on cognitive
load theory have also manipulated intrinsic load. Kalguya et al.(2001a), for example, used
different levels of task difficulty in an experiment, where participants had to solve simple
tasks with a very limited problem space or complex tasks with a larger problem space. In
this way, the authors manipulated intrinsic load, because the different tasks represent
different levels of element-interactivity. Cooper et al.(2001) compared a less demanding
task (studying worked examples), with a more demanding task (imagining worked
examples) and found that the former were more beneficial for novices, whereas the latter
were more beneficial for advanced learners. The two kinds of task were differently
demanding and, accordingly, imposed also different amounts of intrinsic load.
Other studies manipulated intrinsic load by instructional help. For instance, when
learning with worked-out examples (i.e. examples that provide a ready-made solution to a
problem) is compared to conventional problem solving, learners are confronted with
different tasks requiring different element-interactivity. Whereas learning with worked-out
examples requires only focusing on presented problem states and operators bridging the
differences between them, conventional problem solving requires the learner to keep also
the goal of the task in mind and requires planning in addition to focusing on problem states
and operators. Thus, following the ready-made solution in a worked-out example imposes a
lower intrinsic load on the learner’s working memory than finding the solution on one’s
own (cf. Cooper and Sweller 1987; Renkl 1999,2002; Sweller and Chandler 1991,1994).
The lower intrinsic load of worked-out examples were beneficial for novices, whereas
advanced learners profited more from the higher intrinsic load of traditional problem
solving (Kalguya et al.2001b).
Paas and van Merriënboer (1994) studied learning from worked-out examples of
arithmetic problems, which were embedded into contexts of high and low variability. When
490 Educ Psychol Rev (2007) 19:469–508
context variability was high, learning results were better than when variability was low.
Paas and van Merriënboer attributed the better learning results to an additional load caused
by the context variability and considered this load to be germane. However, one can also
argue that the worked-out examples required both understanding of the embedding context
and understanding of the arithmetic problem. Thus, processing the information about the
embedding context is part of the task and therefore part of the intrinsic load. If a learner has
to solve problems embedded into variable contexts, he/she has to perform repeatedly new
comprehension processes and has to map arithmetic problems repeatedly onto different
contexts. This is more demanding than if the embedding context remained constant. It
follows that the problems with high context variability had a higher intrinsic (not germane)
load than the problems with low context variability.
The findings of Schnotz and Rasch (2005) on learning about time phenomena related to
the rotation of the earth from animated or static pictures mentioned above can also be
interpreted within this framework. Students performed better with static than with animated
pictures. The animation had facilitated the envisioning of the earth’s rotation, because
learners could passively follow a visual display. However, the facilitation had made the
learning task too easy: It had shifted the task difficulty out of the learners’zone of proximal
development and, thus, had a negative effect on learning.
It should be noted that although a decrease of cognitive load due to facilitation of task
performance increases free working memory capacity, this free capacity is not necessarily
used for cognitive activities that enhance learning. Learners often do not engage in higher
order cognitive processing that would lead them to a deeper comprehension, so that part of
their working memory capacity is left unused. In these cases, learners need to be stimulated
to invest their available working memory capacity into additional cognitive activities in
order to enhance learning (Winne and Hadwin 1998).
How Does Working Memory Contribute to Learning?
Task performance versus learning
As mentioned above, learning is a process that leads to changes in long-term memory. If
nothing has been altered in long-term memory, nothing has been learned. In order to make
students learn, teachers require them to solve learning tasks. Learning tasks can have many
forms. They can require students to comprehend a text, to solve an algebraic equation, to
estimate household expenditures, to prove a theorem, to write an essay, and many others.
Without performing any activity, no learning can occur. This is why learning is said to be an
active process. Insofar, task performance and learning are closely correlated. Despite this
correlation, however, they are fundamentally different processes, because they operate on
We will confine ourselves here to cognitive learning and, thus, consider only cognitive
tasks. Performing a cognitive task means to transform a mental representation from a less
favorable state to a more favorable (goal) state by cognitive operations in working memory.
The kind of transformation can be very different, depending on the subject matter and the
instructional objective. For example, if the learning task is “Solve equation ‘5x −10 = 0’
for x”, then the student has to transform the more complex equation “5x −10 = 0”(the
given state) into the simple equation “x=2”(the goal state). If the learning task is to
understand Newton’s Second Law (the principle of action), then the student has to
transform his/her initial mental state of not having understood into a mental state which
Educ Psychol Rev (2007) 19:469–508 491
represents that a force acting on a mass equals the mass multiplied by its acceleration. In all
cases, the content of working memory is transformed from one state into another state
through the application of operations. Although these activities cannot guarantee the
intended changes in long-term memory, they can trigger such changes with some
probability. If they do, learning takes place. Contrary to task performance, learning is a
process that transforms the content of long-term memory. The transformation leads from a
state of lower expertise to a state of higher expertise. These changes can be triggered by
cognitive processing in working memory, but this influence is a relatively indirect one:
Students perform learning tasks by operations in working memory, and as a by-product,
changes in long-term memory occur. There are no cognitive operations that could
deliberately and directly change the content of long-term memory.
To summarize: Cognitive task performance operates on mental structures in working
memory, whereas learning operates on mental structures in long-term memory. Accordingly,
learning does not take place in working memory. What does take place in working memory
is information processing as part of the learning task performance (such as, for example,
comprehending texts, solving equations, or proving theorems), which trigger with some
likelihood changes in long-term memory.
The fundamental difference between task performance and learning becomes especially
obvious when we focus on comprehension as a special case of task performance.
Comprehension and learning are not clearly distinguished in everyday life. Cognitive load
theory does also not make a clear distinction, because learning and comprehension are
defined similarly. Learning is considered as a process that leads to changes in long-term
memory; if nothing has been altered in long-term memory, nothing has been learned.
Understanding is assumed to occur when all information relevant for a task is processed
simultaneously in working memory and when a change in long-term memory occurs too,
because (according to cognitive load theory) if nothing has been altered in long-term
memory, nothing has been understood (Sweller 2005). It is not surprising that
comprehension and learning are often not clearly distinguished, because under everyday
conditions, good comprehension usually results in good learning. Nevertheless, we consider
comprehension and learning as fundamentally different, because under specific circum-
stances, comprehension can occur also without learning.
An example has been reported by Milner et al.(1968). H.M., a patient with severe
epilepsy, underwent a radical brain surgery, in which parts of he hippocampus were
removed to reduce his seizures. H.M. could talk intelligently about things that took place
in the moment, and he could read and understand texts without problems. However, he
could not remember new experiences even after a few minutes they had happened, and
he could read and understand the same text again and again without remembering
anything of what he had read. Although this is an extreme clinical case, it demonstrates
that understanding and learning are different, and that the former can occur without the
latter. H.M. was able to understand, but he was unable to learn the content of what he
read. In other words: He could create declarative knowledge structures in working
memory, but he could not transform these declarative knowledge structures into long-
We therefore suggest to clearly distinguishing between comprehension and learning.
Comprehension can be considered as the process of constructing a mental representation in
working memory, regardless of whether a change in long-term memory takes place or not,
whereas learning is a process that leads to changes in long-term memory (cf. Kintsch 1998).
Although under normal conditions, comprehension and learning are highly correlated, both
processes are nevertheless fundamentally different.
492 Educ Psychol Rev (2007) 19:469–508
The distinction between task performance (including comprehension) and learning has
implications for understanding the differences between the various kinds of cognitive load.
Remember that intrinsic load is the unavoidable load required by performing a learning task
at a given level of expertise. The concept of intrinsic load is therefore related to
performance: The amount of intrinsic load results from the number of cognitive elements
that have to be held in working memory simultaneously to do the necessary steps for task
performance. Extraneous load occurs, if the learning task has to be performed under
unfavorable conditions, because coping with these conditions requires some extra working
memory capacity or extra effort. The concept of extraneous load is therefore also related to
performance: The amount of extraneous load depends on the extra working memory capacity
or effort required to cope with the unfavorable conditions. Whereas both intrinsic and
extraneous load are performance-related concepts (cf. Yeung et al. 1997), germane load is
related to learning: It refers to the working memory capacity required for schema construction
and schema automation that result in changes in long-term memory (Sweller et al. 1998).
These considerations have important implications for instructional design. When task
performance and learning are fundamentally different processes and when intrinsic and
extraneous load are performance-oriented, whereas germane load is learning-oriented, it
follows that easy task performance does not necessarily result in easy learning and that
performance aids are not necessarily learning aids. Performance aids reduce intrinsic or
extraneous load. If they reduce the load of tasks to an extent that allows successful
performance, which were otherwise impossible, they have an enabling function. If they
reduce the load of tasks that would otherwise require very high mental effort, they have a
facilitating function. Both the enabling function and the facilitating function are beneficial
for learning as long as the cognitive load is still in the zone of proximal development.
Further facilitation can make processing unnecessarily easy and prevent students from
learning-relevant processing. This is demonstrated, for example, by the expertise reversal
effect, when performance aids (such as worked-out examples) turn out to be disadvanta-
geous for individuals with higher expertise, or when animations prevent learners from
running their own mental simulations (cf. Kalguya et al. 1998,2003; Schnotz and Rasch
2005). Aids are then beneficial for task performance, but not for learning. In other words:
Making a task easier does not necessarily result in better learning.
Remember that cognitive load theory claims validity only for the acquisition of biologically
secondary knowledge, because this is where working memory is needed (Sweller 2003;
J. Sweller and S. Sweller 2006). The theory assumes that this kind of learning occurs
through schema construction and automation, and it assumes that these processes impose a
germane load on working memory. Finally, the theory assumes that the different kinds of
load are additive: The total load is the sum of intrinsic, extraneous and germane load
(Sweller et al. 1998; Sweller 2005). These assumptions have important implications: If
learning requires schema construction and automation and if these processes impose a germane
load on working memory, then germane load is a requirement for learning. In other words:
There should be no learning without germane load. Imagine an individual solving a task, which
is so difficult that it requires all available working memory capacity. In this case, all working
memory resources would be occupied by intrinsic load, and, due to the additivity of the different
kinds of load, there would be no working memory capacity left for germane load. It follows that
no learning should occur in this case. However, we will see that this is highly questionable also
in case of biologically secondary (i.e. culturally mediated) knowledge.
Educ Psychol Rev (2007) 19:469–508 493
Students acquire different kinds of knowledge. Many theories of learning and cognition
make a distinction between declarative and procedural knowledge (Eysenck and Keane
2000). Whereas declarative knowledge is related to semantic and episodic memory,
procedural knowledge refers to the ability to perform skilled actions. Slusarz and Sun
(2001) mention that the distinction between declarative and procedural knowledge maps
roughly onto the distinction between explicit and implicit knowledge. Procedural
knowledge is generally inaccessible and thus implicit, while declarative knowledge is
generally accessible and thus explicit. Top-down oriented theories of learning emphasize
that learners first acquire a great deal of explicit declarative knowledge and then, through
practice, turn this knowledge into a procedural form. Skill acquisition according to the
ACT* model of Anderson (1983) provides an example for this view. Bottom-up oriented
theories of learning emphasize that acquisition of knowledge can take place also in the
opposite direction. Sun et al.(2001), for example, argue that individuals can learn to
perform complex skills without first obtaining a large amount of explicit knowledge. Their
skill learning model includes top-down as well as bottom-up processes, and it integrates
symbolic knowledge representations on top levels with parallel distributed processing
representations on bottom levels.
When learning can take place also via bottom-up processes, the question arises to what
extent it requires conscious awareness of what is going to be learned. This question is
usually discussed in terms of implicit learning. Implicit learning is contrasted with explicit
learning (Baddeley 1997; Perrig 1996; Stadler and Frensch 1998). Learning is explicit if the
learner intentionally acquires a specific set of target knowledge and if he/she is aware of
and able to verbalize what has been learned (Frensch 1998; Kirkhart 2001). In contrast,
learning is implicit, if the learner is not able to verbalize what he/she has learned or if he/
she is not even aware of what he/she has learned (Lewicki et al.1992). The fact that he/she
has learned something becomes obvious only when task performance includes application
of what he/she has learned, although the learner is unable to explain or verbalize it. Implicit
learning has been investigated under different paradigms including learning of grammar,
learning to control complex systems, learning of temporal patterns (sequences),
and learning of concepts (Baddeley 1997; Perrig 1996).
Artificial grammar learning Different studies examined the acquisition of artificial
grammar (Reber 1989). In a typical research setting a group of participants learned
sequences of letters generated by an artificial grammar (but without any explanation of the
grammar itself), while another group of subjects learned random sequences. Both groups
were then shown further sequences which were either generated by the grammar or which
were random sequences, and they had to decide for each sequence whether it corresponds to
the grammar or not. The group that had seen the grammatical sequences in the learning
phase yielded better results than the control group, but were not able to report any
knowledge of the rules of the grammar. The dissociation between performance and verbal
report is the finding that prompted Reber to describe learning as implicit (Cleeremans et al.
1998). Reber assumes that knowledge about the grammar (i.e. a representation of its
abstract rules) is acquired without conscious reflection about regularities in the stimulus
material and, thus, by implicit learning.
Dynamic system control learning In a study of Berry and Broadbent (1984) participants
had to control a complex dynamic system: a sugar-production factory had to be managed to
maintain a specific level of sugar output. Learners performed progressively better but could
not explain the principles they followed in performing the task. Most participants did not
494 Educ Psychol Rev (2007) 19:469–508
acquire explicit declarative knowledge about the rules governing the sugar production, but
nevertheless performed well (Eysenck and Keane 2000). That is: Most participants learned
to perform the task effectively and demonstrated an ability to control the system, but could
not explain how they controlled the system, nor could they report the principles underlying
Sequence learning One might critically argue that learners of the Reber-Study or the Berry-
and-Broadbent-Study may have had conscious awareness to the relevant rules or
regularities, but had just difficulties to verbalize this knowledge. The studies concerning
sequence learning, however, are much less prone to this criticism (Eysenck and Keane
2000; Lewicki et al. 1988). In this learning paradigm people have to react to specific items
presented on a screen. It was measured in these studies whether reaction times are different
between groups in which items were presented in random sequence or in a non-obvious
pattern. Although the reaction times in these kinds of experiments were speeded up, quite
often, the learners were not able to describe the underlying rule of their behaviour.
Concept learning Implicit learning is also discussed with regard to learning of categorical
or conceptual knowledge (Perrig 1996). Humans seem to use the frequency of features and
the covariation of features in the acquisition of knowledge. Lewicki (Lewicki 1985,1986;
Lewicki et al.1992) demonstrated in the field of social cognition that a single experience
with a person that has specific attributes influences the individual’s attitude and behaviour
towards other persons with similar or dissimilar attributes. For example, after a negative
experience subjects had with an instructor, they searched more often contact to an instructor
that was not similar to the person that was involved in the negative experience, without any
awareness of this behavioral tendency (Lewicki 1985; Perrig 1996). The studies of Lewicki
demonstrate that the acquisition of categorical knowledge can take place, although persons
are not able to articulate the co-variation within their experiences.
The reported studies indicate that humans are able to acquire knowledge without
consciousness of the knowledge content. They can grasp co-variations without becoming
aware of them (and thus are not able to describe them). From an evolutionary perspective,
this is not surprising: It is very likely that primitive neural systems developed before the
emergence of conscious functioning and that these systems could learn from experiences,
even when these experiences could not be processed consciously within anything like
working memory. The corresponding learning mechanisms might be precursors of the
acquisition devices for biologically primary knowledge (cf. Reber et al. 1991). Obviously,
knowledge can be acquired by implicit learning without representing of what is learned in
working memory, which means that learners can profit from previous experiences even
when they are not aware of it. This is not only true for biologically primary, but also for
biologically secondary knowledge as demonstrated by the previous examples: Learning an
artificial grammar, controlling an artificial dynamic system and learning an artificial
sequence of operations are instances of cultural knowledge, which cannot be considered as
biologically primary. As Wenger (1998) has put it: Learning is everywhere; it is as natural
as breathing and eating; individuals are ‘condemned’to learn.
The examples of implicit learning suggest that the abstraction or construction of
schemata (Bartlett 1932; Schank and Abelson 1977) are not necessarily conscious processes
that require working memory capacity. Accordingly, learning can occur also without
involvement of working memory and, thus, without germane load. It follows that, contrary
to the implicit assumption of cognitive load theory, germane load should not be considered
as a requirement for learning even in the acquisition of culturally mediated, biologically
Educ Psychol Rev (2007) 19:469–508 495
secondary knowledge. Of course, this raises the question of how germane load can be
A Closer Look at Germane Load
As mentioned above, research on cognitive load focused until the middle of the 1990ies
primarily on possibilities to reduce extraneous load. When the concept of germane load was
introduced, things became more complicated, because learning could now also be improved
by an increase of cognitive load, provided that this load is germane (van Merriënboer
1997). The new version of the theory was more easily prone to circularity by post-hoc
categorizations of cognitive load: If reduction of cognitive load leads to better learning,
then the load could be said to have been extraneous; if it deteriorates learning, then the load
could be said to have been germane. Of course, such post hoc categorizations do not have
any explanatory value. In order to explain the results of instructional manipulations on
learning, the different kinds of cognitive load have to be defined independently of the
Defining germane cognitive load
Cognitive load theory describes germane load as the working memory capacity required for
processes of schema abstraction and schema automation, which in turn lead to changes in
long-term memory. As we have seen above, germane load is not a necessary requirement
for learning, because working memory is not the only way of inducing changes in long-
term memory. So, the question arises what does and what does not count a priori as
There are three points to be made for a further specification of germane cognitive load:
First, germane load requires working memory capacity –otherwise, it would be no
cognitive load. Second, germane load is beneficial for learning –otherwise, the load would
not be germane. Third, according to the distinction between performance and learning, it is
the cognitive processing in working memory, not the learning as a change in long-term
memory that creates (germane) cognitive load. These points suggest the following tentative
definition: Germane load is cognitive load due to cognitive activities in working memory
that aim at intentional learning and that go beyond simple task performance. If these
activities would not take place in working memory, they would not cause a cognitive load.
If they would not aim at learning, they would not be germane. If they would not go beyond
task performance, they would simply be part of the intrinsic load. The following cognitive
activities would qualify for this characterization of germane cognitive load:
–conscious application of learning strategies (i.e. strategies, which are not automated
–conscious search for patterns in the learning material in order to deliberately abstract
cognitive schemata (i.e. mindful abstraction) and create semantic macrostructures,
–restructuring of problem representations in order to solve a task more easily (i.e. by
–meta-cognitive processes that monitor cognition and learning.
All these kinds of activities explicitly aim at promoting learning. They require additional
working memory capacity beyond the requirements of the task performance itself and can
therefore be considered as an additional cognitive load component. Insofar, they correspond
496 Educ Psychol Rev (2007) 19:469–508
to our definition of germane cognitive load. Although germane load can promote learning,
it is no longer a prerequisite of any kind of learning. Instead, germane load is the cognitive
load of some specific cognitive activities that are performed in addition to the ordinary
performance of a learning task and which aim at the further improvement of learning.
Learning can occur also without germane load, but germane load can further enhance
Constraints on germane cognitive load
Instructional design influenced by cognitive load theory usually follows the idea that
germane load should be as high as possible (cf. Paas et al.2004). This implies the question,
what is really possible under the specific conditions at hand. Germane load seems to be
constrained in multiple ways. More specifically, we assume that it is constrained by
working memory capacity, by the nature of the task (thus, by its intrinsic load), and by the
As any other kind of cognitive load, germane load is constrained by the available
working memory capacity. If the intrinsic load of a task is very high and needs most of the
learner’s working memory capacity, there is not much capacity left for any germane load
even in the absence of extraneous load. The germane load seems to be constrained also by
the nature of the learning task and, thus, by the intrinsic load. Remember that we have
defined above germane load as the portion of working memory capacity occupied by
cognitive processes such as using specific learning strategies, schema abstraction, cognitive
restructuring, or meta-cognitive monitoring, which exceed the task requirements and are
therefore additional cognitive processes intentionally aiming at improving learning. These
additional cognitive processes are of course closely related to the processing of the task. In
other words: Task performance serves as an information base or source of experience for
these additional cognitive processes. It seems therefore plausible to assume that the
germane load is constrained by the intrinsic load. Whereas it is possible to solve very
difficult tasks (high intrinsic load) without deep meta-cognitive reflection (low germane
load), it is not possible to reflect deeply (high germane load) about a very easy task (low
intrinsic load). Insofar, we assume an asymmetric relation between germane load and
intrinsic load: The intrinsic load can exceed the germane load, but the germane load cannot
exceed the intrinsic load. It follows that a high amount of free working memory capacity
due to a low intrinsic load is not beneficial for learning, because the free capacity can be
used for germane load only to a limited extent.
Finally, germane load is constrained under motivational aspects, that is, from the
learner’s willingness to use his/her available mental resources for additional strategic
cognitive processing to enhance learning. Students do not automatically invest all their
available cognitive capacity (i.e. which is not used for intrinsic or extraneous load) into
extra learning activities. Instead, they decide whether they do or do not engage and how
much resources they will invest. Germane cognitive load therefore depends also on general
learning orientations, on affective and on motivational aspects of learning. For example,
learners who follow a deep approach of learning will more likely adopt a higher germane
load than learners who follow a surface approach of learning (Entwistle and Ramsden 1983;
Marton and Saljö 1984). Similarly, learners with high interest in the learning content will
more likely adopt a higher germane load than learners with low interest (Renninger et al.
1992). Germane load is therefore an aspect of the learner’s self-regulation (Winne and
Hadwin 1998). It is therefore not sufficient to provide learning environments that allow
learners to have cognitive resources available for germane load. It is necessary to take care
Educ Psychol Rev (2007) 19:469–508 497
as far as possible, that learners engage into this kind of processing, that they invest their
available working memory resources into the corresponding learning activities.
How is germane load related to learning?
We have argued above that germane load is constrained by intrinsic load, because the task
performance that causes intrinsic load provides the information base for the additional
cognitive processes that cause germane load. The close relationship between the additional
strategic (germane load) processes and the processes of task performance (intrinsic load)
has also consequences for the relation between germane load and learning, because it
implies that germane load cannot predict learning on its own, but only in combination with
Let us imagine on the one hand that a task is very difficult for a learner and that it
puts therefore a heavy intrinsic cognitive load on his/her working memory. In this case,
there is not much capacity left for germane load even if extraneous load is zero. As
there is not much working memory capacity left for additional cognitive activities that
enhance learning, the germane load is limited and the beneficial effect of germane load
on learning will also be limited. Now let us imagine on the other hand that a task is
very easy for a learner and that it puts therefore only a low intrinsic load on his/her
working memory. In this case, there is much capacity left for germane load in working
memory (provided that extraneous load is low or zero). However, even if the learner is
willing to invest all his/her free working memory capacity into additional cognitive
processing to enhance learning, the beneficial effect will nevertheless be limited,
because there is not much to learn from performing such an easy task. If this is correct,
then the beneficial effect of germane load activities on learning depends not only on the
amount of germane load, but also on the task requirements, that is, on the intrinsic
load. We therefore assume that it is the combination of germane load and intrinsic load
rather than the germane load alone, which allows predicting how much better learning
will result from germane load activities.
As we have seen already, learning can take place also without germane load just on the
basis of task performance, even when the learner is not aware that learning is taking place.
We therefore assume that the combination of germane load and intrinsic load is not the only
factor that results in learning. Instead, there might exist also other factors, which together
have conjoint effects on learning. For example, it could be sensible to assume a basic
component of learning that is not affected by germane load, but simply results from task
performance and, thus, is affected only by intrinsic load. The overall learning could then be
a conjoint result of this basic component and an additional component that derives from the
combination of germane load and intrinsic load. There is further research needed to clarify
Measuring Cognitive Load
Many sciences treat the quantitative parameters of their constructs as variables and try
to measure them. This is especially important in empirical field research, when found
cases have to be categorized according to measurement results. In experimental
them a posteriori. Most studies on cognitive load were experiments. Cognitive load was
frequently manipulated by the experimental treatment rather than measured directly.
498 Educ Psychol Rev (2007) 19:469–508
Although experimental manipulation can be an alternative to direct measurement, the
possibility of measuring cognitive load is nevertheless highly desirable also in this research
tradition (Paas et al.2003b). There are three major methods of cognitive load
measurement: Subjective ratings, physiological measures and performance-based
measures (Eggemeier 1988).
Subjective ratings This method is based on the assumption that individuals are able to
inspect their own cognitive processes and to report the experienced difficulty as well as the
amount of mental effort they have invested. Although it is possible that an individual
considers a task as (too) difficult and therefore does not invest much effort, the experienced
difficulty and the invested effort are likewise used as indicators for cognitive load.
Experienced difficulty and mental effort can be measured with subjective rating scales. For
example, Paas and van Merriënboer (1994) used a modified version of a rating scale from
Bratfisch et al.(1972) for measuring perceived difficulty. Subjects had to report the amount
of effort invested on a nine-point scale (ranging from 1=‘very easy’to 9=‘very difficult’).
Other studies have used the NASA-TLX, a general scale for the assessment of task loads,
for the measurement of cognitive load (Hart and Staveland 1988).
The advantage of subjective ratings is that they are simple and easily applicable also in a
natural setting, which increases the ecological validity of the results. Subjective
introspective data are often considered as questionable. However, they are nevertheless
data that need interpretation (cf. Brewer and Nakamura 1984). Accordingly, if carefully
used, this method can reveal valuable data. The disadvantage of subjective ratings is the
instability of the individual’s framework of reference. The framework can change in the
course of learning due to adaptation processes or as a response to motivational and
emotional changes which decrease reliability.
Physiological measures This method assumes that changes in the cognitive functioning are
reflected in changes of physiological states such as galvanic skin response (GSR), pupillary
dilation or heart rate variability. In the galvanic skin response methodology, current is
passed through the body with the skin resistance measured (active GSR) or the current
generated by the body itself (passive GSR). The advantage of the GSR method is that it
provides a relatively simple method for examining the function of the sympathetic
autonomic nervous system and that it is not prone to the introspective skills of the
individual. The disadvantage of the method is that it cannot be used in natural settings,
which reduces its ecological validity. The GSR can also be elicited by emotional arousal or
by any other stimulus capable of an arousal effect. Because the same indicator is used for
different constructs, its validity is naturally limited. Finally, amplitudes tend to habituate
and vary depending on the experimental conditions. Accordingly, the framework of
reference for data interpretation can also change in the course of learning.
The use of pupillary dilation for cognitive load is based on the assumption that the
diameter of the pupil increases with increasing load (Minassian et al.2004; Van Gerven
et al.2004). The advantages and disadvantages of this method are similar to those of the
GSR. On the one hand, the measure is not prone to introspective skills. On the other hand,
individual baselines vary and adaptation to other factors makes interpretation difficult. Data
registration requires high-tech equipment and cannot be used in natural settings. The use of
heart rate variability for measuring cognitive load is based on the assumption that
controlled processing is related to a specific cardiovascular state that manifests itself in the
heart-rate variability power spectrum band (Mulder 1992). Cognitive effort is supposed to
be directly related to controlled processing, which in turn causes a change in the power
Educ Psychol Rev (2007) 19:469–508 499
spectrum. Paas et al.(1994) found, however, that this method was not more useful than
Performance-based measures This method is based on the assumption that working
memory capacity is limited, but can be flexibly allocated to current requirements. When
two tasks require the same resources in parallel, these resources have to be split up between
both tasks. Accordingly, less resources are available for each task compared to a situation
where only one task has to be performed. This kind of performance-based measure is
therefore called the dual task methodology. Participants are required to work on a primary
learning task and simultaneously perform a secondary task, which is usually a simple
reaction task. Both tasks are assumed to use the same resources in working memory. The
more resources are required by the primary task, the more the performance of the secondary
task will be reduced. Accordingly, decrease in performance of the secondary task can be
used as an indicator for the cognitive load imposed by the primary task (Brüncken et al.
2003). Alternatively, one can assume that operating on a secondary task requires
interruption of the primary task: If the cognitive load of the primary task increases, the
reaction times within the secondary task will increase, because interrupting the primary task
requires more intermediate results to be stored in working memory, before the secondary
task can be operated on (Renkl et al.2003). The advantage of this method is that it provides
an objective measure, which is not influenced by the individual’s introspective skills. The
disadvantage of the method is that it requires an experimental setting, which is usually not
applicable under natural conditions and has therefore lower ecological validity. Further-
more, increased load is not necessarily mirrored in lower secondary task performance. One
cannot exclude the possibility that the individual tries to keep the secondary task
performance constant at the expense of the primary task performance.
Limitations of cognitive load measurement
Subjective ratings, physiological methods or performance-based methods generally aim at
measuring the total load experienced by a learner. That is, they do not distinguish between
intrinsic load, extraneous load and germane load. It is obvious that such a distinction cannot
be made on the basis of physiological methods or performance-based methods. Regarding
the use of subjective ratings, one could think of developing questionnaire items that would
allow a distinction between the different kinds of cognitive load. A few attempts are being
made in this direction. However, we are in doubt whether learners will really be able to
clearly distinguish different kinds of cognitive load by introspection, especially given the
conceptual difficulties of discriminating different kinds of load as described above.
According to our own view, there will be no reliable and valid methods of measuring
distinct kinds of cognitive load in the next and even in the farer future.
This should not be considered as a negative statement about cognitive load theory.
Cognitive load theory is basically a conceptual framework for the analysis of instructional
processes based on knowledge about the human architecture. As a framework, it should be
fruitful for empirical research and for research-based practice. A framework does not require
that each theoretical construct needs its own measurement procedure. Other theoretical
frameworks –such as schema theory or production systems –have also been very fruitful
without offering an empirical measurement procedure for each specific construct. According
to our own view, the possibility of directly measuring different kinds of cognitive load is not a
decisive issue for the role of the theory as a conceptual framework.
500 Educ Psychol Rev (2007) 19:469–508
Summary and Further Perspectives
Cognitive load theory has made so far an important contribution in the field of learning and
instruction. It has stimulated numerous empirical studies about the relation of working
memory and learning, it has stimulated deeper reflection about what is going on in the mind
of the learner during the process of teaching, and it has changed the views of practitioners
such as teachers and media designers about instruction and learning. However, our analysis
has also shown weaknesses that suggest further conceptual clarification and modifications
of some basic assumptions of the theory.
Traditional cognitive load theory and alternative views
The main differences between the traditional view of cognitive load theory and the
modified view suggested by our previous analysis are summarized in Table 1. The
traditional version of the theory assumes, that intrinsic load is fixed and cannot be
manipulated, whereas extraneous load and germane load are variable. Regarding
instructional design, the traditional version suggests to minimize extraneous load as far as
possible and to maximize germane load as far as possible. Contrary to the traditional view,
we consider intrinsic load only as fixed for a specific task at a specific level of expertise.
Intrinsic load can and should be manipulated in instructional design by selecting adequate
learning tasks, which fits to the learner’s expertise. Intrinsic load can be too high, when
tasks are too difficult, and it can be too low, when tasks are too easy. Whether a cognitive
load is intrinsic or extraneous depends on the learner’s expertise and the educational
The traditional version of cognitive load theory considers extraneous load as a result of
unnecessarily increased element-interactivity due to an inadequate instructional format. Our
own analysis suggests that there are different kinds of extraneous load caused by different
kinds of misalignment of learning task difficulty and learners’expertise. Extraneous load can
(a) be due to an unnecessarily high interactivity of relevant information, which is at (or
beyond) the limits of working memory. It can (b) be caused by unnecessary efforts to
maintain relevant information in working memory (without increasing element-interactivity).
Extraneous load can (c) be due to enforced interactivity of irrelevant information, and it can
(d) consist simply in the waste of time and effort of solving too easy tasks or using unneeded
instructional help even when element-interactivity is low. We agree that extraneous load
should be minimized. However, when advanced learners become disadvantaged by an
instructional format, which used to be advantageous for novice learners (and therefore did not
overwhelm their processing capacities), we do not consider them as disadvantaged due to
an increased element-interactivity. It is more likely that their learning is impeded by too
easy tasks (due to too low task complexity or too much instructional help), which do not
challenge the capacities of the learner.
Accordingly, reduction of cognitive load is not always helpful for learning. Instead of
focusing only on the reduction of extraneous load, instructional design should sometimes
also increase intrinsic load in order to create adequate alignment of task difficulty with the
learner’s expertise. Alignment of task difficulty with the learner’s expertise is equivalent to
adapting instruction to the learner’s zone of proximal development (ZPD). Cognitive load
can be reduced by instructional help. The reduction has positive effects on learning if the
help has an enabling function: if learning task performance becomes possible which
otherwise had remained impossible. The reduction has also positive effects on learning if
the help has a facilitating function, provided that the facilitation remains within the ZPD. If
Educ Psychol Rev (2007) 19:469–508 501
the facilitation shifts the task difficulty beneath the ZPD, then the facilitation has negative effects
on learning. In this case, the learner is unchallenged. He/she wastes time and energy for too easy
tasks and for processing unneeded help without benefit for learning. Thus, making learning tasks
easier does not necessarily result in better learning.
Traditional cognitive load theory assumes that learning requires working memory
capacity, because schema construction and schema automation impose cognitive (germane)
load on working memory. Our analysis suggests, however, that schema construction and
schema automation do not necessarily need extra working memory capacity. Thus, learning
can occur also without germane load. Furthermore, some examples of germane cognitive
Table 1 Main Differences Between the Traditional Version and A Modified Version of Cognitive Load
Traditional Version Modified Version
Intrinsic Load Fixed. Fixed for a given learning task, expertise
and educational objective, but variable for
Due to unnecessarily increased element
Due to interactivity of relevant information
at limits of working memory.
Due to maintaining relevant information
(without increased element-interactivity).
Due to interactivity of irrelevant
Due to waste of time and effort (without
increased element interactivity).
Due to schema construction and automation. Due to processes in working memory
aiming at intentional learning going
beyond simple task performance.
Constrained by working memory capacity. Constrained by working memory capacity,
Constrained by intrinsic load,
Constrained by motivation.
Learning Requires working memory capacity: schema
construction and automation impose
Does not necessarily require working
memory capacity: schema construction
and automation can occur without
Can be impeded by unnecessary increase of
Can be impeded by unnecessary increase of
Can be impeded by unnecessary mental
effort (without increased element-
Can be impeded by too low intrinsic load.
Affected by germane load. Affected by germane load combined with
Reduce extraneous load as far as possible. Reduce extraneous load as far as possible.
Adapt intrinsic load to the learner’s
Reduce intrinsic load, if task difficulty is
too high (enable or facilitate task
performance within the learner’s ZPD)
Increase intrinsic load, if task difficulty is
Increase germane load as far as possible. Adapt germane load to the intrinsic load.
502 Educ Psychol Rev (2007) 19:469–508
load in the literature can also be reinterpreted as instances of intrinsic load. In order to
distinguish germane load more clearly from intrinsic load and in order to avoid circularity,
we suggest to define cognitive load as germane, if it is due to cognitive activities in
working memory, which aim at intentional learning and which go beyond simple task
performance. Conscious application of learning strategies, search for patterns in the
learning material, restructuring of problem representations, or metacognitive processes are
possible examples. Accordingly, germane load is not a requirement for any kind of learning.
Learning can occur also without germane load, but germane load can further enhance
Whereas traditional cognitive load theory suggests that germane load should be as high
as possible, we consider germane load as subject to multiple constraints, which have to be
taken into account in instructional design. Germane load is not only constrained by the
available working memory capacity, it is also constrained by the nature of the learning task
(i.e. by its intrinsic load), and it is constrained by the learner’s willingness to invest his/her
available working memory resources into specific learning-oriented activities. Thus,
germane cognitive load cannot be increased to any degree whatever within the limits of
available working memory capacity. Instead, germane load should be balanced within the
available working memory capacity with the intrinsic load of the learning task, which in
turn has to be adapted to the learner’s zone of proximal development.
Further research perspectives
Besides the conceptual suggestions described above, further research is needed that
investigates more closely the relation between different kinds of cognitive load and
different kinds of learning. Of course, special care has to be taken to characterize cognitive
load concepts independently from learning. Otherwise, if the different kinds of cognitive
load were defined according to their effects on learning, they had no explanatory value due
to their inherent circularity. Research should not only focus on cognitive learning, but also
take into account perceptual learning and behavioral learning (motor learning), which play
an important role even in school learning, although this role might be less obvious than
those of cognitive learning. Research should also discriminate between results of cognitive
load manipulations on declarative learning and those on procedural learning. Furthermore,
distinctions should be made between different kinds of explicit learning such as intentional
learning (i.e. learning with the intention to learn and with awareness of what has been
learned) or incidental learning (i.e. learning without the intention to learn, but with
awareness of what has been learned) on the one hand, and implicit learning (i.e. learning
without the intention to learn and even without awareness of what has been learned) on the
other hand. Based on previous research, it can be expected that implicit learning plays a
much more important role in perceptual learning and in behavioral learning (i.e. motor
learning) than in cognitive learning and that it plays a more important role in procedural
learning than in learning declarative knowledge (Howard and Howard 1992; Mulligan
1998; Roediger 1990; Srinivas and Roediger 1990).
It might also be worth investigating what is the optimum intrinsic load of learning tasks
for these different kinds of learning. One could expect, for example, that in case of explicit
intentional learning, a medium level of intrinsic load would be adequate, which still leaves
sufficient working memory capacity for germane load. In the case of procedural learning on
the perceptual or behavioral level, however, which might include much implicit learning
(i.e. learning without conscious awareness of what is learned), there would be no need and
no basis for conscious reflection and, thus, no need for germane load during task
Educ Psychol Rev (2007) 19:469–508 503
performance. Accordingly, the optimal level of intrinsic load might be much higher than in
the case of explicit intentional learning.
Contrary to a widespread view in the field of instructional design, cognitive load theory
does also in its traditional form not suggest to generally reduce cognitive load as far as
possible. Instead, it suggests that extraneous load should be reduced, whereas germane load
should be increased. Contrary to the traditional view of cognitive load theory, our previous
analysis has shown that not only the germane load, but also the intrinsic load can be
reduced by mistake too much, because this can lower the possibilities for learning instead
of increasing them.
The requirements of learning tasks should be adapted to the learner’s zone of proximal
development. This zone in turn depends on the learner’s level of expertise. Whether a
specific requirement is intrinsic load or extraneous load depends on the alignment between
task requirements and the learner’s capabilities: If both are well aligned, the corresponding
load is intrinsic; if they are misaligned, an extraneous load is created. If learning tasks are
too difficult, the learner is unable to perform these tasks successfully, because the
requirements exceed the learner’s working memory capacity. In this case, successful
performance is impossible, and there is no learning. If learning tasks are too easy, they do
not challenge the learner’s capabilities, because task performance is automated to a large
extent and therefore requires very few working memory capacity (if any). In this case,
performance is successful, but there is also little learning (except from further automation).
Facilitation of learning tasks can be helpful for learners under specific conditions
(cf. Wallen et al.2005). However, facilitating can also be harmful for learning. The
negative effects of facilitation obviously occur, when learners who would be able to
perform cognitive processes on their own make nevertheless use of external support, which
they do not really need. The unneeded external support can keep them away from doing
learning-relevant cognitive processes by themselves. In this case, the facilitation makes
performance easier, but it does not improve learning.
As our analysis has emphasized, simple rules-of-thumb regarding the reduction of cognitive
load are inadequate for instructional design. The different kinds of cognitive load are subject to
multiple constraints, which have to be well balanced in teaching and learning. Instead of
applying simple rules-of-thumb, we need a better understanding of how people learn under
instructional guidance. We need to know under which conditions specific instructional
manipulations are effective, and why they are effective under these conditions. In other words:
We need further theory driven empirical research on teaching and learning. Cognitive load
theory has made an important contribution to this field of research. Nevertheless, understanding
the role of working memory in learning and instruction seems to be still at its beginning.
Acknowledgment We are grateful to John Sweller for various intensive discussions about fundamental
issues of cognitive load theory. We also want to thank four anonymous reviewers for their helpful comments
on a previous version of this article.
Ainsworth, S., & Van Labeke, N. (2004). Multiple forms of dynamic representation. Learning and
Instruction, 14(3), 241–255.
504 Educ Psychol Rev (2007) 19:469–508
Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Baddeley, A. D. (1986). Working memory. Oxford: Clarendon Press.
Baddeley, A. D. (1997). Human memory. Theory and practice. Hove: Lawrence Erlbaum Associates
Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. London: Cambridge
Berry, D. C., & Broadbent, D. E. (1984). On the relationship between task performance and associated
verbalizable knowledge. The Quarterly Journal of Experimental Psychology, 36, 209–231.
Bobis, J., Sweller, J., & Cooper, M. (1993). Cognitive load effects in a primary-school geometry task.
Learning and Instruction, 3,1–21.
Bratfisch, O., Borg, G., & Dornic, S. (1972). Perceived item-difficulty in three tests of intellectual
performance capacity. Report No. 29. Stockholm: Institute of Applied Psychology.
Brewer, W. F., & Nakamura, G. V. (1984). The nature and functions of schemas. In R. S. Wyer & T. K. Srull
(Eds.), Handbook of social cognition, Vol. 1 (pp. 119–160). Hillsdale, N.J.: Erlbaum.
Brüncken, R., Plass, J. L., & Leutner, D. (2003). Direct measurement of cognitive load in multimedia
learning. Educational Psychologist, 38,53–61.
Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and
Instruction, 8(4), 293–332.
Chandler, P., & Sweller, J. (1992). The split-attention effect as a factor in the design of instruction. British
Journal of Educational Psychology, 62, 233–246.
Chandler, P., & Sweller, J. (1996). Cognitive load while learning to use a computer program. Applied
Cognitive Psychology, 10(2), 151–170.
Cleeremans, A., Destrebecqz, A., & Boyer, M. (1998). Implicit learning: News from the front. Trends in
Cognitive Sciences, 2(10), 406–416.
Cooper, G., & Sweller, J. (1987). The effects of schema acquisition and rule automation on mathematical
problem-solving transfer. Journal of Educational Psychology, 79, 347–362.
Cooper, G., Tindall-Ford, S., Chandler, P., & Sweller, J. (2001). Learning by imagining. Learning of
Experimental Psychology: Applied, 7,68–82.
Eggemeier, F. T. (1988). Properties of workload assessment techniques. In P. A. Hancock & N. Meshkati
(Eds.), Human and mental workload (pp. 41–62). Amsterdam: North-Holland, Elsevier.
Entwistle, N. J., & Ramsden, P. (1983). Understanding student learning. London: Croom Helm.
Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102,211–245.
Eysenck, M. W., & Keane, M. T. (2000). Cognitive psychology. Hove: Psychology Press.
Frensch, P. A. (1998). One concept, multiple meanings. In M. A. Stadler & P. A. Frensch (Eds.), Handbook
of implicit learning. Thousand Oaks, CA: Sage.
Geary, D. (2005). The origin of mind. Washington: American Psychological Association.
Geary, D. (2007). Educating the evolved mind: Conceptual foundations for an evolutionary educational
psychology. In J. S. Carlson & J. R. Levin (Eds.), Psychological perspectives on contemporary
educational issues. Greenwich, CT: Information Age Publishing.
Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical
and theoretical research. In P. A. Hancock & N. Meshkati (Eds.), Human Mental Workload (pp. 139–
183). Amsterdam: Elsevier Science Publishers B. V. (North Holland).
Howard, D. V., & Howard, J. H. (1992). Adult age differences in the rate of learning serial patterns: Evidence
from direct and indirect tests. Psychology & Aging, 7, 232–241.
Kalyuga, S. (2000). When using sound with a text or picture is not beneficial for learning. Australian Journal
of Educational Technology, 16(2), 161–172.
Kalyuga, S., Ayres, P., Chandler, P., & Sweller, J. (2003). The expertise reversal effect. Educational
Kalyuga, S., Chandler, P., & Sweller, J. (1998). Levels of expertise and instructional design. Human Factors,
Kalyuga, S., Chandler, P., & Sweller, J. (2000). Incorporating learner experience into the design of
multimedia instruction. Journal of Educational Psychology, 92(1), 126–136.
Kalyuga, S., Chandler, P., & Sweller, J. (2001). Learner experience and efficiency of instructional guidance.
Educational Psychology, 21,5–23.
Kalyuga, S., Chandler, P., Tuovinen, J., & Sweller, J. (2001). When problem solving is superior to studying
worked examples. Journal of Educational Psychology, 93, 579–588.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: Cambridge University Press.
Kirkhart (2001). The nature of declarative and nondeclarative knowledge for implicit and explicit learning.
Journal of General Psychology, 128(4), 447–461.
Leahy, W., Chandler, P., & Sweller, J. (2003). When auditory presentations should and should not be a
component of multimedia instruction. Applied Cognitive Psychology, 17, 401–418.
Educ Psychol Rev (2007) 19:469–508 505
Leung, M., Low, R., & Sweller, J. (1997). Learning from equations or words. Instructional Science, 25,37–
Lewicki, P. (1985). Nonconscious biasing effects of single instances on subsequent judgments. Journal of
Personality and Social Psychology, 48, 563–574. .
Lewicki, P. (1986). Nonconscious social information processing. New York: Academic Press.
Lewicki, P., Hill, T., & Bizot, E. (1988). Acquisition of procedural knowledge about a pattern of stimuli that
cannot be articulated. Cognitive Psychology, 20,24–37.
Lewicki, P., Hill, T., & Czyzewska, M. (1992). Nonconscious acquisition of information. American
Psychologist, 47(6), 796–801.
Mager, R. F. (1975). Preparing instructional objectives. Palo Alto, CA: Fearon.
Marcus, N., Cooper, M., & Sweller, J. (1996). Understanding instructions. Journal of Educational
Marton, F., & Saljö, R. (1984). Approaches to learning. In F. Marton, D. Hounsell, & D. Entwistle (Eds.),
The experience of learning (pp. 39–58). Edinburgh: Scottish Academic Press.
Mayer, R. E. (1997). Multimedia learning: Are we asking the right questions? Educational Psychologist, 32,
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Mayer, R. E. (2005). The cambridge handbook of multimedia learning. Cambridge: Cambridge University
Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational
Psychologist, 38(1), 43–52.
McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts always better? Text
coherence, background knowledge, and levels of understanding in learning from text. Cognition and
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for
processing information. Psychological Review, 63,81–97.
Milner, B., Corkin, S., & Teubner, H. L. (1968). Further analysis of the hippocampal amnestic syndrome:
14-year follow-up study of H.M. Neuropsychologia, 6, 215–234.
Minassian, A., Granholm, E., Verney, S., & Perry, W. (2004). Pupillary dilation to simple vs. complex tasks
and its relationship to thought disturbance in schizophrenia patients. International Journal of
Moreno, R., & Mayer, R. E. (1999a). Cognitive principles of multimedia learning: The role of modality and
contiguity. Journal of Educational Psychology, 91(2), 358–368.
Moreno, R., & Mayer, R. E. (1999b). Visual presentations in multimedia learning: Conditions that overload
visual working memory. In D. P. Huijsmans & A. W. M. Smeulders (Eds.), Lecture notes in computer
science: Visual information and information systems (pp. 793–800). Berlin: Springer.
Mousavi, S. Y., Low, R., & Sweller, J. (1995). Reducing cognitive load by mixing auditory and visual
presentation modes. Journal of Educational Psychology, 87, 319–334.
Mulder, L. J. M. (1992). Measurement and analysis methods of heart rate and respiration for use in applied
environments. Biological Psychology, 34, 205–236.
Mulligan, N. W. (1998). The role of attention during encoding in implicit and explicit memory. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 24,27–47.
Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent
developments. Educational Psychologist, 38,1–4.
Paas, F., Renkl, A., & Sweller, J. (2004). Cognitive load theory: Instructional implications of the interaction
between information structures and cognitive architecture. Instructional Science, 32,1–8.
Paas, F., Tuovinen, J. E., Tabbers, H., & Van Gerven, P. W. M. (2003). Cognitive load measurement as a
means to advance cognitive load theory. Educational Psychologist, 38,63–71.
Paas, F., & Van Gog, T. (2006). Optimising worked example instruction: Different ways to increase germane
cognitive load. Learning and Instruction, 16, 87–91.
Paas, F., & Van Merriënboer, J. J. G. (1994). Variability of worked examples and transfer of geometrical
problem-solving skills: A cognitive-load approach. Journal of Educational Psychology, 86, 122–133.
Paas, F., van Merriënboer, J. J. G., & Adam, J. J. (1994). Measurement of cognitive load in instructional
research. Perceptual and Motor Skills, 79, 419–430.
Perrig, W. J. (1996). Implizites lernen. In J. Hoffmann & W. Kintsch (Eds.), Enzyklopädie der psychologie
(pp. 203–234). Göttingen: Hogrefe.
Peterson, L., & Peterson, M. (1959). Short-term retention of individual verbal items. Journal of Experimental
Psychology, 58, 193–198.
Reber, S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118,
506 Educ Psychol Rev (2007) 19:469–508
Reber, S., Walkenfeld, F. F., & Hernstadt, R. (1991). Implicit and explicit learning: Individual differences and
iq. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 888–896.
Renkl, A. (1999). Learning mathematics from worked-out examples. Analyzing and fostering self-
explanations. European Journal of Psychology of Education 14, 477–488.
Renkl, A. (2002). Lerning from worked-out examples: Instructional explanations supplement self-
explanations. Learning and Instruction, 12, 529–556.
Renkl, A., Gruber, H., Weber, S., Lerche, T., & Schweizer, K. (2003). Cognitive Load beim Lernen aus
Lösungsbeispielen [Cognitive load during learning from worked-out examples]. Zeitschrift für
Pädagogische Psychologie, 17,93–101.
Renninger, A., Hidi, S., & Krapp, A. (1992). The role of interest in learning and development. Mahwah, NJ:
Roediger, H. L. (1990). Implicit memory: Retention without remembering. American Psychologist, 45,
Schank, R. C., & Abelson, R. P. (1977). Scripts, plans, goals and understanding. Hillsdale, NJ: Erlbaum.
Schneider, W., & Shiffrin, R.M. (1977). Controlled and automatic human information prodessing: I.
Detection, search, and attention. Psychological Review, 84,1–66.
Schnotz, W. (2001). Sign systems, technologies, and the acquisition of knowledge. In J. F. Rouet,
J. Levonen, & A. Biardeau (Eds.), Multimedia learning —cognitive and instructional issues (pp. 9–29).
Schnotz, W. (2005). An integrated model of multimedia learning. In R. E. Mayer (Ed.), The Cambridge
handbook of multimedia learning (pp. 49–69). New York: Cambridge University Press.
Schnotz, W., & Rasch, T. (2005). Enabling, facilitating, and inhibiting effects of animations in multimedia
learning: Why reduction of cognitive load can have negative results on learning. Educational
Technology: Research and Development, 53(3), 47–58.
Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information prodessing: II.
Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190.
Sijtsma, K. (2004). Item response theory. In M. S. Lewis-Beck, A. Bryman, & T. Futing Liao (Eds.), The
SAGE encyclopedia of social science research methods, Vol. 2 (pp. 529–533). Thousand Oaks, CA:
Slusarz, P., & Sun, R. (2001). The interaction of explicit and implicit learning: An integrated model. In
Proceedings of the 23rd cognitive science society conference (pp. 952–957). Mahwah, NJ: Lawrence
Srinivas, K., & Roediger, H. L. (1990). Classifiying implicit memory tests: Category association and
anagram solution. Journal of Memory & Language, 29, 389–412.
Stadler, M. A., & Frensch, P. A. (1998). Handbook of implicit learning. Thousand Oaks, CA: Sage
Sun, R., Merrill, E., & Peterson, T. (2001). From implicit skills to explicit knowledge: A bottom-up model of
skill learning. Cognitive Science, 25(2), 203–244.
Sweller, J. (1976). The effect of task complexity and sequence on rule learning and problem solving. British
Journal of Psychology, 67, 553–558.
Sweller, J. (1980). Hypothesis salience, task difficulty, and sequential effects on problem solving. American
Journal of Psychology, 93, 135–145.
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 257–
Sweller, J. (1999). Instructional design in technical areas. Melbourne: ACER Press.
Sweller, J. (2003). Evolution of human cognitive architecture. In B. Ross (Ed.), The psychology of learning
and motivation, Vol. 43, (pp. 215–266). San Diego: Academic Press.
Sweller, J. (2004). Instructional design consequences of an analogy between evolution by natiral selection
and human cognitive architecture. Instructional Science, 32,9–31.
Sweller, J. (2005). Implications of cognitive load theory for multimedia learning. In R. E. Mayer (Ed.), The
Cambridge handbook of multimedia learning (pp. 19–30). New York: Cambridge University Press.
Sweller, J., & Chandler, P. (1991). Evidence for cognitive load theory. Cognition and Instruction, 8, 351–362.
Sweller, J., & Chandler, P. (1994). Why some material is difficult to learn. Cognition and Instruction, 12(3),
Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of
technical material. Journal of Experimental Psychology: General, 119, 176–192.
Sweller, J., & Cooper, G. A. (1985). The use of worked examples as a substitute for problem solving in
learning algebra. Cognition and Instruction, 2,59–89.
Sweller, J., & Levine, M. (1982). Effects of goal specificity on means–ends analysis and learning. Journalof
Experimental Psychology: Learning, Memory, and Cognition, 8, 463–474.
Educ Psychol Rev (2007) 19:469–508 507
Sweller, J., Mawer, R. F., & Howe, W. (1982). Consequences of history-cued and means–end strategies in
problem solving. American Journal of Psychology, 95, 455–483.
Sweller, J., & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4, 434–
Sweller, J., van Merriënboer, J. J. G., & Paas, F. G. W. C. (1998). Cognitive architecture and instructional
design. Educational Psychology Review, 10(3), 251–296.
Tindall-Ford, S., Chandler, P., & Sweller, J. (1997). When two sensory modes are better than one. Journal of
Experimental Psychology: Applied, 3, 257–287.
Van Gerven, P. W. M., Paas, F., van Merriënboer, J. J. G, & Schmidt, H. G. (2004). Memory load and the
cognitive pupillary response in aging. Psychophysiology, 41, 167–174.
Van Merriënboer, J. J. G. (1997). Training complex cognitive skills. Englewood Cliffs, NJ: Educational
Vygotski, L. S. (1963). Learning and mental development at school age In B. Simon & J. Simon (Eds.),
Educational psychology in the U.S.S.R. (pp. 21–34). London: Routledge & Kegan Paul.
Wallen, E., Plass, J. L., & Brünken, R. (2005). The function of annotations in the comprehension of scientific
texts: Cognitive load effects and the impact of verbal ability. Educational Technology Research and
Wenger, E. (1998). Communities of practice: Learning, meaning, and identity. Cambridge: Cambridge
Winne, P. H., & Hadwin, A. F. (1998). Studying as self-regulated learning. In D. J. Hacker, J. Dunlosky, &
A. C. Graesser (Eds.), Metacognition in educational theory and practice (pp. 277–304). Mahwah, NJ:
Lawrence Erlbaum Associates.
Yeung, A. S., Jin, P., & Sweller, J. (1997). Cognitive load and learner expertise: Split-attention and
redundancy effects in reading with explanatory notes. Contemporary Educational Psychology, 23,1–21.
508 Educ Psychol Rev (2007) 19:469–508