Content uploaded by Juan Cristobal Castro-Alonso
Author content
All content in this area was uploaded by Juan Cristobal Castro-Alonso on Jun 06, 2019
Content may be subject to copyright.
The Modality Effect of Cognitive Load Theory
Juan C. Castro-Alonso
1(&)
and John Sweller
2
1
Center for Advanced Research in Education,
Universidad de Chile, Santiago, Chile
jccastro@ciae.uchile.cl
2
School of Education, University of New South Wales, Sydney, Australia
j.sweller@unsw.edu.au
Abstract. The modality effect, which has been investigated by cognitive load
theory, predicts that learning from visualizations supplemented with written text
should be less effective than learning from the same visualizations supplemented
with comparable spoken text. An explanation of the effect assumes a degree of
separation between the processing of visuospatial and auditory information. Due
to this separability, learning only from visuospatial information (visualizations
and visual text) is more likely to overload visuospatial processing, as compared
to learning from visuospatial and auditory information (visualizations and
auditory text), in which both the visuospatial and the auditory processors share
the load of the learning material. The aims of this review chapter are to:
(a) describe the modality effect, (b) provide supporting evidence using computer
multimedia about STEM topics, and (c) describe studies indicating the separa-
bility of visuospatial and auditory processing. We finish by suggesting future
directions for research on the modality effect.
Keywords: Modality effect or modality principle Cognitive load theory
STEM education Multicomponent working memory Multimedia learning
1 Introduction
For over 30 years, cognitive load theory [1] and the related cognitive theory of mul-
timedia learning [2] have published several educational principles based on randomized
experiments with control of variables. Under these strict methodological conditions,
many instructional principles or effects have been investigated. We focus on one of
these principles, the modality effect or modality principle [e.g., 3], which guides the
design of multimedia instructional resources that combine visualizations (e.g. anima-
tions, videos, photos, diagrams) and texts (e.g., on-screen, printed, narrated). The
modality effect occurs when multimedia that depict visualizations associated with
written text is less effective for learning than multimedia that depict the same visual-
izations associated with comparable narrated text [4].
Cognitive load theory explains the modality effect by building on Baddeley and
Hitch’s multicomponent model of working memory [see 5]. In this model, working
memory includes two systems with limited capacity: (a) the phonological loop,
managing the processing of auditory information, and (b) the visuospatial sketch pad,
dealing with visual and spatial information. The multicomponent model indicates that
©Springer Nature Switzerland AG 2020
W. Karwowski et al. (Eds.): AHFE 2019, AISC 963, pp. 75–84, 2020.
https://doi.org/10.1007/978-3-030-20135-7_7
jccastro@ciae.uchile.cl
auditory and visuospatial information to a substantial degree tend to be processed
separately in these limited systems [cf. 6]. Due to this separability, learning only from
visuospatial information (images supplemented with written texts) is more likely to
overload the limited visuospatial sketch pad, as compared to learning from visuospatial
and auditory information (images supplemented with spoken text), in which both the
visuospatial sketch pad and the phonological loop share the cognitive load of the
learning material.
Hence, the modality effect is produced when an overloaded system does not have
enough capacity to deal with learning, as compared to two less overloaded systems. For
example, Fig. 1shows two multimedia formats to teach the shape of bacteria. The
format on the left (Fig. 1a) supplements the images with spoken text, whereas the
design on the right (Fig. 1b) supplements the images with written text, so it is more
likely to overload the visuospatial processor. (Note that cognitive load theory usually
investigates more complex educational materials, likely to overload the visuospatial or
auditory systems [see 7], rather than the simple example we are providing).
From the diverse materials and topics where the modality effect has been investi-
gated, in the next section we provide examples of supporting results on computer
multimedia dealing with STEM (science, technology, engineering, and mathematics)
topics.
Fig. 1. Example of formats that ause both the visuospatial sketch pad and the phonological loop
(with the shapes spoken rather than written), and buse solely the visuospatial sketch pad (at the
risk of overloading it).
76 J. C. Castro-Alonso and J. Sweller
jccastro@ciae.uchile.cl
2 STEM Multimedia Evidence for the Modality Effect
In a meta-analysis of the modality effect [8], in which 43 effect sizes and more than
1,900 participants were investigated, there was an overall effect size of d= 0.72.
According to benchmarks for behavioral sciences [9], this size corresponds to a medium
to large effect. Notably, when the meta-analysis [8] compared different instructional
disciplines, a larger effect size of d= 1.20 was reported for the science domains.
Additional evidence for the modality effect using STEM multimedia has included a
variety of participants and learning content. For example, Experiment 1 reported in [10]
compared the instructional effectiveness of visual vs. auditory explanations added to
meteorology multimedia. In the study, where university students had to learn lightning
formation, it was observed that those given auditory texts outperformed those shown
the texts on-screen. Also, in [11], Experiment 1 assessed first-year apprentices learning
to read a fusion diagram for soldering. Results showed that a group receiving the
animated visuals supplemented with auditory explanations showed higher test scores
and lower self-ratings of cognitive load than a group given animated visuals supple-
mented with written text explanations. In an experiment where university participants
studied lightning formation from different animations [12, Experiment 1], results
showed that animations with narrations obtained significantly higher retention and
transfer scores than animations with on-screen texts.
Concerning biological sciences, in [13] the authors investigated university students
learning about fish movements. For both static and animated displays, it was observed
that narrated multimedia outperformed written multimedia in tests of retention and
transfer. Also, Experiment 2 in [14] reported university participants who studied
through computer static pictures and texts describing the structure and function of an
enzyme. Randomly, half of the sample studied from the pictures supplemented with
narrations and the other half received supplementary written texts. Results showed
higher recall scores for the group studying with images and narrations. A similar
direction of effects for the modality effect, although non-significant, was observed for
comprehension and transfer tests. Lastly, [15] investigated the capacity of university
students to learn a health science first-aid procedure using two animation formats. Half
of the participants were randomly assigned to animations and written texts (subtitles
below the depiction), and the other half watched the animations with narrated texts.
Results of the behavioral performance test showed that the narrated versions outper-
formed the written text formats.
Thus, there are diverse studies using STEM multimedia learning supporting the
modality effect. As described above, the effect assumes that working memory can be at
least partially separated into different processors. Evidence for this separability can help
predict the effectiveness of the modality effect under different learning conditions.
Next, we describe two areas that have supported the separation of processing between
visuospatial and auditory information.
The Modality Effect of Cognitive Load Theory 77
jccastro@ciae.uchile.cl
3 Working Memory Separability that Allows the Modality
Effect
As reviewed in [16], there are at least two research areas that show evidence for the
separability between the visuospatial and auditory subcomponents of working memory,
which will be termed here: (a) selective interference, and (b) modality organization.
Concerning selective interference, this research shows selective impairment in pro-
cessing visuospatial information when receiving additional visuospatial information
(but not when receiving extra auditory information), and selective impairment in
processing auditory information when receiving additional auditory information (but
not when receiving extra visuospatial information).
In a classic example of selective interference [17], male university students spoke
out letters while memorizing other letters. The information to be memorized was
presented either in an auditory or visual modality. It was observed that visually shown
letters were remembered for a longer period, because, in contrast to auditorily shown
letters, they were not interfered by the auditory speaking aloud process. In an example
from multimedia research [18], Experiment 1 tested female education undergraduates
learning about the cardiovascular system from a multimedia module. While learning
the contents, students were required to respond as rapidly as possible to a change in
color of an on-screen element. Showing selective interference, students were slower in
responding to the visual change when the multimedia included on-screen text, as
compared to narrated multimedia. In other words, students were more impaired in the
color visuospatial tasks by the visual text than by the auditory text of the multimedia.
Similar multimedia are shown in Fig. 2, which depicts information based on the design
in [18]. In these replicas, the on-screen text is expected to produce a slower student’s
response when the star at the top-right changes color from purple to green (Fig. 2a),
compared to the response when there is a narration (Fig. 2b).
Fig. 2. With on-screen text athe response to the star changing color should be slower than with
auditory text b.
78 J. C. Castro-Alonso and J. Sweller
jccastro@ciae.uchile.cl
As another example, in two dual-task experiments investigating visuospatial pro-
cessing [19], undergraduates were asked to generate and rotate mental images while
executing other simultaneous tasks. Results showed that the simultaneous task of
speaking a word (auditory task) did not interfere with the visuospatial tasks. In contrast,
the simultaneous task of localizing the source of a sound (spatial task) was deleterious
to processing the mental images.
The investigations by Robinson and colleagues are also part of the selective
interference research. The authors observed more interference between visuospatial and
verbal processing when the verbal information was presented in visuospatial config-
urations [cf. 6]. For example, in [20] they reported four experiments with university
students learning zoology categories from different verbal displays, including less
visuospatial (written paragraphs) and more visuospatial configurations (graphic orga-
nizers and concept maps). To produce processing interference, the students were also
given verbal and visual working memory tasks. As predicted, test scores on zoology
information memorized from graphic organizers or concept maps (high in visuospatial
information) were lower when attempting a visual working memory task, as compared
to a verbal working memory task. In contrast, these visuospatial interferences were not
observed when the information was memorized from paragraphs, which relied less on
visuospatial organization than graphic organizers and concept maps. In a follow-up
with stricter controls [21], the findings were replicated.
To exemplify these experiments, Fig. 3provides different visuospatial configura-
tions for texts about two penguin species. The facts comparing the species are given
either as paragraphs (Fig. 3a), as a graphic organizer (Fig. 3b), or as a concept map
(Fig. 3c). It can be predicted that paragraphs will be less affected by visuospatial
interference than the graphic organizer or the concept map. In short, these investiga-
tions of selective interference support the hypothesis that visuospatial information
(including texts presented in visuospatial configurations, such as concept maps), tends
to be processed separately from verbal information (including texts presented as written
paragraphs).
Fig. 3. Comparable texts configured as aparagraphs, bgraphic organizer, and cconcept map.
The Modality Effect of Cognitive Load Theory 79
jccastro@ciae.uchile.cl
In addition to selective interference investigations, another area that supports the
separability of processing visuospatial and auditory information is modality organi-
zation. This research shows that, when retrieving elements from memory, the modality
of the memorized items is pervasive and is even stronger than semantic or other types
of grouping categories. As a classic example, in [22] the author employed an original
strategy to present sets of four simultaneous words to university students. Two words
were shown visually, and the two remaining were presented auditorily (one word per
ear). Students had to report the four words from memory. The results showed that
participants reported the words in blocks of the same modality (i.e., auditory &
auditory, visual & visual), even though the four stimuli were simultaneous. In a follow-
up study, it was observed that this modality order was stronger than an associative or
semantic order between pairs of words of different modalities. For example, when a
pair such as girl (auditory) & boy (visual) was presented as stimuli, the association
girl–boy was weaker than the modality of each word. The words were always reported
in modality-determined blocks, even if this order broke the associations between word
pairs. Figure 4shows examples of stimuli and responses when the modality of the
stimuli was congruent with the semantic association of the word pairs (Fig. 4a, girl and
boy in auditory modality) or when it was non-congruent (Fig. 4b, boy and day in visual
modality).
Another study related to modality organization [23], investigated the response times
of undergraduates receiving repeated and novel words, presented either visually or
auditorily. As expected, results showed a faster response time when repeated words
were presented (same word in studying and test times), compared to novel words.
Notably, the effect was larger when the same modality was used in studying and test
times, compared to cross modality results. These findings indicate that processing the
Fig. 4. The modality organization shown in stimuli and response time, when the modality of the
stimuli is either congruent a, or non-congruent bwith the semantic association.
80 J. C. Castro-Alonso and J. Sweller
jccastro@ciae.uchile.cl
modality of a word occurs earlier than processing its meaning. Analogous findings with
numerical stimuli were reported in [24], regarding an experiment with university stu-
dents. The students received a series of interspersed auditory and visual one-digit
numbers. It was observed that participants who reported the digits grouped according to
modality outperformed those grouping the digits according to their presentation order
and mixing the modalities. In other words, it was more efficient to memorize the two
modalities in parallel rather than intermixing them.
In conclusion, modality organization research supports that items presented visu-
ospatially or auditorily tend to be remembered attached to the modality in which they
were presented. Thus, both modality organization and selective interference investi-
gations support the suggestion that visuospatial processing in working memory is at
least to some extent independent from auditory processing. As described next, future
directions for research in these areas of investigation may inform cognitive load theory
and the modality effect.
4 Future Directions for Research
A possible direction for further study of the modality effect concerns different designs
of visualizations and texts. For example, under visualization conditions that demand
more working memory, such as transient animations [25], the modality effect should be
larger than under less demanding conditions. The design of the auditory text can also
be important, as voice narrations can be more effective than machine narrations [26].
In addition, selective interference and modality organization research may inform
more fine-grained analyses to investigate the modality effect. For example, selective
interference can include the time factor, as giving a space or lapse of time can be effective
to allow working memory to replenish resources and avoid interference [cf. 27].
Also, there are current investigations relating the modality effect to other effects of
cognitive load theory or the cognitive theory of multimedia learning [see 4]. For
example, links have been established with the redundancy effect [see 28], the expertise
reversal effect [see 29], the transient information effect [see 30], and the signaling
principle [see 31].
For example, if the narration merely reiterates information in the visualization, this
will likely produce a negative redundancy effect [28] instead of a positive modality
effect. Also, related to the redundancy [28] and expertise reversal effects [7,29], if an
on-screen text contains information already known by the students, then it is more
appropriate to discard it rather than to present it as narration. Similarly, boundary
conditions for the modality effect entail the integration of findings from the modality
effect (supporting narrated texts) and the transient information effect (discouraging
transient narrated texts) [e.g., 32].
Regarding the signaling principle [31], sometimes a short on-screen text that signals
important visual information can be an effective learning asset. For example, in [15],
short labels placed at relevant areas of the screen had positive effects on learning.
Future research should investigate the most appropriate length, format, and placement
to produce these positive signaling effects of texts, while avoiding a cognitive overload
of the visuospatial processor.
The Modality Effect of Cognitive Load Theory 81
jccastro@ciae.uchile.cl
Lastly, there are known factors that influence multimedia learning, which may also
affect the modality effect, such as gender [33] and visuospatial processing or spatial
ability [34].
5 Conclusion
The modality effect has been investigated by cognitive load theory and the cognitive
theory of multimedia learning. The effect has shown that learning from visualizations
supplemented with written text is less effective than learning from the same visual-
izations supplemented with comparable narrations. These findings assume that there is
a degree of separation in the working memory processing of visuospatial and auditory
information, and that written text will tend to overload the visuospatial processor more
than auditory text, because written text will be processed simultaneously with the
visual learning elements. As auditory text removes the load from the visuospatial
processor and onto the auditory processor, it leaves more capacity for the visuospatial
processor to deal with the visual learning elements. The modality effect has been
supported by diverse educational multimedia depicting STEM concepts. Similarly, the
foundation of the effect, that is the separability between visuospatial and auditory
processing, has also been systematically investigated. We described two research areas
showing this separability, namely, selective interference and modality organization.
Future directions for research, including boundary conditions, will provide additional
support for the modality effect and its relationship with other findings of cognitive load
theory and the derived cognitive theory of multimedia learning.
Acknowledgments. Support from PIA–CONICYT Basal Funds for Centers of Excellence
Project FB0003, and CONICYT Fondecyt 11180255, is gratefully acknowledged. We thank
Ignacio Jarabran for helping with the illustrations.
References
1. Sweller, J., Ayres, P., Kalyuga, S.: Cognitive Load Theory. Springer, New York (2011)
2. Mayer, R.E. (ed.): The Cambridge Handbook of Multimedia Learning, 2nd edn. Cambridge
University Press, New York (2014)
3. Mousavi, S.Y., Low, R., Sweller, J.: Reducing cognitive load by mixing auditory and visual
presentation modes. J. Educ. Psychol. 87(2), 319–334 (1995)
4. Low, R., Sweller, J.: The modality principle in multimedia learning. In: Mayer, R.E. (ed.)
The Cambridge Handbook of Multimedia Learning, 2nd edn., pp. 227–246. Cambridge
University Press, New York (2014)
5. Baddeley, A.: Working memory. Science 255(5044), 556–559 (1992)
6. Clark, J., Paivio, A.: Dual coding theory and education. Educ. Psychol. Rev. 3(3), 149–210
(1991)
7. Chen, O., Kalyuga, S., Sweller, J.: The expertise reversal effect is a variant of the more
general element interactivity effect. Educ. Psychol. Rev. 29(2), 393–405 (2017)
8. Ginns, P.: Meta-analysis of the modality effect. Learn. Instr. 15(4), 313–331 (2005)
9. Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Erlbaum, Hillsdale (1988)
82 J. C. Castro-Alonso and J. Sweller
jccastro@ciae.uchile.cl
10. Schmidt-Weigand, F., Kohnert, A., Glowalla, U.: A closer look at split visual attention in
system- and self-paced instruction in multimedia learning. Learn. Instr. 20(2), 100–110
(2010)
11. Kalyuga, S., Chandler, P., Sweller, J.: Managing split-attention and redundancy in
multimedia instruction. Appl. Cogn. Psychol. 13(4), 351–371 (1999)
12. Moreno, R., Mayer, R.E.: Cognitive principles of multimedia learning: the role of modality
and contiguity. J. Educ. Psychol. 91(2), 358–368 (1999)
13. Kühl, T., Scheiter, K., Gerjets, P., Edelmann, J.: The influence of text modality on learning
with static and dynamic visualizations. Comput. Hum. Behav. 27(1), 29–35 (2011)
14. Seufert, T., Schütze, M., Brünken, R.: Memory characteristics and modality in multimedia
learning: an aptitude-treatment-interaction study. Learn. Instr. 19(1), 28–42 (2009)
15. de Koning, B.B., van Hooijdonk, C.M.J., Lagerwerf, L.: Verbal redundancy in a procedural
animation: on-screen labels improve retention but not behavioral performance. Comput.
Educ. 107,45–53 (2017)
16. Penney, C.G.: Modality effects and the structure of short-term verbal memory. Mem. Cogn.
17(4), 398–422 (1989)
17. Kroll, N.E.A., Parks, T., Parkinson, S.R., Bieber, S.L., Johnson, A.L.: Short-term memory
while shadowing: recall of visually and of aurally presented letters. J. Exp. Psychol. 85(2),
220–224 (1970)
18. Brünken, R., Steinbacher, S., Plass, J.L., Leutner, D.: Assessment of cognitive load in
multimedia learning using dual-task methodology. Exp. Psychol. 49(2), 109–119 (2002)
19. Bruyer, R., Scailquin, J.-C.: The visuospatial sketchpad for mental images: testing the
multicomponent model of working memory. Acta Psychol. 98(1), 17–36 (1998)
20. Robinson, D.H., Katayama, A.D., Fan, A.-C.: Evidence for conjoint retention of information
encoded from spatial adjunct displays. Contemp. Educ. Psychol. 21(3), 221–239 (1996)
21. Robinson, D.H., Robinson, S.L., Katayama, A.D.: When words are represented in memory
like pictures: evidence for spatial encoding of study materials. Contemp. Educ. Psychol. 24
(1), 38–54 (1999)
22. Murdock Jr., B.B.: Four-channel effects in short-term memory. Psychon. Sci. 24(4), 197–
198 (1971)
23. Kirsner, K., Smith, M.C.: Modality effects in word identification. Mem. Cogn. 2(4), 637–640
(1974)
24. Penney, C.G.: Order of report in bisensory verbal short-term memory. Can. J. Psychol. 34
(2), 190–195 (1980)
25. Castro-Alonso, J.C., Ayres, P., Wong, M., Paas, F.: Learning symbols from permanent and
transient visual presentations: don’t overplay the hand. Comput. Educ. 116,1–13 (2018)
26. Mayer, R.E., DaPra, C.S.: An embodiment effect in computer-based learning with animated
pedagogical agents. J. Exp. Psychol. Appl. 18(3), 239–252 (2012)
27. Chen, O., Castro-Alonso, J.C., Paas, F., Sweller, J.: Extending cognitive load theory to
incorporate working memory resource depletion: evidence from the spacing effect. Educ.
Psychol. Rev. 30(2), 483–501 (2018)
28. Kalyuga, S., Sweller, J.: The redundancy principle in multimedia learning. In: Mayer, R.E.
(ed.) The Cambridge Handbook of Multimedia Learning, 2nd edn., pp. 247–262. Cambridge
University Press, New York (2014)
29. Kalyuga, S., Ayres, P., Chandler, P., Sweller, J.: The expertise reversal effect. Educ. Psychol.
38(1), 23–31 (2003)
30. Ayres, P., Paas, F.: Making instructional animations more effective: a cognitive load
approach. Appl. Cogn. Psychol. 21(6), 695–700 (2007)
The Modality Effect of Cognitive Load Theory 83
jccastro@ciae.uchile.cl
31. van Gog, T.: The signaling (or cueing) principle in multimedia learning. In: Mayer, R.E.
(ed.) The Cambridge Handbook of Multimedia Learning, 2nd edn., pp. 263–278. Cambridge
University Press, New York (2014)
32. Leahy, W., Sweller, J.: Cognitive load theory, modality of presentation and the transient
information effect. Appl. Cogn. Psychol. 25(6), 943–951 (2011)
33. Castro-Alonso, J.C., Wong, M., Adesope, O.O., Ayres, P., Paas, F.: Gender imbalance in
instructional dynamic versus static visualizations: a meta-analysis. Educ. Psychol. Rev.
(2019). Advance Online Publication
34. Castro-Alonso, J.C., Uttal, D.H.: Spatial ability for university biology education. In: Nazir,
S., Teperi, A.-M., Polak-Sopińska, A. (eds.) Advances in Human Factors in Training,
Education, and Learning Sciences: Proceedings of the AHFE 2018 International Conference
on Human Factors in Training, Education, and Learning Sciences, pp. 283–291. Springer
(2019)
84 J. C. Castro-Alonso and J. Sweller
jccastro@ciae.uchile.cl