ChapterPDF Available

Principles for reducing extraneous processing in multimedia learning: Coherence, signaling, redundancy, spatial contiguity, and temporal contiguity principles



Extraneous overload occurs when essential cognitive processing (required to understand the essential material in a multimedia message) and extraneous cognitive processing (required to process extraneous material or to overcome confusing layout in a multimedia message) exceed the learner’s cognitive capacity. Five multimedia design methods intended to minimize extraneous overload are based on the coherence, signaling, redundancy, spa- tial contiguity, and temporal contiguity principles. The coherence principle is that people learn more deeply from a multimedia message when extraneous material is excluded rather than included. This principle was supported in 23 out of 23 experimental tests, yielding a median effect size of 0.86. The sig- naling principle is that people learn more deeply from a multimedia message when cues are added that highlight the organization of the essential mate- rial. This principle was supported in 24 out of 28 experimental tests, yielding a median effect size of 0.41. The redundancy principle is that people learn more deeply from graphics and narration than from graphics, narration, and on-screen text. This principle was supported in 16 out of 16 experimental tests, yielding a median effect size of 0.86. The spatial contiguity principle is that people learn more deeply from a multimedia message when correspond- ing words and pictures are presented near rather than far from each other on the page or screen. This principle was supported in 22 out of 22 experi- mental tests, yielding a median effect size of 1.10. The temporal contiguity principle is that people learn more deeply from a multimedia message when corresponding animation and narration are presented simultaneously rather than successively. This principle was supported in 9 out of 9 experimental tests, yielding a median effect size of 1.22.
List of Contributors page ix
1. Introduction to Multimedia Learning 1
Part I. Theoretical Foundations
2. Implications of Cognitive Load Theory for Multimedia
Learning 27
3. Cognitive Theory of Multimedia Learning 43
4. Integrated Model of Text and Picture Comprehension 72
5. The Four-Component Instructional Design
Model: Multimedia Principles in Environments
for Complex Learning
Part II. Basic Principles of Multimedia Learning
6. Ten Common but Questionable Principles of Multimedia
7. The Multimedia Principle 174
8. The Split-Attention Principle in Multimedia Learning 206
9. The Modality Principle in Multimedia Learning 227
978-1-107-03520-1 - The Cambridge Handbook of Multimedia Learning: Second Edition
Edited by Richard E. Mayer
Table of Contents
More information
vi Contents
10. The Redundancy Principle in Multimedia Learning 247
11. The Signaling (or Cueing) Principle in Multimedia
Learning 263
12. Principles for Reducing Extraneous Processing
in Multimedia Learning: Coherence, Signaling,
Redundancy, Spatial Contiguity, and Temporal
Contiguity Principles 279
13. Principles for Managing Essential Processing in
Multimedia Learning: Segmenting, Pre-training, and
Modality Principles 316
14. Principles Based on Social Cues in Multimedia
Learning: Personalization, Voice, Image, and
Embodiment Principles 345
Part III. Advanced Principles of Multimedia Learning
15. The Guided Discovery Learning Principle in Multimedia
Learning 371
16. The Worked Examples Principle in Multimedia Learning 391
17. The Self-Explanation Principle in Multimedia Learning 413
18. The Generative Drawing Principle in Multimedia Learning 433
19. The Feedback Principle in Multimedia Learning 449
20. The Multiple Representation Principle in Multimedia
21. The Learner Control Principle in Multimedia Learning 487
22. Animation Principles in Multimedia Learning 513
978-1-107-03520-1 - The Cambridge Handbook of Multimedia Learning: Second Edition
Edited by Richard E. Mayer
Table of Contents
More information
23. The Collaboration Principle in Multimedia Learning 547
24. The Expertise Reversal Principle in Multimedia Learning 576
25. The Individual Differences in Working Memory
Capacity Principle in Multimedia Learning 598
Part IV. Multimedia Learning of Cognitive Processes
26. Multimedia Learning of Cognitive Processes 623
27. Multimedia Learning of Metacognitive Strategies 647
28. Multimedia Learning and the Development of Mental
Models 673
Part V. Multimedia Learning in Advanced Computer-Based Contexts
29. Multimedia Learning with Intelligent Tutoring Systems 705
30. Multimedia Learning with Simulations and Microworlds 729
31. Multimedia Learning with Computer Games 762
32. Multimedia Learning with Video 785
33. Multimedia Learning from Multiple Documents 813
34. Multimedia Learning in e-Courses 842
Author Index 883
Subject Index
978-1-107-03520-1 - The Cambridge Handbook of Multimedia Learning: Second Edition
Edited by Richard E. Mayer
Table of Contents
More information
... Second, each of the channels has its own limited capacity (Baddeley, 1992) and third, learning form multimedia material is effective, when learners actively engage in selection, organization and integration of the material. However, these processes are hampered when extraneous cognitive load binds too many cognitive resources (Mayer and Fiorella, 2014). Thus, the reduction of extraneous cognitive load is necessary, due to the limited capacity of the cognitive system Sweller, 1991, 1992;Paas, 1992;Paas and Van Merriënboer, 1994). ...
... In this regard, it is important to consider prior knowledge as it relates to the intrinsic load the material poses on a learner. Learners with higher prior knowledge may profit less from signaling than those with lower prior knowledge because their intrinsic cognitive load is lower and they might be less in danger of being overloaded (Van Merrienboer and Sweller, 2005;Mayer and Fiorella, 2014;Richter et al., 2016Richter et al., , 2018Alpizar et al., 2020). For those learners, the threshold for positive effects of signaling is supposedly higher compared to learners, who are already heavily challenged by the high intrinsic load. ...
... One typical approach to explain effects of multimedia-design principles is a reduction of extraneous cognitive load. Technically speaking, signaling is supposed to reduce extraneous cognitive load by reducing the processed information to the gist (Mayer and Fiorella, 2014;Schneider et al., 2018;Alpizar et al., 2020). However, adding information, which is not directly related to the content, even if it is a signal, adds extraneous cognitive load (Sweller, 1988;Paas, 1992). ...
Full-text available
Classroom videos are a viable means to implement evidence-informed reasoning in teacher education in order to establish an evidence-informed teaching practice. Although learning with videos relieves pre-service teachers from acting in parallel and might reduce complexity, the material still poses higher cognitive load than written text vignettes or other traditionally used static material. In particular, the information they deliver is transient and can, therefore, easily be missed. Signaling can guide learners’ attention to central aspects of a video, thereby reducing cognitive load and enhancing learning outcomes. In the current project, pre-service teachers acquired scientific knowledge about learning strategies and their promotion in a computer-based learning environment. We explored the effect of different arrangements of signaling in classroom video-examples on conceptual knowledge and the reasoning-component of professional vision. Therefore, we conducted a set of two studies with 100 student teachers including two signal arrangements in order to investigate how signaling can help learning to reason about classroom videos. In addition, we varied if participants received information on the use of signals in advance (informed) or not (uninformed). We measured conceptual knowledge by asking participants what they knew about self-regulation strategies. Additionally, we assessed reasoning by asking participants to notice sequences in a video where teachers induced learning strategies, and to reason in what respect the observed behavior was useful to induce the strategy. Uninformed signaling did not affect the acquisition of conceptual knowledge and reasoning. Informed signaling led to significantly better conceptual knowledge than uninformed signaling. It is argued that the signal-induced extraneous load exceeded the load reduction due to the signal’s selection advantage in the uninformed conditions. In a third, exploratory study, nine participants were interviewed on the perception of different signals and indicated that spotlight and zoom-in signals foster processing of classroom videos.
... In the context of multimedia learning, 3 of 17 research has identified several design principles for reducing this type of cognitive load. The most relevant to our research are briefly presented as follows [25]: ...
... According to the signaling principle, it is important to highlight those pieces of information that are particularly significant for learning. In the case of texts, this highlighting is usually done by formatting the respective text passages differently, while in images or videos, symbols such as circles or arrows are usually used to attract the attention of the learners [25]. ...
... While one piece of information has to be kept ready in the working memory, the related information may still have to be searched for and processed. To avoid this effect, the temporal and spatial contiguity principle is implemented, requiring that related information be closely connected in time and space [25]. ...
Full-text available
Chemical phenomena are only observable on a macroscopic level, whereas they are explained by entities on a non-visible level. Students often demonstrate limited ability to link these different levels. Augmented reality (AR) offers the possibility to increase contiguity by embedding virtual models into hands-on experiments. Therefore, this paper presents a pre- and post-test study investigating how learning and cognitive load are influenced by AR during hands-on experiments. Three comparison groups (AR, animation and filmstrip), with a total of N = 104 German secondary school students, conducted and explained two hands-on experiments. Whereas the AR group was allowed to use an AR app showing virtual models of the processes on the submicroscopic level during the experiments, the two other groups were provided with the same dynamic or static models after experimenting. Results indicate no significant learning gain for the AR group in contrast to the two other groups. The perceived intrinsic cognitive load was higher for the AR group in both experiments as well as the extraneous load in the second experiment. It can be concluded that AR could not unleash its theoretically derived potential in the present study.
... However, the authors note that there are potentially more benefits if hands are included as the instructional efficiency is improved, particularly for more complex tasks where the cognitive load may be higher. (Mayer & Fiorella, 2014) support this theory by claiming that the presence of hands may produce signaling or cueing, an effective technique to reduce cognitive load and to improve learning from visualizations. The signaling principle states that people learn more deeply from a multimedia message where added cues highlight the organization of the essential material. ...
... Numerous studies show however that a combination of text and pictures support learning and deepens understanding and problem-solving processes (J. M. Clark & Paivio, 1991), (Mayer & Fiorella, 2014), (Wittrock, 2010), independently on the cognitive style of individuals (Massa & Mayer, 2006). ...
... The theory according to which people learn better from words and pictures than from words alone was first introduced by (Mayer, 2001) and is known as the multimedia effect. However, simply combining text and pictures does not seem to always improve learning, since such a complex process is dependent on various aspects including the form of visualization, the type of learning task, the number of referential connections between text and pictures, and the personal characteristics of the learner (Mayer & Fiorella, 2014), (Schnotz & Bannert, 2003). Thus, the learning performance differs with respect to individual differences like prior knowledge (Kalyuga, 2007), spatial ability (Hegarty, 2012), (Höffler & Leutner, 2011) or cognitive style (Höffler et al., 2010). ...
To remain competitive in the ongoing industrial revolution (i.e., Industry 4.0), manufacturing sectors must ensure high flexibility at the production level, a need best addressed by skilled human workforce. As traditional training becomes increasingly inefficient, finding a better way of training novice workers becomes a critical requirement. Literature suggests that augmented reality (AR), an emerging technology proposed by Industry 4.0, can potentially address this concern.The benefits of AR-based knowledge sharing tools have been demonstrated in a variety of domains, including industry, from manufacturing to validation and maintenance. However, despite the progress of AR in recent years, no significant industrial breakthrough can be noted. We found that most AR systems are elaborated and evaluated under controlled settings, without the implication of the eventual end users. Guided by literature recommendations, we conducted a long-term case study in a manual assembly factory, to identify needs and expectations that an AR training system should meet, to optimally address the considered industrial sector.Further, we conducted an in-depth analysis on information representation and conveyance in AR, with respect to cognitive implications and content authoring efforts. We explored as well human-computer interaction paradigms to identify principles and design guidelines for elaborating an AR tool dedicated to the shop floor context. We found that the visual representation of the assembly expertise in AR can rely on spatially registered low-cost visual assets (i.e., text, photo, video, and predefined auxiliary data), while a human-centered design should be adopted during the elaboration of the AR system, prioritizing usability and usefulness rather than performance.We defined a formalized visual representation (i.e., 2W1H principle) of assembly operations in AR, that considers authoring concerns and supports training performance. We proposed an HMD-only immersive authoring that allows one to capture his assembly expertise in-situ, during the assembly itself. The authoring is a one-step process, does not rely on existing data or external services and does not require AR or technical expertise, pre- or post-processing of data. During training, the assembly information is conveyed via AR by following the 2W1H principle, designed to guide novice workers in a natural, non-intrusive manner, minimizing user input and UI clutter, and aiming to optimize comprehension and learning.We evaluated our proposal by conducting several experiments. The first, conducted on a real-world assembly workstation, confirmed the hypothesis that spatially registered low-cost visual assets can effectively convey manual assembly expertise to novice workers via AR in an industrial setup. The findings of the second experiment supported the assumption that the worthiness of authoring CAD-based AR instructions in similar industrial context is questionable. A final experiment proved that the proposed AR system, including both authoring and training procedures, can be used effectively by novices in a matter of minutes. The overall reported feedback demonstrated the usability and efficiency of the proposed AR training approach, indicating that a similar system implementation could be successfully adopted in shop floor environments.Future work should validate the reported experimental findings in large-scale industrial evaluations and propose reliable “intelligent” modules (e.g., assembly validation and feedback) to better assist novice workers during training and optimize the authoring procedure as well.
... Video creators violate this so-called modality-principle, for example, when the visual element includes complete sentences, and the explanation encompasses reading these sentences. Third, according to the temporal contiguity principle (Mayer & Fiorella, 2014), it is important to present visual and spoken textual information at the same time, rather than first talking about a visual and then showing the visual (or the other way around). Fourth, video creators should refrain from using visuals with irrelevant (but possibly interesting) details, as they might distract the learner from the important content (Mayer & Fiorella, 2014). ...
... Third, according to the temporal contiguity principle (Mayer & Fiorella, 2014), it is important to present visual and spoken textual information at the same time, rather than first talking about a visual and then showing the visual (or the other way around). Fourth, video creators should refrain from using visuals with irrelevant (but possibly interesting) details, as they might distract the learner from the important content (Mayer & Fiorella, 2014). ...
... Second, it is not possible to identify more subtle qualitative differences between videos that meet the same criteria. For instance, the video quality differs most likely not only because the video does (or does not) use visualizations but because of the kind of visualization used and how it is connected to the (verbally explained) content (Mayer & Fiorella, 2014). ...
Full-text available
More and more teachers create video explanations for their instruction. Whether or not they are effective for learning depends on the videos’ instructional quality. Reliable measures to assess the quality of video explanations, however, are still rare, especially for videos created by (preservice) teachers. We developed such a measure in a two-step process: First, the categories were theoretically derived. Second, a coding manual was developed and used with 36 videos, which were created by preservice teachers during a university seminar. The resulting framework, which can be used as a coding manual for future research, consists of twelve criteria in five different categories: video content, learner orientation, representation and design, language, and process structure. With this framework, we contribute a reliable measure to evaluate the quality of existing videos. In practice, teachers can also use this measure as a guideline when creating or choosing video explanations for the classroom.
... HMD FPV capture addresses the hands-free requirement and ensures the alignment of trainee's viewpoint with the one in the video, which otherwise would make following it more difficult [53]. Media contextualization aims to address the spatial contiguity principle [54], lower physical and mental effort during the training session. ...
... The proposed information conveyance technique aims to address, in addition to information correctness and completeness, human considerations like cognitive load and facility to use. As identified by Mayer et al. [54], AR can reduce cognitive load by supplementing human labor with the right information at the right moment and in the right mode without disturbing user's focus. Sahu et al. [55] stated that seamlessly displaying AR information is essential to minimize the cognitive load of workers and not display redundant augmentation media during the assembly procedure [56]. ...
... The information conveyance approach in ATOFIS aims to speed up instruction reading (H5) and facilitate comprehension (i.e., less assembly errors) (H6) during training by presenting the right information at the right time and in the right mode [54] (see Section 3.5). Unlike ATOFIS, Guides presents the information (i.e., text description and image/video) in a panel that is not spatially registered with the assembly location, not necessarily in the FoV of the trainee and that can visually interfere with the assembly. ...
Conference Paper
This paper reports on a user study to comparatively evaluate two AR training systems designed for step-by-step manual operations: ATOFIS - recently proposed in the literature, and Microsoft Dynamics 365 Guides (hereinafter Guides) - one of the most relevant state-of-the-art commercial solutions. The user study (N=16) was conducted in two stages - i.e., training and authoring, on a partial replica of a real-world assembly workstation. During training, the participant learns a sequence of manual operations by performing two assembly cycles, guided by each of the two AR training systems. During authoring, the participant creates the two sets of AR work instructions used in the next training session, one set with each of the two authoring systems. We bound the authoring and training procedures during the experiment to comparatively assess the AR systems overall, and address at the same time an evaluation gap observed in the literature. The experimental results demonstrated advantages of the authoring approach proposed by ATOFIS (i.e., low-cost, formalized, in-situ, immersive and on-the-fly), proved the usability and effectiveness of the AR instructions authored with ATOFIS and validated a set of hypotheses formulated by the authors of the system. ATOFIS authoring was 1.72x faster and unanimously preferred by the participants; ATOFIS training reported zero assembly errors and was by 13% faster than Guides. ATOFIS reported excellent system usability (i.e., SUS) and mental workload (i.e., NASA-TLX) scores for both authoring and training, outperforming Guides on all dimensions.
... For example, Albus et al. (2021) investigated whether the multimedia signalling principle of directing the learner's attention by using textual annotations to relevant information might be effective in VR. They found out that similarly to 2D classical multimedia settings, signalling improved VR learners' learning outcomes, but in contrast to the 2D multimedia settings, there was no effect on comprehension and transfer (Alpizar et al., 2020;Mayer & Fiorella, 2014). In another study by Makransky, Terkildsen, and Mayer (2019), the authors examined the redundancy principle which states that people learn better when the same information is not presented in more than one format. ...
Full-text available
Background Virtual reality (VR) is considered a promising approach to support learning. An instructional design is essential to optimize cognitive processes. Studies show that VR has unique instructional and pedagogical requirements. Objectives To evaluate the effectiveness and applicability of the modality principle, which was previously validated in 2D classic multimedia, for learning with VR. The modality principle states that multimedia information presented as spoken narration is superior to on‐screen text. Methods A prospective experimental study with two compared conditions of instruction: VR‐based learning guided by on‐screen text (n = 34) versus spoken narration (n = 28). Students' cognitive learning experiences were captured by eye‐tracking and electrodermal activity (EDA). In addition, students' knowledge was evaluated using a pre–post knowledge test. Results and Conclusions Overall, there was no significant difference in knowledge retention between the participants who learned with on‐screen text compared to spoken narration. However, results from the eye‐tracking analysis showed that students who learned with the on‐screen text devoted longer visual attention toward important learning activity areas of interest, suggesting a better ability to discern between relevant and irrelevant information. Conversely, students who learned with the spoken narration expressed significantly more EDA peak responses, proposing a higher cognitive load. Implications This study outlines that while learning with VR was effective, the modality principle might not apply to learning with VR. Moreover, the analysis of the learning process suggests even an inverse effect, favouring the provision of instructional scaffolds as on‐screen text. Future research should evaluate this effect on long‐term knowledge retention.
How to effectively learn from video lectures has attracted much attention from researchers and educators. Many attempts have been made to apply design principles in creating video lectures that will maximize learning. Relatively little attention has been paid to the learning strategies students use when watching video lectures. Our studies found that learners learn better when an instructor used pointing gestures and continuous gaze guidance; they also learn better when the instructor used direct gaze with a happy facial expression. Furthermore, learners learn better when they were not exposed to others’ messages during viewing video lectures and engaged into explaining to oneself and a peer after viewing short video lectures. The main findings suggest that whether learners can effectively learn from video lectures depends on both video lectures design and their learning strategies. Our findings are discussed in terms of potential application in courses using video lectures.KeywordsEEG oscillationsEye-tracking technologyLearning strategiesVideo lectures
Full-text available
Die Spreu vom Weizen trennen. Erklärvideos für den Literaturunterricht mit einem Qualitätsraster auswählen Podcasts, webquests, wikis, webblogs und vor allem Lern-und Erklärvideos sind heute vielfach als digitale Lern-helfer verbreitet. In Folge der vorangegangenen Schulschließungen im Zuge der Covid-19-Pandemie werden sie nun auch verstärkt im formalen Bildungskontext eingesetzt. So gab in einer forsa-Umfrage unter rund 1000 Leh-rer*innen im April 2020 nahezu jede zweite Lehrperson aus dem Sekundarstufenbereich an, Erklärvideos während der Schulschließungen genutzt zu haben (vgl. forsa, 2020). Vor dem Hintergrund der stark differenzierenden Qua-lität von Lernvideos widmet sich der vorliegende Beitrag der Vorstellung eines empirisch validierten mehrdimen-sionalen Kriterienrasters. 1 Für die Qualitätsanalyse vereint das Raster fachwissenschaftliche sowie fachdidaktische Kriterien zur Vermittlung gattungstypologischen Wissens zur Kurzgeschichte und mediendidaktische Erkenntnisse zur Qualität von Erklärvideos. (Angehende) Lehrpersonen sollen durch die Arbeit mit dem Raster einerseits für unterschiedliche Qualitätsebenen von Erklärvideos sensibilisiert und dadurch andererseits in ihrer Rolle als Ga-tekeeper in einer zunehmend digitalisierten Lernwelt gestärkt werden. Bevor die zentralen Dimensionen des Qualitätsrasters und zwei damit verbundene exemplarische Analysen aus-führlich vorgestellt werden, gibt der Beitrag einen Überblick zu Lernvideos als Lerngegenstand und problematisiert in einem nächsten Schritt die Notwendigkeit von Qualitätskriterien zu diesen.
Full-text available
New technologies have made the world highly visual, making visual literacy an important and relevant 21st Century skill. Educators and students are often required to produce visual communication products such as infographics, but they often lack the confidence and proficiency in visual design skills to create higher-quality infographics. We conducted a case study to examine how graphic and instructional designers perform the visual design process (i.e., a series of actions performed in composing an infographic, such as creating layout and alignment) when applying visual design principles. The goal was to investigate the visual design processes and identify differences in the strategies used to develop higher versus lower quality infographics rated across 18 design criteria. We identified the design actions and computed the transitional probability between the observed actions to construct a process model for composing effective infographics. Results reveal that high-rated infographics were developed using a more systematic approach, starting by creating a well-planned structure (e.g., setting margins and columns) followed by setting spatial zones to map out a visual hierarchy prior to working on fonts, colors, and graphic elements, and using a consistent application of visual rules. These target processes were encapsulated into a five-stage Infographic Visual Design Model.
German-language teacher training is often segmented into three distinct phases: First, future teachers begin their training as students at university where they study predominantly academic disciplines (usually 4–5 years). Here, they complete some initial practical school training. Second, they undergo practical teacher training as student teachers at a school (two years after university degree) before they finally begin their professional careers as teachers. In this last phase, teachers periodically take advanced training courses.
Full-text available
Do individual differences in working memory capacity (WMC) affect student learning within multimedia instructional environments? High and low WMC students, as measured by the operation span (OSPAN) task, engaged in a multimedia tutorial addressing lightning formation or car brake use. The results of two experiments indicated that students with high WMC recalled and transferred more information than students with low WMC after engaging in a multimedia tutorial. In addition, the multimedia principles of coherence (Exp 1) and signaling (Exp 2) were also assessed for validation. Each of the experiments failed to validate the previous multimedia learning principles. These results are consistent with a general individual differences WMC effect, but inconsistent with previous finding regarding the coherence and signaling effects.
Full-text available
In three studies, eye movements of participants were recorded while they viewed a single-slide multimedia presentation about how car brakes work. Some of the participants saw an integrated presentation in which each segment of words was presented near its corresponding area of the diagram (integrated group, Experiments 1 and 3) or an integrated presentation that also included additional labels identifying each part (integrated-with-labels group, Experiment 2), whereas others saw a separated presentation in which the words were presented as a paragraph below the diagrams (separated group, Experiments 1 and 2) or as a legend below the diagrams (legend group, Experiment 3). On measures of cognitive processing during learning, the integrated groups made significantly more eye-movements from text to diagram and vice versa (integrative transitions; d = 1.65 in Experiment 1, d = 0.85 in Experiment 2, and d = 1.44 in Experiment 3) and significantly more eye-movements from the text to the corresponding part of the diagram (corresponding transitions; d = 2.02 in Experiment 1 and d = 1.35 in Experiment 3) than the separated groups. On measures of learning outcome the integrated groups significantly outperformed the separated groups on transfer test score in Experiment 1(d = .80) and Experiment 2 (d = .73) but not in Experiment 3 (d = .35). Spatial contiguity encourages more attempts to integrate words and pictures and enables more successful integration of words and pictures during learning, which can result in meaningful learning outcomes.