Students viewed a computer animation depicting the process of lightning. In Experiment 1, they concurrently viewed on-screen text presented near the animation or far from the animation, or concurrently listened to a narration. In Experiment 2, they concurrently viewed on-screen text or listened to a narration, viewed on-screen text following or preceding the animation, or listened to a narration following or preceding the animation. Learning was measured by retention, transfer, and matching tests. Experiment 1 revealed a spatial-contiguity effect in which students learned better when visual and verbal materials were physically close. Both experiments revealed a modality effect in which students learned better when verbal input was presented auditorily as speech rather than visually as text. The results support 2 cognitive principles of multimedia learning. Technological advances have made possible the combina-tion and coordination of verbal presentation modes (such as narration and on-screen text) with nonverbal presentation modes (such as graphics, video, animations, and environmen-tal sounds) in just one device (the computer). These ad-vances include multimedia environments, where students can be introduced to causal models of complex systems by the use of computer-generated animations (Park & Hopkins, 1993). However, despite its power to facilitate learning, multimedia has been developed on the basis of its technologi-cal capacity, and rarely is it used according to research-based principles (Kozma, 1991; Mayer, in press; Moore, Burton, & Myers, 1996). Instructional design of multimedia is still mostly based on the intuitive beliefs of designers rather than on empirical evidence (Park & Hannafin, 1994). The purpose of the present study is to contribute to multi-media learning theory by clarifying and testing two cogni-tive principles: the contiguity principle and the modality principle.