The Use of Eye Tracking as a Research and Instructional Tool in Multimedia Learning

The Use of Eye Tracking as a Research and Instructional Tool in Multimedia Learning

Katharina Scheiter (Leibniz-Institut für Wissensmedien, Germany) and Alexander Eitel (Leibniz-Institut für Wissensmedien, Germany)
DOI: 10.4018/978-1-5225-3822-6.ch035
OnDemand PDF Download:
No Current Special Offers


The present chapter summarizes the state of the art of using eye tracking in research on multimedia learning. It first provides an overview the various eye tracking parameters that have been used in this field before describing its various functions as a research tool. As a research tool eye tracking serves to test and refine assumptions regarding the process of learning with multimedia, explain the origin of individual differences in learning outcomes, gives insight into the effects of instructional interventions at a process level, and enriches other forms of assessment. In addition, more recently eye tracking has also been used to develop materials aimed at supporting multimedia learning. Thus, it also serves as an instructional tool when it used to design adaptive instruction or to model cognitive processes relevant to multimedia learning. The chapter concludes with a description of some of the challenges in using eye tracking in multimedia research.
Chapter Preview


Learning from multimedia has become one of the major areas of research in learning and instruction especially since digital technology is increasingly used in education, thereby allowing not only static instructional materials, but also dynamic formats. Despite the fact that many people think of multimedia involving the use of advanced digital technology, its definition in research is far more traditional. Here the term multimedia refers to any combination of words (spoken or written) and pictures (e.g., photographs, diagrams, videos, animation). More generally speaking, multimedia environments are multirepresentational systems (Ainsworth, 1999) where different representational formats are combined in a way to best convey an instructional message.

The case for multimedia is made by comparing it to monomedia formats, mostly to text-only instruction. There is abundant empirical evidence showing that multimedia yields better learning than text (Mayer, 2014a). This multimedia effect is dependent on a variety of conditions: In particular, pictures that serve to augment the text should be relevant to the learning objective and not only serve decorative purposes. Moreover, the materials should be designed in a way that they enable and facilitate certain cognitive processes that have been determined as essential for effective multimedia learning by respective theories in this research area. Accordingly, the majority of current multimedia research is focused on identifying possible boundary conditions for the multimedia effect and the cognitive processes underlying multimedia learning. This is where eye tracking comes into play (cf. Scheiter & Van Gog, 2009; Van Gog & Scheiter, 2010). For instance, as will be discussed later in more detail, eye tracking research has shown that successful learning from multimedia requires that learners will pay sufficient attention to pictures rather than predominantly reading the text and that they will make connections between both representations (e.g., Hannus & Hyönä, 1999; Hegarty & Just, 1993; cf. Renkl & Scheiter, in press, for a review on further boundary conditions).

According to cognitive theories of multimedia learning such as the Cognitive Theory of Multimedia Learning by Mayer (CTML; Mayer, 2014b) or the Integrated Model of Text and Picture Processing by Schnotz (ITPC; Schnotz, 2014), a set of distinct cognitive processes contributes to the construction of a mental model and thus learning from multimedia. Both theories agree upon the fact that learners first need to identify and attend to relevant information contained in both the text and the picture and select it for further processing. Once the relevant information is selected, it needs to be organized in memory into coherent, mode-specific mental representations. Finally, information from both representational formats are then related to each other with the help of prior knowledge and integrated into one mental coherent mental model. This mental model composed of verbal and pictorial information is seen as the major reason for why learning from text and pictures is more effective than learning from text only.

The objective of the present chapter is to illustrate how eye tracking can serve as a valuable tool to study and potentially also to support the aforementioned cognitive processes of selection, organization, and integration while learning from multimedia. In the following, we will first discuss which eye tracking parameters are typically used in multimedia studies and how they relate to the theoretical assumptions outlined above. Second, we will review studies in which eye tracking was used as a research tool to study cognitive processes underlying learning with multimedia, before turning our attention to more advanced uses of eye tracking for designing instruction in the third section. The fourth section deals with the various methodological challenges of using eye tracking in multimedia research before concluding with some suggestions for future research.

Complete Chapter List

Search this Book: