Conference PaperPDF Available

Abstract and Figures

We introduce the Multimodal Learning Analytics Pipeline, a generic approach for collecting and exploiting multimodal data to support learning activities across physical and digital spaces. The MMLA Pipeline facilitates researchers in setting up their multimodal experiments, reducing setup and configuration time required for collecting meaningful datasets. Using the MMLA Pipeline, researchers can decide to use a set of custom sensors to track different modalities, including behavioural cues or affective states. Hence, researchers can quickly obtain multimodal sessions consisting of synchronised sensor data and video recordings. They can analyse and annotate the sessions recorded and train machine learning algorithms to classify or predict the patterns investigated.
Content may be subject to copyright.
The Multimodal Learning Analytics Pipeline
Daniele Di Mitri1, Jan Schneider2, Marcus Specht1,3, Hendrik Drachsler1,2
1 Open University of The Netherlands, The Netherlands, daniele.dimitri@ou.nl
2 German Institute for International Educational Research, Germany schneider.jan@dipf.de
3 Delft University of Technology, The Netherlands
ABSTRACT
We introduce the Multimodal Learning Analytics
Pipeline, a generic approach for collecting and exploiting
multimodal data to support learning activities across physical
and digital spaces. The MMLA Pipeline facilitates researchers
in setting up their multimodal experiments, reducing setup and
configuration time required for collecting meaningful datasets.
Using the MMLA Pipeline, researchers can decide to use a set
of custom sensors to track different modalities, including
behavioural cues or affective states. Hence, researchers can
quickly obtain multimodal sessions consisting of synchronised
sensor data and video recordings. They can analyse and annotate
the sessions recorded and train machine learning algorithms to
classify or predict the patterns investigated.
I. INTRODUCTION
Learning researchers are increasingly employing multimodal
data and multi-sensor interfaces in a variety of learning
activities. Two main factors drive this interest. First, the
emergence and wide diffusion of new seamless technologies for
data capturing. These include smartphones, wearable sensors or
Internet of Things (IoT) devices. Research shows that some of
these technologies can be employed in formal and non-formal
learning settings [1]. Second, the learning activities both in the
academic and vocational education sectors, are becoming more
and more blended, as they take place across digital platforms and
physical and co-located settings such as group activities or
individual exercise. Related literature features several both
empirical and theoretical studies that can fall under the umbrella
name of Multimodal Learning Analytics [2]. The MMLA field
looks primarily at learning scenarios alternative to the learner
seated in front of the laptop.
II. PROBLEM STATEMENT
In most of the researches conducted in MMLA or contiguous
fields using multimodal data, the techniques for data collection,
synchronization, annotation and analysis, use tailor-made
solutions over standardised approaches. That is due to the lack
of established standard technological and methodological
practices which make the field of MMLA jeopardised and
adverse for newcomers. We consider this a significant drawback
for the field of learning analytics and we aim to address with this
research.
III. PROPOSED SOLUTION
With the Multimodal Learning Analytics Pipeline, we aim at
addressing the lack of tools and support for the MMLA
researchers. The MMLA Pipeline provides an approach for
collecting and exploiting multimodal data to support activities
across physical and digital spaces. The MMLA Pipeline
facilitates researchers in setting up their multimodal
experiments, reducing setup and configuration time required for
collecting meaningful datasets. The multimodal data collected
can support researchers to design more accurate student
modelling, learning analytics and intelligent machine tutoring.
Using the MMLA Pipeline, researchers can decide to use a set
of custom sensors to track different modalities, including
behavioural cues or affective states. Hence, researchers can
quickly obtain multimodal sessions consisting of synchronised
sensor data and video recordings. They can analyse and annotate
the sessions recorded and train machine learning algorithms to
classify or predict the patterns investigated.
A comprehensive overview of the MMLA Pipeline is given
in Figure 1. The MMLA Pipeline is a cycle consisting of five
steps, which propose a solution to the five main MMLA
challenges.
(1) The data collection: techniques used for capturing,
aggregating and synchronizing data from multiple
modalities and sensor streams;
(2) the data storing: the approach used for organizing
multimodal data which having multiple formats and big
sizes, for storing and retrieving them later;
(3) the data annotation: how to provide meaning to portions
of multimodal recordings and to collect human
interpretations through expert or self-reports;
Task model 3rd party
sensors or API
2. Data
storing
Dashboards
Physiological
sensor data
Motoric
sensor data
(D)
Historical
reports
(B)
Predictions
(C)
Patterns
5. Data
exploitation
1. Data
collection
Processed
data store
4. Data
processing
Predictions
Model
fitting
(A)
Corrective
feedback
Intelligent Tutors
(A)
Evaluation
Expert reports
3. Data
annotation
(B,C)
Prediction
models
(D)
(D)
B,C
(B,C)
RESEARCH PRODUCTION
corrections
awareness orchestration adaptation
Fig. 1 Graphical representation of the MMLA Pipeline.
(4) the data processing: approach for cleaning, aligning,
integrating, extracting relevant features from the raw
multimodal data and transforming them into a new data
representation suitable for exploitation;
(5) the data exploitation: the approach to ultimately support
the learner during the learning process with the
predictions and the insights obtained by the multimodal
data.
The MMLA Pipeline offers a bird-eye view on the lifecycle
of multimodal data that are collected from and used to support
the learner. We imagine the MMLA Pipeline in two phases, the
‘research’ phase and the ‘production’ phase. The first one
includes a several expert-driven operations, such as sensor
selections, annotations, model training, parameter tuning. These
configurations are used in a later stage of ‘production’ in which
the MMLA Pipeline is used as the multimodal data backbone
infrastructure for collecting the learning data and using them for
improving the learning activities.
In real-life learning activities, multimodal data can be
supportive in different ways. We call these the exploitation
strategies. For example, an Intelligent Tutor using the MMLA
Pipeline can prompt instantaneous feedback, nudging the learner
towards the desired behaviour. Alternatively, the learner data
can be used for retrospective feedback, in the form of an
analytics dashboard.
IV. CURRENT PROTOTYPES
At the current stage, we developed two main prototypes as
implementations of the MMLA Pipeline. The prototypes were
presented in two recent studies A) the Multimodal Learning
Hub [4] and B) the Visual Inspection Tool [5].
A. Multimodal Learning Hub
The Multimodal Learning Hub (LearningHub) is a system
that focuses on the data collection and data storing of multimodal
learning experiences [4]. It uses the concept of Meaningful
Learning Task (MLT) using a custom data format (MLT session
file) for data storing and exchange. At the current stage of
development, the Learning Hub uses a set of specifications that
shape it for learning activities. Several libraries compatible with
the LearningHub have been coded to work with commercial
devices and sensors. LearningHub focuses on short and
meaningful learning activities (~10 minutes) and uses a
distributed, client-server architecture with a master node
controlling and receiving updates from multiple data-provider
applications. It also handles video and audio recordings with the
primary purpose to support the human annotation process. The
expected output of the LearningHub is one (or multiple) MLT-
JSON session files including 1) one-to-n multimodal, time-
synchronised sensor recordings; 2) a video/audio file providing
evidence for retrospective annotations. The LearningHub is open
source and developed in C#.
B. Visual Inspection Tool
The Visual Inspection Tool (VIT) allows the manual and
semi-automatic annotation psychomotor learning tasks which
can be captured with a set of sensors. The VIT enables the user
to 1) triangulate multimodal data with video recordings; 2) to
segment the multimodal data into time intervals and to add
annotations to the time intervals; 3) to download the annotated
dataset and use the annotations as labels for machine learning
predictions. The annotations created with the VIT are saved into
MLT-JSON data-format as the other sensor files. The
annotations are treated as an additional sensor application, where
each frame is a time interval with relative startTime and
stopTime instead that a single timestamp. Using the standard
MLT-JSON data-format, the user of the VIT can both define
custom annotation schemes or load existing annotation files.
V. FUTURE WORKS
The two prototypes described are a first of implementing a part
of the MMLA Pipeline proposing a solution for the first three
challenges of data collection, storing and annotation. Although
yet prototypical the tools described are available with Open
Source licensing and they were created with extensibility in
mind. As future work, we want to focus on the data exploitation
and processing improving the current feedback mechanisms to
produce feedback on real-time based both on expert as well as
machine learned rules. We are planning to extend the
LearningHub with a Runtime Feedback System which would
allow the expert to set the type of feedback message, which
sensor device to send the messages to, what learner to address
with feedback and under which conditions the feedback should
be prompted.
REFERENCES
[1] Worsley, M. (2018). Multimodal learning analytics’ past, present, and,
potential futures. In CEUR Workshop Proceedings (Vol. 2163, pp. 116).
Aachen, Germany: CEUR Workshop Proceedings. Retrieved from
http://crossmmla.org/wp-
content/uploads/2018/02/CrossMMLA2018_paper_8.pdf
[2] Schneider, J., Börner, D., van Rosmalen, P., & Specht, M. (2015).
Augmenting the Senses: A Review on Sensor-Based Learning Support.
Sensors, 15(2), 40974133. http://doi.org/10.3390/s150204097
[3] Di Mitri, D., Schneider, J., Specht, M., & Drachsler, H. (2018). The Big
Five: Addressing Recurrent Multimodal Learning Data Challenges. In
Martinez-Maldonado Roberto (Ed.), Proceedings of the Second
Multimodal Learning Analytics Across (Physical and Digital) Spaces
(CrossMMLA) (p. 6). Aachen: CEUR Workshop Proceedings. Retrieved
from http://ceur-ws.org/Vol-2163/paper6.pdf
[4] Schneider, J., Di Mitri, D., Limbu, B., & Drachsler, H. (2018).
Multimodal Learning Hub: A Tool for Capturing Customizable
Multimodal Learning Experiences. In Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics) (Vol. 11082 LNCS, pp. 4558). Cham,
Switzerland: Springer. http://doi.org/10.1007/978-3-319-98572-5_4
[5] Di Mitri, D., Schneider, J., Klemke, R., Specht, M., & Drachsler, H.
(2019). Read Between the Lines: An Annotation Tool for Multimodal
Data for Learning. In Proceedings of the 9th International Conference on
Learning Analytics & Knowledge - LAK19 (pp. 5160). New York, NY,
USA: ACM. http://doi.org/10.1145/3303772.3303776
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In recent years sensor components have been extending classical computer-based support systems in a variety of applications domains (sports, health, etc.). In this article we review the use of sensors for the application domain of learning. For that we analyzed 82 sensor-based prototypes exploring their learning support. To study this learning support we classified the prototypes according to the Bloom’s taxonomy of learning domains and explored how they can be used to assist on the implementation of formative assessment, paying special attention to their use as feedback tools. The analysis leads to current research foci and gaps in the development of sensor-based learning support systems and concludes with a research agenda based on the findings.
Conference Paper
This paper introduces the Visual Inspection Tool (VIT) which supports researchers in the annotation of multimodal data as well as the processing and exploitation for learning purposes. While most of the existing Multimodal Learning Analytics (MMLA) solutions are tailor-made for specific learning tasks and sensors, the VIT addresses the data annotation for different types of learning tasks that can be captured with a customisable set of sensors in a flexible way. The VIT supports MMLA researchers in 1) triangulating multimodal data with video recordings; 2) segmenting the multimodal data into time-intervals and adding annotations to the time-intervals; 3) downloading the annotated dataset and using it for multimodal data analysis. The VIT is a crucial component that was so far missing in the available tools for MMLA research. By filling this gap we also identified an integrated workflow that characterises current MMLA research. We call this workflow the Multimodal Learning Analytics Pipeline, a toolkit for orchestration, the use and application of various MMLA tools.
Chapter
Studies in Learning Analytics provide concrete examples of how the analysis of direct interactions with learning management systems can be used to optimize and understand the learning process. Learning, however, does not necessarily only occur when the learner is directly interacting with such systems. With the use of sensors, it is possible to collect data from learners and their environment ubiquitously, therefore expanding the use cases of Learning Analytics. For this reason, we developed the Multimodal Learning Hub (MLH), a system designed to enhance learning in ubiquitous learning scenarios, by collecting and integrating multimodal data from customizable configurations of ubiquitous data providers. In this paper, we describe the MLH and report on the results of tests where we explored its reliability to integrate multimodal data.
Multimodal learning analytics' past, present, and, potential futures
  • M Worsley
Worsley, M. (2018). Multimodal learning analytics' past, present, and, potential futures. In CEUR Workshop Proceedings (Vol. 2163, pp. 1-16).
The Big Five: Addressing Recurrent Multimodal Learning Data Challenges
  • D Di Mitri
  • J Schneider
  • M Specht
  • H Drachsler
Di Mitri, D., Schneider, J., Specht, M., & Drachsler, H. (2018). The Big Five: Addressing Recurrent Multimodal Learning Data Challenges. In Martinez-Maldonado Roberto (Ed.), Proceedings of the Second Multimodal Learning Analytics Across (Physical and Digital) Spaces (CrossMMLA) (p. 6). Aachen: CEUR Workshop Proceedings. Retrieved from http://ceur-ws.org/Vol-2163/paper6.pdf