Conference Paper
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper introduces the Visual Inspection Tool (VIT) which supports researchers in the annotation of multimodal data as well as the processing and exploitation for learning purposes. While most of the existing Multimodal Learning Analytics (MMLA) solutions are tailor-made for specific learning tasks and sensors, the VIT addresses the data annotation for different types of learning tasks that can be captured with a customisable set of sensors in a flexible way. The VIT supports MMLA researchers in 1) triangulating multimodal data with video recordings; 2) segmenting the multimodal data into time-intervals and adding annotations to the time-intervals; 3) downloading the annotated dataset and using it for multimodal data analysis. The VIT is a crucial component that was so far missing in the available tools for MMLA research. By filling this gap we also identified an integrated workflow that characterises current MMLA research. We call this workflow the Multimodal Learning Analytics Pipeline, a toolkit for orchestration, the use and application of various MMLA tools.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Therefore, CIMLA should be able to access the datasets from Google Drive and LMS Server. Once the heterogeneous datasets are accessed, their file formats need to be unified in a single file format (REQ2) [Di Mitri, 2020, Di Mitri et al., 2019a. For example, students' responses data files are stored in XLSX, whereas log data files are stored in CSV file format. ...
... Another requirement emerges from the need to consider structured and organized CI following CDM4MMLA into the data processing pipeline (REQ3) so that contextaware data processing can take place , Shankar et al., 2019. Further, multimodal educational data stored in heterogeneous datasets should be processed using a standard data processing pipeline (REQ4) [Shankar et al., 2020, Di Mitri et al., 2019a, Shankar et al., 2019 2019b, Hassan et al., 2021]. Finally, the output data file generated after processing multimodal educational data should be exported in a set of file formats (REQ5) [Di Mitri, 2020, Di Mitri et al., 2019a, Shankar et al., 2019. ...
... Further, multimodal educational data stored in heterogeneous datasets should be processed using a standard data processing pipeline (REQ4) [Shankar et al., 2020, Di Mitri et al., 2019a, Shankar et al., 2019 2019b, Hassan et al., 2021]. Finally, the output data file generated after processing multimodal educational data should be exported in a set of file formats (REQ5) [Di Mitri, 2020, Di Mitri et al., 2019a, Shankar et al., 2019. For example, the output data file from the processing of students' responses and LMS log would be in CSV file format by default because the input data files were in CSV. ...
Article
Full-text available
Multimodal Learning Analytics (MMLA) solutions aim to provide a more holistic picture of a learning situation by processing multimodal educational data. Considering contextual information of a learning situation is known to help in providing more relevant outputs to educational stakeholders. However, most of the MMLA solutions are still in prototyping phase and dealing with different dimensions of an authentic MMLA situation that involve multiple cross-disciplinary stakeholders like teachers, researchers, and developers. One of the reasons behind still being in prototyping phase of the development lifecycle is related to the challenges that software developers face at different levels in developing context-aware MMLA solutions. In this paper, we identify the requirements and propose a data infrastructure called CIMLA. It includes different data processing components following a standard data processing pipeline and considers contextual information following a data structure. It has been evaluated in three authentic MMLA scenarios involving different cross-disciplinary stakeholders following the Software Architecture Analysis Method. Its fitness was analyzed in each of the three scenarios and developers were interviewed to assess whether it meets functional and non-functional requirements. Results showed that CIMLA supports modularity in developing context-aware MMLA solutions and each of its modules can be reused with required modifications in the development of other solutions. In the future, the current involvement of a developer in customizing the configuration file to consider contextual information can be investigated.
... For the arm and pocket, the right side was initially selected because the participant was right-handed. Sets of data were collected with these four positions and compared visually by the research team using the Visual Inspection Tool (VIT) [30] (VIT-https: //github.com/dimstudio/visual-inspection-tool, last accessed on 25 March 2021), an existing annotation toolkit, by observing the data plots from the time series to explore the similarity in terms of patterns when the participant performs the strokes. ...
... We used existing components for the Kinect (Kinect Reader), which was also connected to the LearningHub for data collection. Additionally, we used the Visual Inspection Tool (VIT) [30] for data annotation and Sharpflow [17] for data analysis. These components above enabled the implementation of the MMLA Pipeline. ...
... The annotations were synchronised with the recorded sessions using the VIT [30], a web application prototype that allows manual and semi-automatic data annotation of MLT-JSON session files generated by the LearningHub. The VIT allows for specifying custom time-based intervals and assigned properties in those intervals. ...
Article
Full-text available
Beginner table-tennis players require constant real-time feedback while learning the fundamental techniques. However, due to various constraints such as the mentor's inability to be around all the time, expensive sensors and equipment for sports training, beginners are unable to get the immediate real-time feedback they need during training. Sensors have been widely used to train beginners and novices for various skills development, including psychomotor skills. Sensors enable the collection of multimodal data which can be utilised with machine learning to classify training mistakes, give feedback, and further improve the learning outcomes. In this paper, we introduce the Table Tennis Tutor (T3), a multi-sensor system consisting of a smartphone device with its built-in sensors for collecting motion data and a Microsoft Kinect for tracking body position. We focused on the forehand stroke mistake detection. We collected a dataset recording an experienced table tennis player performing 260 short forehand strokes (correct) and mimicking 250 long forehand strokes (mistake). We analysed and annotated the multimodal data for training a recurrent neural network that classifies correct and incorrect strokes. To investigate the accuracy level of the aforementioned sensors, three combinations were validated in this study: smartphone sensors only, the Kinect only, and both devices combined. The results of the study show that smartphone sensors alone perform sub-par than the Kinect, but similar with better precision together with the Kinect. To further strengthen T3's potential for training, an expert interview session was held virtually with a table tennis coach to investigate the coach's perception of having a real-time feedback system to assist beginners during training sessions. The outcome of the interview shows positive expectations and provided more inputs that can be beneficial for the future implementations of the T3.
... positioning and physiological markers), system logs and human logs. Making sense of data captured from multiple channels during group situations has been one of the goals of multimodal learning analytics (MMLA) [3] and teamwork science [46], but although there exist some MMLA interfaces tailored to researchers or experts [23,66], there is to our knowledge no work investigating how to provide teamwork insights to students and teachers. ...
... In sum, the few MMLA interfaces complex and exploratory, and are mostly targeted at data analysts (see review in [23]). To our knowledge, this is the first attempt in providing explanatory guidance to teachers and students to gain insights on team activity from multimodal data. ...
... EDA peaks are automatically detected and then a researcher runs a script to plot them on the timeline). The MMLA community continues to develop pipelines to automate the transformation of multimodal data ( [17,23,79] reviewed in [83]) that should benefit our work. Moreover, the challenge of programmatically annotating and enhancing charts includes not only content selection and generation [39,62,75], but also automated layout of visual elements [9,45,87]. ...
... Traditional classroom observations require human inference and are highly contextual; human-mediated labelling is often used in MMLA to relate raw data to more abstract constructs [23] [24]. Observation data integration with LA can happen for triangulation purposes [25], for observing technology-enhanced learning [26], inferring meaningful learning interaction data through annotations of direct observations [27] and video annotation to triangulate multimodal datasets, extract learning context and segment into time intervals has also been suggested [24]. ...
... Traditional classroom observations require human inference and are highly contextual; human-mediated labelling is often used in MMLA to relate raw data to more abstract constructs [23] [24]. Observation data integration with LA can happen for triangulation purposes [25], for observing technology-enhanced learning [26], inferring meaningful learning interaction data through annotations of direct observations [27] and video annotation to triangulate multimodal datasets, extract learning context and segment into time intervals has also been suggested [24]. Computer-assisted observation can help the process of observations through enforcing specific coding schemes and prevent missing data, speeding up the process of observations [28], enhance the validity and reliability of data [29]. ...
Article
Full-text available
Educational processes take place in physical and digital places. To analyse educational processes, Learning Analytics (LA) enable data collection from the digital learning context. At the same time, to gain more insights, the LA data can be complemented with the data coming from physical spaces enabling Multimodal Learning Analytics (MMLA). To interpret this data, theoretical grounding or contextual information is needed. Learning designs (LDs) can be used for contextualisation, however, in authentic scenarios the availability of machine-readable LD is scarce. We argue that Classroom Observations (COs), traditionally used to understand educational processes taking place in physical space, can provide the missing context and complement the data from the co-located classrooms. This paper reports on a co-design case study from an authentic scenario that used CO to make sense of the digital traces. In this paper we posit that the development of MMLA approaches can benefit from co-design methodologies; through the involvement of the end-users (project managers) in the loop, we illustrate how these data sources can be systematically integrated and analysed to better understand the use of digital resources. Results indicate that CO can drive sense-making of LA data where predefined LD is not available. Furthermore, CO can support layered contextualisation depending on research design, rigour and systematic documentation/data collection efforts.Also, co-designing the MMLA solution with the end-users proved to be a useful approach.
... Hence, such existing LA data models may not be able to integrate the metadata of heterogeneous datasets. Moreover, they do not need to pass through the same data processing pipeline as in the case of multimodal data processing involved in MMLA [16] [70]. Nevertheless, there are several contextualized LA data models adopted from the field of Learning Design (LD), which clearly represent design-phase CI [ [39], to the best of our knowledge, none of the existing LA data models fully represent the enactment-phase CI. ...
... They are ordered and arranged as per the sequence adopted from a data processing pipeline to conceptually process raw data to sense-making information [71]. We adopted the DRS's entities (Preparation, Organization, Fusion, Analysis, and Visualization) from [65] [6] [2] [41] [71] [16]. Each of these entities is representing a data processing activity of the data processing pipeline (e.g., M-DVC) for covering the journey from raw data to the outputs for sense-making information (e.g., visual, textual, etc.). ...
Chapter
When MultiModal Learning Analytics (MMLA) are applied in authentic educational scenarios, multiple stakeholders (such as teachers, researchers and developers) often communicate to specify the requirements of the envisioned MMLA solution. Later on, developers instantiate the software solution for the MMLA data processing needed, as per the stakeholders' specification, to fit the concrete setting of implementation (e.g., a set of classrooms with a certain technological setup). Current MMLA development practice, however, is relatively young and there is still a dearth of standardized practices at different phases of the development lifecycle. Such standardization may lead to interoperability among solutions that the current ad-hoc and tailor-made solutions lack. This chapter presents the Contextualized Data Model for MultiModal Learning Analytics (CDM4MMLA), a data model to represent, organize, and structure contextualized MMLA process specifications, to be later used by MMLA solutions. To illustrate the model's expressivity and exibility, the CDM4MMLA has been applied to three authentic MMLA scenarios. While not a definitive and universal proposal yet, this kind of common computer-interpretable models can not only help in specification reusability (e.g., if the underlying processing technologies change in the future), but also serve as a sort of `lingua franca' within the MMLA research and development community to more consistently specify its processes and accumulate knowledge.
... Multimodale Learning Analytics (MMLA) involves complex technical issues in collecting, merging, and analyzing different types of learning data data from heterogeneous data sources. Di Mitri et al. [4] proposed a Multimodal Learning Analytics Pipeline (MMLAP), which provides a generic approach to collecting and analyzing multimodal data to support learning activities in physical and digital spaces. The pipeline is structured in five steps, namely: (1) collection of data, (2) storage of data, (3) data labeling, (4) data processing and (5) data application. ...
... Mul modal sensor data The data pipeline, which is based on the Multimodal Learning Analytics Pipeline by Di Mitri et al. [4], starts with the aggregation and processing of multimodal sensor data from learners. Thus, the first step is to collect multimodal sensor data from learners (1), by means of body-mounted sensors to measure various physiological parameters, video feeds to detect a skeletal map of the learner's movements, or learning progress from a Learning Management System (LMS). ...
Conference Paper
Full-text available
Modern cloud-based big data engineering approaches like machine learning and blockchain enable the collection of learner data from numerous sources of different modalities (like video feeds, sensor data etc.), allowing multimodal learning analytics (MMLA) and reflection on the learning process. In particular, complex psycho-motor skills like dancing or operating a complex machine are profiting from MMLA. However, instructors, learners, and other institutional stake-holders may have issues with the traceability and the transparency of machine learning processes applied on learning data on the one side, and with privacy, data protection and security on the other side. We propose an approach for the acquisition, storage, processing and presentation of multimodal learning analytics data using machine learning and blockchain as services to reach explainable artificial intelligence (AI) and certified traceability of learning data processing. Moreover, we facilitate end-user involvement into to whole development cycle by extending established open-source software DevOps processes by participative design and community-oriented monitoring of MMLA processes. The MILKI-PSY cloud (MPC) architecture is extending existing MMLA approaches and Kubernetes based automation of learning analytics infrastructure deployment from a number of research projects. The MPC will facilitate further research and development in this field.
... To answer RQ2, we applied the MMLA Pipeline including the LearningHub [26] for collecting data and the Visual Inspection Tool [27] for manual annotation and visual understanding of the patterns. ...
... We used the Visual Inspection Tool [27] to visually inspect and annotate the multimodal recordings. The process of annotation consisted of first identifying the time interval for each attempt to select "yes" or "no" based on the shown card. ...
Chapter
The link between the body and mind has fascinated philosophers and scientists for ages. The increasing availability of sensor technologies has enabled the possibility to explore this link even deeper, providing some evidence that certain physiological measurements such as galvanic skin response can have in the performance of different learning activities. In this paper, we explore the link between learners’ performance of cognitive tasks and their physiological state with the use of Multimodal Learning Analytics (MMLA). We used MMLA tools and techniques to collect, annotate, and analyse physiological data from 16 participants wearing an Empatica E4 wristband while engaging in task-switching cognitive exercises. The collected data include temperature, blood volume pulse, heart rate variability, galvanic skin response, and screen recording from each participant while performing the exercises. To examine the link between cognitive performance we applied a preliminary qualitative analysis to galvanic skin response and tested different Artificial Intelligence techniques to differentiate between productive and unproductive performance.
... For this purpose, models are developed, trained, and evaluated using machine learning methods (Aulck et al. 2019). The available data play an important role in the quality of the model (Schneider et al. 2019). These models can be used in early warning systems to identify students at risk earlier and provide more targeted advice. ...
... One example of this is the early warning system FragSte(Berens et al. 2019, Schneider et al. 2019. FragSte only uses data that every university in Germany collects anyway. ...
Conference Paper
Full-text available
Network analysis simulations were used to guide decision-makers while configuring instructional spaces on our campus during COVID-19. Course enrollment data were utilized to estimate metrics of student-to-student contact under various instruction mode scenarios. Campus administrators developed recommendations based on these metrics; examples of learning analytics implementation are provided.
... Traditional classroom observations require human inference and are highly contextual; human-mediated labelling is often used in MMLA to relate raw data to more abstract constructs [23] [24]. Observation data integration with LA can happen for triangulation purposes [25], for observing technology-enhanced learning [26], inferring meaningful learning interaction data through annotations of direct observations [27] and video annotation to triangulate multimodal datasets, extract learning context and segment into time intervals has also been suggested [24]. ...
... Traditional classroom observations require human inference and are highly contextual; human-mediated labelling is often used in MMLA to relate raw data to more abstract constructs [23] [24]. Observation data integration with LA can happen for triangulation purposes [25], for observing technology-enhanced learning [26], inferring meaningful learning interaction data through annotations of direct observations [27] and video annotation to triangulate multimodal datasets, extract learning context and segment into time intervals has also been suggested [24]. Computer-assisted observation can help the process of observations through enforcing specific coding schemes and prevent missing data, speeding up the process of observations [28], enhance the validity and reliability of data [29]. ...
Preprint
Full-text available
Educational processes take place in physical and digital places. To analyse educational processes, Learning Analytics (LA) enable data collection from the digital learning context. At the same time, to gain more insights, the LA data can be complemented with the data coming from physical spaces enabling Multimodal Learning Analytics (MMLA). To interpret this data, theoretical grounding or con-textual information is needed. Learning designs (LDs) can be used for contextualisation, however, in authentic scenarios the availability of machine-readable LD is scarce. We argue that Classroom Observations (COs), traditionally used to understand educational processes taking place in physical space, can provide the missing context and complement the data from the co-located classrooms. This paper reports on a co-design case study from an authentic scenario that used CO to make sense of the digital traces. In this paper we posit that the development of MMLA approaches can benefit from co-design methodologies; through the involvement of the end-users (project managers) in the loop, we illustrate how these data sources can be systematically integrated and analysed to better understand the use of digital resources. Results indicate that CO can drive sense-making of LA data where pre-defined LD is not available. Furthermore, CO can support layered contextualisation depending on research design, rigour and systematic documentation/data collection efforts. Also, co-designing the MMLA solution with the end-users proved to be a useful approach.
... MMLA aims to find meaningful information and patterns in the heterogeneous datasets gathered from educational scenarios, which can support stakeholders (e.g., teachers and students) in evidence-based decision making during teaching and learning practices [13]. However, multiple system design decisions need to be made to go from the heterogeneous, raw multimodal evidence of learning to meaningful stakeholder support, from selecting which data processing activities are relevant [14], [15], sequencing them [16], or structuring the particular steps involved in each of those data processing activities [17]. Aside from making these decisions and systematically specifying them, prior MMLA research highlights the need to include contextual information of the learning scenario in the analysis, to aid in the correct processing and interpretation of the analyses [18], [19]. ...
... 1) Challenges in decision-making: Most software developers do not have experience of processing multimodal evidence of learning. They need to decide which data processing activities and steps are going to be involved in the analysis process [15]. Moreover, they need to reformulate their usual design decisions, which are based on their past experience for MMLA development [14]. ...
Article
Multimodal Learning Analytics (MMLA) systems, understood as those that exploit multimodal evidence of learning to better model a learning situation, have not yet spread widely in educational practice. Their inherent technical complexity, and the lack of educational stakeholder involvement in their design, are among the hypothesized reasons for the slow uptake of this emergent field. To aid in the process of stakeholder communication and systematization leading to the specification of MMLA systems, this paper proposes a Multimodal Data Value Chain (M-DVC). This conceptual tool, derived from both the field of Big Data and the needs of MMLA scenarios, has been evaluated in terms of its usefulness for stakeholders, in three authentic case studies of MMLA systems currently under development. The results of our mixed-methods evaluation highlight the usefulness of the M-DVC to elicit unspoken assumptions or unclear data processing steps in the initial stages of development. The evaluation also revealed limitations of the M-DVC in terms of the technical terminology employed, and the need for more detailed contextual information to be included. These limitations also prompt potential improvements for the M-DVC, on the path towards clearer specification and communication within the multi-disciplinary teams needed to build educationally-meaningful MMLA solutions.
... Linear transcripts often sacrifice the accurate representation of temporal nuances across different modalities (e.g., the duration of a verbal utterance compared to a gesture) and the temporal entanglements between multimodal events for the sake of readability. To address this issue, some researchers leverage human annotation tools designed to support transcribing multimodal interactions, such as ChronoViz, VIT, or V-note (Mitri et al., 2019). These multimodal annotation tools enable researchers to visualize multimodal events, make multiple data streams temporally interleave one another, and add customized annotations. ...
Article
Full-text available
In various technology-enhanced learning (TEL) environments, knowledge co-creation progresses through multimodal interactions that integrate verbal and nonverbal modalities, such as speech and gestures. This study investigated two distinct analytical approaches for analyzing multimodal interactions—triangulating and interleaving—by applying them to collaborative learning processes during an online embodied mathematics intervention. The findings demonstrate that the interleaving approach captures the temporal dynamics and nuanced interplay between multimodal events, providing deeper insights into how shared meaning-making evolves over time. In contrast, the triangulating approach effectively identifies cumulative interaction patterns but does not account for their temporal structure. Specifically, the interleaving approach, employing epistemic network analysis, revealed statistically significant differences in discourse patterns between learners with larger and smaller variances in upper body movements during the co-design activity. These findings underscore the complementary value of the interleaving approach in analyzing multimodal interactions and offer practical implications for advancing understanding of collaborative learning processes in TEL environments.
... The tool is developed in .NET and only works on Windows systems. It has been extended with a visualisation tool (e.g., VIT) for annotating purposes to assist researchers [6]. VIT is reported to be scalable and supports different types of sensors. ...
... The tool is developed in .NET and only works on Windows systems. It has been extended with a visualisation tool (e.g., VIT) for annotating purposes to assist researchers [9]. VIT is reported to be scalable and supports different types of sensors. ...
Preprint
Full-text available
The Multimodal Learning Analytics (MMLA) research community has significantly grown in the past few years. Researchers in this field have harnessed diverse data collection devices such as eye-trackers, motion sensors, and microphones to capture rich mul-timodal data about learning. This data, when analyzed, has been proven highly valuable for understanding learning processes across a variety of educational settings. Notwithstanding this progress, an ubiquitous use of MMLA in education is still limited by challenges such as technological complexity, high costs, etc. In this paper, we introduce CoTrack, a MMLA system for capturing the multimodal-ity of a group's interaction in terms of audio, video, and writing logs in online and co-located collaborative learning settings. The system offers a user-friendly interface, designed to cater to the needs of teachers and students without specialized technical expertise. Our usability evaluation with 2 researchers, 2 teachers and 24 students has yielded promising results regarding the system's ease of use. Furthermore, this paper offers design guidelines for the development of more user-friendly MMLA systems. These guidelines have significant implications for the broader aim of making MMLA tools accessible to a wider audience, particularly for non-expert MMLA users.
... Capturing and examining self-regulated learning during game-based learning SRL operations during learning with a GBLE can be difficult to capture using unimodal traditional methods such as click stream data and self-reports. Multimodal data affords researchers the opportunity to use multiple data streams to reveal learners' internal SRL processes including the use of strategies as they learn with GBLEs (Azevedo et al., 2018Alonso-Fernández et al., 2019;Di Mitri et al., 2019;Sharma and Giannakos, 2020;Giannakos et al., 2022). Multimodal data includes both subjective (e.g., self-report measures) and objective (e.g., log files, eye tracking) data channels that can capture physiological, verbal, behavioral, and contextual data during learning to reveal how learners interact with information, what SRL strategies learners may deploy, and why learners enact certain behaviors (Järvelä et al., 2019(Järvelä et al., , 2021Molenaar et al., 2023). ...
Article
Full-text available
Introduction Self-regulated learning (SRL), or learners’ ability to monitor and change their own cognitive, affective, metacognitive, and motivational processes, encompasses several operations that should be deployed during learning including Searching, Monitoring, Assembling, Rehearsing, and Translating (SMART). Scaffolds are needed within GBLEs to both increase learning outcomes and promote the accurate and efficient use of SRL SMART operations. This study aims to examine how restricted agency (i.e., control over one’s actions) can be used to scaffold learners’ SMART operations as they learn about microbiology with Crystal Island, a game-based learning environment. Methods Undergraduate students (N = 94) were randomly assigned to one of two conditions: (1) Full Agency, where participants were able to make their own decisions about which actions they could take; and (2) Partial Agency, where participants were required to follow a pre-defined path that dictated the order in which buildings were visited, restricting one’s control. As participants played Crystal Island, participants’ multimodal data (i.e., log files, eye tracking) were collected to identify instances where participants deployed SMART operations. Results Results from this study support restricted agency as a successful scaffold of both learning outcomes and SRL SMART operations, where learners who were scaffolded demonstrated more efficient and accurate use of SMART operations. Discussion This study provides implications for future scaffolds to better support SRL SMART operations during learning and discussions for future directions for future studies scaffolding SRL during game-based learning.
... For example, the model does not provide an out-of-the-box solution that is easy to implement. Developing a multimodal learning solution continues to be time-consuming, tedious, and difficult to get enough accurate annotated recordings to train machine learning models capable of making useful predictions using multimodal data, despite following pragmatic approaches [10], using customizable tools to collect [11] and annotate multimodal data [12]. In this paper, we present an extension of the preliminary study [13] where we tested a completely different approach that might help to quickly and simply distinguish expertise levels based on sensor data. ...
Article
Full-text available
This paper presents an extension to the approach described in [13] which was designed to help distinguish expert and novice performance easily by observing the sensor data without having to understand nor apply models to the sensor signal. The method consisted of plotting the sensor data and identifying irregularities in novice data and regularities in expert data. In this paper, we solidify the thesis that, with the help of sensors, expert performances are smoother, contain fewer irregularities, and have consistently uniform patterns than novice performances. We do so using the extended methodology on the same data set from the previous five cases in [13], namely running, bachata dance, salsa dance, tennis swings, and football penalty kicks, pointing out this assertion.
... The motion capture is based on the Python solution of Google MediaPipe. It continuously outputs an annotated stream of a detected hand and the coordinates of its joints according to Figure 2. The Visual Inspection Tool (VIT) [2] was used to define the required finger positions for each letter. ...
Chapter
Various studies show that multimodal interaction technologies, especially motion capture in educational environments, can significantly improve and support educational purposes such as language learning. In this paper, we introduce a prototype that implements finger tracking and teaches the user different letters of the German fingerspelling alphabet. Since most options for tracking a user’s movements rely on hardware that is not commonly available, a particular focus is laid on the opportunities of new technologies based on computer vision. These achieve accurate tracking with consumer webcams. In this study, the motion capture is based on Google MediaPipe. An evaluation based on user feedback shows that the prototype’s capabilities meet the functional requirements. This study corroborates the thesis that new technologies, such as motion-tracking-based software, can make language learning more accessible. Keywordsmultimodalitymotion capturelanguage learningcomputer vision
... The model, however, does not provide an out-of-the-box solution that is easy to implement, there are recurrent challenges that appear whenever someone wants to develop a multimodal learning solution [7]. Moreover, even by following pragmatic approaches [8] and using customizable tools to collect [9] and annotate multimodal data [10], it is time-consuming, tedious, and difficult to get enough accurate annotated recordings to train machine learning models capable of making useful predictions using multimodal data. ...
Conference Paper
This paper presents an approach that helps distinguish expert and novice performance easily by observing the sensor data without having to understand nor apply models to the sensor signal. The method consists of plotting the sensor data and identifying irregularities. We corroborate, with the help of sensors, that expert performances are smoother, contain fewer irregularities, and have consistently uniform patterns than novice performances. In this paper, we present six different cases pointing out this assertion, namely bachata and salsa dances, tennis swings, football penalty kicks, badminton, and running.
... The 'Multimodal data' is another category that has recently gained attention. 12 analyzed tools provide analytics based on multimodal data, including audio (Griol and Callejas, 2018;Mota et al., 2018), video (Dabisias et al., 2015;Ogata and Mouri, 2015), writing (Hui et al., 2016), geo-location (Fulantelli et al., 2013), and biometric (Di Mitri et al., 2019;Tamura et al., 2019). The main reason for this increase in the usage of 'Assessment data', 'User profile data', and 'Multimodal data' is to go beyond statistics-based LA and provide more effective LA to the users by correlating various data types. ...
Preprint
Full-text available
Open Learning Analytics (OLA) is an emerging research area that aims at improving learning efficiency and effectiveness in lifelong learning environments. OLA employs multiple methods to draw value from a wide range of educational data coming from various learning environments and contexts in order to gain insight into the learning processes of different stakeholders. As the research field is still relatively young, only a few technical platforms are available and a common understanding of requirements is lacking. This paper provides a systematic literature review of tools available in the learning analytics literature from 2011-2019 with an eye on their support for openness. 137 tools from nine academic databases are collected to form the base for this review. The analysis of selected tools is performed based on four dimensions, namely 'Data, Environments, Context (What?)', 'Stakeholders (Who?)', 'Objectives (Why?)', and 'Methods (How?)'. Moreover, five well-known OLA frameworks available in the community are systematically compared. The review concludes by eliciting the main requirements for an effective OLA platform and by identifying key challenges and future lines of work in this emerging field.
... Moreover, the promise of the end of the pandemic relates to a probable decrease in online teaching and learning. As the possibilities to collect and analyse multimodal data are limited (Di Mitri et al., 2019), there are challenges in terms of sustainability of such analyses. Therefore, further research should explore how to make the best use of the proposed approach by enabling automated LA and visualization in dashboards, to provide practical value. ...
Article
Full-text available
To ensure the validity of an assessment programme, it is essential to align it with the intended learning outcomes (LO). We present a model for ensuring assessment validity which supports this constructive alignment and uses learning analytics (LA). The model is based on LA that include a comparison between ideal LO weights (expressing the prioritization of LOs), actual assessment weights (maximum assessment points per LO), and student assessment results (actually obtained assessment points per LO), as well as clustering and trace data analysis. These analytics are part of a continuous improvement cycle, including strategic planning and learning design (LD) supported by LO prioritization, and monitoring and evaluation supported by LA. To illustrate and test the model, we conducted a study on the example of a graduate‐level higher education course in applied mathematics, by analysing student assessment results and activity in a learning management system. The study showed that the analyses provided valuable insights with practical implications for the development of sound LD, tailored educational interventions, databases of assessment tasks, recommendation systems, and self‐regulated learning. Future research should investigate the possibilities for automation of such LA, to enable full exploitation of their potential and use in everyday teaching and learning. Practitioner notes What is already known about this topic To develop sound, student‐centred learning design (LD), it is essential to ensure that assessment is constructively aligned with the intended learning outcomes (LO). This constructive alignment is crucial for ensuring the validity of an assessment program. Learning analytics (LA) can provide insights that help develop valid assessment programs. What this paper adds As not all LOs are equally important, assessment programs should reflect the prioritization of LOs, which can be determined by using various multi‐criteria decision‐making (MCDM) methods. This article presents and illustrates, based on an empirical case, a model of continuous improvement of LD, which uses LA to compare how LOs are reflected in (actual) students' results, in an (actual) assessment program, and in the (ideal) prioritization of LOs based on MCDM. The study presents how clustering of students based on their assessment results can be used in LA to provide insights for educational interventions better targeted to students' needs. Implications for practice and/or policy The proposed LA can provide important insights for the development (or improvement) of LD in line with the intended course LOs, but also study program LOs (if course and study program LOs are properly aligned). The LA can also contribute to the development of databases of assessment tasks aligned with course LOs, with ensured validity, supporting sharing and reusing, as well as to the development of tailored educational interventions (eg, based on clustering). The proposed LA can also contribute to the development of recommendation systems, with recommendations for the improvement of LD for teachers or learning suggestions for students, as well as students' meta‐cognition and self‐regulated learning.
... Usually, different data modalities require specialized applications to support the annotation process. Several offline and online tools have been proposed for data annotation such as Mova [8], Microsoft PSI [9] or the Visual Inspection Tool [10]. ...
Article
Full-text available
Psychomotor learning develops our bodies in organized patterns with the help of environmental signals. With modern sensor arrays, we can acquire multimodal data to compare activities with stored reference models of body motions. To do this on a large scale in an efficient way, we need cloud-based infrastructures for the storage, processing and visualization of psychomotor learning analytics data. In this paper, we propose a conceptual sensor stream processing pipeline for this purpose. The solution is based on the so-called MLOps approach, a variation of the successful DevOps model for open source software engineering for large-scale machine learning solutions based on standard components. This processing pipeline will facilitate the multimodal analysis of many training scenarios collaboratively.
... The annotation can be carried out by one researcher retrospectively using the Visual Inspection Tool (VIT) (Di Mitri et al., 2019a). In the VIT, the researcher can load the MLT Session files one by one to triangulate the video recording with the sensor data. ...
Article
Full-text available
This paper describes the CPR Tutor, a real-time multimodal feedback system for cardiopulmonary resuscitation (CPR) training. The CPR Tutor detects training mistakes using recurrent neural networks. The CPR Tutor automatically recognises and assesses the quality of the chest compressions according to five CPR performance indicators. It detects training mistakes in real-time by analysing a multimodal data stream consisting of kinematic and electromyographic data. Based on this assessment, the CPR Tutor provides audio feedback to correct the most critical mistakes and improve the CPR performance. The mistake detection models of the CPR Tutor were trained using a dataset from 10 experts. Hence, we tested the validity of the CPR Tutor and the impact of its feedback functionality in a user study involving additional 10 participants. The CPR Tutor pushes forward the current state of the art of real-time multimodal tutors by providing: (1) an architecture design, (2) a methodological approach for delivering real-time feedback using multimodal data and (3) a field study on real-time feedback for CPR training. This paper details the results of a field study by quantitatively measuring the impact of the CPR Tutor feedback on the performance indicators and qualitatively analysing the participants’ questionnaire answers.
... Among the tools published to date, Barz et al. [9] and Mitri et. al. [10] developed a multi-sensor annotation tool with video reference (recorded during data collection) to annotate data from different on-body sensors. Similarly, Diete et al. [11] developed an image-supported, semi-supervised annotation tool that enables researchers to annotate wrist-based grabbing actions lasting a few seconds using images as ground truth to guide the annotation. ...
Conference Paper
Full-text available
Human activity recognition using wearable accelerometers can enable in-situ detection of physical activities to support novel human-computer interfaces. Many of the machine-learning-based activity recognition algorithms require multi-person, multi-day, carefully annotated training data with precisely marked start and end times of the activities of interest. To date, there is a dearth of usable tools that enable researchers to conveniently visualize and annotate multiple days of high-sampling-rate raw accelerometer data. Thus, we developed Signaligner Pro, an interactive tool to enable researchers to conveniently explore and annotate multi-day high-sampling rate raw accelerometer data. The tool visualizes high-sampling-rate raw data and time-stamped annotations generated by existing activity recognition algorithms and human annotators; the annotations can then be directly modified by the researchers to create their own, improved, annotated datasets. In this paper, we describe the tool's features and implementation that facilitate convenient exploration and annotation of multi-day data and demonstrate its use in generating activity annotations.
... (3) Visual Inspection Tool (VIT). A web-based tool developed in Javascript and HTML5, which allows the visual inspection and the annotation of multimodal datasets encoded with MLT-JSON data format [3]. VIT was modified to deal with geo-location data 5 . ...
Chapter
Full-text available
In this paper we introduce MOBIUS, a smartphone-based system for remote tracking of citizens’ movements. By collecting smartphone’s sensor data such as accelerometer and gyroscope, along with self-report data, the MOBIUS system allows to classify the users’ mode of transportation. With the MOBIUS app the users can also activate GPS tracking to visualise their journeys and travelling speed on a map. The MOBIUS app is an example of a tracing app which can provide more insights into how people move around in an urban area. In this paper, we introduce the motivation, the architectural design and development of the MOBIUS app. To further test its validity, we run a user study collecting data from multiple users. The collected data are used to train a deep convolutional neural network architecture which classifies the transportation modes using with a mean accuracy of 89%.
... In line with this strategy we are currently specifically exploring additional use cases in the psycho-motor domain, where GaDeP and the WEKIT framework complement each other. One of our current targets is to combine GaDeP and the WEKIT framework on the methodological level and to explore the integration of Gamifire with a framework for multimodal interaction [34][35][36][37]. ...
Article
Full-text available
Gamification aims at addressing problems in various fields such as the high dropout rates, the lack of engagement, isolation, or the lack of personalisation faced by Massive Open Online Courses (MOOC). Even though gamification is widely applied, not only in MOOCs, only few cases are meaningfully designed and empirically tested. The Gamification Design Process (GaDeP) aims to cover this gap. This article first briefly introduces GaDeP, presents the concept of meaningful gamification, and derives how it motivates the need for the Gamifire platform (as a scalable and platform-independent reference infrastructure for MOOC). Secondly, it defines the requirements for platformindependent gamification and describes the development of the Gamifire infrastructure. Thirdly we describe how Gamifire was successfully applied in four different cases. Finally, the applicability of GaDeP beyond MOOC is presented by reporting on a case study where GaDeP has been successfully applied by four student research and development projects. From both, the Gamifire cases and the GaDeP cases we derive the key contribution of this article: insights in the strengths and weaknesses of the Gamifire infrastructure as well as lessons learned about the applicability and limitations of the GaDeP framework. The paper ends detailing our future works and planned development activities.
... The fusion of knowledge about the learning domain, the learning activity, the physical activity, and the environmental context hopefully creates a comprehensive picture of the learning context. A comprehensible temporal dashboard visualization of the learning contexts with the corresponding ESM annotations can provide insight and reflection (e.g., [5]). The visual link between subjective assessment of one's performance and objective recognition of the learning context could allow learners to recognize patterns and adapt their learning context. ...
... The annotation can be carried out by an expert retrospectively using the Visual Inspection Tool (VIT) [6]. In the VIT, the expert can load the MLT Session files one by one to triangulate the video recording with the sensor data. ...
Chapter
We developed the CPR Tutor, a real-time multimodal feedback system for cardiopulmonary resuscitation (CPR) training. The CPR Tutor detects mistakes using recurrent neural networks for real-time time-series classification. From a multimodal data stream consisting of kinematic and electromyographic data, the CPR Tutor system automatically detects the chest compressions, which are then classified and assessed according to five performance indicators. Based on this assessment, the CPR Tutor provides audio feedback to correct the most critical mistakes and improve the CPR performance. To test the validity of the CPR Tutor, we first collected the data corpus from 10 experts used for model training. Hence, to test the impact of the feedback functionality, we ran a user study involving 10 participants. The CPR Tutor pushes forward the current state of the art of real-time multimodal tutors by providing: 1) an architecture design, 2) a methodological approach to design multimodal feedback and 3) a field study on real-time feedback for CPR training.
... However, capturing and analysing multimodal data in learning contexts, along with their facets is not a straightforward process and mostly requires both input tools and analytical tools that are sensitive to both the variability between and the complexity of different data modes (Di Mitri et al. 2019;Flewitt et al. 2009). Thus, it is necessary to use various technologies and tools to gather dedicated multimodal data and then analyse it in sophisticated ways, to better understand the complexity of learning in all of its nuances and intricacies. ...
Article
Full-text available
This systematic review on data modalities synthesises the research findings in terms of how to optimally use and combine such modalities when investigating cognitive, motivational, and emotional learning processes. ERIC, WoS, and ScienceDirect databases were searched with specific keywords and inclusion criteria for research on data modalities, resulting in 207 relevant publications. We provide findings in terms of target journal, country, subject, participant characteristics, educational level, foci, type of data modality, research method, type of learning, learning setting, and modalities used to study the different foci. In total, 18 data modalities were classified. For the 207 multimodal publications, 721 occurrences of modalities were observed. The most popular modality was interview followed by survey and observation. The least common modalities were heart rate variability, facial expression recognition, and screen recording. From the 207 publications, 98 focused exclusively on the cognitive aspects of learning, followed by 27 publications that only focused on motivation, while only five publications exclusively focused on emotional aspects. Only 10 publications focused on a combination of cognitive, motivational, and emotional aspects of learning. Our results plea for the increased use of objective measures, highlight the need for triangulation of objective and subjective data, and demand for more research on combining various aspects of learning. Further, rather than researching cognitive, motivational, and emotional aspects of learning separately, we encourage scholars to tap into multiple learning processes with multimodal data to derive a more comprehensive view on the phenomenon of learning.
... Data-driven understanding of collaborative learning has advanced from the analyses of unimodal primitives, such as keystrokes and clickstreams, to the analyses of data streams obtained from the multiple sensors [39]. The main motivation has been that the use of multiple sensors and resources allows for holistic understanding of collaborative processes [29,41,53]. ...
... Providing such feedback, especially formative, requires further research on both technology and methodology to be able to compare streaming data and experts recorded data in the physical time and space. [9] and [1] has been making significant efforts for achieving this feat. Their work so far has involved synchronized multi-modal data collection and annotation of such data which are crucial steps for being able to provide realtime feedback with sensor data. ...
Chapter
Full-text available
Body-worn sensors can be used to capture, analyze, and replay human performance for training purposes. The key challenge to any such approach is to establish validity that the captured expert experience is actually suitable for training. In this paper, to evaluate this, we apply a questionnaire-based expert assessment and a complementary trainee knowledge assessment to study the approach adopted and the models generated with the WEKIT solution, a hardware and software application that complements Augmented Reality glasses with wearable sensor-actuator experience. This solution was developed using the ID4AR framework which as also developed within the WEKIT project. ID4AR framework is a domain agnostic framework which can be used to design augmented reality and sensor based applications for training. The study presented triangulates validity across three independent test-beds in the professional domains of aircraft maintenance, medical imaging, and astronaut training, with 61 experts completing the expert survey and 337 students completing the trainee knowledge test. Results show that the captured expert models were positively received in all three domains and the identified level of acceptance suggests that the solution is capable of capturing models for training purposes at large.
Conference Paper
Full-text available
Nowadays, learning activities have become more interactive and collaborative than ever before. However, it remains unclear what makes the group perform differently in such a learning context. With the empowerment of multimodal data (MMD), we conducted a field study involving 12 groups of children who collaborated during two-day-long classroom activities. This paper reports on a quantitative analysis and temporal explanation concerning the relation between children’s performance and their group-level MMD measurements during a collaborative coding session in a design thinking activity. We computed each group’s performance based on the created artefacts and compared the groups with better performance than the others. The results demonstrate that high-performing groups show more joint engagement, joint visual attention, and joint emotional intensity of delight, while low-performing groups show significantly more joint emotional intensity of frustration. In addition, the evolution over the four temporal phases showed different patterns between high and low-performing groups. Finally, this paper discusses design and theoretical implications for educators, researchers and practitioners.
Chapter
Many recent studies highlighted the importance of feedback on the quality of learning. It empowers students to take ownership of their learning, guides institutions in making informed decisions, ensures continuous improvement, fosters engagement and motivation, facilitates open communication, and enables personalized learning experiences. However, despite its relevance, the use of feedback processes in everyday teaching often becomes unsustainable, due to the number of students and the timing of the courses. On the other hand, the expansion of ubiquitous learning in digital environments has led to an exponential growth of significant data for tracking learning. Although the use of these data can be beneficial, tools and technologies are needed for automated data collection and analysis. In this direction, significant support can be provided by technologies incorporating Artificial Intelligence (AI), which include a wide collection of different technologies and algorithms. Notably, Learning Analytics (LA) and Educational Data Mining (EDM) can be useful in developing a student-focused strategy. The systematic use of AI techniques and algorithms could enable new scenarios for educators, profiling and predicting learning outcomes and supporting the creation of sustainable patterns of assessment. However, even though several studies aimed at integrating EDM and LA techniques in online learning environments, only few of them focused on applying them to real-world physical learning environments to support teachers in providing timely and quality feedback based on minimally invasive measurements. The present paper presents an approach aimed at addressing the feedback problem in real university classes, laying the groundwork for the development of an intelligent system that can inform and support the university teacher in delivering personalized feedback to a large group of students.
Chapter
The need to innovate teaching-learning practices to enhance students’ learning outcomes and promote nowadays transversal skills often clashes with the reiteration of standardized and outdated teaching and assessment methods. Recent developments in the assessment field have highlighted the need to shift the focus of assessment from the product (or the outcome) to the learning process itself, moving from an assessment of learning and for learning to an assessment as learning, in which the student actively participates in the process. This perspective moves toward learning-oriented assessment practices and involves the integration of three key elements: tasks appropriate to the approach, development of assessment competence, and student involvement in feedback processes. These practices thus support self-regulated learning by leading students to take an active role, monitoring their progress through self-assessment, reflecting on the effectiveness of their learning approaches, and considering their mistakes as an opportunity to learn and improve. The purpose of the present study is to investigate whether the change in assessment modes affects students’ ability to self-regulate their own learning path.
Chapter
Full-text available
This chapter describes the insights derived by the design and development of the Multimodal Tutor, a system that uses artificial intelligence for providing digital feedback and to support psychomotor skills acquisition. In this chapter, we discuss the insights which we gained from eight studies: (1) an exploratory study combining physiological data and learning performance (Learning Pulse); (2) a literature survey on multimodal data for learning and a conceptual model (the Multimodal Learning Analytics Model); (3) an analysis of the technical challenges of Multimodal Learning Analytics (the Big Five Challenges); (4) a technological framework for using multimodal data for learning (the Multimodal Pipeline); (5) a data collection and storing system for multimodal data (the Learning Hub); (6) a data annotation tool for multimodal data (the Visual Inspection Tool); (7) a case study in Cardiopulmonary Resuscitation training (CPR Tutor) consisting of a feasibility study for detecting CPR mistakes; and (8) a real-time feedback study.
Chapter
While digital education technologies have improved to make educational resources more available, the modes of interaction they implement remain largely unnatural for the learner. Modern sensor-enabled computer systems allow extending human-computer interfaces for multimodal communication. Advances in Artificial Intelligence allow interpreting the data collected from multimodal and multi-sensor devices. These insights can be used to support deliberate practice with personalised feedback and adaptation through Multimodal Learning Experiences (MLX). This chapter elaborates on the approaches, architectures, and methodologies in five different use cases that use multimodal learning analytics applications for deliberate practice.
Article
Full-text available
The new educational models such as smart learning environments use of digital and context‐aware devices to facilitate the learning process. In this new educational scenario, a huge quantity of multimodal students' data from a variety of different sources can be captured, fused, and analyze. It offers to researchers and educators a unique opportunity of being able to discover new knowledge to better understand the learning process and to intervene if necessary. However, it is necessary to apply correctly data fusion approaches and techniques in order to combine various sources of multimodal learning analytics (MLA). These sources or modalities in MLA include audio, video, electrodermal activity data, eye‐tracking, user logs, and click‐stream data, but also learning artifacts and more natural human signals such as gestures, gaze, speech, or writing. This survey introduces data fusion in learning analytics (LA) and educational data mining (EDM) and how these data fusion techniques have been applied in smart learning. It shows the current state of the art by reviewing the main publications, the main type of fused educational data, and the data fusion approaches and techniques used in EDM/LA, as well as the main open problems, trends, and challenges in this specific research area. This article is categorized under: Application Areas > Education and Learning
Article
Full-text available
Multimodal learning analytics (MMLA) has increasingly been a topic of discussion within the learning analytics community. The Society of Learning Analytics Research is home to the CrossMMLA Special Interest Group and regularly hosts workshops on MMLA during the Learning Analytics Summer Institute (LASI). In this paper, we articulate a set of 12 commitments that we believe are critical for creating effective MMLA innovations. Moreover, as MMLA grows in use, it is important to articulate a set of core commitments that can help guide both MMLA researchers and the broader learning analytics community. The commitments that we describe are deeply rooted in the origins of MMLA and also reflect the ways that MMLA has evolved over the past 10 years. We organize the 12 commitments in terms of (i) data collection, (ii) analysis and inference, and (iii) feedback and data dissemination and argue why these commitments are important for conducting ethical, high-quality MMLA research. Furthermore, in using the language of commitments, we emphasize opportunities for MMLA research to align with established qualitative research methodologies and important concerns from critical studies.
Article
With the wide expansion of distributed learning environments the way we learn became more diverse than ever. This poses an opportunity to incorporate different data sources of learning traces that can offer broader insights into learner behavior and the intricacies of the learning process. We argue that combining analytics across different e-learning systems can measure the effectiveness of learning designs and maximize learning opportunities in distributed learning settings. As a step towards this goal, in this study, we considered how to broaden the context of a single learning environment into a learning ecosystem that integrates three separate e-learning systems. We present a cross-platform architecture that captures, integrates, and stores learning-related data from the learning ecosystem. To prove the feasibility and the benefit of the cross-platform architecture, we used regression and classification techniques to generate interpretable models with analytics that are relevant for instructors and learners in understanding learning behavior and making sense of the instructional method on learning performance. The results show that combining data across multiple e-learning systems improves the classification accuracy compared to data from a single learning system by a factor of 5. Our work highlights the value of cross-platform analytics and presents a springboard for the creation of new cross-systems data-driven research practices.
Article
Full-text available
Most research on learning technology uses clickstreams and questionnaires as their primary source of quantitative data. This study presents the outcomes of a systematic literature review of empirical evidence on the capabilities of multimodal data (MMD) for human learning. This paper provides an overview of what and how MMD have been used to inform learning and in what contexts. A search resulted in 42 papers that were included in the analysis. The results of the review depict the capabilities of MMD for learning and the ongoing advances and implications that emerge from the employment of MMD to capture and improve learning. In particular, we identified the six main objectives (ie, behavioral trajectories, learning outcome, learning‐task performance, teacher support, engagement and student feedback) that the MMLA research has been focusing on. We also summarize the implications derived from the reviewed articles and frame them within six thematic areas. Finally, this review stresses that future research should consider developing a framework that would enable MMD capacities to be aligned with the research and learning design (LD). These MMD capacities could also be utilized on furthering theory and practice. Our findings set a baseline to support the adoption and democratization of MMD within future learning technology research and development. Practitioner Notes What is already known about this topic Capturing and measuring learners’ engagement and behavior using MMD has been explored in recent years and exhibits great potential. There are documented challenges and opportunities associated with capturing, processing, analyzing and interpreting MMD to support human learning. MMD can provide insights into predicting learning engagement and performance as well as into supporting the process. What this paper adds Provides a systematic literature review (SLR) of empirical evidence on MMD for human learning. Summarizes the insights MMD can give us about the learning outcomes and process. Identifies challenges and opportunities of MMD to support human learning. Implications for practice and/or policy Learning analytics researchers will be able to use the SLR as a guide for future research. Learning analytics practitioners will be able to use the SLR as a summary of the current state of the field.
Thesis
Full-text available
Teaching and learning processes take place in blended learning settings. To create a holistic picture of educational context and analyse these processes for different purposes, different data sources and collection methods come into play. Learning interaction analysis has been an important part of the Technology-enhanced Learning (TEL) research; the data collection and analysis can happen through traditional or modern data-collection methods, gathering insights from physical and digital spaces. Technological advancements brought the need for analysis of digital interactions (Learning Analytics, LA), covering only one part of the educational process. To respond to the problem of so-called street-light effect and one-dimensional data sources, in recent years Multimodal Learning Analytics (MMLA) field emerged, combining different data-sources from traditional or modern data collection techniques coming from across space interactions, also from physical settings: sensors, EEG devices etc. At the same time, to guide the data collection process or to analyse digital traces and data collected through automated means, contextual information such as learning design (LD) with teacher intentions, actors, roles, media use and other information is needed. Traditional data collection methods, especially qualitative methods, can respond to this need as they often contain highly contextual information. Traditional classroom observational methods are relevant and useful sources to include in the analysis for different purposes: to gain evidence from physical space, triangulate the findings, contextualise data analysis and support sensemaking of digital traces. On the other hand, human-mediated classroom observation methods also benefit from automated observations (MMLA data) and can enrich the data, speed up the observation process or gather evidence on indicators unobservable to the human eye. Aligning traditional (human-labelled) and modern (automated) classroom observations, therefore, is beneficial for educational research and practice. Previous research indicates that the fields of LD and LA have a synergetic relationship, where LD contextualises data analysis and LA informs LD. At the same time, connecting these three factors: human-mediated, automated observations and contextualisation of data analysis with LD is not a trivial task and special attention needs to be given to the specificities, meaning, affordances, constraints and quality of the data sources. To provide with a holistic picture on teaching and learning processes this research has connected two research paradigms and focused on the development of conceptual and technological tools to create links between different sources of data and the contextualisation through the development of The Framework for Contextualised Multimodal Observations. The Framework was developed through research-based design methodology and is implemented through a classroom observation app Observata — Classroom Observation tool that produces LA compliant data with specific context and LD (or without). The Framework consists of accompanying three contributions: the model and the protocol for MMLA observational process, the Model for Contextualised MMLA Observations, Context-aware MMLA Taxonomy.
Chapter
Open learning analytics (OLA) is an emerging research area that aims at improving learning efficiency and effectiveness in lifelong learning environments. OLA employs multiple methods to draw value from a wide range of educational data coming from various learning environments and contexts in order to gain insight into the learning processes of different stakeholders. As the research field is still relatively young, only a few technical platforms are available, and a common understanding of requirements is lacking. This paper provides a systematic literature review of tools available in the learning analytics literature from 2011 to 2019 with an eye on their support for openness. One hundred thirty-seven tools from nine academic databases are collected to form the base for this review. The analysis of selected tools is performed based on four dimensions, namely, “Data, Environments, Context (What?),” “Stakeholders (Who?),” “Objectives (Why?),” and “Methods (How?).” Moreover, five well-known OLA frameworks available in the community are systematically compared. The review concludes by eliciting the main requirements for an effective OLA platform and by identifying key challenges and future lines of work in this emerging field.
Conference Paper
Full-text available
Analysis of learning interactions can happen for different purposes. As educational practices increasingly take place in hybrid settings, data from both spaces are needed. At the same time, to analyse and make sense of machine aggregated data afforded by Technology-Enhanced Learning (TEL) environments, contextual information is needed. We posit that human labelled (classroom observations) and automated observations (multimodal learning data) can enrich each other. Researchers have suggested learning design (LD) for contextualisation, the availability of which is often limited in authentic settings. This paper proposes a Context-aware MMLA Taxonomy, where we categorize systematic documentation and data collection within different research designs and scenarios, paying special attention to authentic classroom contexts. Finally, we discuss further research directions and challenges.
Chapter
In learning situations that do not occur exclusively online, the analysis of multimodal evidence can help multiple stakeholders to better understand the learning process and the environment where it occurs. However, Multimodal Learning Analytics (MMLA) solutions are often not directly applicable outside the specific data gathering setup and conditions they were developed for. This paper focuses specifically on authentic situations where MMLA solutions are used by multiple stakeholders (e.g., teachers and researchers). In this paper, we propose an architecture to process multimodal evidence of learning taking into account the situation’s contextual information. Our adapter-based architecture supports the preparation, organisation, and fusion of multimodal evidence, and is designed to be reusable in different learning situations. Moreover, to structure and organise such contextual information, a data model is proposed. Finally, to evaluate the architecture and the data model, we apply them to four authentic learning situations where multimodal learning data was collected collaboratively by teachers and researchers.
Chapter
Full-text available
This paper seeks to contribute to the emerging field of Quantitative Ethnography (QE) by demonstrating its utility to solve a complex challenge in Learning Analytics: the provision of timely feedback to collocated teams and their coaches. We define two requirements that extend the QE concept in order to operationalise it such a design process, namely, the use of co-design methodologies, and the availability of automated analytics workflow to close the feedback loop. We introduce the Multimodal Matrix as a data modelling approach that can integrate theoretical concepts about teamwork with contextual insights about specific work practices, enabling the analyst to map between higher order codes and low-level sensor data, with the option add the results of manually performed analyses. This is implemented in software as a workflow for rapid data modelling, analysis and interactive visualisation, demonstrated in the context of nursing teamwork simulations. We propose that this exemplifies how a QE methodology can underpin collocated activity analytics, at scale, with in-principle applications to embodied, collocated activities beyond our case study.
Chapter
Full-text available
Collaboration is an important 21st century skill; it can take place in a remote or co-located setting. Co-located collaboration (CC) is a very complex process which involves subtle human interactions that can be described with multimodal indicators (MI) like gaze, speech and social skills. In this paper, we first give an overview of related work that has identified indicators during CC. Then, we look into the state-of-the-art studies on feedback during CC which also make use of MI. Finally, we describe a Wizard of Oz (WOz) study where we design a privacy-preserving research prototype with the aim to facilitate real-time collaboration in-the-wild during three co-located group PhD meetings (of 3-7 members). Here, human observers stationed in another room act as a substitute for sensors to track different speech-based cues (like speaking time and turn taking); this drives a real-time visualization dashboard on a public shared display. With this research prototype, we want to pave way for design-based research to track other multimodal indicators of CC by extending this prototype design using both humans and sensors.
Article
Full-text available
Multimodality in learning analytics and learning science is under the spotlight. The landscape of sensors and wearable trackers that can be used for learning support is evolving rapidly, as well as data collection and analysis methods. Multimodal data can now be collected and processed in real time at an unprecedented scale. With sensors, it is possible to capture observable events of the learning process such as learner's behaviour and the learning context. The learning process, however, consists also of latent attributes, such as the learner's cognitions or emotions. These attributes are unobservable to sensors and need to be elicited by human-driven interpretations. We conducted a literature survey of experiments using multimodal data to frame the young research field of multimodal learning analytics. The survey explored the multimodal data used in related studies (the input space) and the learning theories selected (the hypothesis space). The survey led to the formulation of the Multimodal Learning Analytics Model whose main objectives are of (O1) mapping the use of multimodal data to enhance the feedback in a learning context; (O2) showing how to combine machine learning with multimodal data; and (O3) aligning the terminology used in the field of machine learning and learning science. © 2018 The Authors. Journal of Computer Assisted Learning Published by John Wiley & Sons, Ltd.
Article
Full-text available
In recent years, the focus of healthcare and wellness technologies has shown a significant shift towards personal vital signs devices. The technology has evolved from smartphone-based wellness applications to fitness bands and smartwatches. The novelty of these devices is the accumulation of activity data as their users go about their daily life routine. However, these implementations are device specific and lack the ability to incorporate multimodal data sources. Data accumulated in their usage does not offer rich contextual information that is adequate for providing a holistic view of a user's lifelog. As a result, making decisions and generating recommendations based on this data are single dimensional. In this paper, we present our Data Curation Framework (DCF) which is device independent and accumulates a user's sensory data from multimodal data sources in real time. DCF curates the context of this accumulated data over the user's lifelog. DCF provides rule-based anomaly detection over this context-rich lifelog in real time. To provide computation and persistence over the large volume of sensory data, DCF utilizes the distributed and ubiquitous environment of the cloud platform. DCF has been evaluated for its performance, correctness, ability to detect complex anomalies, and management support for a large volume of sensory data.
Article
Full-text available
"Teaching analytics" is the application of learning analytics techniques to understand teaching and learning processes, and eventually enable supportive interventions. However, in the case of (often, half-improvised) teaching in face-to-face classrooms, such interventions would require first an understanding of what the teacher actually did, as the starting point for teacher reflection and inquiry. Currently, such teacher enactment characterization requires costly manual coding by researchers. This paper presents a case study exploring the potential of machine learning techniques to automatically extract teaching actions during classroom enactment, from five data sources collected using wearable sensors (eye-tracking, EEG, accelerometer, audio and video). Our results highlight the feasibility of this approach, with high levels of accuracy in determining the social plane of interaction (90%, k=0.8). The reliable detection of concrete teaching activity (e.g., explanation vs. questioning) accurately still remains challenging (67%, k=0.56), a fact that will prompt further research on multimodal features and models for teaching activity extraction, as well as the collection of a larger multimodal dataset to improve the accuracy and generalizability of these methods.
Article
Full-text available
How does AI&EdAIED today compare to 25 years ago? This paper addresses this evolution by identifying six trends. The trends are ongoing and will influence learning technologies going forward. First, the physicality of interactions and the physical space of the learner became genuine components of digital education. The frontier between the digital and the physical has faded out. Similarly, the opposition between individual and social views on cognition has been subsumed by integrated learning scenarios, which means that AIED pays more attention today to social interactions than it did at its outset. Another trend is the processing of learners’ behavioural particles, which do not carry very many semantics when considered individually, but are predictive of knowledge states when large data sets are processed with machine learning methods. The development of probabilistic models and the integration of crowdsourcing methods has produced another trend: the design of learning environments has become less deterministic than before. The notion of learning environment evolved from a rather closed box to an open ecosystem in which multiple components are distributed over multiple platforms and where multiple stakeholders interact. Among these stakeholders, it is important to notice that teachers play a more important role than before: they interact not only at the design phase (authoring) but also in the runtime phase (orchestration). These trends are not specific to AIED; they depict the evolution of learning technologies as a whole.
Conference Paper
Full-text available
The Presentation Trainer is a multimodal tool designed to support the practice of public speaking skills, by giving the user real-time feedback about different aspects of her nonverbal communication. It tracks the user's voice and body to interpret her current performance. Based on this performance the Presentation Trainer selects the type of intervention that will be presented as feedback to the user. This feedback mechanism has been designed taking in consideration the results from previous studies that show how difficult it is for learners to perceive and correctly interpret real-time feedback while practicing their speeches. In this paper we present the user experience evaluation of participants who used the Presentation Trainer to practice for an elevator pitch, showing that the feedback provided by the Presentation Trainer has a significant influence on learning.
Article
Full-text available
This longitudinal study explores the effects of tracking and monitoring time devoted to learn with a mobile tool, on self-regulated learning. Graduate students (n = 36) from three different online courses used their own mobile devices to track how much time they devoted to learn over a period of four months. Repeated measures of the Online Self-Regulated Learning Questionnaire and Validity and Reliability of Time Management Questionnaire were taken along the course. Our findings reveal positive effects of tracking time on time management skills. Variations in the channel, content and timing of the mobile notifications to foster reflective practice are investigated, and time-logging patterns are described. These results not only provide evidence of the benefits of recording learning time, but also suggest relevant cues on how mobile notifications should be designed and prompted towards self-regulated learning of students in online courses.
Article
Full-text available
In various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, in multiple experiments or subjects, among others. We use the term “modality” for each such acquisition framework. Due to the rich characteristics of natural phenomena, it is rare that a single modality provides complete knowledge of the phenomenon of interest. The increasing availability of several modalities reporting on the same system introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. As we argue, many of these questions, or “challenges,” are common to multiple domains. This paper deals with two key issues: “why we need data fusion” and “how we perform it.” The first issue is motivated by numerous examples in science and technology, followed by a mathematical framework that showcases some of the benefits that data fusion provides. In order to address the second issue, “diversity” is introduced as a key concept, and a number of data-driven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account for diversity across the data sets. The aim of this paper is to provide the reader, regardless of his or her community of origin, with a taste of the vastness of the field, the prospects, and the opportunities that it holds.
Article
Full-text available
In recent years sensor components have been extending classical computer-based support systems in a variety of applications domains (sports, health, etc.). In this article we review the use of sensors for the application domain of learning. For that we analyzed 82 sensor-based prototypes exploring their learning support. To study this learning support we classified the prototypes according to the Bloom’s taxonomy of learning domains and explored how they can be used to assist on the implementation of formative assessment, paying special attention to their use as feedback tools. The analysis leads to current research foci and gaps in the development of sensor-based learning support systems and concludes with a research agenda based on the findings.
Conference Paper
Full-text available
This presentation was delivered as a lead briefing to kickoff a panel titled "Round Table Discussion on Intelligent Tutoring Systems (ITSs). The focus of this presentation was to convey essential design elements and future directions for the research and development of the Generalized Intelligent Framework for Tutoring (GIFT). The round table also included presentations by leading experts in the field of intelligent instruction: Dr. Arthur Graesser, Dr. Vincent Aleven, Dr. Alan Lesgold, and Mr. Doug Lenat. An abstract of the panel session follows: An emphasis on self-development in the military community has highlighted the need for adaptive computer-based tutoring systems (CBTS) to support point-of-need training in environments where human instructors are unavailable. Adaptive CBTS aim to select instructional strategies to meet the specific learning needs of individuals or teams. Instead of one-size-fits all instructional delivery, they aim to assess trainee cognitive and/or affective states and use this information to tailor instructional decisions. Some of the underlying components required to accomplish this include a learner model, a repertoire of instructional strategies and a methodology for selecting the best strategy based on the current state of the learner model. This session will begin with a talk reviewing adaptive tutoring principles and a description and demonstration of a modular CBTS framework: The Generalized Intelligent Framework for Tutoring (GIFT), developed at the Army Research Laboratory - Human Research & Engineering Directorate. GIFT allows researchers to manipulate the CBTS components in order to test empirically the effect of different assessment and instructional strategies on learning outcomes. Following the review and demonstration, roundtable members, who are experts in intelligent tutoring and artificial intelligence, will discuss how their past or current work relates to GIFT, barriers to the adoption of general tutoring frameworks such as GIFT, and solutions to overcoming those barriers—both technological and organizational.
Technical Report
Full-text available
An emphasis on self-regulated learning in the military community (U.S. Army Training & Doctrine Command, 2011) has highlighted a need for point-of-need training in environments where human tutors are either unavailable or impractical. Computer-Based Tutoring Systems (CBTS) have been shown to be as effective as expert human tutors (VanLehn, 2011) in one-to-one tutoring in well-defined domains (e.g., mathematics or physics) and significantly better than traditional classroom training environments. CBTS have demonstrated significant promise, but fifty years of research have been unsuccessful in making CBTS ubiquitous in military training or the tool of choice in our educational system. Why? The availability and use of CBTS have been constrained by their high development costs, their limited reuse, a lack of standards, and their inadequate adaptability to the needs of learners (Picard, 2006). Their application to military domains is further hampered by the complex and often ill-defined environments in which our military operates today. CBTS are often built as domain-specific, unique, one-of-a-kind, largely domain-dependent solutions focused on a single pedagogical strategy (e.g., model tracing or constraint-based approaches) when complex learning domains may require novel or hybrid approaches. The authors posit that a modular CBTS framework and standards could enhance reuse, support authoring and optimization of CBTS strategies for learning, and lower the cost and skillset needed for users to adopt CBTS solutions for military training and education. This paper considers the design and development of a modular CBTS framework called the Generalized Intelligent Framework for Tutoring (GIFT).
Article
Full-text available
New high-frequency data collection technologies and machine learning analysis techniques could offer new insights into learning, especially in tasks in which students have ample space to generate unique, personalized artifacts, such as a computer program, a robot, or a solution to an engineering challenge. To date most of the work on learning analytics and educational data mining has focused on online courses or cognitive tutors, in which the tasks are more structured and the entirety of interaction happens in front of a computer. In this paper, I argue that multimodal learning analytics could offer new insights into students' learning trajectories, and present several examples of this work and its educational application.
Conference Paper
Full-text available
Multimodal Learning Analytics is a field that studies how to process learning data from dissimilar sources in order to automatically find useful information to give feedback to the learning process. This work processes video, audio and pen strokes information included in the Math Data Corpus, a set of multimodal resources provided to the participants of the Second International Workshop on Multimodal Learning Analytics. The result of this processing is a set of simple features that could discriminate between experts and non-experts in groups of students solving mathematical problems. The main finding is that several of those simple features, namely the percentage of time that the students use the calculator, the speed at which the student writes or draws and the percentage of time that the student mentions numbers or mathematical terms, are good discriminators be- tween experts and non-experts students. Precision levels of 63% are obtained for individual problems and up to 80% when full sessions (aggregation of 16 problems) are analyzed. While the results are specific for the recorded settings, the methodology used to obtain and analyze the features could be used to create discriminations models for other contexts.
Article
Full-text available
Signals from peripheral physiology (e.g., ECG, EMG, and GSR) in conjunction with machine learning techniques can be used for the automatic detection of affective states. The affect detector can be user-independent, where it is expected to generalize to novel users, or user-dependent, where it is tailored to a specific user. Previous studies have reported some success in detecting affect from physiological signals, but much of the work has focused on induced affect or acted expressions instead of contextually constrained spontaneous expressions of affect. This study addresses these issues by developing and evaluating user-independent and user-dependent physiology-based detectors of nonbasic affective states (e.g., boredom, confusion, curiosity) that were trained and validated on naturalistic data collected during interactions between 27 students and AutoTutor, an intelligent tutoring system with conversational dialogues. There is also no consensus on which techniques (i.e., feature selection or classification methods) work best for this type of data. Therefore, this study also evaluates the efficacy of affect detection using a host of feature selection and classification techniques on three physiological signals (ECG, EMG, and GSR) and their combinations. Two feature selection methods and nine classifiers were applied to the problem of recognizing eight affective states (boredom, confusion, curiosity, delight, flow/-engagement, surprise, and neutral). The results indicated that the user-independent modeling approach was not feasible; however, a mean kappa score of 0.25 was obtained for user-dependent models that discriminated among the most frequent emotions. The results also indicated that k-nearest neighbor and Linear Bayes Normal Classifier (LBNC) classifiers yielded the best affect detection rates. Single channel ECG, EMG, and GSR and three-channel multimodal models were generally more diagnostic than two--channel models.
Conference Paper
Full-text available
In this technical demo we present repoVizz (http://repovizz.upf.edu), an integrated online system capable of structural formatting and remote storage, browsing, exchange, annotation, and visualization of synchronous multi-modal, time-aligned data. Motivated by a growing need for data-driven collaborative research, repoVizz aims to resolve commonly encountered diculties in sharing or browsing large collections of multi-modal data. At its current state, repoVizz is designed to hold time-aligned streams of heterogeneous data: audio, video, motion capture, physiological signals, extracted descriptors, annotations, et cetera. Most popular formats for audio and video are supported, while Broadcast WAVE or CSV formats are adopted for streams other than audio or video (e.g., motion capture or physiological signals). The data itself is struc tured via customized XML les, allowing the user to (re-)organize multi-modal data in any hierarchical manner, as the XML structure only holds metadata and pointers to data files. Datasets are stored in an online database, allowing the user to interact with the data remotely through a powerful HTML5 visual interface accessible from any standard web browser; this feature can be considered a key aspect of repoVizz since data can be explored, annotated, or visualized from any location or device. Data exchange and upload/download is made easy and secure via a number of data conversion tools and a user/permission management system.
Conference Paper
Full-text available
Previous studies have shown that the success of interper-sonal interaction depends not only on the contents we communicate ex-plicitly, but also on the social signals that are conveyed implicitly. In this paper, we present NovA (NOnVerbal behavior Analyzer), a system that analyzes and facilitates the interpretation of social signals conveyed by gestures, facial expressions and others automatically as a basis for computer-enhanced social coaching. NovA records data of human inter-actions, automatically detects relevant behavioral cues as a measurement for the quality of an interaction and creates descriptive statistics for the recorded data. This enables us to give a user online generated feedback on strengths and weaknesses concerning his social behavior, as well as elaborate tools for offline analysis and annotation.
Conference Paper
Full-text available
Automatic detection and interpretation of social signals car-ried by voice, gestures, mimics, etc. will play a key-role for next-generation interfaces as it paves the way towards a more intuitive and natural human-computer interaction. The paper at hand introduces Social Signal Interpretation (SSI), a framework for real-time recognition of social signals. SSI supports a large range of sensor devices, filter and feature algorithms, as well as, machine learning and pattern recognition tools. It encourages developers to add new components using SSI's C++ API, but also addresses front end users by offering an XML interface to build pipelines with a text editor. SSI is freely available under GPL at http://openssi.net.
Conference Paper
Full-text available
Learners experience a variety of emotions during learning sessions with Intelligent Tutoring Systems (ITS). The research community is building systems that are aware of these experiences, generally represented as a category or as a point in a low-dimensional space. State-of-the-art systems detect these affective states from multimodal data, in naturalistic scenarios. This paper provides evidence of how the choice of representation affects the quality of the detection system. We present a user-independent model for detecting learners' affective states from video and physiological signals using both the categorical and dimensional representations. Machine learning techniques are used for selecting the best subset of features and classifying the various degrees of emotions for both representations. We provide evidence that dimensional representation, particularly using valence, produces higher accuracy.
Article
Full-text available
With the increase in available educational data, it is expected that Learning Analytics will become a powerful means to inform and support learners, teachers and their institutions in better understanding and predicting personal learning needs and performance. However, the processes and requirements behind the beneficial application of Learning and Knowledge Analytics as well as the consequences for learning and teaching are still far from being understood. In this paper, we explore the key dimensions of Learning Analytics (LA), the critical problem zones, and some potential dangers to the beneficial exploitation of educational data. We propose and discuss a generic design framework that can act as a useful guide for setting up Learning Analytics services in support of educational practice and learner guidance, in quality assurance, curriculum development, and in improving teacher effectiveness and efficiency. Furthermore, the presented article intends to inform about soft barriers and limitations of Learning Analytics. We identify the required skills and competences that make meaningful use of Learning Analytics data possible to overcome gaps in interpretation literacy among educational stakeholders. We also discuss privacy and ethical issues and suggest ways in which these issues can be addressed through policy guidelines and best practice examples.
Conference Paper
Full-text available
Tabletops have the potential to provide new ways to support collaborative learning generally and, more specifically, to aid people in learning to collaborate more effectively. To achieve this potential, we need to gain understanding of how to design tabletop environments so that they capture relevant information about collaboration processes so that we can make it available in a form that is useful for learners, their teachers and facilitators. This paper draws upon research in computer supported collaborative learning to establish a set of principles for the design of a tabletop learning system. We then show how these have been used to design our Collaid (Collaborative Learning Aid) environment. Key features of this system are: capture of multi-modal data about collaboration in a tabletop activity using a microphone array and a depth sensor; integration of these data with other parts of the learning system; transforming the data into visualisations depicting the processes that occurred during the collaboration at the table; and sequence mining of the interaction logs. The main contributions of this paper are: our design guidelines to build the Collaid environment and the demonstration of its use in a collaborative concept mapping learning tool applying data mining and visualisations of collaboration.
Conference Paper
Full-text available
This paper describes the use of sensors in intelligent tutors to detect students' affective states and to embed emotional support. Using four sensors in two classroom experiments the tutor dynamically collected data streams of physiological activity and students' self-reports of emotions. Evidence indicates that state-based fluctuating student emotions are related to larger, longer-term affective variables such as self-concept in mathematics. Students produced self- reports of emotions and models were created to automatically infer these emotions from physiological data from the sensors. Summaries of student physiological activity, in particular data streams from facial detection software, helped to predict more than 60% of the variance of students emotional states, which is much better than predicting emotions from other contextual variables from the tutor, when these sensors are absent. This research also provides evidence that by modifying the "context" of the tutoring system we may well be able to optimize students' emotion reports and in turn improve math attitudes.
Conference Paper
We focus on data collection designs for the automated analysis of teacher-student interactions in live classrooms with the goal of identifying instructional activities (e.g., lecturing, discussion) and assessing the quality of dialogic instruction (e.g., analysis of questions). Our designs were motivated by multiple technical requirements and constraints. Most importantly, teachers could be individually micfied but their audio needed to be of excellent quality for automatic speech recognition (ASR) and spoken utterance segmentation. Individual students could not be micfied but classroom audio quality only needed to be sufficient to detect student spoken utterances. Visual information could only be recorded if students could not be identified. Design 1 used an omnidirectional laptop microphone to record both teacher and classroom audio and was quickly deemed unsuitable. In Designs 2 and 3, teachers wore a wireless Samson AirLine 77 vocal headset system, which is a unidirectional microphone with a cardioid pickup pattern. In Design 2, classroom audio was recorded with dual first- generation Microsoft Kinects placed at the front corners of the class. Design 3 used a Crown PZM-30D pressure zone microphone mounted on the blackboard to record classroom audio. Designs 2 and 3 were tested by recording audio in 38 live middle school classrooms from six U.S. schools while trained human coders simultaneously performed live coding of classroom discourse. Qualitative and quantitative analyses revealed that Design 3 was suitable for three of our core tasks: (1) ASR on teacher speech (word recognition rate of 66% and word overlap rate of 69% using Google Speech ASR engine); (2) teacher utterance segmentation (F-measure of 97%); and (3) student utterance segmentation (F-measure of 66%). Ideas to incorporate video and skeletal tracking with dual second-generation Kinects to produce Design 4 are discussed.
Chapter
Studies in Learning Analytics provide concrete examples of how the analysis of direct interactions with learning management systems can be used to optimize and understand the learning process. Learning, however, does not necessarily only occur when the learner is directly interacting with such systems. With the use of sensors, it is possible to collect data from learners and their environment ubiquitously, therefore expanding the use cases of Learning Analytics. For this reason, we developed the Multimodal Learning Hub (MLH), a system designed to enhance learning in ubiquitous learning scenarios, by collecting and integrating multimodal data from customizable configurations of ubiquitous data providers. In this paper, we describe the MLH and report on the results of tests where we explored its reliability to integrate multimodal data.
Article
Experts are imperative for training apprentices, but learning from experts is difficult. Experts often struggle to explicate and/or verbalize their knowledge or simply overlook important details due to internalization of their skills, which may make it more difficult for apprentices to learn from experts. In addition, the shortage of experts to support apprentices in one-to-one settings during trainings limits the development of apprentices. In this review, we investigate how augmented reality and sensor technology can be used to capture expert performance in such a way that the captured performance can be used to train apprentices without increasing the workload on experts. To this end, we have analysed 78 studies that have implemented augmented reality and sensor technology for training purposes. We explored how sensors have been used to capture expert performance with the intention of supporting apprentice training. Furthermore, we classified the instructional methods used by the studies according to the 4C/ID framework to understand how augmented reality and sensor technology have been used to support training. The results of this review show that augmented reality and sensor technology have the potential to capture expert performance for training purposes. The results also outline a methodological approach to how sensors and augmented reality learning environments can be designed for training using recorded expert performance.
Chapter
This paper describes the design of an intelligent Multimodal Tutor for training people to perform cardiopulmonary resuscitation using patient manikins (CPR tutor). The tutor uses a multi-sensor setup for tracking the CPR execution and generating personalised feedback, including unobtrusive vibrations and retrospective summaries. This study is the main experiment of a PhD project focusing on multimodal data support for investigating practice-based learning scenarios, such as psychomotor skills training in the classroom or at the workplace. For the CPR tutor the multimodal data considered consist of trainee’s body position (with Microsoft Kinect), electromyogram (with Myo armband) and compression rates data derived from the manikin. The CPR tutor uses a new technological framework, the Multimodal Pipeline, which motivates a set of technical approaches used for the data collection, storage, processing, annotation and exploitation of multimodal data. This paper aims at opening up the motivation, the planning and expected evaluations of this experiment to further feedback and considerations by the scientific community.
Article
Low arousal states (especially boredom) have been shown to be more deleterious to learning than high arousal states, though the latter have received much more attention (e.g., test anxiety, confusion, and frustration). Aiming at profiling arousal in the classroom (how active students are) and examining how activation levels relate to achievement, we studied sympathetic arousal during two runs of an elective advanced physics course in a real classroom setting, including the course exam. Participants were high school students ( N = 24) who were randomly selected from the course population. Arousal was indexed from electrodermal activity, measured unobtrusively via the Empatica E4 wristband. Low arousal was the level with the highest incidence (60% of the lesson on average) and longest persistence, lasting on average three times longer than medium arousal and two times longer than high arousal level occurrences. During the course exam, arousal was positively and highly correlated ( r = .66) with achievement as measured by the students' grades. Implications for a need to focus more on addressing low arousal states in learning are discussed, together with potential applications for biofeedback, teacher intervention, and instructional design.
Article
This paper presents three multimodal learning analytic approaches from a hands-on learning activity. We use video, audio, gesture and bio-physiology data from a two-condition study (N = 20), to identify correlations between the multimodal data, experimental condition, and two learning outcomes: design quality and learning. The three approaches incorporate: 1) human-annotated coding of video data, 2) automated coding of gesture, audio and bio-physiological data and, 3) concatenated human-annotated and automatically annotated data. Within each analysis we employ the same machine learning and sequence mining techniques. Ultimately we find that each approach provides different affordances depending on the similarity metric and the dependent variable. For example, the analysis based on human-annotated data found strong correlations among multimodal behaviors, experimental condition, success and learning, when we relaxed constraints on temporal similarity. The second approach performed well when comparing students’ multimodal behaviors as a time series, but was less effective using the temporally relaxed similarity metric. The take-away is that there are several strategies for doing multimodal learning analytics, and that many of these approaches can provide a meaningful glimpse into a complex data set, glimpses that may be difficult to identify using traditional approaches.
Conference Paper
Affect detection is an important component of computerized learning environments that adapt the interface and materials to students' affect. This paper proposes a plan for developing and testing multimodal affect detectors that generalize across differences in data that are likely to occur in practical applications (e.g., time, demographic variables). Facial features and interaction log features are considered as modalities for affect detection in this scenario, each with their own advantages. Results are presented for completed work evaluating the accuracy of individual modality face- and interaction- based detectors, accuracy and availability of a multimodal combination of these modalities, and initial steps toward generalization of face-based detectors. Additional data collection needed for cross-culture generalization testing is also completed. Challenges and possible solutions for proposed cross-cultural generalization testing of multimodal detectors are also discussed.
Conference Paper
Should we judge the quality of the class by the grades the students and teacher get at the end of the semester or how the group collaborated during the semester towards acquiring new knowledge? Up until recently, the later approach was all too inaccessible due to complexity and time needed to evaluate every class. With the development of new technologies in different branches of video processing, gaze tracking and audio analysis we are getting the opportunity to go further with our analysis and go around the potential problem substitution into which we were previously forced. We present our efforts to record student-student and student-teacher interactions within a classroom eco-system. For this purpose, we developed a multi-camera system for observing teacher actions and students reactions throughout the class. We complemented the data with a mobile eye-tracker worn by the teacher, quantitative questionnaire data collection, as well as in-depth interviews with students about their impressions of the classes they took, and about our intervention. The seven-part experiment was conducted during the autumn semester of 2013, in two classes with over 60 participants. We present the conclusions we reached about the experiment format, visualize the preliminary results of our processing and discuss other options we are considering for our further experiments. We aim to explore further possibilities for analysing classroom life in order to create a more responsive environment to the needs of the students.
Conference Paper
The recent emergence of several low-cost, high resolution, multimodal sensors has greatly facilitated the ability for researchers to capture a wealth of data across a variety of contexts. Over the past few years, this multimodal technology has begun to receive greater attention within the learning community. Specifically, the Multimodal Learning Analytics community has been capitalizing on new sensor technology, as well as the expansion of tools for supporting computational analysis, in order to better understand and improve student learning in complex learning environments. However, even as the data collection and analysis tools have greatly eased the process, there remain a number of considerations and challenges in framing research in such a way that it lends to the development of learning theory. Moreover, there are a multitude of approaches that can be used for integrating multimodal data, and each approach has different assumptions and implications. In this paper, I describe three different types of multimodal analyses, and discuss how decisions about data integration and fusion have a significant impact on how the research relates to learning theories.
Conference Paper
Detecting learning-centered affective states is difficult, yet crucial for adapting most effectively to users. Within tutoring in particular, the combined context of student task actions and tutorial dialogue shape the student's affective experience. As we move toward detecting affect, we may also supplement the task and dialogue streams with rich sensor data. In studies of introductory computer programming tutoring, human tutors communicated with students through text-based interfaces. Manual and automated approaches were leveraged to annotate dialogue, task actions, facial movements, postural positions, and hand-to-face gestures. Prior investigations in this line of doctoral research identified associations between nonverbal behavior and learning-centered affect, such as engagement and frustration. Additionally, preliminary work used hidden Markov models to analyze sequences of affective tutorial interaction. Further work will address the sequential nature of the multimodal data. This line of research is expected to improve automated understanding of learning-centered affect, with particular insights into how affect unfolds from moment to moment during tutoring. This may result in systems that treat student affect not as transient states, but instead as interconnected links in a student's path toward learning.