Article

Predicting Student Performance from LMS Data: A Comparison of 17 Blended Courses Using Moodle LMS

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the adoption of Learning Management Systems (LMSs) in educational institutions, a lot of data has become available describing students’ online behavior. Many researchers have used these data to predict student performance. This has led to a rather diverse set of findings, possibly related to the diversity in courses and predictor variables extracted from the LMS, which makes it hard to draw general conclusions about the mechanisms underlying student performance. We first provide an overview of the theoretical arguments used in learning analytics research and the typical predictors that have been used in recent studies. We then analyze 17 blended courses with 4,989 students in a single institution using Moodle LMS, in which we predict student performance from LMS predictor variables as used in the literature and from in-between assessment grades, using both multi-level and standard regressions. Our analyses show that the results of predictive modeling, notwithstanding the fact that they are collected within a single institution, strongly vary across courses. Thus, the portability of the prediction models across courses is low. In addition, we show that for the purpose of early intervention or when in-between assessment grades are taken into account, LMS data are of little (additional) value. We outline the implications of our findings and emphasize the need to include more specific theoretical argumentation and additional data sources other than just the LMS data.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... A considerable volume of research in the field of learning analytics is exploratory and has been aimed at developing predictive models of student academic success, often related to the prediction of students' grades and retention (Conijn et al. 2017;Chen et al. 2020;Dawson et al. 2014;Siemens, Dawson, and Lynch 2014). The majority of such models have been based on the examination of learner trace data retrieved from learning management systems (LMSs). ...
... Based on a literature review of predictor variables derived from LMS log data in fully online and blended courses (Conijn et al. 2017), most commonly used variables include total number of events, number of online sessions, total time spent online, number of content page views, number of discussion messages viewed and posted, and number of assessment items completed. Another recent review (Alyahyan and Düştegör 2020) identified variables such as number of logins, number of discussion forum entries, number/total time material viewed, as frequently used variables for predicting student academic success. ...
... Another recent review (Alyahyan and Düştegör 2020) identified variables such as number of logins, number of discussion forum entries, number/total time material viewed, as frequently used variables for predicting student academic success. However, given that most predictive modeling studies were set in different learning contexts and varied notably in the selection of variables (features) for predictive models, it is hard to draw a conclusion about the best or most stable predictors of student performance (Conijn et al. 2017). Likewise, results from such studies have a low potential for generalizability (Andres et al. 2018) and replicability (Andres et al. 2017). ...
Article
Full-text available
Predictors of student academic success do not always replicate well across different learning designs, subject areas, or educational institutions. This suggests that characteristics of a particular discipline and learning design have to be carefully considered when creating predictive models in order to scale up learning analytics. This study aimed to examine if and to what extent frequently used predictors of study success are portable across a homogenous set of courses. The research was conducted in an integrated blended problem-based curriculum with trace data (n = 2,385 students) from 50 different course offerings across four academic years. We applied the statistical method of single paper meta-analysis to combine correlations of several indicators with students' success. Total activity and the forum indicators exhibited the highest prediction intervals, where the former represented proxies of the overall engagement with online tasks, and the latter with online collaborative learning activities. Indicators of lecture reading (frequency of lecture view) showed statistically insignificant prediction intervals and, therefore, are less likely to be portable across course offerings. The findings show moderate amounts of variability both within iterations of the same course and across courses. The results suggest that the use of the meta-analytic statistical method for the examination of study success indicators across courses with similar learning design and subject area can offer valuable quantitative means for the identification of predictors that reasonably well replicate and consequently can be reliably portable in the future.
... Recent research in data mining addresses the limitations of such rule set-based models by advocating automated learning methods. The regression method is employed in research [11,12,21]. In [11], the student engagement (data attribute) is included to build a regression-based predictor for undergraduate GPA prediction. A. Pardo et al. [12] combine the university student's self-regulated learning indicators and engagement with online learning logs to predict the student's academic performance. ...
... In [11], the student engagement (data attribute) is included to build a regression-based predictor for undergraduate GPA prediction. A. Pardo et al. [12] combine the university student's self-regulated learning indicators and engagement with online learning logs to predict the student's academic performance. R. Conijn et al. [21] utilize the data collected from the Moodle learning management system to predict and compare the student performance of 17 blended courses by a multiple linear regression method. S. Kotsiantis et al. [22] utilize the Naive Bayes algorithm to predict the performances of students (fail or pass) in the final examination of the university-level distance learning, with the student's demographic characteristics and marks on a few written assignments. ...
... Besides, although most neural network based student performance prediction models achieve relatively good performances, they only treat the model as an end-to-end learning process without investigating the relationship between the data, causing the models to lack interoperability. Universal [18] Universal [19] Universal [20] Universal [21] Universal [22] Universal [23] Universal [24] Universal [27] Universal [28] Universal [29] Universal [30] Universal [31] Universal Proposed method Graph structure data and MLP Personalized ...
Article
Full-text available
Senior high school education (SHSE) forms a connecting link between the preceding junior high school education and the following college education. Through SHSE, a student not only completes k-12 education, but also lays a foundation for subsequent higher education. The grade of the student in SHSE plays a critical role in college application and admission. Therefore, utilizing the grade of the student as an indicator is a reasonable method to instruct and ensure the effect of SHSE. However, due to the complexity and nonlinearity of the grade prediction problem, it is hard to predict the grade accurately. In this paper, a novel grade prediction model aiming to handle the complexity and nonlinearity is proposed to accurately predict the grade of the senior high student. To deal with the complexity, a graph structure is employed to represent the students’ grades in all subjects. To handle the nonlinearity, the multi-layer perceptron (MLP) is used to learn (or fit) the inner relation of the subject grades. The proposed grade prediction model based on graph neural network is tested on the dataset of Ningbo Xiaoshi High School. The results show that the proposed method performs well in the prediction of senior high school student grades.
... Many studies have been conducted to predict students' performance on the basis of expectations of the abovementioned benefits. Previous studies have used self-efficacy (Yu et al., 2020a), demographics (El Aissaoui et al., 2020;Yağci & Çevik, 2019), and students' online behaviors (Conijn et al., 2017;Lemay & Doleck, 2020) to predict students' performance. Furthermore, given the potential value of finding hidden patterns between features, studies that predict students' performance using various machine learning algorithms have been actively conducted, and the results of these studies exhibited high prediction rates (Cen et al., 2016;Chaturvedi & Ezeife, 2017;Grivokostopoulou et al., 2015;Haridas et al., 2020). ...
... Fourth, the features used to predict student performance often lack a background theory. Studies that attempt to improve educational quality by using big data from a learning management system (LMS), for example, frequently lack a clear background theory to support the research (Clow, 2013;Conijn et al., 2017). However, if a study has a theoretical grounds, it could motivate the process of deciding which features to extract from the raw log data of LMS or interpreting the results (Conijn et al., 2017). ...
... Studies that attempt to improve educational quality by using big data from a learning management system (LMS), for example, frequently lack a clear background theory to support the research (Clow, 2013;Conijn et al., 2017). However, if a study has a theoretical grounds, it could motivate the process of deciding which features to extract from the raw log data of LMS or interpreting the results (Conijn et al., 2017). Several studies extract features using a theoretical background. ...
Article
Full-text available
Predicting students’ performance in advance could help assist the learning process; if “at-risk” students can be identified early on, educators can provide them with the necessary educational support. Despite this potential advantage, the technology for predicting students’ performance has not been widely used in education due to practical limitations. We propose a practical method to predict students’ performance in the educational environment using machine learning and explainable artificial intelligence (XAI) techniques. We conducted qualitative research to ascertain the perspectives of educational stakeholders. Twelve people, including educators, parents of K-12 students, and policymakers, participated in a focus group interview. The initial practical features were chosen based on the participants’ responses. Then, a final version of the practical features was selected through correlation analysis. In addition, to verify whether at-risk students could be distinguished using the selected features, we experimented with various machine learning algorithms: Logistic Regression, Decision Tree, Random Forest, Multi-Layer Perceptron, Support Vector Machine, XGBoost, LightGBM, VTC, and STC. As a result of the experiment, Logistic Regression showed the best overall performance. Finally, information intended to help each student was visually provided using the XAI technique.
... By drawing on the aforementioned features former studies have been able to predict the final course performance (Agudo-Peregrina et al., 2014;Conijn, Snijders, Kleingeld, & Matzat, 2017;Tomasevic et al., 2020) or identify students who are about to drop out due to learning struggles (Hasan et al., 2020;Huang et al., 2020). Overall, the picture that emerged from utilizing such data is very promising indicating that student performance can be modelled using such features to meet the needs of instructors, students, and administrators. ...
... Earlier research had indicated that forum discussions are correlated with learning performance (Kim, Park, Yoon & Jo, 2016;Papamitsiou & Economides, 2014). Hence, it was only natural that researchers explored the power of forum discussions as features for predicting learning (Conijn et al., 2017;Kim et al., 2016;Romero & Ventura, 2010;Wang, Kraut & Levine, 2015). ...
... Even though students produce large volumes of text in E-learning systems, previous studies have not methodically explored text as a feature. The exception to this rule involves studies that have focused on forum discussions for predicting student performance (Conijn et al., 2017;Kim et al., 2016;Romero et al., 2013;Wang et al., 2015). However, most studies tend to extract contextual and behavioural features of forum posts, for example the number of posts or the number of words per post. ...
Article
Full-text available
The digital trails that students leave behind on e-learning environments have attracted considerable attention in the past decade. Typically, some of these traces involve the production of different kinds of texts. While students routinely produce a bulk of texts in online learning settings, the potential of such linguistic features has not been systematically explored. This paper introduces a novel approach that involves using student-generated texts for predicting performance after viewing short video lectures. Forty-two undergraduates viewed six video lectures and were asked to write short summaries for each one. Five combinations of features that were extracted from these summaries were used to train eight machine learning classifiers. The findings indicated that the raw text feature set achieved higher average classification accuracy in two video lectures, while the combined feature set whose dimensionality had been reduced resulted in higher classification accuracy in two other video lectures. The findings also indicated that the Gradient Boost, AdaBoost and Random Forest classifiers achieved high average performance in half of the video lectures. The study findings suggest that student-produced texts are a very promising source of features for predicting student performance when learning from short video lectures.
... For instance, Romero and Ventura [6] suggest that students use an LMS to personalise their learning, such as review specific material or engage in relevant discussions as they prepare for their exams. Meanwhile, teachers rely on an LMS to deliver their course content and manage teaching resources in a relatively simple and uniform manner [7] without worrying about pace, place or space constraints. Irrespective of how an LMS is used, user interaction with the system generates significant and detailed digital footprints that can be mined using LA tools. ...
... Every activity performed in Moodle is captured in a database or system log, which can then be analysed to examine underlying student learning behaviours via LA approaches. A deeper investigation may be conducted if any indicators pertaining to at-risk students are identified [7]. A modelling process translates these indicators (extracted from training data) into predictive insights, which can be used on new data (or test data) to gauge student online behaviours. ...
... LMS data and in-between assessment grades were used in another study [7] to predict student performance. Multiple linear regression was used to induce predictive models at the end of the course and evaluate the efficacy of the available features within LMS. ...
Article
Full-text available
Poor academic performance of students is a concern in the educational sector, especially if it leads to students being unable to meet minimum course requirements. However, with timely prediction of students’ performance, educators can detect at-risk students, thereby enabling early interventions for supporting these students in overcoming their learning difficulties. However, the majority of studies have taken the approach of developing individual models that target a single course while developing prediction models. These models are tailored to specific attributes of each course amongst a very diverse set of possibilities. While this approach can yield accurate models in some instances, this strategy is associated with limitations. In many cases, overfitting can take place when course data is small or when new courses are devised. Additionally, maintaining a large suite of models per course is a significant overhead. This issue can be tackled by developing a generic and course-agnostic predictive model that captures more abstract patterns and is able to operate across all courses, irrespective of their differences. This study demonstrates how a generic predictive model can be developed that identifies at-risk students across a wide variety of courses. Experiments were conducted using a range of algorithms, with the generic model producing an effective accuracy. The findings showed that the CatBoost algorithm performed the best on our dataset across the F-measure, ROC (receiver operating characteristic) curve and AUC scores; therefore, it is an excellent candidate algorithm for providing solutions on this domain given its capabilities to seamlessly handle categorical and missing data, which is frequently a feature in educational datasets.
... When starting LA, it is also necessary to draw the boundaries ("what purpose," "for whom," "what data," and "how to analyze") and to reveal objectives due to the broad scope of LA (Chatti et al., 2012). In this context, it is remarkable that by focusing on academic success, most researchers predict performance with LMS data (Conijn et al., 2017;Iglesias-Pradas et al., 2015;Mwalumbwe & Mtebe, 2017;Saqr et al., 2017;Strang, 2016;Zacharis, 2015), compare various techniques to increase the predictive power (Cui et al., 2020;Hung et al., 2019;Miranda & Vegliante, 2019;You, 2016), and predict using individual characteristics and LMS data (Ramirez-Arellano et al., 2019;Strang, 2017). However, there is still no consensus on designing interventions to increase learning outcomes. ...
... Naturally, there have been many studies investigating the prediction of academic success with LMS data. In some studies (Mwalumbwe & Mtebe, 2017;Saqr et al., 2017;Zacharis, 2015), the classification power of LMS data on academic achievement is worth considering, while in some studies (Conijn et al., 2017;Iglesias-Pradas et al., 2015;Strang, 2016) LMS data partially contributed. For example, Saqr et al. (2017) found that engagement parameters showed significant positive correlations with student performance, especially those reflecting motivation and self-regulation. ...
... impacted students' performance. In studies where LMS data partially contributed, Conijn et al. (2017) revealed that the accuracy of the prediction models differed mainly between the courses, with between 8 and 37% explaining variance in the final grade. For early intervention or in-between assessment grades, the LMS data proved to be of little value. ...
Article
Full-text available
This study aims to determine indicators that affect students' final performance in an online learning environment using predictive learning analytics in an ICT course and Turkey context. The study takes place within a large state university in an online computer literacy course (14 weeks in one semester) delivered to freshmen students (n = 1209). The researcher gathered data from Moodle engagement analytics (time spent in course, number of clicks, exam, content, discussion), assessment grades (pre-test for prior knowledge, final grade), and various scales (technical skills and "motivation and attitude" dimensions of the readiness, and self-regulated learning skills). Data analysis used multi regression and classification. Multiple regression showed that prior knowledge and technical skills predict the final performance in the context of the course (ICT 101). According to the best probability, the Decision Tree algorithm classified 67.8% of the high final performance based on learners' characteristics and Moodle engagement analytics. The high level of total system interactions of learners with low-level prior knowledge increases their probability of high performance (from 40.4 to 60.2%). This study discussed the course structure and learning design, appropriate actions to improve performance, and suggestions for future research based on the findings.
... To make sense of this and to employ it effectively, data analytics in the context of education has become more common in the past decade. The use of so-called 'learning analytics' often aims to predict a student's course performance based on interaction data (Conijn et al., 2016;De Medio et al., 2020). However, the extent to which data-driven predictions to date are robust seems to vary. ...
... However, the extent to which data-driven predictions to date are robust seems to vary. For example, in the context of Learning Management Systems, using student data (e.g., interaction times, clicks) to predict course performance shows strong differences across different courses (Conijn et al., 2016). Moreover, although such techniques provide insight to the system owners and managers, they often do not help students with education-related problems, such as deciding what course to follow next. ...
... The scope of educational recommender systems can vary strongly (Rivera et al., 2018), both in terms what algorithmic approaches are used and what areas of education are covered. With regard to the former, it seems that collaborative filtering (CF) and hybrid approaches that involve a CF component are most popular (Rivera et al., 2018), arguably because Learning Management Systems (LMSs) generate a lot of interaction data from which student-related parameters can be distilled (Conijn et al., 2016;Hasan et al., 2016). ...
Article
Full-text available
A challenge for many young adults is to find the right institution to follow higher education. Global university rankings are commonly used, but inefficient tool, for they do not consider a person's preferences and needs. For example, some persons pursue prestige in their higher education, while others prefer proximity. This paper develops and evaluates a university recommender system, eliciting user preferences as ratings to build predictive models and to generate personalized university ranking lists. In Study 1, we performed offline evaluation on a rating dataset to determine which recommender approaches had the highest predictive value. In Study 2, we selected three algorithms to produce different university recommendation lists in our online tool, asking our users to compare and evaluate them in terms of different metrics (Accuracy, Diversity, Perceived Personalization, Satisfaction and Novelty). We show that a SVD algorithm scores high on accuracy and perceived personalization, while a KNN algorithm scores better on novelty. We also report findings on preferred university features.
... Although researchers have conducted in-depth research on the predictive features, due to different research scenarios and research data, the results of these studies are not consistent. Recently, Conijn et al. extracted 23 common predictive features from the log data of 17 courses, and compared the effect of each feature on the prediction of learning results for different courses [19]. They found that, in addition to the mid-term test score being significantly related to the final result in all courses, other features are only significantly related to the final result in some courses; moreover, the correlation between a specific feature and the final result differs across courses [19]. ...
... Recently, Conijn et al. extracted 23 common predictive features from the log data of 17 courses, and compared the effect of each feature on the prediction of learning results for different courses [19]. They found that, in addition to the mid-term test score being significantly related to the final result in all courses, other features are only significantly related to the final result in some courses; moreover, the correlation between a specific feature and the final result differs across courses [19]. This illustrates that it is difficult to find a set of general predictive features, and thus we should select appropriate predictive features for specific situations. ...
... According to the predicting variable, different prediction algorithms are employed to develop the prediction model. When the variable is the final grade of the student, the most used prediction algorithm is multivariate linear regressions, MLR [19]. When the variable is whether the student passes the course, the prediction algorithms is usually LR, DT, NB, KNN, SVM, Multilayer Perception (MP), RF, and so on [8,10,11,15,[20][21][22]. ...
Article
Full-text available
Online learning has developed rapidly, but the success rate is very low. Hence, it is of great significance to construct a learning result predicting model, and to quickly and accurately identify students at risk of failing their course. In order to mine the dynamic features of learning behaviors and use them to improve the accuracy of detection of at-risk students, we propose a long-short term memory (LSTM) network based approach to identify at-risk students. To validate the performance of this approach, we first extracted the behavior data of one course from a public dataset, and generate two types of datasets, the aggregated datasets and the sequential datasets. After that, we used eight classic machine learning methods to train predicting model on these datasets and explored whether the models trained on sequential datasets are more accurate than the models trained on aggregated datasets. The results show that the models trained on sequential datasets are more accurate when naïve Bayes, Classification and Regression Tree, Random Forest (RF), Iterative Dichotomiser 3 and Multilayer Perception are used. Finally, we used the LSTM to train predicting models on sequential datasets, and compared them with the best models trained by RF. The results show that the models trained by the LSTM are more accurate, which proves the effectiveness of the proposed approach at certain extent.
... Despite significant disagreements among experts, the terms metaverse, virtual reality (VR), mixed reality (MR), and cross reality (XR) all refer to the same thing [86][87][88]. All of these phrases refer to the circumstances in which a person has a virtual reality experience that advances technology [92][93][94]. Historically, VR has been more often used than XR (see Fig. 11). All immersive learning environments (ILEs) have one characteristic: they provides context for the technical, narrative, and challenging aspects [98][99][100]. ...
... The expanding effects of XR cannot be selected arbitrarily, since the brain often accepts what the eye believes to be real [58,88]. For instance, employment of XR in every scenario needs to be declarative knowledge [23,92,97]. When expenses, environmental sensitivity, and social awareness are addressed, XR may not be the optimal option in many modern situations [64,67,89]. ...
... VR for Medical Education (adapted from[73,84,[91][92][93][94][95][96][97]). ...
Article
Full-text available
Numerous studies have shown the efficacy of incorporating extended reality (XR) into higher education. In light of Covid-19 pandemic and the need to improve the virtual presence of remote students and workers, university authorities prioritize the integration of different XR alternatives into education. To choose the most appropriate virtual reality (VR) platforms and apps, instructional designers (IDs) must possess analytical, design, implementation, and evaluation skills. This article shows how unique identifiers may be used to assist in the creation of high-level resources. On the one hand, we offer recommendations for well-established design models, design approaches, and methodologies, and on the other hand, we suggest templates for research and funding leadership, as well as for evaluating XR for optimum usability in higher educational institution's applications.
... Data mining techniques and ANN have also been proposed in [11] to predict the instructor performance using questionnaires and scores as inputs. On the other hand, the students' performance has also been predicted using data mining techniques [12] and providing a detailed analysis in virtual learning environments with online teaching [13]. The analysis in [14] also aims at determining the academic performance by considering selfregulated learning indicators and online events engagement. ...
... The analysis in [14] also aims at determining the academic performance by considering selfregulated learning indicators and online events engagement. In [12]- [15], the focus is placed on the student or instructor performance, but the predictions do not provide a clear and specific information on how the course should be designed. Although the information from those analyses can be used by the teacher, the correlation between the educational action and the competences acquirement is absent. ...
Article
Full-text available
Although technical competences are fundamental at engineering degrees, industry is also requesting the promotion of transversal capabilities. Consequently, the map of target competences may vary over time, area and location. In this context, the design of an undergraduate course is not a trivial task if the promotion of several competences is desired. When such design is manually performed by the teacher using his/her previous experience, the perspective of the students and the information of the previous scores are usually disregarded. Furthermore, the determination of the optimal times for the different activities becomes complex to satisfy a multi-objective problem that aims at balancing technical and soft skills. This paper proposes the use of a predictive tool to assist the design of the course. On the one hand, the predictive algorithm automatically determines the duration of the different activities to fit a specific map of competences. Moreover, the predictive tool also offers valuable information about the perspective of the student and the influence of previous scores using objective indices. The assessment of the proposal is done in a course of Electrical Machines at the University of Malaga (Spain), confirming the capability of the proposed predictive tool to provide a valuable insight on the subject and to automatically determine the duration of different methodological tools.
... One of the main threads of research in learning analytics has been focused on predicting student success. Predicting students who may fail or underachieve may pave the way for the provision of appropriate support and proactive intervention (Conijn et al., 2017;Gašević et al., 2016;Ifenthaler & Yau, 2020). Three main themes of this research can be observed: 1) studies performed in limited settings (e.g., a single course); 2) studies performed in multiple courses, and 3) studies replicating other findings of similar studies (Ifenthaler & Yau, 2020;Li et al., 2017). ...
... The second type (multiple courses) is becoming increasingly common in learning analytics. Results from large-scale studies have reported noticeable variability in indicators of student success, as well as in the precision or portability of predictive models (Conijn et al., 2017;Gašević et al., 2016). This variability was reported across different institutions as well as within the same institution. ...
Article
Full-text available
There has been extensive research using centrality measures in educational settings. One of the most common lines of such research has tested network centrality measures as indicators of success. The increasing interest in centrality measures has been kindled by the proliferation of learning analytics. Previous works have been dominated by single-course case studies that have yielded inconclusive results regarding the consistency and suitability of centrality measures as indicators of academic achievement. Therefore, large-scale studies are needed to overcome the multiple limitations of existing research (limited datasets, selective and reporting bias, as well as limited statistical power). This study aims to empirically test and verify the role of centrality measures as indicators of success in collaborative learning. For this purpose, we attempted to reproduce the most commonly used centrality measures in the literature in all the courses of an institution over five years of education. The study included a large dataset (n=3,277) consisting of 69 course offerings, with similar pedagogical underpinnings, using meta-analysis as a method to pool the results of different courses. Our results show that degree and eigenvector centrality measures can be a consistent indicator of performance in collaborative settings. Betweenness and closeness centralities yielded uncertain predictive intervals and were less likely to replicate. Our results have shown moderate levels of heterogeneity, indicating some diversity of the results comparable to single laboratory replication studies.
... One of the main threads of research in learning analytics has been focused on predicting students' success. Predicting students who may fail or underachieve may pave the way for the provision of appropriate support and proactive intervention (Conijn et al., 2017;Gašević et al., 2016;Ifenthaler & Yau, 2020). Three main themes of research can be observed: 1) studies performed in limited settings (e.g., a single course); 2) studies performed in multiple courses, and 3) studies replicating other findings of similar studies (Ifenthaler & Yau, 2020;Li et al., 2017). ...
... The second type (multiple courses) is increasingly becoming common in learning analytics. Results from large-scale studies have reported noticeable variability in indicators of students' success, as well as in the precision or the portability of predictive models (Conijn et al., 2017;Gašević et al., 2016). The variability was reported across different institutions as well as within the same institution. ...
Preprint
Full-text available
There is extensive research using centrality measures in educational settings. One of the most common lines of such research has tested network centrality measures as indicators of success. The increasing interest in centrality measures has been kindled by the proliferation of learning analytics. Previous works have been dominated by single-course case studies which have yielded inconclusive results regarding the consistency and suitability of centrality measures as indicators of academic achievement. Therefore, large-scale studies are needed to overcome the multiple limitations of existing research (limited datasets, selective and reporting bias, as well as limited statistical power). This study aims to empirically test and verify the role of centrality measures as indicators of success in collaborative learning. For this purpose, we attempted to reproduce the most commonly used centrality measures in the literature in all the courses of an institution over five years of education. The study included a large dataset (n=3,277) consisting of 69 course offerings, with similar pedagogical underpinning, using meta-analysis as a method to pool the results of different courses. Our results show that degree and eigenvector centrality measures can be a consistent indicator of performance in collaborative settings. Betweenness and closeness centralities yielded uncertain predictive intervals and were less likely to replicate. Our results have shown moderate levels of heterogeneity, indicating some diversity of the results comparable to single laboratory replication studies. Notes for Practice (research paper) • Degree and Eigenvector centrality measures can be a consistent indicator of performance in settings in which course design emphasizes collaboration. • The correlation between degree and eigenvector centrality measures and academic achievement was reproducible regardless of the number of students, number of interactions, year of study, or course subject. • Closeness and betweenness centralities showed inconsistent correlation with performance. • Although our context was homogenous, there was moderate heterogeneity in the pooled effect sizes indicating the diversity of CSCL as a medium.
... Cross-validation has also been shown to be a valuable tool to enable LA models to generalize to independent educational records (Alexandro 2018). However, a small quantity of research found that crossvalidation produces models with lower mean AUC values, compared to models omitting this step (Kumar and Singh 2017 Lisitsyna and Oreshin 2019), while others reported poor prediction results when studying retention in different academic subjects (Campbell et al. 2007;Conijn et al. 2016). However, when focusing on implementing educational interventions to foster student success, Conijn et al. (2016) suggested using LMSs in conjunction with other academic and personal data records to provide greater insight into the student's academic and personal background. ...
... However, a small quantity of research found that crossvalidation produces models with lower mean AUC values, compared to models omitting this step (Kumar and Singh 2017 Lisitsyna and Oreshin 2019), while others reported poor prediction results when studying retention in different academic subjects (Campbell et al. 2007;Conijn et al. 2016). However, when focusing on implementing educational interventions to foster student success, Conijn et al. (2016) suggested using LMSs in conjunction with other academic and personal data records to provide greater insight into the student's academic and personal background. ...
Thesis
Monte Carlo simulation studies are used to examine how eight factors impact predictions of a binary target outcome in data science pipelines: (1) the choice of four DMMs [Logistic Regression (LR), Elastic Net Regression (GLMNET), Random Forest (RF), Extreme Gradient Boosting (XGBoost)], (2) the choice of three filter preprocessing feature selection techniques [Correlation Attribute Evaluation (CAE), Fisher’s Scoring Algorithm (FSA), Information Gain Attribute Evaluation (IG)], (3) number of training observations, (4) number of features, (5) error of measurement, (6) class imbalance magnitude, (7) missing data pattern, and (8) feature selection cutoff. The findings are consistent with literature about which data properties and algorithms perform best. Measurement error negatively impacted pipeline performance across all factors, DMMs, and feature selection techniques. A larger number of training observations ameliorated the decrease in predictive efficacy resulting from measurement error, different class imbalance magnitudes, missing data patterns, and feature selection cutoffs. GLMNET significantly outperformed all other DMMs, while CAE and FSA enhanced the performance of LR and GLMNET. A consensus ranking methodology integrating feature selection with cross-validation is presented. As an application, the data pipeline was used to forecast the performance of 3,225 students enrolled in a collegiate biology course using a corpus of 57 university and course-specific features at four time points (pre-course, weeks 3, 6, and 9). Borda’s method applied during cross-validation identified collegiate academic attributes and performance on concept inventory assessments as the primary features impacting student success. Performance variability of the pipeline was generally consistent with the results of the simulation studies. GLMNET exhibited the highest predictive efficacy with the least amount of variability in the area under the curve (AUC) metric. However, increasing the number of training observations did not always significantly enhance pipeline performance. The benefits of developing interpretable data pipelines are also discussed.
... Their results show that models predicting performance using generic features of engagement lack sufficient explanatory power, whereas adding course-specific features of engagement significantly improves prediction. Other LA studies similarly provided evidence on the relevance of such context-related factors observed externally, such as course design (Conijn, Snijders, Kleingeld, & Matzat, 2016;Marras, Vignoud, & Käser, 2021), as well internal to the learners, such as learner states (Jovanović, Saqr, Joksimović, & Gašević, 2021). The information about the context can differ, ranging from socially shared spaces to internal learner states, to learner similarity in developmental processes, or navigational paths that reflect the journey of an individual within a learning resource. ...
... For example, is context anything that remains unrepresented in trace data (e. g. Conijn et al., 2016;Marras et al., 2021)? Or is context a latent variable that impacts upon a student's outcomes (e.g. ...
Article
The ability to develop new skills and competencies is a central concept of lifelong learning. Research to date has largely focused on the processes and support individuals require to engage in upskilling, re-learning or training. However, there has been limited attention examining the types of support that are necessary to assist a learner’s transition from “old” workplace contexts to “new”. Professionals often undergo significant restructuring of their knowledge, skills, and identities as they transition between career roles, industries, and sectors. Domains such as learning analytics (LA) have the potential to support learners as they use the analysis of fine-grained data collected from education technologies. However, we argue that to support transitions throughout lifelong learning, LA needs fundamentally new analytical and methodological approaches. To enable insights, research needs to capture and explain variability, dynamics, and causal interactions between different levels of individual development, at varying time scales. Scholarly conceptions of the context in which transitions occur are also required. Our interdisciplinary argument builds on the synthesis of literature about transitions in the range of disciplinary and thematic domains such as conceptual change, shifts between educational systems, and changing roles during life course. We highlight specific areas in research designs and current analytical methods that hinder insight into transformational changes during transitions. The paper concludes with starting points and frameworks that can advance research in this area.
... The annual citations of the other five articles are not less than 3 (Agudo- Peregrina et al., 2014;Conijn et al., 2016;Rienties & Toetenel, 2016;Xing et al., 2016;You, 2016). Specifically, Agudo-Peregrina et al. (2014) explored the relations between interactions and academic performance in a virtual learning environment (VLE)-based course using multiple linear regression methods. ...
... Such results confirmed the importance of SRL in online education and suggested that meaningful learning behaviors were advantageous for course achievement prediction. Conijn et al. (2016) compared the portability of learner performance prediction models using correlation analysis and hierarchical regression analysis based on LMS data of 17 blended courses. Results showed no comprehensive set of variables for learner performance prediction. ...
Article
Full-text available
Learning analytics (LA) has become an increasingly active field focusing on leveraging learning process data to understand and improve teaching and learning. With the explosive growth in the number of studies concerning LA, it is significant to investigate its research status and trends, particularly the thematic structure. Based on 3900 LA articles published during the past decade, this study explores answers to questions such as “what research topics were the LA community interested in?” and “how did such research topics evolve?” by adopting structural topic modeling and bibliometrics. Major publication sources, countries/regions, institutions, and scientific collaborations were examined and visualized. Based on the analyses, we present suggestions for future LA research and discussions about important topics in the field. It is worth highlighting LA combining various innovative technologies (e.g., visual dashboards, neural networks, multimodal technologies, and open learner models) to support classroom orchestration, personalized recommendation/feedback, self-regulated learning in flipped classrooms, interaction in game-based and social learning. This work is useful in providing an overview of LA research, revealing the trends in LA practices, and suggesting future research directions.
... To achieve the study objectives, a structured questionnaire was used which contains: 1. The Questionnaire Background: The questionnaire built based on some studies found in literature such as (Mahdi H. R., 2014;Gabriel et al.,2020;Nistor et al.,2019;Conijn et al.,2017;Natarajan et al.,2018;Ahmad et al.,2020). 2. The Questionnaire Describe: An online survey questionnaire was developed to obtain the responses from undergraduate students at Al-Aqsa University. ...
... Various previous studies have shown the effectiveness of Moodle in education where achieved acceptance and use it as a technological product to the most researchers based on the opinions of teachers and students of higher education(Conijn et al, 2017;Nistor, Stanciu, Lerche, & Kiel, 2019; De Medio et al,2020;Mahdi & Hammad, 2020). In this context of educational research, the technology acceptance literature presented a rich collection of models and theories for explaining the adoption of information technology innovations. ...
Conference Paper
Full-text available
International Journal of Youth Economy- Volume 5 - Issue 2 (2021)
... Various previous studies have shown the effectiveness of Moodle in education where achieved acceptance and use it as a technological product to the most researchers based on the opinions of teachers and students of higher education (Conijn et al, 2017;Nistor, Stanciu, Lerche, & Kiel, 2019;De Medio et al,2020;Mahdi & Hammad, 2020). In this context of educational research, the technology acceptance literature presented a rich collection of models and theories for explaining the adoption of information technology innovations. ...
... To achieve the study objectives, a structured questionnaire was used which contains: 1. The Questionnaire Background: The questionnaire built based on some studies found in literature such as (Mahdi H. R., 2014;Gabriel et al.,2020;Nistor et al.,2019;Conijn et al.,2017;Natarajan et al.,2018;Ahmad et al.,2020). 2. The Questionnaire Describe: An online survey questionnaire was developed to obtain the responses from undergraduate students at Al-Aqsa University. ...
Article
Full-text available
After COVID-19, Al-Aqsa university in Palestine has been widely using online learning to manage the university education. To encourage the increased adoption of online learning among students, this study investigated the factors that influenced the existing Al-Aqsa university students’ continuance intention to use online learning. The Technology Acceptance Model (TAM) has been used here as a theoretical basis. A model using structural equation modelling was developed for an Al-Aqsa university students’ continuance intention to use online learning. Survey-based data were collected in Al-Aqsa university from 265 students. This study found that all six constructs, namely, perceived usefulness (β= 0.171), perceived ease of use (β= 0.779), perceived irreplaceability (β= 0.168), perceived credibility (β= 0.688), and compatibility (β= 0.192) had a positive influence on Al-Aqsa university students’ continuance intention to use online learning.
... They found that the naïve Bayes classifier model and an ensemble model using a sequence of models had the best results among the seven modeling methods tested. Conijn et al. [11] analyzed the relevant literature and summarized predictor variables for performance prediction. According to log data in the Moodle platform, they extracted predictive variables such as page viewing, resource viewing, quiz, assignment, wiki, and forum discussion to predict final exam grades using a multiple linear regression model. ...
... However, the max time interval of quizzes not submitted (NQT) shows a certain correlation with academic performance (r > 0.3), indicating the learning inactivity measured by the feature can also affect the learner's course learning. In general, the results show that it may be useful to compare students not only with their peers, but also with their own behavior in other courses [11]. According to the statistics of course information, 1011 learners took two courses C4 and C10 at the same time, from which it can be explored whether there is a difference in the relevance of the same behaviors to academic performance in the two courses. ...
Article
Full-text available
In recent years, massive open online courses (MOOCs) have received widespread attention owing to their flexibility and free access, which has attracted millions of online learners to participate in courses. With the wide application of MOOCs in educational institutions, a large amount of learners' log data exist in the MOOCs platform, and this lays a solid data foundation for exploring learners' online learning behaviors. Using data mining techniques to process these log data and then analyze the relationship between learner behavior and academic performance has become a hot topic of research. Firstly, this paper summarizes the commonly used predictive models in the relevant research fields. Based on the behavior log data of learners participating in 12 courses in MOOCs, an entropy-based indicator quantifying behavior change trends is proposed, which explores the relationships between behavior change trends and learners' academic performance. Next, we build a set of behavioral features, which further analyze the relationships between behaviors and academic performance. The results demonstrate that entropy has a certain correlation with the corresponding behavior, which can effectively represent the change trends of behavior. Finally, to verify the effectiveness and importance of the predictive features, we choose four benchmark models to predict learners' academic performance and compare them with the previous relevant research results. The results show that the proposed feature selection-based model can effectively identify the key features and obtain good prediction performance. Furthermore, our prediction results are better than the related studies in the performance prediction based on the same Xuetang MOOC platform, which demonstrates that the combination of the selected learner-related features (behavioral features + behavior entropy) can lead to a much better prediction performance.
... The generalization of student performance models covering blended courses was researched on the twin attributes -students' study habits and social interactions (Gitinabard et al., 2019). Another study was conducted to test the portability of models based on LMS variables across various courses (Conijn et al., 2016), as shown in Table 1. In the traditional educational system, the studies were conducted on the students of different universities to understand the generalization of the learner model on different populations (Nghe et al., 2007). ...
... Formative assessment data with past summative scores are better predictors of student performance Romero et al. (2013) The online discussion forum activities of students are also used to predict student performance Discussion board logs Early predictions have less accuracy. Clustering and association rules have better accuracy in predictions Conijn et al. (2016) To implement a weekly prediction on the student's success rate for a course, the student performance in that course, and the final grade of the student in that course. ...
Article
Full-text available
In recent times, Educational Data Mining and Learning Analytics have been abundantly used to model decision-making to improve teaching/learning ecosystems. However, the adaptation of student models in different domains/courses needs a balance between the generalization and context specificity to reduce the redundancy in creating domain-specific models. This paper explores the predictive power and generalization of a feature - context-bound cognitive skill score- in estimating the likelihood of success or failure of a student in a traditional higher education course so that the appropriate intervention is provided to help the students. To identify the students at risk in different courses, we applied classification algorithms on context-bound cognitive skill scores of a student to estimate the chances of success or failure, especially failure. The context-bound cognitive skill scores were aggregated based on the learning objective of a course to generate meaningful visual feedback to teachers and students so that they can understand why some students are predicted to be at risk. Evaluation of the generated model shows that this feature is applicable in a range of courses, and it mitigates the effort in engineering features/models for each domain. We submit that overall, context-bound cognitive skill scores prove to be effective in flagging the student performance when the accurate metrics related to learning activities and social behaviors of the students are unavailable.
... Scopus was used for the citation count unless the article was only available in WOS; then, the WOS citation count per article was used. The 155 journal articles reviewed in this study have a combined citation count of 608 with the most cited (71 times) being a review article comparing 17 blended courses using Moodle LMS (Conijn et al., 2017). Total citation counts of the articles by published year were 95 in 2015, 92 in 2016, 270 in 2017, 83 in 2018, 50 in 2019, and 21 in 2020. ...
... Of the top 10 cited articles (listed in Table 3), five articles were published in 2017, accounting for 198 citations of the total 270 for that year, with the remaining 72 citations across 24 papers. Of the top 10 authors, four are attributed to the top-cited paper (Conijn et al., 2017). All the top 10 cited authors have articles in the top 10 cited list (Table 4). ...
Article
Full-text available
Background The Moodle Learning Management System (LMS) is widely used in online teaching and learning, especially in STEM education. However, educational research on using Moodle is scattered throughout the literature. Therefore, this review aims to summarise this research to assist three sets of stakeholders—educators, researchers, and software developers. It identifies: (a) how and where Moodle has been adopted; (b) what the concerns, trends, and gaps are to lead future research and software development; and (c) innovative and effective methods for improving online teaching and learning. The review used the 4-step PRISMA-P process to identify 155 suitable journal articles from 104 journals in 55 countries published from January 2015 to June 2021. The database search was conducted with Scopus and Web of Science. Insights into the educational use of Moodle were determined through bibliometric analysis with Vosviewer outputs and thematic analysis. Results This review shows that Moodle is mainly used within University STEM disciplines and effectively improves student performance, satisfaction, and engagement. Moodle is increasingly being used as a platform for adaptive and collaborative learning and used to improve online assessments. The use of Moodle is developing rapidly to address academic integrity, ethics, and security issues to enhance speed and navigation, and incorporate artificial intelligence. Conclusion More qualitative research is required on the use of Moodle, particularly investigating educators’ perspectives. Further research is also needed on the use of Moodle in non-STEM and non-tertiary disciplines. Further studies need to incorporate educational theories when designing courses using the Moodle platform.
... The researchers used two machine learning models to be trained on three coherent clusters of students who were grouped based on the similarity of specific education-related factors and metrics in order to predict the time to degree completion and student enrollment in the offered educational programs. Moreover, the authors in [18] obtained educational students' from e-learning at Eindhoven University of Technology (TU/e), Netherlands, of 2014/2015. This dataset contains 4,989 students from 17 courses. ...
Article
Full-text available
Recently, the COVID-19 pandemic has triggered different behaviors in education, especially during the lockdown, to contain the virus outbreak in the world. As a result, educational institutions worldwide are currently using online learning platforms to maintain their education presence. This research paper introduces and examines a dataset, E-LearningDJUST, that represents a sample of the student’s study progress during the pandemic at Jordan University of Science and Technology (JUST). The dataset depicts a sample of the university’s students as it includes 9,246 students from 11 faculties taking four courses in spring 2020, summer 2020, and fall 2021 semesters. To the best of our knowledge, it is the first collected dataset that reflects the students’ study progress within a Jordanian institute using e-learning system records. One of this work’s key findings is observing a high correlation between e-learning events and the final grades out of 100. Thus, the E-LearningDJUST dataset has been experimented with two robust machine learning models (Random Forest and XGBoost) and one simple deep learning model (Feed Forward Neural Network) to predict students’ performances. Using RMSE as the primary evaluation criteria, the RMSE values range between 7 and 17. Among the other main findings, the application of feature selection with the random forest leads to better prediction results for all courses as the RMSE difference ranges between (0–0.20). Finally, a comparison study examined students’ grades before and after the Coronavirus pandemic to understand how it impacted their grades. A high success rate has been observed during the pandemic compared to what it was before, and this is expected because the exams were online. However, the proportion of students with high marks remained similar to that of pre-pandemic courses.
... In this study, students reported that the gamified online discussion environment was fun and interesting compared to the traditional online discussion environment. (Conijn et al., 2017) conducted a study on "Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS". In this study, we analyzed 17 mixed courses attended by 4,989 students in one institution using Moodle LMS. ...
Article
Full-text available
This study was used to determine the feasibility of the instrument before being used for further research. This study uses a quasi-experimental research design. In this study, to calculate the validation results from three validators who are experts in their fields using the arithmetic average, the validated instruments are: (1) the validation questions get 87.51% results; (2) validation of attitude assessment got 90.27% results; (3) validation of student skills assessment got 92.71% results; (4) the validation of the assessment of the Learning Implementation Plan (RPP) got 92.16% results; (5) validation of the syllabus assessment got 86.95% results; (6) Moodle assessment validation with browser exam got 87.25% results; (7) the validation of the student learning module assessment got 84.71% results; (8) the validity of the items is declared valid; (9) the reliability of the items is declared reliable; (10) the differentiating power of the items has very good criteria; and (11) the level of difficulty of the items showed that of the 40 items tested, the percentage was 35% (14 questions) in the medium category, and 65% (26 questions) in the easy category. From the calculation results that have been mentioned, the Moodle application instrument with a browser exam to measure online learning outcomes is declared feasible and can be used for further research.
... In [8], the research analyzed 17 blended courses with 4,989 students on Moodle LMS, using logistic models (pass/fail) and standard regression (final grade). They predict student performance in the first 10 weeks. ...
... The syllabi-employed-tabular environment is most popular for problem-employed-teaching-training [35] and problem-solving situations [36], which aspire to gather students' join-collaborative study actions and physical interactions [37], and teaching-training resources, students-teachers computational practices, and interaction results are composed in a list form and table form [38]. The learning management system is a admired method which used as the best learning-resource for the educational organizations, and it is largely approved to support teaching-learning activities [39] to gather students-learners online behaviors [40] and used to predict academic performance and consequences [41]. These software applications and programs are providing browser-based environment to assessment the course contents by objectives, allow them upload and download course contents and submit continuous assessments and assignments through weblinks which dynamically added to or removed from the web-applications based on resource integration and availability [42]. ...
... Additionally, it is interesting to note that some learners affirm they invest more time and effort in their online assignments; and even when this happens, it could be assumed that students would be achieving more in their academic performance (Cerezo et al., 2016;Conijn et al., 2017;Joksimović et al., 2015;Motz et al., 2019), a study carried out by Motz et al. (2021) found the opposite. Students who invested more effort in their tasks earned lower grades and felt less successful than when doing their schoolwork in normal circumstances. ...
Article
Full-text available
p style="text-align: justify;">Coronavirus disease (COVID-19) Pandemic changed education conditions worldwide forcing all the parties involved to adapt to a new system. This study aimed to collect information related to the effects of teaching English online on English as a Foreign Language (EFL) students’ achievement. Data were collected from EFL teachers and students enrolled in three different Ecuadorian Universities (Technical University of Ambato, Higher Polytechnic School of Chimborazo, and University of Cuenca) from five different levels: A1, A2, B1, B1+, and B2. This preliminary paper reports the results of 480 students regarding four major sections: pedagogical practice and assessment, learning outcomes, affective factors and perceptions of students about the advantages and disadvantages of online learning during the pandemic COVID-19; considering the Hierarchy of online learning needs of Justin Shewell. An online survey questionnaire with 17 questions and a 5-point Likert scale was applied. The Cronbach's Alpha test presented 0.84 and 0.73 level of reliability. The Kolmogorov Smirnov’s statistic and, the Kendall's Tau_b tests, and the Levene's test for homogeneity of variances were performed with the SPSS statistical program. The results made evident that online learning affects academic achievement in EFL students during the COVID-19 pandemic, which was confirmed after analyzing four main areas: pedagogical practices and assessment, learning outcomes, affective factors and students’ perceptions about the advantages and disadvantages of online learning. The importance of online learning was highlighted since it has been understood as a tool to face the emergency produced by the COVID-19 pandemic.</p
... To measure the portability of the prediction models, ordinary least square regressions were used (Conijn et al. 2017). Ordinary least squares regressions were run to determine the effects of the predictors. ...
Article
Full-text available
Weather prediction is one of the challenging issues around the world. It is necessary to determine the effective use of water resources and forecasting weather-related disasters. The emerging machine learning techniques are coupled with the large set of weather dataset to forecast weather. Rainfall depends on a lot of weather attributes. The dataset may have relevant and irrelevant attributes. In this paper, two supervised learning algorithms are proposed to forecast the weather. In the first method, selected features are fed into the multiple linear regression model for training. Then, the prediction is performed with good accuracy of 82%. In the second method, to reduce the error rate of the deep learning algorithm we need to encode the cyclical features before applying the deep learning algorithm. Then, tuning hyperparameters in the n-hidden-layered networks improved the performance of the model with good accuracy of 92.32%.
... There may be predictors which are used, and their values are computed only in those models where their presence are relevant (related to activities before the time of prediction) (Đambić, Mirjana and Daniel, 2016;Erkan, 2012) or cross-course predictors kept in each model, at each time of prediction. (Conijn, Snijders, Kleingeld and Matzat, 2016;Kondo, Okubo and Hatanaka, 2017) The goodness of these cross-course predictors models may be adversely affected by the small number of relevant predictors at a given time instant. Predictors that relate to activities that are due later in the course hold few pieces of information (irrelevant) at a specific time instant of prediction and have values of zero in many cases. ...
Article
Full-text available
The relation between an educational target and a set of predictors related to the learners and their learning activities in a given learning context can be investigated by predictive Machine Learning (ML) modelling. For courses with many students and with predictors that well reflect the specialties of the courses, the predictive power of "classical" ML models generally meets expectations. At the same time, even large universities have several courses where the small number of students does not allow the use of "classical" ML models, although the need to forecast student performance also appears for these courses. In this study, considering the research on the Applied Statistics course with the participation of 56 full-time students at the University of Dunaújváros, we present various model building techniques that can be used to increase the predictive power of models. We systematically show different model building technics starting from less effective technics to more developed ones. These developed ones are applicable even for small or mid-sized university courses producing a monotonically increasing good performance metrics in time. The conditions and limits of their applicability are also discussed.
... Researchers focus on the LMS recorded behavioural data for prediction, based on the assumption that records in the LMS can represent certain behaviours or traits of the user. These behaviours or traits are associated with their academic performance (Conijn et al., 2016;Dominguez et al., 2016;Shruthi and Chaitra, 2016;Adejo and Connolly, 2018;Helal et al., 2018;Sandoval et al., 2018;Akçapınar et al., 2019;Liao et al., 2019;Sukhbaatar et al., 2019;Mubarak et al., 2020b;Waheed et al., 2020). Different studies are concerned with different issues. ...
Article
Full-text available
Anomalies in education affect the personal careers of students and universities' retention rates. Understanding the laws behind educational anomalies promotes the development of individual students and improves the overall quality of education. However, the inaccessibility of educational data hinders the development of the field. Previous research in this field used questionnaires, which are time- and cost-consuming and hardly applicable to large-scale student cohorts. With the popularity of educational management systems and the rise of online education during the prevalence of COVID-19, a large amount of educational data is available online and offline, providing an unprecedented opportunity to explore educational anomalies from a data-driven perspective. As an emerging field, educational anomaly analytics rapidly attracts scholars from a variety of fields, including education, psychology, sociology, and computer science. This paper intends to provide a comprehensive review of data-driven analytics of educational anomalies from a methodological standpoint. We focus on the following five types of research that received the most attention: course failure prediction, dropout prediction, mental health problems detection, prediction of difficulty in graduation, and prediction of difficulty in employment. Then, we discuss the challenges of current related research. This study aims to provide references for educational policymaking while promoting the development of educational anomaly analytics as a growing field.
... To engage students in a learning environment, teachers are using more IT tools to encourage students in a classroom, such as polling tools, Google documents, and Kahoot (Zacharis, 2015). Existing studies of academic performance have omitted Learning Management System (LMS) data that is incomplete (Conijn et al., 2017;Macfadyen & Dawson, 2010). Some of the overlapping and complementary parts among the three theories can be captured using computer-based learning. ...
Article
Full-text available
Educators in higher education institutes often use statistical results obtained from their online Learning Management System (LMS) dataset, which has limitations, to evaluate student academic performance. This study differs from the current body of literature by including an additional dataset that advances the knowledge about factors affecting student academic performance. The key aims of this study are fourfold. First, is to fill the educational literature gap by applying machine learning techniques in educational data mining, making use of the Internet usage behaviour log files and LMS data. Second, LMS data and Internet usage log files were analysed with machine learning techniques for predicting at-risk-of-failure students, with greater explanation added by combining student demographic data. Third, the demographic features help to explain the prediction in understandable terms for educators. Fourth, the study used a range of Internet usage data, which were categorized according to type of usage data and type of web browsing data to increase prediction accuracy.
... e principles for the quality of students were not studied so far, and the perceptions of all stakeholders, namely, industries, faculty, student, and the requirements for quality of faculty, were also completely ignored. As the higher education system is undergoing a colossal change, with privatization and globalization of education, this study will aid the development of the system by bringing in a socially relevant tool [30] and suggestions to the policymakers which will enhance the quality of higher education institutions in India. ...
Article
Full-text available
EDM and LA are two fields that study how to use facts to get more academic learning and enhance the students’ entire performance. Both areas are concerned with a broad range of issues such as curriculum strategies, coaching, mental well-being of students, learning motivation, and academic achievement. The COVID-19 pandemic highly disrupted the higher education sector and shifted the old, chalk-talk teaching-learning model to an online learning format. This meant that the structure and nature of teaching, learning, assessment, and feedback methodologies also changes. With the empowerment in technology, timely and effective feedback is provided by the teachers to achieve greater learning. Through these studies, it is noted that negative feedback discourages the effort and achievement of learners, so it should be carefully crafted and delivered. In this work, a new methodology is planned based on an improved FCN (fully connected network). The key impartial of the proposed method is to regulate the assessment of the quality of students in Higher Education HE. The proposed methodology is composed of different phases: The first phase is data acquisition, in which the data are gathered from various sources for training and testing of the proposed method. The second phase is data orientation, in which the information is oriented in a specific file format. After that, data are cleaned, and preprocessing methods are applied. In the fourth phase, a machine learning-based model is developed to predict student’s academic performance. The fully connected neural network is enhanced with LA to better assess student quality in higher education. The proposed work is evaluated with the OULAD database, which was gathered from the students of Open University. The proposed methodology has attained an accuracy of 84%, more significant than the conventional ANN model accuracy rate. The proposed methodology’s Recall, F1-score, and precision rates are 0.88, 0.91, and 0.93, respectively.
... Overall, these contradictory assumptions on the usefulness of broad log data indicators go along with inconsistent findings on the association between those indicators of general online activity and learning outcomes: some studies reported no association (Broadbent, 2016), negative correlations (e.g., Ransdell & Gaillard-Kenney, 2009;Strang, 2016), or positive correlations (e.g., Liu & Feng, 2011;McCuaig & Baldwin, 2012;Saqr et al., 2017). Other studies that examined various online courses simultaneously obtained mixed results across different courses (e.g., Conijn et al., 2017;Gašević et al., 2016), indicating that online courses might be too heterogeneous to draw a general conclusion about the link between general online activity and learning outcomes. ...
Article
Full-text available
Analyzing log data from digital learning environments provides information about online learning. However, it remains unclear how this information can be transferred to psychologically meaningful variables or how it is linked to learning outcomes. The present study summarizes findings on correlations between general online activity and learning outcomes in university settings. The course format, instructions to engage in online discussions, requirements, operationalization of general online activity, and publication year are considered moderators. A multi-source search provided 41 studies ( N = 28,986) reporting 69 independent samples and 104 effect sizes. The three-level random-effects meta-analysis identified a pooled effect of r = .25 p = .003, 95% CI [.09, .41], indicating that students who are more active online have better grades. Despite high heterogeneity, Q(103) = 3,960.04, p < .001, moderator analyses showed no statistically significant effect. We discuss further potential influencing factors in online courses and highlight the potential of learning analytics.
... Conijn et al. [20] analyzed 17 blended courses with 4989 students using Moodle LMS. Their objective was to predict students' final grades from LMS predictor variables and from Electronics 2022, 11, 468 4 of 23 in-between assessment grades, using logistic (pass/fail) and standard regression (final grade) models. ...
Article
Full-text available
Educational data mining is a process that aims at discovering patterns that provide insight into teaching and learning processes. This work uses Machine Learning techniques to create a student performance prediction model, using academic data and records from a Learning Management System, that correlates with success or failure in completing the course. Six algorithms were employed, with models trained at three different stages of their two-year course completion. We tested the models with records of 394 students from 3 courses. Random Forest provided the best results with 84.47% on the F1 score in our experiments, followed by Decision Tree obtaining similar results in the first subjects. We also employ clustering techniques and find different behavior groups with a strong correlation to performance. This work contributes to predicting students at risk of dropping out, offers insight into understanding student behavior, and provides a support mechanism for academic managers to take corrective and preventive actions on this problem.
... As with other LMS tools, Moodle collects a large amount of data including all the interactions of registered participants (students, teachers, and editors, among others). All these data can offer insight into the online behavior of students, improving both learning and teaching [3]. ...
Article
Full-text available
An inherent requirement of teaching using online learning platforms is that the teacher must analyze student activity and performance in relation to course learning objectives. Therefore, all e-learning environments implement a module to collect such information. Nevertheless, these raw data must be processed to perform e-learning analysis and to help teachers arrive at relevant decisions for the teaching process. In this paper, UBUMonitor is presented, an open-source desktop application that downloads Moodle (Modular Object-Oriented Dynamic Learning Environment) platform data, so that student activity and performance can be monitored. The application organizes and summarizes these data in various customizable charts for visual analysis. The general features and uses of UBUMonitor are described, as are some approaches to e-teaching improvements, through real case studies. These include the analysis of accesses per e-learning object, statistical analysis of grading e-activities, detection of e-learning object configuration errors, checking of teacher activity, and comparisons between online and blended learning profiles. As an open-source application, UBUMonitor was institutionally adopted as an official tool and validated with several groups of teachers at the Teacher Training Institute of the University of Burgos.
... Conijn et al. furthered the analysis of the portability of prediction models and the accuracy of timely prediction [11]. ...
Preprint
Full-text available
Learning management systems (LMSs) have become essential in higher education and play an important role in helping educational institutions to promote student success. Traditionally, LMSs have been used by postsecondary institutions in administration, reporting, and delivery of educational content. In this paper, we present an additional use of LMS by using its data logs to perform data-analytics and identify academically at-risk students. The data-driven insights would allow educational institutions and educators to develop and implement pedagogical interventions targeting academically at-risk students. We used anonymized data logs created by Brightspace LMS during fall 2019, spring 2020, and fall 2020 semesters at our college. Supervised machine learning algorithms were used to predict the final course performance of students, and several algorithms were found to perform well with accuracy above 90%. SHAP value method was used to assess the relative importance of features used in the predictive models. Unsupervised learning was also used to group students into different clusters based on the similarities in their interaction/involvement with LMS. In both of supervised and unsupervised learning, we identified two most-important features (Number_Of_Assignment_Submissions and Content_Completed). More importantly, our study lays a foundation and provides a framework for developing a real-time data analytics metric that may be incorporated into a LMS.
Conference Paper
Full-text available
Commonly, time limits are a necessary part of every exam, and they may introduce an unintended influence such as speededness on the item and ability parameter estimates as they have not been accounted for while modeling the latent ability. The changepoint analysis (CPA) method can be used to obtain more accurate parameters by detecting speeded examinees, the location of change-point, removing speeded responses, and reestimating parameters. In the current study, several examinees were detected as speeded across five sections of a 250-item exam, and two main patterns were observed. In addition, speededness was further investigated using response times (RTs) per item and two patterns were observed for examinees with a decrease in performance after the estimated changepoint. Recommendations for practitioners, limitations, and future research were discussed in the conclusion section.
Article
The continuous increase in tuition fees in high education in many countries requires justification by the university authorities of what students receive from them in return. One of the key factors of student recruitment is values for money and quality learning experiences including hands-on industry training that can guarantee immediate employment for the graduates. This article describes redesigning the curriculum of a cloud computing undergraduate module in collaboration with Amazon Web Services (AWS) Academy. Industry-based practical hands-on labs were incorporated into this module for engineering students to improve their practical knowledge and skills related to the Internet of Things. Through an innovative approach, this practitioner research introduces industry best practices and hands-on labs in cloud computing. In this approach, academic theories were incorporated in cloud computing with their applications through industry attachment. It enables students to have both the theoretical and practical knowledge and skills for ensuring their careers in the field of cloud computing. The study finds that students tend to be more engaged and learn better when theoretical knowledge and understanding are combined with real-world applications through the attachment with the industry.
Article
In this project we proposed an automatic student performance and assessment generation models in e-learning using machine learning algorithms. Our proposed model will find out Students performance by using their behavior. Behavioral data like study material searching, video accessing time, and submission dates, assignment marks, question asking behavior etc. will be tracked into the database. In e-learning assessment generation is a very important and time consuming activity for teachers. Therefore to solve this issue we proposed automatic assessment generation model which will use Formal Concept Analysis algorithm. Formal Concept Analysis algorithm will be used to extract knowledge from the question-answers. A Learning Management System(LMS) is an application software that plays a vital role in educational technology. Such software can be designed to augment and facilitate instructional activities including registration and management of education courses, analyzing skill gaps, reporting, and delivery of electronic courses simultaneously.
Article
Finding students at high risk of poor academic performance as early as possible plays an important role in improving education quality. To do so, most existing studies have used the traditional machine learning algorithms to predict students’ achievement based on their behavior data, from which behavior features are extracted manually thanks to expert experience and knowledge. However, owing to an increase in the varieties and overall volume of behavioral data, it has become more and more challenging to identify high-quality handcrafted features. In this paper, we propose an end-to-end deep learning model that automatically extracts features from students’ multi-source heterogeneous behavior data to predict academic performance. The key innovation of this model is that it uses long short-term memory networks to capture inherent time-series features for each type of behavior, and it takes two-dimensional convolutional networks to extract correlation features among different behaviors. We conducted experiments with four types of daily behavior data from students of the university in Beijing. The experimental results demonstrate that the proposed deep model method outperforms several machine learning algorithms.
Article
Purpose Student attritions in tertiary educational institutes may play a significant role to achieve core values leading towards strategic mission and financial well-being. Analysis of data generated from student interaction with learning management systems (LMSs) in blended learning (BL) environments may assist with the identification of students at risk of failing, but to what extent this may be possible is unknown. However, existing studies are limited to address the issues at a significant scale. Design/methodology/approach This study develops a new approach harnessing applications of machine learning (ML) models on a dataset, that is publicly available, relevant to student attrition to identify potential students at risk. The dataset consists of the data generated by the interaction of students with LMS for their BL environment. Findings Identifying students at risk through an innovative approach will promote timely intervention in the learning process, such as for improving student academic progress. To evaluate the performance of the proposed approach, the accuracy is compared with other representational ML methods. Originality/value The best ML algorithm random forest with 85% is selected to support educators in implementing various pedagogical practices to improve students’ learning.
Chapter
The article is devoted to an important today’s issue - the analysis of a student’s digital footprint. The authors present their experience in research of a large amount of data collected within the Moodle learning platform during their hybrid course of continuing vocational education. The difficulties encountered in processing such data are considered, and a tool for overcoming them is proposed - the SAP Predictive Analytics Desktop software, available to universities for free within the framework of the SAP University Alliance global academic initiative. The result of using this tool to build models in order to find answers to such questions as the probability of successful completion of the course by a student, the influence of various factors on the level of motivation during training, the search for criteria for selecting students for the course, etc. is shown. The mechanism of clustering of trainees using SAP Predictive Analytics is also considered, which can be used to select leaders in educational process with the purpose of their further promotion (admission to internships, participation in projects, etc.). The article may be useful for university lecturers developing and conducting hybrid and online courses, as well as for any researchers interested in the analysis of the educational process.
Conference Paper
Being frightened from becoming technologically outdated, higher educational institutions are nowadays competing to deploy the most advanced technologies in their teaching activities. With the exponential growth of technology, this becomes more challenging and resource-demanding; hence, the selection of a sustainable and effective technological solution that best supports the teaching, learning, sociological and pedagogical aspects within the institution becomes crucial. This paper investigates the effectiveness of tablet technology for in-class material delivery from the perspective of the students as well as two other learning tools that are provided by this technology which are: digital in-class written notes and video-recorded sessions. The study is conducted at the Australian College of Kuwait and involves three faculty members, four courses, and a total of 100 students from the Electrical Engineering department. The tablet device is a 2-in-1 stylus-enabled reversible laptop that can be used by the instructors for teaching, research, and other related activities. A quantitative methodology through the use of a questionnaire is then used to evaluate the intention to use, satisfaction, and effectiveness factors for each of the teaching and learning tools provided by tablets. The questionnaire results are finally analyzed through SPSS 25.0 software to draw a conclusion on each of these factors and their relationship with each of the analyzed teaching and learning tools.
Article
Audience Response Systems like clickers are gaining much attention for early identification of at-risk students as quality education, student success rate and retention are major concerning areas, as evidenced in this COVID scenario. Usage of this active learning strategy across the varying strength of classrooms are found to be much effective in retaining the attention, retention and learning power of the students. However, implementing clickers for large classrooms incur overhead costs on instructor's part. As a result, educational researchers are experimenting with various lightweight alternatives. This paper discusses one such alternative: lightweight formative assessments for blended learning environments. It discusses their implementation and effectiveness in early identification of at-risk students. This study validates the usage of lightweight assessments for three core pedagogically different courses of large computer science engineering classrooms. It uses voting ensemble classifier for effective predictions. With the usage of lightweight assessments in early identification of at-risk students, accuracy range of 87%–94.7% have been achieved along-with high ROC-AUC values. The study also proposes the generalized pedagogical architecture for fitting in these lightweight assessments within the course curriculum of pedagogically different courses. With the constructive outcomes, the light-weight assessments seem to be promising for efficient handling of scaling technical classrooms.
Chapter
This paper takes social network text as the research object, and introduces SVM model into emotion analysis and recognition system to effectively improve the accuracy of emotion classification of network text, master the development law of network public opinion, effectively study and judge college students’ social network information interaction behavior, and intervene in high-risk network information interaction behavior in real time, Guide college students to make correct use of network information resources and better grasp the ideological and behavioral state of college students and the law of network information interaction. Establish a fast and efficient crisis early warning and feedback system to make the psychological crisis prevention work achieve early detection, early prevention and early treatment, cultivate students’ psychological capital, strengthen psychological quality education, and effectively improve students’ mental health level.
Article
Over the past decade, the field of education has seen stark changes in the way that data are collected and leveraged to support high-stakes decision-making. Utilizing big data as a meaningful lens to inform teaching and learning can increase academic success. Data-driven research has been conducted to understand student learning performance, such as predicting at-risk students at an early stage and recommending tailored interventions to support services. However, few studies in veterinary education have adopted Learning Analytics. This article examines the adoption of Learning Analytics by using the retrospective data from the first-year professional Doctor of Veterinary Medicine program. The article gives detailed examples of predicting six courses from week 0 (i.e., before the classes started) to week 14 in the semester of Spring 2018. The weekly models for each course showed the change of prediction results as well as the comparison between the prediction results and students’ actual performance. From the prediction models, at-risk students were successfully identified at the early stage, which would help inform instructors to pay more attention to them at this point.
Article
Learning outcomes can be predicted with machine learning algorithms that assess students’ online behavior data. However, there have been few generalized predictive models for a large number of blended courses in different disciplines and in different cohorts. In this study, we examined learning outcomes in terms of learning data in all of the blended courses offered at a Chinese university and proposed a new classification method of blended courses, in which students were primarily clustered on the basis of their online learning behaviors in blended courses using the expectation–maximization algorithm. Then, the blended courses were classified on the basis of the cluster of students who were present in the course and had the highest proportion. The advantage of this method is that the criteria used for classification of the blended courses are clearly defined on the basis of students' online behavior data, so it can easily be used by machine learning systems to algorithmically classify blended courses based on log data collected from a learning management system. Drawing on the classification of the blended courses, we also proposed and validated a general model using the random forest algorithm to predict learning outcomes based on students’ online behaviors in blended courses with different disciplines and different cohorts. The findings of this study indicated that after blended courses were classified on the basis of students’ online behavior, prediction accuracy in each category increased. The overall accuracies for Course I (380 courses out of 661 after screening), L (14 courses out of 661 after screening), A (237 courses out of 661 after screening), V (8 courses out of 661 after screening), and H (22 courses out of 661 after screening) were 38.2%, 48.4%, 42.3%, 42.4%, and 74.7%, respectively. According to these results, it was found that a prerequisite for the accurate prediction of students' learning outcomes in a blended course was that most students should be highly engaged in a variety of online learning activities rather than being focused on only one type of activity, such as only watching online videos or submitting online assignments. The prediction model achieved accuracies of 80.6%, 85.3%, 63%, 54.8%, and 14.3% for grades A, B, C, D, and F in Course H, respectively. The results demonstrated the potential of the proposed model for accurately predicting learning outcomes in blended courses. Finally, we found that there was no single online learning behavior that had a dominant effect on the prediction of students' final grades.
Article
When addressing analysis and prediction problems in a specific domain based on big data processing, the following problems often arise: only relationships between features in the domain itself are considered, and existing methods are not effective for training models on small sample data sets. The traditional approach usually obtains the relationships between single-domain features. Analysis and forecasting in the problem domain alone can quickly achieve good accuracy, but due to the limitations of the analysis domain, it becomes increasingly difficult to further improve the prediction accuracy. This paper proposes a novel data analysis approach compatible with small sample sets called multidomain data depth analysis (MODE). In contrast to traditional approaches, MODE emphasizes multidomain data and considers the relationships among feature domains in the original data. The features in each domain are orthogonally extracted, and feature dimensions are expanded in accordance with the characteristics of small data sets. A better prediction model can be obtained by using the expanded and strengthened features. We apply this approach to real big data from the field of sociology to predict annual income based on census data in experiments. The experimental results show that MODE offers a better prediction effect based on small multidomain samples.
Article
Full-text available
Los métodos tradicionales de predicción del riesgo académico en ocasiones presentan limitaciones para la identificación oportuna, por otro lado, las Analíticas de Aprendizaje presentan ciertas ventajas. El objetivo de este estudio fue analizar características de los modelos predictivos basados en analíticas de aprendizaje en Educación Superior. Se realizó una revisión sistemática de las bases Web of Science, Scopus y Eric usando las palabras clave "analítica de aprendizaje" y "predicción". Se seleccionaron 12 investigaciones que cumplieron con los criterios de inclusión. Los resultados indicaron que el 100% de los estudios buscaron predecir el rendimiento académico, se incluyen variables de analíticas, sociodemográficas y sociocognitivas como predictoras. El sistema de gestión de aprendizaje más usado fue Moodle de cursos blended learning y online. Los estudios se desarrollaron principalmente en Europa; las muestras fueron de hasta 500 participantes de Ingeniería y Tecnología. El tipo de análisis más frecuente fue regresión en software R y SPSS. La mayoría logró un modelo de predicción grande (R2 > .30). Se concluye que la construcción actual de modelos de predicción de abandono universitario posee importantes limitaciones.
Preprint
Full-text available
Procrastination, the irrational delay of tasks, is a common occurrence in online learning. Potential negative consequences include higher risk of drop-outs, increased stress, and reduced mood. Due to the rise of learning management systems and learning analytics, indicators of such behavior can be detected, enabling predictions of future procrastination and other dilatory behavior. However, research focusing on such predictions is scarce. Moreover, studies involving different types of predictors and comparisons between the predictive performance of various methods are virtually non-existent. In this study, we aim to fill these research gaps by analyzing the performance of multiple machine learning algorithms when predicting the delayed or timely submission of online assignments in a higher education setting with two categories of predictors: subjective, questionnaire-based variables and objective, log-data based indicators extracted from a learning management system. The results show that models with objective predictors consistently outperform models with subjective predictors, and a combination of both variable types perform slightly better. For each of these three options, a different approach prevailed (Gradient Boosting Machines for the subjective, Bayesian multilevel models for the objective, and Random Forest for the combined predictors). We conclude that careful attention should be paid to the selection of predictors and algorithms before implementing such models in learning management systems.
Article
Full-text available
This study examined the extent to which instructional conditions influence the prediction of academic success in nine undergraduate courses offered in a blended learning model (n = 4134). The study illustrates the differences in predictive power and significant predictors between course-specific models and generalized predictive models. The results suggest that it is imperative for learning analytics research to account for the diverse ways technology is adopted and applied in course-specific contexts. The differences in technology use, especially those related to whether and how learners use the learning management system, require consideration before the log-data can be merged to create a generalized model for predicting academic success. A lack of attention to instructional conditions can lead to an over or under estimation of the effects of LMS features on students' academic success. These findings have broader implications for institutions seeking generalized and portable models for identifying students at risk of academic failure.
Article
Full-text available
Contemporary literature on online and distance education almost unequivocally argues for the importance of interactions in online learning settings. Nevertheless, the relationship between different types of interactions and learning outcomes is rather complex. Analyzing 204 offerings of 29 courses, over the period of six years, this study aimed at expanding the current understanding of the nature of this relationship. Specifically, with the use of trace data about interactions and utilizing the multilevel linear mixed modeling techniques, the study examined whether frequency and duration of student-student, student-instructor, student-system, and student-content interactions had an effect of learning outcomes, measured as final course grades. The findings show that the time spent on student-system interactions had a consistent and positive effect on the learning outcome, while the quantity of student-content interactions was negatively associated with the final course grades. The study also showed the importance of the educational level and the context of individual courses for the interaction types supported. Our findings further confirmed the potential of the use of trace data and learning analytics for studying learning and teaching in online settings. However, further research should account for various qualitative aspects of the interactions used while learning, different pedagogical/media features, as well as for the course design and delivery conditions in order to better explain the association between interaction types and the learning achievement. Finally, the results might imply the need for the development of the institutional and program-level strategies for learning and teaching that would promote effective pedagogical approaches to designing and guiding interactions in online and distance learning settings.
Conference Paper
Full-text available
Annie.Bryan@open.ac.uk ABSTRACT While substantial progress has been made in terms of predictive modeling in the Learning Analytics Knowledge (LAK) community, one element that is often ignored is the role of learning design. Learning design establishes the objectives and pedagogical plans which can be evaluated against the outcomes captured through learning analytics. However, no empirical study is available linking learning designs of a substantial number of courses with usage of Learning Management Systems (LMS) and learning performance. Using cluster-and correlation analyses, in this study we compared how 87 modules were designed, and how this impacted (static and dynamic) LMS behavior and learning performance. Our findings indicate that academics seem to design modules with an " invisible " blueprint in their mind. Our cluster analyses yielded four distinctive learning design patterns: constructivist, assessment-driven, balanced-variety and social constructivist modules. More importantly, learning design activities strongly influenced how students were engaging online. Finally, learning design activities seem to have an impact on learning performance, in particular when modules rely on assimilative activities. Our findings indicate that learning analytics researchers need to be aware of the impact of learning design on LMS data over time, and subsequent academic performance.
Conference Paper
Full-text available
All forms of learning take time. There is a large body of research suggesting that the amount of time spent on learning can improve the quality of learning, as represented by academic performance. The wide-spread adoption of learning technologies such as learning management systems (LMSs), has resulted in large amounts of data about student learning being readily accessible to educational researchers. One common use of this data is to measure time that students have spent on different learning tasks (i.e., time-on-task). Given that LMS systems typically only capture times when students executed various actions, time-on-task measures are estimated based on the recorded trace data. LMS trace data has been extensively used in many studies in the field of learning analytics, yet the problem of time-on-task estimation is rarely described in detail and the consequences that it entails are not fully examined. This paper presents the results of a study that examined the effects of different time-on-task estimation methods on the results of commonly adopted analytical models. The primary goal of this paper is to raise awareness of the issue of accuracy and appropriateness surrounding time-estimation within the broader learning analytics community, and to initiate a debate about the challenges of this process. Furthermore, the paper provides an overview of time-on-task estimation methods in educational and related research fields.
Article
Full-text available
Learning analytics seek to enhance the learning processes through systematic measurements of learning related data and to provide informative feedback to learners and educators. Track data from Learning Management Systems (LMS) constitute a main data source for learning analytics. This empirical contribution provides an application of Buckingham Shum and Deakin Crick's theoretical framework of dispositional learning analytics: an infrastructure that combines learning dispositions data with data extracted from computer-assisted, formative assessments and LMSs. In a large introductory quantitative methods module, 922 students were enrolled in a module based on principles of blended learning, combining face-to-face Problem-Based Learning sessions with e-tutorials. We investigated the predictive power of learning dispositions, outcomes of continuous formative assessments and other system generated data in modelling student performance and their potential to generate informative feedback. Using a dynamic, longitudinal perspective, computer-assisted formative assessments seem to be the best predictor for detecting underperforming students and academic performance, while basic LMS data did not substantially predict learning. If timely feedback is crucial, both use-intensity related track data from e-tutorial systems, and learning dispositions, are valuable sources for feedback generation.
Article
Full-text available
The Open Academic Analytics Initiative (OAAI) is a collaborative, multi‐year grant program aimed at researching issues related to the scaling up of learning analytics technologies and solutions across all of higher education. The paper describes the goals and objectives of the OAAI, depicts the process and challenges of collecting, organizing and mining student data to predict academic risk, and report results on the predictive performance of those models, their portability across pilot programs at partner institutions, and the results of interventions on at‐risk students.
Conference Paper
Full-text available
The aim of this study is to suggest more meaningful components for learning analytics in order to help learners improving their learning achievement continuously through an educational technology approach. Multiple linear regression analysis is conducted to determine which factors influence student's academic achievement. 84 undergraduate students in a women's university in South Korea participated in this study. The six-predictor model was able to account for 33.5% of the variance in final grade, F(6, 77) = 6.457, p < .001, R2 = .335. Total studying time in LMS, interaction with peers, regularity of learning interval in LMS, and number of downloads were determined to be significant factors for students' academic achievement in online learning environment. These four controllable variables not only predict learning outcomes significantly but also can be changed if learners put more effort to improve their academic performance. The results provide a rationale for the treatment for student time management effort.
Article
Full-text available
This article considers the developing field of learning analytics and argues that to move from small-scale practice to broad scale applicability, there is a need to establish a contextual framework that helps teachers interpret the information that analytics provides. The article presents learning design as a form of documentation of pedagogical intent that can provide the context for making sense of diverse sets of analytic data. We investigate one example of learning design to explore how broad categories of analytics—which we call checkpoint and process analytics—can inform the interpretation of outcomes from a learning design and facilitate pedagogical action.
Article
Full-text available
Growing interest in data and analytics in education, teaching, and learning raises the priority for increased, high-quality research into the models, methods, technologies, and impact of analytics. Two research communities -- Educational Data Mining (EDM) and Learning Analytics and Knowledge (LAK) have developed separately to address this need. This paper argues for increased and formal communication and collaboration between these communities in order to share research, methods, and tools for data mining and analysis in the service of developing both LAK and EDM fields.
Article
Full-text available
Theoretical and empirical evidence in the learning sciences substantiates the view that deep engagement in learning is a function of a complex combination of learners' identities, dispositions, values, attitudes and skills. When these are fragile, learners struggle to achieve their potential in conventional assessments, and critically, are not prepared for the novelty and complexity of the challenges they will meet in the workplace, and the many other spheres of life which require personal qualities such as resilience, critical thinking and collaboration skills. To date, the learning analytics research and development communities have not addressed how these complex concepts can be modelled and analysed, and how more traditional social science data analysis can support and be enhanced by learning analytics. We report progress in the design and implementation of learning analytics based on a research validated multidimensional construct termed "learning power". We describe, for the first time, a learning analytics infrastructure for gathering data at scale, managing stakeholder permissions, the range of analytics that it supports from real time summaries to exploratory research, and a particular visual analytic which has been shown to have demonstrable impact on learners. We conclude by summarising the ongoing research and development programme and identifying the challenges of integrating traditional social science research, with learning analytics and modelling.
Article
Full-text available
In this article we examine educational assessment in the 21st century. Digital learning environments emphasize learning in action. In such environments, assessments need to focus on performance in context rather than on tests of ed and isolated skills and knowledge. Digital learning environments also provide the potential to assess performance in context, because digital tools make it possible to record rich streams of data about learning in progress. But what assessment methods will use this data to measure mastery of complex problem solvingthe kind of thinking in action that takes place in digital learning environments? Here we argue that one way to address this challenge is through evidence-centered design1a framework for developing assessments by systematically linking models of understanding, observable actions, and evaluation rubrics to provide evidence of learning. We examine how evidence-centered design can address the challenge of assessment in new media learning environments by presenting one specific theory-based approach to digital learning, known as epistemic games (http://epistemicgames.org/eg/), and describing a method, epistemic network analysis (ENA), to assess learner performance based on this theory. We use the theory and its related assessment method to illustrate the concept of a digital learning systema system composed of a theory of learning and its accompanying method of assessment, linked into an evidence-based, digital intervention. We argue that whatever tools of learning and assessment digital environments use, they need to be integrated into a coherent digital learning system linking learning and assessment through evidence-centered design.
Article
Full-text available
Since the beginning of the century, feedback interventions (FIs) produced negative--but largely ignored--effects on performance. A meta-analysis (607 effect sizes; 23,663 observations) suggests that FIs improved performance on average ( d  = .41) but that over one-third of the FIs decreased performance. This finding cannot be explained by sampling error, feedback sign, or existing theories. The authors proposed a preliminary FI theory (FIT) and tested it with moderator analyses. The central assumption of FIT is that FIs change the locus of attention among 3 general and hierarchically organized levels of control: task learning, task motivation, and meta-tasks (including self-related) processes. The results suggest that FI effectiveness decreases as attention moves up the hierarchy closer to the self and away from the task. These findings are further moderated by task characteristics that are still poorly understood. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Networked learning is much more ambitious than previous approaches of using technology in education. It is, therefore, more difficult to evaluate the effectiveness and efficiency of the networked learning activities. Evaluation of learners' interactions in networked learning environments is a resource and expertise demanding task. Educators participating in networked learning communities, have very little support by integrated tools to evaluate students' learning activities flow and identify learners' online browsing behaviour and interactions. As a consequence, educators are in need for non-intrusive and automatic ways to become informed about learners' progress in order to better follow their learning process and appraise the online course effectiveness. They also need specialized tools for gathering and analysing data for evaluating the learning effectiveness of networked learning instructional models. The aim of this paper is to present a conceptual framework and an innovative tool, called AnalyticsTool, which is based on this framework, that allow teacher and evaluator to easily track the learners' online behaviour, make judgements about learners' activity flow and have a better insight about the knowledge constructed and skills acquired in a networked learning environment. The innovation of the proposed tool is that interoperates with the Moodle learning management system and that it guides the educator perform the interaction analysis of collaborative learning scenarios that have been designed following specific learning strategies such as TPS, Jigsaw, Pyramid, etc.
Article
Full-text available
Enterprise wide learning management systems are integral to university learning and teaching environments. Griffith University and the University of Western Sydney (UWS) are predominantly face-to-face, multi-campus teaching institutions with similar size student bodies and academic communities. Both Griffith and UWS utilise a single enterprise wide e-learning system, although the systems are different. This paper describes a benchmarking activity between the two universities to determine the level and quality of the uptake of the e-learning system. A framework was developed as a product of the partnership and applied to a representative sample of e-learning sites. The results of the benchmarking exercise showed that there are parallel trends between the two institutions in how the LMS is being used, however with distinct differences in specific areas.
Article
Full-text available
We propose that the design and implementation of effective Social Learning Analytics presents significant challenges and opportunities for both research and enterprise, in three important respects. The first is the challenge of implementing analytics that have pedagogical and ethical integrity, in a context where power and control over data is now of primary importance. The second challenge is that the educational landscape is extraordinarily turbulent at present, in no small part due to technological drivers. Online social learning is emerging as a significant phenomenon for a variety of reasons, which we review, in order to motivate the concept of social learning, and ways of conceiving social learning environments as distinct from other social platforms. This sets the context for the third challenge, namely, to understand different types of Social Learning Analytic, each of which has specific technical and pedagogical challenges. We propose an initial taxonomy of five types. We conclude by considering potential futures for Social Learning Analytics, if the drivers and trends reviewed continue, and the prospect of solutions to some of the concerns that institution-centric learning analytics may provoke.
Article
Full-text available
This is a study of an online, Web based learning environment developed for an introductory business information systems course. The development of the environment was guided by a design principle that emphasizes choices regarding hypertextuality, centralization in client-server, interactivity, multimedia, and synchronicity. The environment included a several hundred page textbook and individual and group online learning aids and implements. Following the development period and a pilot class, the system was used by students in three large classes of 50 -100 students each. We examined student grades, attitudes, and usage logs and their intercorrelations. Our findings indicate that online, Web-based learning environments are not just feasible. Employing such a system to complement lectures yields measurable enhancements. We propose a focus on logged, machine-collected usage statistics. Such statistics allow a measurement of actual reading behavior and linearity in the learning process. Reading amount is highly correlated with student achievement. A-synchronous conferencing tools enhance instructor-student and student-to-student interaction. This interaction is, itself, a correlate of success in the course. Furthermore, more mature student groups seem to make better use of the online environment, and use it less linearly.
Conference Paper
Full-text available
The trend to greater adoption of online learning in higher education institutions means an increased opportunity for instructors and administrators to monitor student activity and interaction with the course content and peers. This paper demonstrates how the analysis of data captured from various IT systems could be used to inform decision making process for university management and administration. It does so by providing details of a large research project designed to identify the range of applications for LMS derived data for informing strategic decision makers and teaching staff. The visualisation of online student engagement/effort is shown to afford instructors with early opportunities for providing additional student learning assistance and intervention – when and where it is required. The capacity to establish early indicators of 'at-risk' students provides timely opportunities for instructors to re-direct or add resources to facilitate progression towards optimal patterns of learning behaviour. The project findings provide new insights into student learning that complement the existing array of evaluative methodologies, including formal evaluations of teaching. Thus the project provides a platform for further investigation into new suites of diagnostic tools that can, in turn, provide new opportunities to inform continuous, sustained improvement of pedagogical practice.
Article
Full-text available
Academic failure among first-year university students has long fuelled a large number of debates. Many educational psychologists have tried to understand and then explain it. Many statisticians have tried to foresee it. Our research aims to classify, as early in the academic year as possible, students into three groups: the 'low-risk' students, who have a high probability of succeeding, the 'medium-risk' students, who may succeed thanks to the measures taken by the university, and the 'high-risk' students, who have a high probability of failing (or dropping out). This article describes our methodology and provides the most significant variables correlated to academic success among all the questions asked to 533 first-year university students during the month of November of academic year 2003-04. Finally, it presents the results of the application of discriminant analysis, neural networks, random forests and decision trees aimed at predicting those students' academic success.
Article
Full-text available
The beneficial effects of learners interacting in online programmes have been widely reported. Indeed, online discussion is argued to promote student-centred learning. It is therefore reasonable to suggest that the benefits of online discussion should translate into improved student performance. The current study examined the frequency of online interaction of 122 undergraduates and compared this with their grades at the end of the year. The findings revealed that greater online interaction did not lead to significantly higher performance for students achieving passing grades; however, students who failed in their courses tended to interact less frequently. Other factors that may be salient in online interactions are discussed.
Article
The rapid uptake of campus-wide Learning Management Systems (LMS) is changing the character of the on-campus learning experience. The trend towards LMS as an adjunct to traditional learning modes has been the subject of little research beyond technical analyses of alternative software systems. Drawing on Australian experience, this paper presents a broad, critical examination of the potential impact of these online systems on teaching and learning in universities. It discusses in particular the possible effects of LMS on teaching practices, on student engagement, on the nature of academic work and on the control over academic knowledge.
Article
In this chapter, the reader is taken through a macro level view of learning management systems, with a particular emphasis on systems offered by commercial vendors. Included is a consideration of the growth of learning management systems during the past decade, the common features and tools contained within these systems, and a look at the advantages and disadvantages that learning management systems provide to institutions. In addition, the reader is presented with specific resources and options for evaluating, selecting and deploying learning management systems. A section highlighting the possible advantages and disadvantages of selecting a commercial versus an open source system is followed by a series of brief profiles of the leading vendors of commercial and open source learning management systems.
Article
Student engagement has become synonymous with the measurement of teaching and learning quality at universities. The almost global adoption of learning management systems as a technical solution to e-learning within universities and their ability to record and track user behaviour provides the academy with an unprecedented opportunity to harness captured data relating to student engagement. This is an exploratory study that aims to show how data from learning management systems can be used as an indicator of student engagement and how patterns in the data have changed with CQUniversity's recent adoption of Moodle as its single learning management system.
Article
This study sought to identify significant behavioral indicators of learning using learning management system (LMS) data regarding online course achievement. Because self-regulated learning is critical to success in online learning, measures reflecting self-regulated learning were included to examine the relationship between LMS data measures and course achievement. Data were collected from 530 college students who took an online course. The results demonstrated that students' regular study, late submissions of assignments, number of sessions (the frequency of course logins), and proof of reading the course information packets significantly predicted their course achievement. These findings verify the importance of self-regulated learning and reveal the advantages of using measures related to meaningful learning behaviors rather than simple frequency measures. Furthermore, the measures collected in the middle of the course significantly predicted course achievement, and the findings support the potential for early prediction using learning performance data. Several implications of these findings are discussed.
Article
This paper outlines how teachers can use the learning management system (LMS) to identify at risk students in the first week of a course. Data is from nine second year campus based business courses that use a blend of face-to-face and online learning strategies. Students that used the LMS in the first week of the course were more likely to pass. For the rest of the course the pattern of usage is then largely similar for students who pass and those that do not pass. This paper identifies how a LMS can identify at risk students in the first week of the course and provides some strategies to motivate these students. © 2012 John Milne, Lynn M Jeffrey, Gordon Suddaby & Andrew Higgins.
Article
The role of discussion forums is an essential part of online courses in tertiary education. Activities in discussion forums help learners to share and gain knowledge from each other.. In fully online courses, discussion forums are often the only medium of interaction. However, merely setting up discussion forums does not ensure that learners interact with each other actively and investigation into the type of participation is required to ensure quality participation. This paper provides a general overview of how fully online students participate in discussion forums and the correlation between their activity online and achievement in terms of grades. The main benefit of this research is that it provides a benchmark for the trend of participation expected of the fully online introductory information technology and programming students. Investigating the participation and the factors behind online behaviour can provide guidelines for continual development of online learning systems.. The results of the data analysis reveal that a high number of students are not accessing or posting in the discussion board. Results also show that there is a correlation between activity of students' in online forums and the grades they achieve. Discussion about the findings of data analysis and the lessons learned from this research are presented in this paper.
Article
This issue is a call for researchers and practitioners to reflect on progress to date and understand the criticality of theory – how it facilitates interpretation of findings but also how it can also restrict and confine our thinking through the assumptions many theoretical models bring. As education paradigms further shift and juxtapose informal and formal learning settings there is a need to re-visit any underlying theoretical assumptions.
Article
Blended learning (BL) is recognized as one of the major trends in higher education today. To identify how BL has been actually adopted, this study employed a data-driven approach instead of model-driven methods. Latent Class Analysis method as a clustering approach of educational data model-driven methods. Latent Class Analysis method as a clustering approach of educational data mining was employed to extract common activity features of 612 courses in a large private university located in South Korea by using online behavior data tracked from Learning Management System and institution's course database. Four unique subtypes were identified. Approximately 50% of the courses manifested inactive utilization of LMS or immature stage of blended learning implementation, which is labeled as Type I. Other subtypes included Type C - Communication or Collaboration (24.3%), Type D - Delivery or Discussion (18.0%), and Type S - Sharing or Submission (7.2%). We discussed the implications of BL based on data-driven decisions to provide strategic institutional initiatives.
Article
We propose that the design and implementation of effective Social Learning Analytics (SLA) present significant challenges and opportunities for both research and enterprise, in three important respects. The first is that the learning landscape is extraordinarily turbulent at present, in no small part due to technological drivers. Online social learning is emerging as a significant phenomenon for a variety of reasons, which we review, in order to motivate the concept of social learning. The second challenge is to identify different types of SLA and their associated technologies and uses. We discuss five categories of analytic in relation to online social learning; these analytics are either inherently social or can be socialised. This sets the scene for a third challenge, that of implementing analytics that have pedagogical and ethical integrity in a context where power and control over data are now of primary importance. We consider some of the concerns that learning analytics provoke, and suggest that Social Learning Analytics may provide ways forward. We conclude by revisiting the drivers andtrends, and consider future scenarios that we may see unfold as SLA tools and services mature. © International Forum of Educational Technology & Society (IFETS).