Conference Paper

Generating Actionable Predictive Models of Academic Performance

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The pervasive collection of data has opened the possibility for educational institutions to use analytics methods to improve the quality of the student experience. However, the adoption of these methods faces multiple challenges particularly at the course level where instructors and students would derive the most benefit from the use of analytics and predictive models. The challenge lies in the knowledge gap between how the data is captured, processed and used to derive models of student behavior, and the subsequent interpretation and the decision to deploy pedagogical actions and interventions by instructors. Simply put, the provision of learning analytics alone has not necessarily led to changing teaching practices. In order to support pedagogical change and aid interpretation, this paper proposes a model that can enable instructors to readily identify subpopulations of students to provide specific support actions. The approach was applied to a first year course with a large number of students. The resulting model classifies students according to their predicted exam scores, based on indicators directly derived from the learning design.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Diversas pesquisas relatam a aplicação de técnicas de análise de dados para a previsão de desempenho acadêmico em instituições e cenários específicos. No entanto, essas aplicações frequentemente consideram um escopo reduzido ou um cenário muito específico [Pardo et al. 2016;Okubo et al. 2017;Almeida et al. 2018]. Por exemplo, Yadav e Pal (2012) realizaram uma pesquisa comparativa limitada a testar algoritmos de Árvores de Decisão; Romero et al. (2013) compararam técnicas de análise de dados (Regressão Linear, Árvores de Decisão, Naïve Bayes) a partir de postagens em fóruns; Jayaprakash et al. (2014) compararam a utilização de técnicas (Regressão Logística, Máquina de Vetor de Suporte, Árvores de Decisão e Naïve Bayes) para a identificação antecipada de estudantes sob risco de desistência e Naif et al. (2017) aplicaram diferentes técnicas (Regressão Logística, Máquina de Vetor de Suporte, Árvores de Decisão e Métodos Bayseanos) a partir de dados socioeconômicos dos estudantes. ...
... Os dados utilizados para a previsão de desempenho acadêmico estão frequentemente relacionados a índices de participação e engajamento em AVAs, e consistem em dados coletados automaticamente de acordo com as atividades realizadas, por exemplo, quantidade de vídeos reproduzidos, postagens em fóruns, tempo total no ambiente, entre outros [Romero et al. 2013;Jayaprakash et al. 2014;Gašević et al. 2015;Pardo et al. 2016;Almeda et al. 2018]. Os dados utilizados em cada estudo dependem do tipo de informação que pode ser extraída, possibilidade de acesso aos dados, extensões e módulos utilizados em cada contexto estudado. ...
... A Formulação 1 (F1) agrupa modelos cuja saída esperada é uma previsão direta da nota final que será obtida pelo estudante em uma determinada escala contínua, para os quais são utilizados modelos de regressão [Almeda et al. 2018;Pardo et al. 2016;Gašević et al. 2016]. A saída da F1 é definida em uma escala contínua com notas válidas entre 0 e 1 e atribuindo-se -1 para os estudantes sem nota. ...
Conference Paper
Full-text available
Predicting student academic performance is one of the main research topics in Learning Analytics, for which different techniques have been applied. In order to facilitate the choice of a technique, this study presents a comparative analysis among regression and classification techniques, considering different application scenarios. We used data from MITx/HarvardX containing logs of activities and participation of 15 groups of 12 MOOCs. Results obtained from the performance evaluation metrics suggest the choice of Decision Trees as a technique to build models for regression and a choice between Decision Trees and Support Vector Machines to build models for classification.
... According to Learning Analytics (LA) research reviews, the prediction of student performance was identified as the most popular topic in fully online or blended learning [4], [5], [6]. Many researchers have invested their efforts on the development of early detection of struggling students for timely interventions [7], [8], [9], [10]. ...
... The goal of early warning systems aims to identify at-risk students accurately and as soon as possible. However, most existing research approaches did not obtain satisfactory accuracy rates [26], [27], [28], [29] or they did not identify students early enough for an academic intervention [7], [8], [10]. This study included an intensive literature review to identify the aforementioned research gaps. ...
... Of those listed, only nine articles in higher education settings perform early warning prediction. However, even those studies have the following deficiencies: (1) small sample size, such as Kim, Park, Yoon and Jo [28] as the study analyzed 105 students for predictive modeling, (2) low prediction accuracy [26], [27], [29] where the prediction accuracies were close in accuracy to random guessing and showed signs of overfitting, (3) late prediction timing [7], [8], [10] in which the studies could not identify at-risk students until 10th or 12th in a 16-week semester, and (4) a specific course [9], [32] in which studies analyzed programming behaviors for the predictive modeling. ...
Article
Performance prediction is a leading topic in learning analytics research due to its potential to impact all tiers of education. This study proposes a novel predictive modeling method to address the research gaps in existing performance prediction research. The gaps addressed include: the lack of existing research focus on performance prediction rather than identifying key performance factors; the lack of common predictors identified for both K-12 and higher education environments; and the misplaced focus on absolute engagement levels rather than relative engagement levels. Two datasets, one from higher education and the other from a K-12 online school with 13 368 students in more than 300 courses, were applied using the predictive modeling technique. The results showed the newly suggested approach had higher overall accuracy and sensitivity rates than the traditional approach. In addition, two generalizable predictors were identified from instruction-intensive and discussion-intensive courses.
... According to Learning Analytics (LA) research reviews, the prediction of student performance was identified as the most popular topic in fully online or blended learning [4], [5], [6]. Many researchers have invested their efforts on the development of early detection of struggling students for timely interventions [7], [8], [9], [10]. ...
... The goal of early warning systems aims to identify at-risk students accurately and as soon as possible. However, most existing research approaches did not obtain satisfactory accuracy rates [26], [27], [28], [29] or they did not identify students early enough for an academic intervention [7], [8], [10]. This study included an intensive literature review to identify the aforementioned research gaps. ...
... Of those listed, only nine articles in higher education settings perform early warning prediction. However, even those studies have the following deficiencies: (1) small sample size, such as Kim, Park, Yoon and Jo [28] as the study analyzed 105 students for predictive modeling, (2) low prediction accuracy [26], [27], [29] where the prediction accuracies were close in accuracy to random guessing and showed signs of overfitting, (3) late prediction timing [7], [8], [10] in which the studies could not identify at-risk students until 10th or 12th in a 16-week semester, and (4) a specific course [9], [32] in which studies analyzed programming behaviors for the predictive modeling. ...
Chapter
This study proposes an analytic approach which combines two predictive models (the predictive model of successful students and the predictive model of at-risk students) to enhance prediction performance for use under the constraints of limited data collection. A case study was conducted to examine the effects of the model combination approach. Eight variables were collected from a data warehouse and the Learning Management System. The best model was selected based on the lowest misclassification rate in the validation dataset. The confusion matrix compares the model’s performance with the following parameters: accuracy, misclassification, and sensitivity. The results show the new combination approach can capture more at-risk students than the singular predictive model, and is only suitable for the ensemble predictive algorithms.
... In a similar vein, [76] described how the use of analytics can be framed in a pedagogical model, where students viewed the analytics as a guideline for sensemaking that can empower them to regulate their learning process. For LA prediction models, it was indicated that transparency related to the reasons why and how certain predictions are made is essential in order for teachers and students to understand how best to act upon the predictions [50]. Also, [26] showed how an LA recommendation could make more sense when the rationale behind it is transparent for the learner. ...
... FAT in LA Ethical Frameworks [33], [54], [70], [72] FAT in a Personal Code of Ethics [38] Institutional Transparency [22], [31], [56], [69] Transparency and Data [22], [23], [66] Implications of Transparency in LA: Transparency for Understanding, Sensemaking, and Reflection [1], [5], [26], [37], [43], [46], [47], [49], [50], [61], [76] Transparency for Acceptance and Adoption [10], [13], [66], [73] Transparency to Build Trust [9], [41], [64] Transparency and the Option to Opt-out [55], [56], [72] Transparency to Support LA Co-Design [18], [59] Transparent LA Tools [7], [62] Transparent LA Research [29] Institutional Accountability [33] Algorithmic Accountability [2], [30], [35] Accountable Learning [32], [44], [53] Fair LA Outcomes: ...
Conference Paper
Full-text available
The scientific community is currently engaged in global efforts towards a movement that promotes positive human values in the ways we formulate and apply Artificial Intelligence (AI) solutions. As the use of intelligent algorithms and analytics are becoming more involved in how decisions are made in public and private life, the societal values of Fairness, Accountability and Transparency (FAT) and the multidimensional value of human Well-being are being discussed in the context of addressing potential negative and positive impacts of AI. This research paper reviews these four values and their implications in algorithms and investigates their empirical existence in the interdisciplinary field of Learning Analytics (LA). We present and highlight results of a literature review that was conducted across all the editions of the Learning Analytics & Knowledge (LAK) ACM conference proceedings. The findings provide different insights on how these societal and human values are being considered in LA research, tools, applications and ethical frameworks.
... Different from GAM, decision trees aim to learn a set of decision rules, which are organized in a hierarchical tree-like manner, to determine the value of the dependant variable. For instance, when predicting students' competence in collaborative problem solving, Cukurova, Zhou, Spikol, and Landolfi (2020) and Pardo et al. (2016) built decision trees to provide a set of useful decision rules to capture factors that were essential to students' performance. Due to their hierarchical structure, decision trees have been widely regarded as a useful technique to enable XAIED. ...
... For explainable AI in education this means that the emphasis should not only be on explaining the inner workings of an algorithm and how certain results are computed, but that there should be a purposeful design consideration of an AI system that can guide the user to take a certain action (Winne, 2021). As shown in the current paper and the literature, certain classes of AIED system seek to provide actionable explanations to promote learning, such as provision of formative feedback (Knight, 2020), triggering reflection and metacognitive monitoring, explainable recommendation of learning resources to engage with Barria-Pineda et al., 2021), (Palominos et al., 2019), revision of course content (Ali, Hatala, Gašević, & Jovanović, 2012), assess algorithmic bias and fairness (Baker & Hawn, 2021;Kizilcec and Lee, In Press), optimal use of instructors time to review student work (Darvishi et al., 2021) or provide support for those who need it the most (Khosravi, Shabaninejad, et al., 2021), or more generally taking a wide range of pedagogical actions (Pardo et al., 2016). ...
Article
Full-text available
There are emerging concerns about the Fairness, Accountability, Transparency, and Ethics (FATE) of educational interventions supported by the use of Artificial Intelligence (AI) algorithms. One of the emerging methods for increasing trust in AI systems is to use eXplainable AI (XAI), which promotes the use of methods that produce transparent explanations and reasons for decisions AI systems make. Considering the existing literature on XAI, this paper argues that XAI in education has commonalities with the broader use of AI but also has distinctive needs. Accordingly, we first present a framework, referred to as XAI-ED, that considers six key aspects in relation to explainability for studying, designing and developing educational AI tools. These key aspects focus on the stakeholders, benefits, approaches for presenting explanations, widely used classes of AI models, human-centred designs of the AI interfaces and potential pitfalls of providing explanations within education. We then present four comprehensive case studies that illustrate the application of XAI-ED in four different educational AI tools. The paper concludes by discussing opportunities, challenges and future research needs for the effective incorporation of XAI in education.
... A knowledge gap has been identified between the model creation for student performance prediction and the interpretation of that prediction for actionable decision making process [18]. For this pedagogical change, a model based on recursive partitioning and automatic selection of features for robust classification was developed with high interpretability. ...
... For this pedagogical change, a model based on recursive partitioning and automatic selection of features for robust classification was developed with high interpretability. The strength of the model was the transparent characterisation of student subgroups based on relevant features for easy translation into actionable processes [18]. ...
Article
Full-text available
Major issues currently restricting the use of learning analytics are the lack of interpretability and adaptability of the machine learning models used in this domain. Interpretability makes it easy for the stakeholders to understand the working of these models and adaptability makes it easy to use the same model for multiple cohorts and courses in educational institutions. Recently, some models in learning analytics are constructed with the consideration of interpretability but their interpretability is not quantified. However, adaptability is not specifically considered in this domain. This paper presents a new framework based on hybrid statistical fuzzy theory to overcome these limitations. It also provides explainability in the form of rules describing the reasoning behind a particular output. The paper also discusses the system evaluation on a benchmark dataset showing promising results. The measure of explainability, fuzzy index, shows that the model is highly interpretable. This system achieves more than 82% recall in both the classification and the context adaptation stages.
... Many data mining techniques have proven to be useful to analyse and interpret educational data or to detect patterns and trends [19], [20], [1], and to predict student outcomes [21]. There are also numerous studies in which learning analytics have been used to create predictive models [22], [4], to predict course success, and to discover students at risk [23], [24], [25]. However, the use of predictive models to help interventions derived from collected data inside and outside the classroom is relatively unexplored [26]. ...
... Tree models can provide teachers with a visual help to understand student performance [22]. Thus, Figure 4 shows the analysis including the evaluation tests and values normalized to 1. ...
... -In the requirements analysis phase we will conduct workshops and interviews to include stakeholders' views. As pointed out in Martinez-Maldonado et al. (2016), a challenge of this first stage is the identification of "possible new and radical features that can be offered by the data to address stakeholder needs, but where the stakeholders may not realize this". For example, many students are not aware and even might not believe that dropout from a degree can be predicted with pretty high accuracy at the end of the first semester of study as shown in various works (Berens et al. 2019, Wagner et al. 2020). ...
... In contrast, some educators and researchers take the philosophy that why the model is accurate is both more interesting and more relevant than raw performance (Pardo et al., 2016). For example, when predicting learning gains, an intricate, black-box model gives little to no actionable insight regarding how an educational process could be altered or improved without a series of complex simulations and mathematical examination (Chakraborty et al., 2018). ...
Conference Paper
Full-text available
Network analysis simulations were used to guide decision-makers while configuring instructional spaces on our campus during COVID-19. Course enrollment data were utilized to estimate metrics of student-to-student contact under various instruction mode scenarios. Campus administrators developed recommendations based on these metrics; examples of learning analytics implementation are provided.
... Both studies in this category proposed models for course-level student performance prediction in a higher education setting. Both studies used numerical input, where one utilized e-learning analytic [47] and the other one [48] used pre-course performance data for prediction. ...
... The thickness of lines is relative to the number of time a primary paper used this approach. [39], [46] J48 [45], [35] Jrip [35] Random Forest [40], [44], [49], [39] unspecified [47] Deep Learning LSTM [48] Other Machine learning algorithms SVM [49], [39] RBF [43] Logit [39] NB [39] Rule learning algorithms CN2 rule inducer algorithm [37] Classification association rule mining [38] Genetic-based algorithms [41], [36], [42] model? ...
Article
Full-text available
Successful prediction of student performance has significant impact to many stakeholders, including students, teachers and educational institutes. In this domain, it is equally important to have accurate and explainable predictions, where accuracy refers to the correctness of the predicted value, and explainability refers to the understandability of the prediction made. In this systematic review, we investigate explainable models of student performance prediction from 2015 to 2020. We analyze and synthesize primary studies, and group them based on nine dimensions. Our analysis revealed the need for more studies on explainable student performance prediction models, where both accuracy and explainability are properly quantified and evaluated.
... For example, Duval (2011) clarified that by collecting data about user behavior, LA can be useful for providing recommendations about learning resources and activities. In addition, the literature showed that mining students' online social interaction is important for References % Nasiri and Minaei, 2012;Nebot et al., 2006;Nesbit et al., 2007;Nugent et al., 2009;Nwaigwe and Koedinger, 2010;Ogor, 2007;Oladokun et al., 2008;Pai et al., 2010;Pandey and Pal, 2011;Parack et al., 2012;Pardo et al., 2017;Pardo et al., 2016;Pardos and Heffernan, 2010;Pardos et al., 2007;Pardos et al., 2008;Pardos et al., 2014;Pardos et al., 2012;Parmar et al., 2015;Patil and Kumar, 2017;Pechenizkiy et al., 2008;Pedro et al., 2013;Pradeep et al., 2015;Pritchard and Warnakulasooriya, 2005;Rajeswari and Lawrance, 2016;Rau and Pardos, 2012;Rau and Scheines, 2012;Cristobal Romero et al., 2013;Romero et al., 2013;Romero et al., 2004;Romero et al., 2008;Sabourin et al., 2012;Sakurai et al., 2012;Salas et al., 2016;Schrire, 2004;Sembiring et al., 2011;Serrano-Laguna et al., 2012;Shangping and Ping, 2008;Shi et al., 2012;Shovon and Haque, 2012;Silva et al., 2016;Simpson, 2006;Stamper et al., 2012;Stanca and Felea, 2016;Stevens et al., 2005;Suganya and Narayani, 2017;Sun, 2010;Sweeney et al., 2015;Tan, 2012;Tang and McCalla, 2002;Thai-Nghe et al., 2010;Tovar and Soto, 2010;Trivedi et al., 2010;Ueno, 2004;Vasić et al., 2015;Vranic et al., 2007;Wang and Mitrovic, 2002;Wang and Beck, 2012;Wong and Li, 2016;Wu, 2010;Xing et al., 2015;Xu and Mostow, 2010;Yoo et al., 2006;You, 2016;Zafra and Ventura, 2009;Zhang et al., 2007;Zhang, 2010;Zheliazkova et al., 2015;Zukhri and Omar, 2008) Dropout and retention (Baradwaj and Pal, 2012;Bayer, Bydzovská, Géryk, Obsivac, and Popelinsky, 2012;Cambruzzi et al., 2015;Carter and Yeo, 2016;Cocea and Weibelzahl, 2006;de-la-Fuente-Valentín et al., 2015;de Almeida Neto and Castro, 2015;Dejaeger, Goethals, Giangreco, Mola, and Baesens, 2012;Govaerts et al., 2010;Iam-On and Boongoen, 2017;Lin, 2012;Lonn, Aguilar, and Teasley, 2015;Lykourentzou et al., 2009;Márquez-Vera et al., 2016;Mazza and Dimitrova, 2004;Morris, Wu, and Finnegan, 2005;Nandeshwar, Menzies, and Nelson, 2011;Pradeep et al., 2015;Rad, Naderi, and Soltani, 2011;Romero et al., 2004;Yu, DiGangi, Jannasch-Pennell, and Kaprolet, 2010) CSBA -Decision modeling (Data-driven decisionmaking) ...
... There were also studies that have applied other techniques such as sequential pattern mining, correlation mining, causal data mining, and outlier detection for similar purposes. Many studies have also been dedicated to regulate the complexity of the representation Holzhüter et al., 2013;Pejić and Molcer, 2016) and provide pedagogical support to students Little et al., 2011;Mazza and Dimitrova, 2004;Pardo et al., 2016) using techniques such as classification, clustering, association rule mining, and visual data mining. Techniques such as classification, clustering, association rule mining, text mining, visual data mining, and statistics have been frequently used to analyze student learning and interaction in different collaborative activities Ji et al., 2016;McCuaig and Baldwin, 2012;Paiva et al., 2016), as well as providing learning opportunities that incorporate student prior knowledge of content Aher and Lobo, 2013;Chen et al., 2008;Chrysostomou et al., 2009;. ...
Article
The potential influence of data mining analytics on the students’ learning processes and outcomes has been realized in higher education. Hence, a comprehensive review of educational data mining (EDM) and learning analytics (LA) in higher education was conducted. This review covered the most relevant studies related to four main dimensions: computer-supported learning analytics (CSLA), computer-supported predictive analytics (CSPA), computer-supported behavioral analytics (CSBA), and computer-supported visualization analytics (CSVA) from 2000 till 2017. The relevant EDM and LA techniques were identified and compared across these dimensions. Based on the results of 402 studies, it was found that specific EDM and LA techniques could offer the best means of solving certain learning problems. Applying EDM and LA in higher education can be useful in developing a student-focused strategy and providing the required tools that institutions will be able to use for the purposes of continuous improvement.
... All these studies use data from ITS environments, but we can also find in the literature similar studies on traditional LMS environment. For example, to predict performance in Moodle course activities using a collaborative multi-regression model (Elbadrawy et al., 2015) or the performance in midterm and final exams using partitioning trees (Pardo, Mirriahi, Martinez-Maldonado, Jovanovic, Dawson & Gašević, 2016). The environment of ITS, LMS and MOOC platforms can be different. ...
... We also use it in Section 6.1 because we expect a linear relationship between the selected variables and students' learning gains. Other authors use different methods such as bayesian knowledge tracing model Guo & Wu, 2015), 1-NN (Koutina & Kermanidis, 2011), neuronal networks using radial basis functions (Calvo-Flores et al., 2006), hidden markov models (Balakrishnan & Coetzee, 2013), support vector machines (Kloft et al., 2014), partitioning trees (Pardo et al., 2016) or C4.5 (Hu et al., 2014) among many others. Another interesting approach is to ensemble different prediction methods to achieve more robust results (Pardos et al., 2010;Essa & Ayad, 2012). ...
Thesis
Full-text available
The 'big data' scene has brought new improvement opportunities to most products and services, including education. Web-based learning has become very widespread over the last decade, which in conjunction with the MOOC phenomenon, it has enabled the collection of large and rich data samples regarding the interaction of students with these educational online environments. We have detected different areas in the literature that still need improvement and more research studies. Particularly, in the context of MOOCs and SPOCs, where we focus our data analysis on the platforms Khan Academy, Open edX and Coursera. More specifically, we are going to work towards learning analytics visualization dashboards, carrying out an evaluation of these visual analytics tools. Additionally, we will delve into the activity and behavior of students with regular and optional activities, badges and their online academically dishonest conduct. The analysis of activity and behavior of students is divided first in exploratory analysis providing descriptive and inferential statistics, like correlations and group comparisons, as well as numerous visualizations that facilitate conveying understandable information. Second, we apply clustering analysis to find different profiles of students for different purposes e.g., to analyze potential adaptation of learning experiences and pedagogical implications. Third, we also provide three machine learning models, two of them to predict learning outcomes (learning gains and certificate accomplishment) and one to classify submissions as illicit or not. We also use these models to discuss about the importance of variables. Finally, we discuss our results in terms of the motivation of students, student profiling, instructional design, potential actuators and the evaluation of visual analytics dashboards providing different recommendations to improve future educational experiments.
... Knowles [20] predicted the student dropout from secondary educational institutions. Pass percentage was predicted using the binary classification by Kotsiantis et al. [21] while the students were classified using the decision tree and the naïve Bayes classifiers [27]. Saa [31] predicted the grades of each student who registered for the course during that semester. ...
Article
Full-text available
Traditional face-to-face education has shifted to online education to prevent large gatherings and crowds from spreading the COVID-19 virus. Several online platforms like Zoom, GoToMeeting, Microsoft Teams, and WebEx restore traditional teaching and promote online education. Online learning classes are particularly beneficial for hospitalized students, massive open online courses (MOOCS), and lifelong learners. This paper uses the deep learning model to predict student performance in an online environment. Student interaction with the online environment is vital to predicting student performance. This prediction will help identify at-risk students, and teachers can help motivate the poor-performance students. We used student interaction features like click sums. We studied credits to understand the students’ behaviour and tried to forecast the outcomes of their final scores by using the hybrid deep learning models. The proposed hybrid model predicts student performance with an accuracy of 98.80%. The results proved that the proposed deep learning model effectively predicts student performance in an online environment.
... Decision trees are often seen as more interpretable than other machine learning models because they use a clear, systematic approach to decisionmaking that is easy for humans to understand. Parlo et al [37] use recursive partitioning that can enable instructors to readily identify subpopulations of students to provide specific support actions and provides actionable insights. For the prediction of students' efficiency in collaborative problem solving [38] built decision trees. ...
... Uniquement 7 des 42 articles retenus citent l'usage d'un algorithme spécifique (16,67 %). Les auteurs de [115,119] indiquent l'utilisation de l'arbre de décision. [134] cite faire de la classification, mais ne précise pas l'algorithme utilisé. ...
Thesis
Ce travail s’inscrit dans une démarche d’implémentation d’un processus d’analytique des apprentissages avec le numérique dans un contexte où la production documentaire est réalisée via une approche d’ingénierie dirigée par les modèles. Nous nous intéressons principalement aux possibilités qui pourraient émerger si une même approche est utilisée afin de réaliser une telle implémentation. Notre problématique porte sur l’identification de ces possibilités, notamment en s’assurant de permettre, via le métamodèle proposé, l’enrichissement d’indicateurs d’apprentissage avec la sémantique et la structure des documents pédagogiques consultés par les apprenants, ainsi qu’une définition en amont des indicateurs pertinents. Afin de concevoir le métamodèle en question, nous avons d’un côté procédé à une étude exploratoire auprès des apprenants afin de connaître leurs besoins et la réception d’indicateurs enrichis. D’un autre côté, nous avons réalisé une revue systématique de la littérature des indicateurs d’interaction existants dans le domaine de l’analytique des apprentissages avec le numérique afin de connaître les éléments potentiellement à abstraire pour la construction d’un métamodèle qui les représente. L’enjeu a été celui de concevoir un métamodèle où les éléments nécessaires à l’abstraction de ce domaine soient présents sans être inutilement complexes, permettant de modéliser à la fois des indicateurs d’apprentissage se basant sur une analyse descriptive et ceux faisant une prévision ou un diagnostic. Nous avons ensuite procédé à une preuve de concept et à une évaluation de ce métamodèle auprès des modélisateurs.
... In a CS1 course, Porter et al. (2014), and a subsequent study by Liao et al. (2016) used answers from classroom response questions at the start of the term and showed that the predictive models built could predict end-of-course performance, thus enabling timely intervention for at-risk students at an early stage. Using data from a first-year engineering course, Pardo et al. (2016) suggested that decision tree models built from log data could be used to give personalized feedback. Log data from LMS have been used to build predictive linear regression and logistic regression in various non-computing courses within the same university, and it was shown that the models generated were course-specific (Connijn et al, 2016). ...
Article
Full-text available
Some novice learners of computer programming are at risk of doing badly in their first programming course. In this pilot study, we develop a logistic regression model to predict at- risk students in our introductory programming course. The model is developed using students’ high school grades on mathematics, features calculated from log data, and scores from a programming quiz. The model suggests that students who have lower mathematics grade, who submit their homework assignments late, and who have lower scores in the programming quiz are more likely to be at-risk. We discuss some implications of this result on our teaching and learning strategies in our course.
... One of the greatest challenges of Massive Open Online Courses (MOOC) learners is to be able to self-direct and self-regulate their learning process and adjust their strate-gies according to the particular context in order to achieve their learning objectives [35]. In the past years, and due to the massive amount of data collected from MOOC platforms, several researchers in the Learning Analytics (LA) community have focused on the analysis of learners' trace data to unveil their learning strategies and propose new classifications accordingly [25,28]. Several methods and techniques have been applied to analyse these trace data, such as unsupervised machine learning techniques, sequence mining algorithms, transition graphs or hidden Markov models [18,25]. ...
Chapter
The study of learners’ behaviour in Massive Open Online Courses (MOOCs) is a topic of great interest for the Learning Analytics (LA) research community. In the past years, there has been a special focus on the analysis of students’ learning strategies, as these have been associated with successful academic achievement. Different methods and techniques, such as temporal analysis and process mining (PM), have been applied for analysing learners’ trace data and categorising them according to their actual behaviour in a particular learning context. However, prior research in Learning Sciences and Psychology has observed that results from studies conducted in one context do not necessarily transfer or generalise to others. In this sense, there is an increasing interest in the LA community in replicating and adapting studies across contexts. This paper serves to continue this trend of reproducibility and builds upon a previous study which proposed and evaluated a PM methodology for classifying learners according to seven different behavioural patterns in three asynchronous MOOCs of Coursera. In the present study, the same methodology was applied to a synchronous MOOC on edX with N = 50,776 learners. As a result, twelve different behavioural patterns were detected. Then, we discuss what decision other researchers should made to adapt this methodology and how these decisions can have an effect on the analysis of trace data. Finally, the results obtained from applying the methodology contribute to gain insights on the study of learning strategies, providing evidence about the importance of the learning context in MOOCs.KeywordsLearning analyticsLearning behaviourLearning strategiesProcess miningMassive open online courses
... The purpose of learning performance prediction is to predict and understand the academic performance of students in the learning trajectory, help teachers to comprehensively understand students' academic conditions, and implement targeted intervention plans based on the predicted results to improve the learning experience of learners [5]. However, using this method also faces multiple challenges, especially how to obtain, process and use data to build learning behavior models [6]. It can be seen that predicting learners' academic performance has become a key issue in the field of learning analysis. ...
Article
Full-text available
Learning analysis provides a new opportunity for the development of online education, and has received extensive attention from scholars at home and abroad. How to use data and models to predict learners’ academic success or failure and give teaching feedback in a timely manner is a core problem in the field of learning analytics. At present, many scholars use key learning behaviors to improve the prediction effect by exploring the implicit relationship between learning behavior data and grades. At the same time, it is very important to explore the association between categories and prediction effects in learning behavior classification. This paper proposes a self-adaptive feature fusion strategy based on learning behavior classification, aiming to mine the effective E-learning behavior feature space and further improve the performance of the learning performance prediction model. First, a behavior classification model (E-learning Behavior Classification Model, EBC Model) based on interaction objects and learning process is constructed; second, the feature space is preliminarily reduced by entropy weight method and variance filtering method; finally, combined with EBC Model and a self-adaptive feature fusion strategy to build a learning performance predictor. The experiment uses the British Open University Learning Analysis Dataset (OULAD). Through the experimental analysis, an effective feature space is obtained, that is, the basic interactive behavior (BI) and knowledge interaction behavior (KI) of learning behavior category has the strongest correlation with learning performance.And it is proved that the self-adaptive feature fusion strategy proposed in this paper can effectively improve the performance of the learning performance predictor, and the performance index of accuracy(ACC), F1-score(F1) and kappa(K) reach 98.44%, 0.9893, 0.9600. This study constructs E-learning performance predictors and mines the effective feature space from a new perspective, and provides some auxiliary references for online learners and managers.
... We need ways to ensure that our LA has a strong link to reality, and makes grounded and well founded claims about what a decision maker, student, or educator could change to improve the outcomes modelled. How can we work towards ensuring that the models generated by LA are actionable [39], with results that can be mapped to definable educational constructs? As such, this problem is highly related to our problem of Theory described above. ...
... Improved students' learning outcomes. 21 Mardikyan & Badur (2011) Investigates the factors associated with the assessment of instructors' teaching performance using two data techniques. ...
Article
Full-text available
Learning analytics is a form of data analysis that allows teachers, lecturers, educational experts, and administrators of online learnings to look for students’ online traces and information associated with the learning processes. The fundamental goal of learning analytics in online classrooms and com-puter-supported instruction is to enhance the learning experience and the entire learning process. This review aims at reviewing some of the benefits available through using learning analytics in higher education institutions (HEI) for the students, teaching staff and the management. The search for relevant literature was conducted by searching online databases which in-clude Web of Science, SCOPUS, Science Direct, IEEE, Emerald, Springer, ERIC and Association for Computing Machinery (ACM). The analysis of the literatures obtained from the online databases revealed that learning analytics provide series of benefits to students, teaching staffs and the management of higher education institutions. The benefits include prediction and identification of target courses, curriculum development and improvement, improved students’ learning outcomes, improved instructors’ performance and monitoring of students’ dropout and retention. It is recommended that higher education institutions adopt the use of learning analytics in their online teaching and learning.
... The term actionable insight has been increasingly used in several learning analytics research papers (Drachsler et al., 2015;Gašević et al., 2017;Pardo et al., 2016;Sergis & Sampson, 2017); however, it was only recently formally defined as "data that allows a corrective procedure, or feedback loop, to be established for a set of actions" (Jørnø & Gynther, 2018). This concept foregrounds the importance of designing for specific actors, with particular tasks and levels of expertise, rather than a vague "user." ...
Article
Full-text available
Using data to generate a deeper understanding of collaborative learning is not new, but automatically analyzing log data has enabled new means of identifying key indicators of effective collaboration and teamwork that can be used to predict outcomes and personalize feedback. Collaboration analytics is emerging as a new term to refer to computational methods for identifying salient aspects of collaboration from multiple group data sources for learners, educators, or other stakeholders to gain and act upon insights. Yet, it remains unclear how collaboration analytics go beyond previous work focused on modelling group interactions for the purpose of adapting instruction. This paper provides a conceptual model of collaboration analytics to help researchers and designers identify the opportunities enabled by such innovations to advance knowledge in, and provide enhanced support for, collaborative learning and teamwork. We argue that mapping from low-level data to higher-order constructs that are educationally meaningful, and that can be understood by educators and learners, is essential to assessing the validity of collaboration analytics. Through four cases, the paper illustrates the critical role of theory, task design, and human factors in the design of interfaces that inform actionable insights for improving collaboration and group learning.
... Performance [8], [35] Grading [11], [36], [37], [38] Achievement [39], [40], [41], [42] Reflection [3] Acquisition [43] Participation [44] Meta-Cognitive [45] Engagement [46], [26] Knowledge Domain [47] Self-Regulated Learning [48], [49] Prediction of Success [7], [25], [27], [19] Procrastination [28] Table III shows the objective of the assessment and the parameter attributes that support it. [24] access time, action, information, access_counting, status_participation Early Prediction of Student Success [7] standard data (demographics, values, time of activities in online classes, video viewing activities), and behavioral alignment data form the mining process [25] course_view, assign_view, assign_submit_update, resource_view, forum_view, previous prerequisite assessment, exam score [27] click-stream behavior is detected temporally so that it functions as a behavioral intervention Performance [8] quiz grades, assignment grades, project grades, participation scores, midterm scores, and online class attendance [35] assignments, quizzes, midterms, final grades, video viewing, reading material, online class attendance Engagement [26] level of student involvement in each material, frequency of access, profile Self-Regulated Learning [48] time, id, action ...
... Late prediction timing (Pardo et al., 2016;Waddington et al., 2016;Casey and Azcona, 2017;Costa et al., 2017;Hung et al., 2017). ...
Article
Purpose For studies in educational data mining or learning Analytics, the prediction of student’s performance or early warning is one of the most popular research topics. However, research gaps indicate a paucity of research using machine learning and deep learning (DL) models in predictive analytics that include both behaviors and text analysis. Design/methodology/approach This study combined behavioral data and discussion board content to construct early warning models with machine learning and DL algorithms. In total, 680 course sections, 12,869 students and 14,951,368 logs were collected from a K-12 virtual school in the USA. Three rounds of experiments were conducted to demonstrate the effectiveness of the proposed approach. Findings The DL model performed better than machine learning models and was able to capture 51% of at-risk students in the eighth week with 86.8% overall accuracy. The combination of behavioral and textual data further improved the model’s performance in both recall and accuracy rates. The total word count is a more general indicator than the textual content feature. Successful students showed more words in analytic, and at-risk students showed more words in authentic when text was imported into a linguistic function word analysis tool. The balanced threshold was 0.315, which can capture up to 59% of at-risk students. Originality/value The results of this exploratory study indicate that the use of student behaviors and text in a DL approach may improve the predictive power of identifying at-risk learners early enough in the learning process to allow for interventions that can change the course of their trajectory.
... The evaluation metrics are indicators aiming to evaluate a model's performance. Based on Table I, most studies adopted indicators that evaluate overall performance, such as accuracy, RMSE, MAE, and AIC [45], [53], [54]. However, since the goal is to identify potentially at-risk students, indicators like recall, F-measure, and ROC are more appropriate. ...
Article
This study proposes two innovative approaches, the 1-channel learning image recognition (1-CLIR) and the 3-channel learning image recognition (3-CLIR) to convert student's course involvements into images for early warning predictive analysis. Multiple experiments with 5,235 students and 576 absolute/1728 relative input variables were conducted to verify their effectiveness. The results indicate both methods can significantly capture more at-risk students (the highest average recall rate is equal to 77.26%) than the following machine learning algorithms—Support Vector Machine (SVM), Random Forest (RF), and Deep Neural Network (DNN) in the middle of the semester. In addition, the innovative approaches allow minor subtypes of at-risk student identification and provide visual insights for personalized interventions. Implications and future directions are also discussed in the article.
... That is, to propose new indicators that represent how learners adhere to the designed paths of the course, such as activity sequences extracted from coarse-grained data. This idea is built upon previous studies, which investigated the relationships between interaction sequences and learning outcomes using methods such as transition graphs, process mining, sequential pattern analysis, and Markov models Pardo et al., 2016). Therefore, and based on prior work, this subsection tackles the following sub research question: Sub-RQ 2-4: Which indicators of SRL obtained from self-reported questionnaires and activity sequence extracted from trace data can predict course success in self-paced MOOCs? ...
Thesis
Full-text available
Massive Open Online Courses (MOOCs) have become a source of digital content anytime and anywhere. MOOCs offer quality content to millions of learners around the world, providing new opportunities for learning. However, only a fraction of those who initiate a MOOC complete it, leaving thousands of committed students without achieving their goals. Recent research suggests that one of the reasons why students find it difficult to complete a MOOC is that they have problems planning, executing, and monitoring their learning process autonomously; that is, they do not effectively self-regulate their learning (SRL). In this thesis, we will explore the possibilities that Learning Analytics (LA) offers to investigate the learning strategies that students use when self-regulate their learning in online environments such as MOOCs. Particularly, the main objective of this research is to develop instruments and methods for measuring students’ SRL strategies (cognitive, metacognitive and resource management) in MOOCs, and to analyze their relationship with students’ learning outcomes. As a methodological approach, this thesis uses mixed methods as a baseline for organizing and planning the research, combining trace-data with self-reported data to better understand SRL in MOOCs. The main contribution of the thesis is threefold. First, it proposes an instrument to measure learners’ SRL profiles in MOOCs. This instrument was validated with an exploratory and confirmatory factorial analysis with 4,627 responses collected in three MOOCs. Second, it presents a methodology based on data mining and process mining techniques to extract learners’ SRL patterns in MOOCs. The methodology was applied in three self-paced Coursera MOOCs with data from 3,458 learners where six patterns of interaction were identified. Then, this methodology was adapted and applied in an effort of replication for analyzing a synchronous edX MOOC with data from 50,776 learners where twelve patterns of interaction we identified. The third contribution is a set of empirical studies that show the relationship between SRL strategies and academic performance, using data from six self-paced MOOCs in Coursera and two synchronous MOOCs in Open edX. These empirical studies led us to identify selfreported learners’ variables (i.e., gender, prior knowledge and occupation) and selfreported SRL strategies (i.e., goal setting, strategic planning) that were identified as the most relevant to predict academic
... Many early warning studies adopted indicators for evaluating overall performance, such as accuracy, Root Mean Square Error (RMSE)), MAE (Mean Absolute Error), and AIC (Akaike information criterion) [34], [35], [36]. However, since the goal is to identify potentially at-risk students, indicators like recall, F-measure, and ROC (Receiver Operating Characteristic) are more appropriate. ...
Article
Full-text available
The rapid development of learning technologies has enabled online learning paradigm to gain great popularity in both high education and K-12, which makes the prediction of student performance become one of the most popular research topics in education. However, the traditional prediction algorithms are originally designed for balanced dataset, while the educational dataset typically belongs to highly imbalanced dataset, which makes it more difficult to accurately identify the at-risk students. In order to solve this dilemma, this study proposes an integrated framework (LVAEPre) based on latent variational autoencoder (LVAE) with deep neural network (DNN) to alleviate the imbalanced distribution of educational dataset and further to provide early warning of at-risk students. Specifically, with the characteristics of educational data in mind, LVAE mainly aims to learn latent distribution of at-risk students and to generate at-risk samples for the purpose of obtaining a balanced dataset. DNN is to perform final performance prediction. Extensive experiments based on the collected K-12 dataset show that LVAEPre can effectively handle the imbalanced education dataset and provide much better and more stable prediction results than baseline methods in terms of accuracy and F1.5 score. The comparison of t-SNE visualization results further confirms the advantage of LVAE in dealing with imbalanced issue in educational dataset. Finally, through the identification of the significant predictors of LVAEPre in the experimental dataset, some suggestions for designing pedagogical interventions are put forward.
... That is, to propose new indicators that represent how learners adhere to the designed paths of the course, such as activity sequences extracted from coarse-grained data. This idea is built upon previous studies, which investigated the relationships between interaction sequences and learning outcomes using methods such as transition graphs, process mining, sequential pattern analysis, and Markov models Pardo et al., 2016). Therefore, and based on prior work, this subsection tackles the following sub research question: Sub-RQ 2-4: Which indicators of SRL obtained from self-reported questionnaires and activity sequence extracted from trace data can predict course success in self-paced MOOCs? ...
... This is even more the case when such models are encapsulated within artificial intelligence or analytics solutions that use them as 'black-boxes' for automatic or semiautomatic decision making. This is a major concern of the Big Data era (Müller et al., 2016), which is highly relevant to the educational domain as well (Pardo et al., 2016;O'Neil, 2017). ...
Article
Full-text available
The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners’ activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners’ data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods.
... • Initial Vocabulary/Taxonomy for Performance -Assignment grade -Course retention / dropout -Course grade -Course grade range (e.g. A-C, D-F) -Course pass / fail -Exam grade -GPA -Graduation -Program retention / dropout -Unspecified performance -Not applicable -Other [7, 22, 46, 47, 62, 63, 71-73, 83, 87, 89, 94, 104, 107, 147, 149, 166, 172, 173, 177, 184, 186, 206, 208, 215, 218, 220, 231, 232, 244, 279, 296, 299, 315, 319, 327, 329, 337, 341, 358, 359, 373, 374, 407, 411, 424] Exam / Post-test Grade or Score [9,10,21,24,33,35,52,59,67,68,77,80,81,85,90,96,109,114,116,127,136,152,162,163,169,193,195,199,202,205,214,217,224,233,238,241,270,274,275,277,284,287,301,302,309,314,334,346,360,368,383,420,423] Course Grade or Score [1, 13, 14, 19, 21, 26, 27, 33, 34, 52, 57, 60, 64, 67-69, 74, 75, 78, 86, 93, 103, 105, 115, 118, 119, 122, 126-128, 133, 138, 141, 142, 144, 146, 158, 159, 171, 173-175, 183, 187, 197, 200, 203, 210, 237, 238, 240, 248, 250, 268, 272, 273, 285, 286, 292, 303, 304 [52,54,59,60,65,82,92,105,106,123,124,127,143,151,165,179,187,188,195,209,216,223,228,239,242,251,256,264,286,309,310,352,364,365,376,383,389,395,399,408,415] Unspecified or Vague Performance [3,6,8,11,15,18,48,49,53,55,56,73,97,100,101,117,125,132,134,187,199,229,234,245,247,271,276,280,281,288,290,294,295,298,300,328,334,342,343,367,371,392,410] Course Retention / Dropout [42,74,91,153,172,178,192,196,212,213,218,220,225,257,261,262,388,398,403,422] GPA or GPA Range (including CGPA, SGPA) [4, 12, 28-32, 36, 37, 64, 66, 73, 113, 129, 135, 137, 140, 149, 167, 181, 189, 206, 211, 218, 240, 249, 255, 263, 265, 266, 268, 291, 297, 322, 325, 339, 344, 345, 366, 385, 390, 402, 405, 413] Course Grade Range (e.g., A-B/C-F, Pass/Fail) [17, 20, 38, 52, 58, 61, 77, 95, 108, 110, 111, 121, 126, 130, 139, 172, 174, 182, 194, 207, 222, 226, 227, 238, 254, 257, 258, 260, 267, 278, 289, 293, 316, 317, 323, 331, 332, 361, 363, 370, 379-381, 386, 402, 403, 406, 418, 421] Knowledge Gain [23,156,164,192,201,270,401,419] □ Define the factors used for prediction. Describe them in such detail that a reader that is not familiar with your particular context understands them. ...
Conference Paper
The ability to predict student performance in a course or program creates opportunities to improve educational outcomes. With effective performance prediction approaches, instructors can allocate resources and instruction more accurately. Research in this area seeks to identify features that can be used to make predictions, to identify algorithms that can improve predictions, and to quantify aspects of student performance. Moreover, research in predicting student performance seeks to determine interrelated features and to identify the underlying reasons why certain features work better than others. This working group report presents a systematic literature review of work in the area of predicting student performance. Our analysis shows a clearly increasing amount of research in this area, as well as an increasing variety of techniques used. At the same time, the review uncovered a number of issues with research quality that drives a need for the community to provide more detailed reporting of methods and results and to increase efforts to validate and replicate work.
... In an LA perspective, such a reframing ought to be conspicuous, but it seems to go unnoticed. For instance, Pardo, Mirriahi et al. (2016) write: ...
Article
The possibilities of Learning Analytics as a tool for empowering teachers and educators have created a steep interest in how to provide so-called actionable insights. However, the literature offers little in the way of defining or discussing what the term “actionable insight” means. This selective literature review provides a look into the use of the term in current literature. The review points to a dominant perspective in the literature that assumes the perspective of a rational actor, where actionable insights are treated as insights mined from data and subsequently acted upon. It also finds evidence of other perspectives and discusses the need for clarification of the term in order to establish a more precise and fruitful use of the term.
... That is, to propose new indicators that represent how learners adhere to the designed paths of the course, such as activity sequences extracted from coarse-grained data. This idea is built upon previous studies, which investigated the relationships between interaction sequences and learning outcomes using methods such as transition graphs, process mining, sequential pattern analysis, and Markov models [14,19,23]. ...
... That is, to propose new indicators that represent how learners adhere to the designed paths of the course, such as activity sequences extracted from coarse-grained data. This idea is built upon previous studies, which investigated the relationships between interaction sequences and learning outcomes using methods such as transition graphs, process mining, sequential pattern analysis, and Markov models [14,20,24]. ...
Chapter
In the past years, predictive models in Massive Open Online Courses (MOOCs) have focused on forecasting learners’ success through their grades. The prediction of these grades is useful to identify problems that might lead to dropouts. However, most models in prior work predict categorical and continuous variables using low-level data. This paper contributes to extend current predictive models in the literature by considering coarse-grained variables related to Self-Regulated Learning (SRL). That is, using learners’ self-reported SRL strategies and MOOC activity sequence patterns as predictors. Lineal and logistic regression modelling were used as a first approach of prediction with data collected from N = 2,035 learners who took a self-paced MOOC in Coursera. We identified two groups of learners: (1) Comprehensive, who follow the course path designed by the teacher; and (2) Targeting, who seek for the information required to pass assessments. For both type of learners, we found a group of variables as the most predictive: (1) the self-reported SRL strategies ‘goal setting’, ‘strategic planning’, ‘elaboration’ and ‘help seeking’; (2) the activity sequences patterns ‘only assessment’, ‘complete a video-lecture and try an assessment’, ‘explore the content’ and ‘try an assessment followed by a video-lecture’; and (3) learners’ prior experience, together with the self-reported interest in course assessments, and the number of active days and time spent in the platform. These results show how to predict with more accuracy when students reach a certain status taking in to consideration not only low-level data, but complex data such as their SRL strategies. KeywordsSelf-regulated learningPredictionMassive Open Online CoursesSequence patternsAchievementSuccess
... For example, at the begin- ning of the course, one interest may be forecasting the midterm results, whereas in the last week the aim of pre- diction is to know the final exam result (or the final grade). In this case, as the course evolves and more information becomes available, results can improve in the final weeks [19]. This was also corroborated by Okubo et al. [20], who predicted grades using neural networks, and improved the accuracy from 50 % in the first week to 100 % in the tenth. ...
Article
This paper surveys the state of the art on prediction in MOOCs through a Systematic Literature Review (SLR). The main objectives are: (1) to identify the characteristics of the MOOCs used for prediction, (2) to describe the prediction outcomes, (3) to classify the prediction features, (4) to determine the techniques used to predict the variables, and (5) to identify the metrics used to evaluate the predictive models. Results show there is strong interest in predicting dropouts in MOOCs. A variety of predictive models are used, though regression and Support Vector Machines stand out. There is also wide variety in the choice of prediction features, but clickstream data about platform use stands out. Future research should focus on developing and applying predictive models that can be used in more heterogeneous contexts (in terms of platforms, thematic areas, and course durations), on predicting new outcomes and making connections among them (e.g., predicting learners' expectancies), on enhancing the predictive power of current models by improving algorithms or adding novel higher-order features (e.g., efficiency, constancy, etc.).
... • Initial Vocabulary/Taxonomy for Performance -Assignment grade -Course retention / dropout -Course grade -Course grade range (e.g. A-C, D-F) -Course pass / fail -Exam grade -GPA -Graduation -Program retention / dropout -Unspecified performance -Not applicable -Other [7, 22, 46, 47, 62, 63, 71-73, 83, 87, 89, 94, 104, 107, 147, 149, 166, 172, 173, 177, 184, 186, 206, 208, 215, 218, 220, 231, 232, 244, 279, 296, 299, 315, 319, 327, 329, 337, 341, 358, 359, 373, 374, 407, 411, 424] Exam / Post-test Grade or Score [9,10,21,24,33,35,52,59,67,68,77,80,81,85,90,96,109,114,116,127,136,152,162,163,169,193,195,199,202,205,214,217,224,233,238,241,270,274,275,277,284,287,301,302,309,314,334,346,360,368,383,420,423] Course Grade or Score [1, 13, 14, 19, 21, 26, 27, 33, 34, 52, 57, 60, 64, 67-69, 74, 75, 78, 86, 93, 103, 105, 115, 118, 119, 122, 126-128, 133, 138, 141, 142, 144, 146, 158, 159, 171, 173-175, 183, 187, 197, 200, 203, 210, 237, 238, 240, 248, 250, 268, 272, 273, 285, 286, 292, 303, 304 [52,54,59,60,65,82,92,105,106,123,124,127,143,151,165,179,187,188,195,209,216,223,228,239,242,251,256,264,286,309,310,352,364,365,376,383,389,395,399,408,415] Unspecified or Vague Performance [3,6,8,11,15,18,48,49,53,55,56,73,97,100,101,117,125,132,134,187,199,229,234,245,247,271,276,280,281,288,290,294,295,298,300,328,334,342,343,367,371,392,410] Course Retention / Dropout [42,74,91,153,172,178,192,196,212,213,218,220,225,257,261,262,388,398,403,422] GPA or GPA Range (including CGPA, SGPA) [4, 12, 28-32, 36, 37, 64, 66, 73, 113, 129, 135, 137, 140, 149, 167, 181, 189, 206, 211, 218, 240, 249, 255, 263, 265, 266, 268, 291, 297, 322, 325, 339, 344, 345, 366, 385, 390, 402, 405, 413] Course Grade Range (e.g., A-B/C-F, Pass/Fail) [17, 20, 38, 52, 58, 61, 77, 95, 108, 110, 111, 121, 126, 130, 139, 172, 174, 182, 194, 207, 222, 226, 227, 238, 254, 257, 258, 260, 267, 278, 289, 293, 316, 317, 323, 331, 332, 361, 363, 370, 379-381, 386, 402, 403, 406, 418, 421] Knowledge Gain [23,156,164,192,201,270,401,419] □ Define the factors used for prediction. Describe them in such detail that a reader that is not familiar with your particular context understands them. ...
Conference Paper
Since computing education began, we have sought to learn why students struggle in computer science and how to identify these at-risk students as early as possible. Due to the increasing availability of instrumented coding tools in introductory CS courses, the amount of direct observational data of student working patterns has increased significantly in the past decade, leading to a flurry of attempts to identify at-risk students using data mining techniques on code artifacts. The goal of this work is to produce a systematic literature review to describe the breadth of work being done on the identification of at-risk students in computing courses. In addition to the review itself, which will summarize key areas of work being completed in the field, we will present a taxonomy (based on data sources, methods, and contexts) to classify work in the area.
... Another area of interest is the prediction of scores in MOOCs. Research in this field is also important to identify learners who have trouble understanding the course contents and to be able to provide frequent and effective feedbacks (Pardo et al. 2016). Furthermore, anticipating grades can be more useful than dropouts in some cases since, e.g., there can be users who have interest on the course but do not manage to acquire enough skills to pass. ...
Article
The learning process in a MOOC (Massive Open Online Course) can be improved from knowing in advance learners’ grades on different assignments. This would be very useful to detect problems with enough time to take corrective measures. In this work, the aim is to analyse how different course scores can be predicted, what elements or variables affect the predictions and how much and in which way it is possible to anticipate scores. To do that, data from a MOOC about Java programming have been used. Results show the importance of indicators over the algorithms and that forum-related variables do not add power to predict grades, unlike previous scores. Furthermore, the type of task can vary the results. Regarding the anticipation, it was possible to use data from previous topics but with worse performance, although values were better than those obtained in the first seven days of the current topic.
... Over the years an ongoing stream of work in LA has focused upon student facing tools that are used directly in a class context (e.g. [4,7,20,28,33,38,51,52]). It seems possible to identify two broad motivations behind these solutions: are they teaching students curriculum content, or are they trying to help them learn how to learn more effectively? ...
Conference Paper
Full-text available
Learning Analytics (LA) sits at the confluence of many contributing disciplines, which brings the risk of hidden assumptions inherited from those fields. Here, we consider a hidden assumption derived from computer science, namely, that improving computational accuracy in classification is always a worthy goal. We demonstrate that this assumption is unlikely to hold in some important educational contexts, and argue that embracing computational "imperfection" can improve outcomes for those scenarios. Specifically, we show that learner-facing approaches aimed at "learning how to learn" require more holistic validation strategies. We consider what information must be provided in order to reasonably evaluate algorithmic tools in LA, to facilitate transparency and realistic performance comparisons.
... Research studies in the field of LA have highlighted the potential of LA student dashboards in enhancing students' motivation and engagement (Verbert et al., 2013;Wise, Zhao, & Hausknecht, 2014), as well as improving learning behaviours and academic performance (Arnold & Pistilli, 2012). However, some have suggested that LA dashboards have resulted in more frequent but not higher quality feedback (Pardo et al., 2016;Tanes, Arnold, King, & Remnet, 2011) and can even be detrimental for learning (Corrin & de Barba, 2014). These contradictory illustrations of the effects of LA dashboards on learning emphasize the need for the purposeful and empirically-informed design of LA dashboards. ...
Conference Paper
Full-text available
Although learning analytics (LA) dashboard visualizations are increasingly being used to provide feedback to students, literature on the effectiveness of LA dashboards has been inconclusive. To address this, a LA student dashboard visualizing students' latest data against their own data from previous weeks (i.e., self-referenced data) was designed-informed by Fredrickson's (2004) broaden-and-build theory, as well as studies highlighting personal best goals (Martin & Elliot, 2016) and the negative effects of peer comparisons (Corrin & de Barba, 2014). The self-referenced LA student dashboard was implemented and evaluated in a Singapore secondary school as part of a larger study, WiREAD. This paper reports on the quantitative impact of the WiREAD self-referenced LA dashboard visualizations on 15-year-old students' critical reading fluency, cognitive reading engagement, and English language (EL) self-efficacy, as well as students' qualitative feedback on the usefulness and shortcomings of the LA dashboard.
... The target prediction and learning environment can vary from one study to another. For example we can find traditional settings such as high school education [1], but lately many studies are using data from different types of Virtual Learning Environments (VLEs) such as Intelligent Tutoring Systems (ITSs) [3,12], and more traditional LMS environments [7,16]. Furthermore, these studies target different learning outcomes such as graduation in high school [1], performance in course activities [7], learning gains [18] or end-of-the assessment scores [3]. ...
Conference Paper
Full-text available
The emergence of MOOCs (Massive Open Online Courses) makes available big amounts of data about students’ interaction with online educational platforms. This allows for the possibility of making predictions about future learning outcomes of students based on these interactions. The prediction of certificate accomplishment can enable the early detection of students at risk, in order to perform interventions before it is too late. This study applies different machine learning techniques to predict which students are going to get a certificate during different timeframes. The purpose is to be able to analyze how the quality metrics change when the models have more data available. From the four machine learning techniques applied finally we choose a boosted trees model which provides stability in the prediction over the weeks with good quality metrics. We determine the variables that are most important for the prediction and how they change during the weeks of the course.
... It is worthwhile noting that the importance of accurate predictive models of attrition or disengagement as studied through MOOC data can also be applied to face-to-face instruction, by making available predictions to teachers so that they can provide timely feedback or take any other suitable action [14], [23]. ...
Conference Paper
Full-text available
There are a number of similarities and differences between FutureLearn MOOCs and those offered by other platforms, such as edX. In this research we compare the results of applying machine learning algorithms to predict course attrition for two case studies using datasets from a selected FutureLearn MOOC and an edX MOOC of comparable structure and themes. For each we have computed a number of attributes in a pre-processing stage from the raw data available in each course. Following this, we applied several machine learning algorithms on the pre-processed data to predict attrition levels for each course. The analysis suggests that the attribute selection varies in each scenario, which also impacts on the behaviour of the predicting algorithms.
... The first comprises the previous two LAK hackathons, the 2015 LAK Workshop "Visual Aspects of Learning Analytics" [2], and the 2016 LAK Workshop "Data Literacy for Learning Analytics" [9]. We will set the scene for the workshop using recent research on actionable analytics [6], student feedback [4], and embedding learning analytics in pedagogic practice [5]. Finally, we will introduce Jisc's student app 2 , which is being piloted with students across the UK after extensive consultation and design activities. 1 Apereo Foundation, https://www.apereo.org/content/about 2 https://analytics.jiscinvolve.org/wp/category/student-app/ ...
Conference Paper
The hackathon is intended to be a practical hands-on workshop involving participants from academia and commercial organizations with both technical and practitioner expertise. It will consider the outstanding challenge of visualizations which are effective for the intended audience: informing action, not likely to be misinterpreted, and embodying contextual appropriacy, etc. It will surface particular issues as workshop challenges and explore responses to these challenges as visualizations resting upon interoperability standards and API-oriented open architectures.
Article
Full-text available
Predicting student’s successful completion of academic programs and the features that influence their performance can have a significant effect on improving students’ completion, and graduation rates and reduce attrition rates. Therefore, identifying students are at risk, and the courses where improvements in content, delivery mode, pedagogy, and assessment activities can improve students’ learning experience and completion rates. In this work, we have developed a prediction and explanatory model using adaptive neuro-fuzzy inference system (ANFIS) methodology to predict the grade point average (GPA), at graduation time, of students enrolled in the information technology program at Ajman University. The approach adopted uses students’ grades in introductory and fundamental IT courses and high school grade point average (HSGPA) as predictors. Sensitivity analysis was performed on the model to quantify the relative significance of each predictor in explaining variations in graduation GPA. Our findings indicate HSGPA is the most influential factor in predicting graduation GPA, with data structures, operating systems, and software engineering coming closely in second place. On the explanatory side, we have found that discrete mathematics was the most influential course causing variations in graduation GPA, followed by software engineering, information security, and HSGPA. When we ran the model on the testing data, 77% of the predicted values fell within one root mean square error (0.29) of the actual GPA, which has a maximum of four. We have also shown that the ANFIS approach has better predictive accuracy than commonly used techniques such as multilinear regression. We recommend that IT programs at other institutions conduct comparable studies and shed some light on our findings.
Chapter
Currently, learning early warning mainly uses two methods, student classification and performance regression, both of which have some shortcomings. The granularity of student classification is not fine enough. The performance regression gives an absolute score value, and it cannot directly show the position of a student in the class. To overcome the above shortcomings, we will focus on a rare learning early warning method — ranking prediction. We propose a dual-student performance comparison model (DSPCM) to judge the ranking relationship between a pair of students. Then, we build the model using data including class quiz scores and online behavior times and find that these two sets of features improve the Spearman correlation coefficient for the ranking prediction by 0.2986 and 0.0713, respectively. We also compare the process proposed with the method of first using a regression model to predict scores and then ranking students. The result shows that the Spearman correlation coefficient of the former is 0.1125 higher than that of the latter. This reflects the advantage of the DSPCM in ranking prediction.KeywordsLearning early warningStudent ranking predictionClass quiz scoreOnline behavior time
Article
Full-text available
Lay Description What is already known about this topic? Learning design (LD) is the pedagogic process used in teaching/learning that leads to the creation and sequencing of learning activities and the environment in which it occurs. Learning analytics (LA) is the measurement, collection, analysis & reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. There are multiple studies on the alignment of LA and LD but research shows that there is still room for improvement. What this paper adds? To achieve better alignment between LD and LA. We address this aim by proposing a framework, where we connect the LA indicators with the activity outcomes from the LD. To demonstrate how learning events/objectives and learning activities are associated with LA indicators and how an indicator is formed/created by (several) LA metrics. We address this aim in our review. This article also aims to assist the LA research community in the identification of commonly used concepts and terminologies; what to measure, and how to measure. Implications for practice and/or policy This article can help course designers, teachers, students, and educational researchers to get a better understanding on the application of LA. This study can further help LA researchers to connect their research with LD.
Article
Full-text available
Predecir el rendimiento académico es un elemento clave en la educación, permitiéndole al profesorado diseñar acciones didácticas preventivas. Diversas disciplinas intervienen en este proceso predictivo, siendo las analíticas de aprendizaje, el aprendizaje automático, la minería de datos educativos las redes neuronales artificiales y las teorías difusas, las de mayor influencia. Se presenta una revisión sistemática a la literatura científica (2010-marzo 2020) presente en Scopus, IEEE Xplore, ACM Digital Library y Springer, con el objetivo valorar el cómo se ha comportado la predicción del rendimiento académico en dos escenarios: (1) modalidades de estudios online (en línea) y semipresencial; y (2) Apoyo tecnológico a la modalidad presencial. Se concluye el artículo con la determinación de las tendencias entre las disciplinas de las tecnologías educativas y las variables del rendimiento académico. Abstract Predicting academic performance is a key element in education, allowing teachers to design preventive didactic actions. Various computational disciplines are involved in this predictive process, with learning analytics, machine learning, educational data mining, artificial neural networks, and fuzzy theories being the most frequently used. A systematic review of the scientific literature (2010-March 2020) indexed in Scopus, IEEE Xplore, ACM Digital Library and Springer is presented, with the aim of evaluating how academic performance prediction has behaved in two scenarios: (1) studies online (online) and blended; and (2) technological support for the face-to-face modality. The article concludes with the determination of the trends between the disciplines of educational technologies and the variables of academic performance. Palabras clave Aprendizaje, educación, educación superior, estudio bibliográfico, evaluación de la educación, informática educativa, rendimiento escolar, tecnología educativa..
Chapter
Predicting student performance in computing majors and the factors affecting his success can have a substantial effect on improving student academic performance and his on-time graduation with all the financial benefits that come with that. There is a limited amount of time an academic advisor may allocate to each student to identify problem areas in the curriculum and take appropriate actions and advise the student based on informed judgement. Thus, there is a need to predict which students are at risk early on in the program. In this work, we have built a prediction model based on particle swarm optimization to estimate the final graduation grade point average (GPA) of students enrolled in the information technology program at Ajman University. Input predictors used in this work were Students' final GPA scores in core courses and high school average grade. Based on records of 74 students who have graduated from the program so far, we have found that the most influential predictor of graduation GPA is high school grade average. Our results showed that the Data Structures and Discrete Mathematics have no role to play in the prediction of GPA while networking and security courses have the most significant prediction contribution. Forty per cent of predicted values fall within 0.25 of the real GPA, which has a maximum upper bound of four. However, the accuracy of the model significantly improved when applied to a much larger publicly available dataset with 88% of GPA scores falling within 0.25 of the actual GPA.
Article
With the wide expansion of distributed learning environments the way we learn became more diverse than ever. This poses an opportunity to incorporate different data sources of learning traces that can offer broader insights into learner behavior and the intricacies of the learning process. We argue that combining analytics across different e-learning systems can measure the effectiveness of learning designs and maximize learning opportunities in distributed learning settings. As a step towards this goal, in this study, we considered how to broaden the context of a single learning environment into a learning ecosystem that integrates three separate e-learning systems. We present a cross-platform architecture that captures, integrates, and stores learning-related data from the learning ecosystem. To prove the feasibility and the benefit of the cross-platform architecture, we used regression and classification techniques to generate interpretable models with analytics that are relevant for instructors and learners in understanding learning behavior and making sense of the instructional method on learning performance. The results show that combining data across multiple e-learning systems improves the classification accuracy compared to data from a single learning system by a factor of 5. Our work highlights the value of cross-platform analytics and presents a springboard for the creation of new cross-systems data-driven research practices.
Conference Paper
Full-text available
This paper describes the first stages on the development of a design method of digital trainings using the collaborative authoring tool “ELIOT”. Based on the theory of instrumental conflict (Marquet, 2005), this method highlights the necessity of the design digital trainings under the optimal harmonization for users/learners in didactic, pedagogical and technical terms. By the implementation of an artificial intelligence, we will collect the data acquired from users’ experiences to analyse their performances. The result of this analysis will be given to the trainer/designer in order to improve future trainings by predictive learning models increasing cognitive skills and the measurement of the efficiency of a digital training.
Article
Developing tools to support students and learning in a traditional or online setting is a significant task in today's educational environment. The initial steps towards enabling such technologies using machine learning techniques focused on predicting the student's performance in terms of the achieved grades. However, these approaches do not perform as well in predicting poor-performing students. The objective of our work is two-fold. First, in order to overcome this limitation, we explore if poorly performing students can be more accurately predicted by formulating the problem as binary classification, based on data provided before the start of the semester. Second, in order to gain insights as to which are the factors that can lead to poor performance, we engineered a number of human-interpretable features that quantify these factors. These features were derived from the students' grades from the University of Minnesota, an undergraduate public institution. Based on these features, we perform a study to identify different student groups of interest, while at the same time, identify their importance. As the resulting models provide us with different subsets of correct predictions, their combination can boost the overall performance.
Article
Full-text available
Finding a solution to the problem of student retention is an often-required task across Higher Education. Most often managers and academics alike rely on intuition and experience to identify the potential risk students and factors. This paper examines the literature surrounding current methods and measures in use in Learning Analytics. We find that while tools are available, they do not focus on earliest possible identification of struggling students. Our work defines a new descriptive statistic for student attendance and applies modern machine learning tools and techniques to create a predictive model. We demonstrate how students can be identified as early as week 3 (of the Fall semester) with approximately 97% accuracy. We, furthermore, situate this result within an appropriate pedagogical context to support its use as part of a more comprehensive student support mechanism.
Article
Full-text available
This study examined the extent to which instructional conditions influence the prediction of academic success in nine undergraduate courses offered in a blended learning model (n = 4134). The study illustrates the differences in predictive power and significant predictors between course-specific models and generalized predictive models. The results suggest that it is imperative for learning analytics research to account for the diverse ways technology is adopted and applied in course-specific contexts. The differences in technology use, especially those related to whether and how learners use the learning management system, require consideration before the log-data can be merged to create a generalized model for predicting academic success. A lack of attention to instructional conditions can lead to an over or under estimation of the effects of LMS features on students' academic success. These findings have broader implications for institutions seeking generalized and portable models for identifying students at risk of academic failure.
Conference Paper
Full-text available
Inquiry skills are an important part of science education standards. There has been particular interest in verifying that these skills can transfer across domains and instructional contexts [4,15,16]. In this paper, we study transfer of inquiry skills, and the effects of prior practice of inquiry skills, using data from over 2000 middle school students using an open-ended immersive virtual environment called Virtual Performance Assessments (VPAs) that aims to assess science inquiry skills in multiple virtual scenarios. To this end, we assessed and compared student performance and behavior within VPA between two groups: novice students who had not used VPA previously, and experienced students who had previously completed a different VPA scenario. Our findings suggest that previous experience in a different scenario prepared students to transfer inquiry skills to a new one, leading these experienced students to be more successful at identifying a correct final conclusion to a scientific question, and at designing causal explanations about these conclusions, compared to novice students. On the other hand, a positive effect of novelty was found for motivation. To better understand these results, we examine the differences in student patterns of behavior over time, between novice and experienced students.
Article
Full-text available
This paper describes a study that looked at the effects of different technology-use profiles on educational experience within communities of inquiry, and how they are related to the students’ levels of cognitive presence in asynchronous online discussions. Through clustering of students (N=81) in a graduate distance education engineering course, we identified six different profiles: 1) Task-focused users, 2) content-focused no users, 3) no users, 4) highly intensive users, 5) content-focused intensive users, and 6) Socially-focused intensive users. Identified profiles significantly differ in terms of their use of learning platform and their levels of cognitive presence, with large effect sizes of 0.54 and 0.19 multivariate η2, respectively. Given that several profiles are associated with higher levels of cognitive presence, our results suggest multiple ways for students to be successful within communities of inquiry. Our results also emphasize a need for a different instructional support and pedagogical interventions for different technology-use profiles.
Article
Full-text available
The analysis of data collected from the interaction of users with educational and information technology has attracted much attention as a promising approach for advancing our understanding of the learning process. This promise motivated the emergence of the new research field, learning analytics, and its closely related discipline, educational data mining. This paper first introduces the field of learning analytics and outlines the lessons learned from well-known case studies in the research literature. The paper then identifies the critical topics that require immediate research attention for learning analytics to make a sustainable impact on the research and practice of learning and teaching. The paper concludes by discussing a growing set of issues that if unaddressed, could impede the future maturation of the field. The paper stresses that learning analytics are about learning. As such, the computational aspects of learning analytics must be well integrated within the existing educational research.
Article
Full-text available
The Open Academic Analytics Initiative (OAAI) is a collaborative, multi‐year grant program aimed at researching issues related to the scaling up of learning analytics technologies and solutions across all of higher education. The paper describes the goals and objectives of the OAAI, depicts the process and challenges of collecting, organizing and mining student data to predict academic risk, and report results on the predictive performance of those models, their portability across pilot programs at partner institutions, and the results of interventions on at‐risk students.
Article
Full-text available
Significance The President’s Council of Advisors on Science and Technology has called for a 33% increase in the number of science, technology, engineering, and mathematics (STEM) bachelor’s degrees completed per year and recommended adoption of empirically validated teaching practices as critical to achieving that goal. The studies analyzed here document that active learning leads to increases in examination performance that would raise average grades by a half a letter, and that failure rates under traditional lecturing increase by 55% over the rates observed under active learning. The analysis supports theory claiming that calls to increase the number of students receiving STEM degrees could be answered, at least in part, by abandoning traditional lecturing in favor of active learning.
Conference Paper
Full-text available
This paper provides an evaluation of the current state of the field of learning analytics through analysis of articles and citations occurring in the LAK conferences and identified special issue journals. The emerging field of learning analytics is at the intersection of numerous academic disciplines, and therefore draws on a diversity of methodologies, theories and underpinning scientific assumptions. Through citation analysis and structured mapping we aimed to identify the emergence of trends and disciplinary hierarchies that are influencing the development of the field to date. The results suggest that there is some fragmentation in the major disciplines (computer science and education) regarding conference and journal representation. The analyses also indicate that the commonly cited papers are of a more conceptual nature than empirical research reflecting the need for authors to define the learning analytics space. An evaluation of the current state of learning analytics provides numerous benefits for the development of the field, such as a guide for under-represented areas of research and to identify the disciplines that may require more strategic and targeted support and funding opportunities.
Article
Full-text available
In this paper, an early intervention solution for collegiate faculty called Course Signals is discussed. Course Signals was developed to allow instructors the opportunity to employ the power of learner analytics to provide real-time feedback to a student. Course Signals relies not only on grades to predict students' performance, but also demographic characteristics, past academic history, and students' effort as measured by interaction with Blackboard Vista, Purdue's learning management system. The outcome is delivered to the students via a personalized email from the faculty member to each student, as well as a specific color on a stoplight -- traffic signal -- to indicate how each student is doing. The system itself is explained in detail, along with retention and performance outcomes realized since its implementation. In addition, faculty and student perceptions will be shared.
Article
Full-text available
Predictive analytics techniques applied to a broad swath of student data can aid in timely intervention strategies to help prevent students from failing a course. This paper discusses a predictive analytic model that was created for the University of Phoenix. The purpose of the model is to identify students who are in danger of failing the course in which they are currently enrolled. Within the model's architecture, data from the learning management system (LMS), financial aid system, and student system are combined to calculate a likelihood of any given student failing the current course. The output can be used to prioritize students for intervention and referral to additional resources. The paper includes a discussion of the predictor and statistical tests used, validation procedures, and plans for implementation.
Article
Full-text available
Self-regulated learning (SRL) is a pivot upon which students’ achievement turns. We explain how feedback is inherent in and a prime determiner of processes that constitute SRL, and review areas of research that elaborate contemporary models of how feedback functions in learning. Specifically, we begin by synthesizing a model of self-regulation based on contemporary educational and psychological literatures. Then we use that model as a structure for analyzing the cognitive processes involved in self-regulation, and for interpreting and integrating findings from disparate research traditions. We propose an elaborated model of SRL that can embrace these research findings and that spotlights the cognitive operation of monitoring as the hub of self-regulated cognitive engagement. The model is then used to reexamine (a) recent research on how feedback affects cognitive engagement with tasks and (b) the relation among forms of engagement and achievement. We conclude with a proposal that research on feedback and research on self-regulated learning should be tightly coupled, and that the facets of our model should be explicitly addressed in future research in both areas.
Article
Full-text available
Examines student performance indicators in online distance learning courses offered on the Internet at a mid-sized private college in the USA. A sample of 74 undergraduate and 147 graduate business students in ten courses were selected for statistical analysis of their grade performance and the relationship with various indicators. The research results include findings that gender and age are related differently for undergraduate and graduate students to performance in distance learning courses, and that undergraduate grades, age, work experience, and discussion board grades are significantly related to overall course performance. However, standardized test scores (SATs, GMATs) and organization position level are not related to the performance in distance learning courses. Makes recommendations for further qualitative and empirical research on distance learning student performance in online computer-mediated courses and programs.
Article
Educational data mining is the area of scientific inquiry centered around the development of methods for making discoveries within the unique kinds of data that come from educational settings, and using those methods to better understand students and the settings which they learn in. The recent advent of public educational data repositories has made it feasible for researchers to investigate a wide variety of scientific questions using data mining. In this article, five categories of educational data mining methods are discussed, as well as the key applications for which educational data mining methods have been used.
Article
The quality of science, technology, engineering, and mathematics (STEM) education in the United States has long been an area of national concern, but that concern has not resulted in improvement. Recently, there has been a growing sense that an opportunity for progress at the higher education level lies in the extensive research on different teaching methods that have been carried out during the last few decades. Most of this research has been on “active learning methods” and the comparison with the standard lecture method in which students are primarily listening and taking notes. As the number of research studies has grown, it has become increasingly clear to researchers that active learning methods achieve better educational outcomes. The possibilities for improving postsecondary STEM education through more extensive use of these research-based teaching methods were reflected in two important recent reports (1, 2). However, the size and consistency of the benefits of active learning remained unclear. In PNAS, Freeman et al. (3) provide a much more extensive quantitative analysis of the research on active learning in college and university STEM courses than previously existed. It was a massive effort involving the tracking and analyzing of 642 papers spanning many fields and publication venues and a very careful analysis of 225 papers that met their standards for the meta-analysis. The results that emerge from this meta-analysis have important implications for the future of STEM teaching and STEM education research.
Conference Paper
This article addresses a relatively unexplored area in the emerging field of learning analytics, the design of learning analytics interventions. A learning analytics intervention is defined as the surrounding frame of activity through which analytic tools, data, and reports are taken up and used. It is a soft technology that involves the orchestration of the human process of engaging with the analytics as part of the larger teaching and learning activity. This paper first makes the case for the overall importance of intervention design, situating it within the larger landscape of the learning analytics field, and then considers the specific issues of intervention design for student use of learning analytics. Four principles of pedagogical learning analytics intervention design that can be used by teachers and course developers to support the productive use of learning analytics by students are introduced: Integration, Agency, Reference Frame and Dialogue. In addition three core processes in which to engage students are described: Grounding, Goal-Setting and Reflection. These principles and processes are united in a preliminary model of pedagogical learning analytics intervention design for students, presented as a starting point for further inquiry.
Article
Feedback is one of the most powerful influences on learning and achievement, but this impact can be either positive or negative. Its power is frequently mentioned in articles about learning and teaching, but surprisingly few recent studies have systematically investigated its meaning. This article provides a conceptual analysis of feedback and reviews the evidence related to its impact on learning and achievement. This evidence shows that although feedback is among the major influences, the type of feedback and the way it is given can be differentially effective. A model of feedback is then proposed that identifies the particular properties and circumstances that make it effective, and some typically thorny issues are discussed, including the timing of feedback and the effects of positive and negative feedback. Finally, this analysis is used to suggest ways in which feedback can be used to enhance its effectiveness in classrooms.
Article
Recently, learning analytics (LA) has drawn the attention of academics, researchers, and administrators. This interest is motivated by the need to better understand teaching, learning, “intelligent content,” and personalization and adaptation. While still in the early stages of research and implementation, several organizations (Society for Learning Analytics Research and the International Educational Data Mining Society) have formed to foster a research community around the role of data analytics in education. This article considers the research fields that have contributed technologies and methodologies to the development of learning analytics, analytics models, the importance of increasing analytics capabilities in organizations, and models for deploying analytics in educational settings. The challenges facing LA as a field are also reviewed, particularly regarding the need to increase the scope of data capture so that the complexity of the learning process can be more accurately reflected in analysis. Privacy and data ownership will become increasingly important for all participants in analytics projects. The current legal system is immature in relation to privacy and ethics concerns in analytics. The article concludes by arguing that LA has sufficiently developed, through conferences, journals, summer institutes, and research labs, to be considered an emerging research field.
Article
Learning analytics is the analysis of electronic learning data which allows teachers, course designers and administrators of virtual learning environments to search for unobserved patterns and underlying information in learning processes. The main aim of learning analytics is to improve learning outcomes and the overall learning process in electronic learning virtual classrooms and computer-supported education. The most basic unit of learning data in virtual learning environments for learning analytics is the interaction, but there is no consensus yet on which interactions are relevant for effective learning. Drawing upon extant literature, this research defines three system-independent classifications of interactions and evaluates the relation of their components with academic performance across two different learning modalities: virtual learning environment (VLE) supported face-to-face (F2F) and online learning. In order to do so, we performed an empirical study with data from six online and two VLE-supported F2F courses. Data extraction and analysis required the development of an ad hoc tool based on the proposed interaction classification. The main finding from this research is that, for each classification, there is a relation between some type of interactions and academic performance in online courses, whereas this relation is non-significant in the case of VLE-supported F2F courses. Implications for theory and practice are discussed next.
Article
Much evaluation of teaching focuses on what teachers do in class. This article focuses on the evaluation of assessment arrangements and the way they affect student learning out of class. It is assumed that assessment has an overwhelming influence on what, how and how much students study. The article proposes a set of 'conditions under which assessment supports learning' and justifies these with reference to theory, empirical evidence and practical experience. These conditions are offered as a framework for teachers to review the effectiveness of their own assessment practice.
Article
Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a survey of the specific application of data mining in learning management systems and a case study tutorial with the Moodle system. Our objective is to introduce it both theoretically and practically to all users interested in this new research area, and in particular to online instructors and e-learning administrators. We describe the full process for mining e-learning data step by step as well as how to apply the main data mining techniques used, such as statistics, visualization, classification, clustering and association rule mining of Moodle data. We have used free data mining tools so that any user can immediately begin to apply data mining without having to purchase a commercial tool or program a specific personalized tool.
Article
We study the incidence (rate of occurrence), persistence (rate of reoccurrence immediately after occurrence), and impact (effect on behavior) of students’ cognitive–affective states during their use of three different computer-based learning environments. Students’ cognitive–affective states are studied using different populations (Philippines, USA), different methods (quantitative field observation, self-report), and different types of learning environments (dialogue tutor, problem-solving game, and problem-solving-based Intelligent Tutoring System). By varying the studies along these multiple factors, we can have greater confidence that findings which generalize across studies are robust. The incidence, persistence, and impact of boredom, frustration, confusion, engaged concentration, delight, and surprise were compared. We found that boredom was very persistent across learning environments and was associated with poorer learning and problem behaviors, such as gaming the system. Despite prior hypothesis to the contrary, frustration was less persistent, less associated with poorer learning, and did not appear to be an antecedent to gaming the system. Confusion and engaged concentration were the most common states within all three learning environments. Experiences of delight and surprise were rare. These findings suggest that significant effort should be put into detecting and responding to boredom and confusion, with a particular emphasis on developing pedagogical interventions to disrupt the “vicious cycles” which occur when a student becomes bored and remains bored for long periods of time.
Assessment and Classroom Learning: a deductive approach Assessment in Education: Principles, Policy & Practice
  • J Hattie
  • R Jaeger
Hattie, J. and Jaeger, R. 1998. Assessment and Classroom Learning: a deductive approach. Assessment in Education: Principles, Policy & Practice. 5, 1 (Mar. 1998), 111–122 doi:10.1080/0969595980050107.