Article

Improving predictive power through deep learning analysis of K-12 online student behaviors and discussion board content

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose For studies in educational data mining or learning Analytics, the prediction of student’s performance or early warning is one of the most popular research topics. However, research gaps indicate a paucity of research using machine learning and deep learning (DL) models in predictive analytics that include both behaviors and text analysis. Design/methodology/approach This study combined behavioral data and discussion board content to construct early warning models with machine learning and DL algorithms. In total, 680 course sections, 12,869 students and 14,951,368 logs were collected from a K-12 virtual school in the USA. Three rounds of experiments were conducted to demonstrate the effectiveness of the proposed approach. Findings The DL model performed better than machine learning models and was able to capture 51% of at-risk students in the eighth week with 86.8% overall accuracy. The combination of behavioral and textual data further improved the model’s performance in both recall and accuracy rates. The total word count is a more general indicator than the textual content feature. Successful students showed more words in analytic, and at-risk students showed more words in authentic when text was imported into a linguistic function word analysis tool. The balanced threshold was 0.315, which can capture up to 59% of at-risk students. Originality/value The results of this exploratory study indicate that the use of student behaviors and text in a DL approach may improve the predictive power of identifying at-risk learners early enough in the learning process to allow for interventions that can change the course of their trajectory.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In this study, student performance was evaluated by supervised ML models. Studies that had similar data have used random forest [42][43][44], gradient boosting [41], support vector machines (SVM) [45][46][47], elastic net [12], naive Bayes [15,34,48], logistic regression [12,41,49], decision tree [42,50,51], and ANN [52][53][54] algorithms. Therefore, the preprocessed data executed on previously mentioned models and best performance with the default parameters were obtained by RF, SVM and LR models; Thus, these models were selected for more tuning and analysis. ...
... The classification that was conducted in this research is based on information about students' personal lifestyles, parent information, and scores in each discipline (lesson). The type of data features used in this study was selected based on studies on student performance evaluation using ML and the data features it had used [15,24,39,40,42,52]. These features include all aspects of students' home life, school life, how they use their time, interests, work, etc. ...
Article
Full-text available
The objective of this research is to develop an machine learning (ML) -based system that evaluates the performance of high school students during the semester and identify the most significant factors affecting student performance. It also specifies how the performance of models is affected when models run on data that only include the most important features. Classifiers employed for the system include random forest (RF), support vector machines (SVM), logistic regression (LR) and artificial neural network (ANN) techniques. Moreover, the Boruta algorithm was used to calculate the importance of features. The dataset includes behavioral information, individual information and the scores of students that were collected from teachers and a one-by-one survey through an online questionnaire. As a result, the effective features of the database were identified, and the least important features were eliminated from the dataset. The ANN accuracy, which was the best accuracy in the original dataset, was reduced in the decreased dataset. On the contrary, SVM performance was improved, which had the highest accuracy among other models, with 0.78. Moreover, the LR and RF models could provide the same performance in the decreased dataset. The results showed that ML models are influential for evaluating students, and stakeholders can use the identified effective factors to improve education.
... Other inputs are used in most studies in the common forms such as ''personal data, background data, interaction data, including access data, use of resources (learning resources), and performance'' [65], [98], [102]. Input features in DL-based papers consist of images [57], [103]- [105], textual data [106], [107] and Essays [67]. ...
Article
Full-text available
Education is a vital part of the development of society and it is changing over time in terms of methods, content, concepts, and models. Recently, it has been increasingly prevalent to benefit from the potentialities of Artificial Intelligence (AI) in addressing educational issues. In this research, the current state of the art of the integration of AI in K-12 education was provided. Specifically, different parts of education in which AI was employed along with the related AI categories were discussed according to different K-12 grades and courses. Additionally, technologies and environments that contributed to employing AI in education were discussed. To this end, a systematic literature review was conducted on articles and conference papers published between 2011 and 2021 in the Web of Science and Scopus databases. As the result of the initial search, 2075 documents were extracted and based on inclusive criteria and 210 documents were identified for further investigation. AI applications were categorized into Student performance, Teaching, Selection, and Behavior tasks, and Other. Machine Learning (ML) and Intelligent Tutoring System (ITS) were the most common approaches among AI categories. Furthermore, high school-related applications were more frequent and STEM courses were substantially targeted by AI. In conclusion, the remarkable impact of AI on education was concluded. The current study reveals information about the potentialities offered by AI in K-12 education which aids researchers in implementing AI-based education systems. As for future works, other databases such as ACM library and Google Scholar can be investigated as well. Furthermore, exploring the 95 papers that were excluded due to inaccessibility to their full texts can be taken into account. Finally, the papers can be also investigated in terms of pedagogical approaches or development tools.
Article
Recently, the ceaseless rise in the global average temperature has led to extreme climates in which natural disasters, such as droughts, hurricanes, earthquakes and floods, are becoming increasingly serious. Recent research has found that social media typically reflects disasters earlier than official communication channels. In this study, the idea of collecting information on flood disasters caused during the periods of typhoons and heavy rains for a city from the plain text messages released by social media by means of a term frequency (TF) and sliding window approach is proposed. The dataset analysed here contains a total of 292 articles and 12,484 tweets. This research determines how to establish a warning mechanism, with an added notification time for flooding disasters, and it shows how to provide relevant disaster relief personnel with references. This article contributes by combining social media data with emergency management information cloud (EMIC) data, especially in the context of having a mechanism for warning about flooding disasters. According to the experimental results, a sliding window of 90 min and a sliding gap of 10 min obtained the best F-measure value ( F = 0.315). The event studied was Typhoon Megi (September 2016), which caused major flooding in Tainan. For the Typhoon Megi event, the flood disaster location database had 161 streets available for matching. Based on the experimental results, it is possible to obtain a high-precision (90% or higher) accuracy rate from real-time tweet data by exploiting a social media dataset.
Article
Full-text available
The rapid development of learning technologies has enabled online learning paradigm to gain great popularity in both high education and K-12, which makes the prediction of student performance become one of the most popular research topics in education. However, the traditional prediction algorithms are originally designed for balanced dataset, while the educational dataset typically belongs to highly imbalanced dataset, which makes it more difficult to accurately identify the at-risk students. In order to solve this dilemma, this study proposes an integrated framework (LVAEPre) based on latent variational autoencoder (LVAE) with deep neural network (DNN) to alleviate the imbalanced distribution of educational dataset and further to provide early warning of at-risk students. Specifically, with the characteristics of educational data in mind, LVAE mainly aims to learn latent distribution of at-risk students and to generate at-risk samples for the purpose of obtaining a balanced dataset. DNN is to perform final performance prediction. Extensive experiments based on the collected K-12 dataset show that LVAEPre can effectively handle the imbalanced education dataset and provide much better and more stable prediction results than baseline methods in terms of accuracy and F1.5 score. The comparison of t-SNE visualization results further confirms the advantage of LVAE in dealing with imbalanced issue in educational dataset. Finally, through the identification of the significant predictors of LVAEPre in the experimental dataset, some suggestions for designing pedagogical interventions are put forward.
Article
Full-text available
Automatic multimedia learning resources recommendation has become an increasingly relevant problem: it allows students to discover new learning resources that match their tastes, and enables the e-learning system to target the learning resources to the right students. In this paper, we propose a content-based recommendation algorithm based on convolutional neural network (CNN). The CNN can be used to predict the latent factors from the text information of the multimedia resources. To train the CNN, its input and output should first be solved. For its input, the language model is used. For its output, we propose the latent factor model, which is regularized by L1-norm. Furthermore, the split Bregman iteration method is introduced to solve the model. The major novelty of the proposed recommendation algorithm is that the text information is used directly to make the content-based recommendation without tagging. Experimental results on public databases in terms of quantitative assessment show significant improvements over conventional methods. In addition, the split Bregman iteration method which is introduced to solve the model can greatly improve the training efficiency.
Article
Full-text available
Knowledge of tree species composition in a forest is an important topic in forest management. Accurate tree species maps allow for much more detailed and in-depth analysis of biophysical forest variables. The paper presents a comparison of three classification algorithms: support vector machines (SVM), random forest (RF) and artificial neural networks (ANN) for tree species classification using airborne hyperspectral data from the Airborne Prism EXperiment sensor. The aim of this paper is to evaluate the three nonparametric classification algorithms (SVM, RF and ANN) in an attempt to classify the five most common tree species of the Szklarska Poręba area: spruce (Picea alba L. Karst), larch (Larix decidua Mill.), alder (Alnus Mill), beech (Fagus sylvatica L.) and birch (Betula pendula Roth). To avoid human introduced biases a 0.632 bootstrap procedure was used during evaluation of each compared classifier. Of all compared classification results, ANN achieved the highest median overall classification accuracy (77%) followed by SVM with 68% and RF with 62%. Analysis of the stability of results concluded that RF and SVM had the lowest variance of overall accuracy and kappa coefficient (12 percentage points) while ANN had 15 percentage points variance in results.
Article
Full-text available
Apart from being able to support the bulk of student activity in suitable disciplines such as computer programming, Web-based educational systems have the potential to yield valuable insights into student behavior. Through the use of educational analytics, we can dispense with preconceptions of how students consume and reuse course material. In this paper, we examine the speed at which students employ concepts which they are being taught during a semester. To show the wider utility of this data, we present a basic classification system for early detection of poor performers and show how it can be improved by including data on when students use a concept for the first time. Using our improved classifier, we can achieve an accuracy of 85% in predicting poor performers prior to the completion of the course.
Article
Full-text available
A key notion conveyed by those who advocate for the use of data to enhance instruction is an awareness that learning analytics has the potential to improve instruction and learning but is not currently reaching that potential. Gibbons (2014) suggested that a lack of learning facilitated by current technology-enabled instructional systems may be due in part to the natural tendency of many designers to focus on the surface layers of instruction (i.e., the content and control layers), while failing to adequately design the internal (less visible) aspects (e.g., the data management layer). In this paper, we outline phases in the design process related to the data management layer that should be considered when integrating learning analytics into a technology-enabled learning system.
Article
Full-text available
Early Warning Systems (EWSs) aggregate multiple sources of data to provide timely information to stakeholders about students in need of academic support. There is an increasing need to incorporate relevant data about student behaviors into the algorithms underlying EWSs to improve predictors of students’ success or failure. Many EWSs currently incorporate counts of course resource use, although these measures provide no information about which resources students are using. We use seven years of data from seven core STEM courses at a large university to investigate the associations between students’ use of categorized course resources (e.g., lecture or exam preparation resources) and their final course grade. Using logistic regression, we find that students who use exam preparation resources to a greater degree than their peers are more likely to receive a final grade of B or higher. In contrast, students who use more lecture-related resources than their peers are less likely to receive a final grade of B or higher. We discuss the implications of our results for developers deciding how to incorporate categories of course resource usage data into EWSs, for academic advisors using this information with students, and for instructors deciding which resources to include on their LMS site.
Article
Full-text available
It is important to study and analyse educational data especially students’ performance. Educational Data Mining (EDM) is the field of study concerned with mining educational data to find out interesting patterns and knowledge in educational organizations. This study is equally concerned with this subject, specifically, the students’ performance. This study explores multiple factors theoretically assumed to affect students’ performance in higher education, and finds a qualitative model which best classifies and predicts the students’ performance based on related personal and social factors.
Conference Paper
Full-text available
Completion rates for massive open online classes (MOOCs) are notoriously low. Identifying student patterns related to course completion may help to develop interventions that can improve retention and learning outcomes in MOOCs. Previous research predicting MOOC completion has focused on click-stream data, student demographics, and natural language processing (NLP) analyses. However, most of these analyses have not taken full advantage of the multiple types of data available. This study combines click-stream data and NLP approaches to examine if students' on-line activity and the language they produce in the online discussion forum is predictive of successful class completion. We study this analysis in the context of a subsample of 320 students who completed at least one graded assignment and produced at least 50 words in discussion forums, in a MOOC on educational data mining. The findings indicate that a mix of click-stream data and NLP indices can predict with substantial accuracy (78%) whether students complete the MOOC. This predictive power suggests that student interaction data and language data within a MOOC can help us both to understand student retention in MOOCs and to develop automated signals of student success.
Article
Full-text available
The purpose of this study was to identify the relationship between the psychological variables and online behavioral patterns of students, collected through a Learning Management System (LMS). Test was attempted of a structural equation model representing the relationships among Time and Study Environment Management (TSEM), one of the sub-constructs of MSLQ, influencing a set of time-related online log variables: login frequency, login regularity, and total login time. Data were collected from 188 college students in a Korean university. Employing structural equation modeling, a hypothesized model was tested for measuring the model fit. The results presented a criterion validity of online log variables to estimate their time management. The structural model including TSEM, online variable, and final score with a moderate fit indicated that learners’ time related online behavior mediates their psychological functions and their learning outcome. Based on the results, the final discussion includes the recommendations for further study and the meaningfulness in regard to the expantion of Learning Analtyics for Performance and Action (LAPA) model.
Article
Full-text available
This mixed-method study focuses on online learning analytics, a research area of importance. Several important student attributes and their online activities are examined to identify what seems to work best to predict higher grades. The purpose is to explore the relationships between student grade and key learning engagement factors using a large sample from an online undergraduate business course at an accredited American university (n = 228). Recent studies have discounted the ability to predict student learning outcomes from big data analytics but a few significant indicators have been found by some researchers. Current studies tend to use quantitative factors in learning analytics to forecast outcomes. This study extends that work by testing the common quantitative predictors of learning outcome, but qualitative data is also examined to triangulate the evidence. Pre and post testing of information technology understanding is done at the beginning of the course. First quantitative data is collected, and depending on the hypothesis test results, qualitative data is collected and analyzed with text analytics to uncover patterns. Moodle engagement analytics indicators are tested as predictors in the model. Data is also taken from the Moodle system logs. Qualitative data is collected from student reflection essays. The result was a significant General Linear Model with four online interaction predictors that captured 77.5 % of grade variance in an undergraduate business course.
Conference Paper
Full-text available
Increasing college participation rates, and a more diverse student population, is posing a challenge for colleges in facilitating all learners achieve their potential. This paper reports on a study to investigate the usefulness of data mining techniques in the analysis of factors deemed to be significant to academic performance in first year of college. Measures used include data typically available to colleges at the start of first year such as age, gender and prior academic performance. The study also explores the usefulness of additional psychometric measures that can be assessed early in semester one, specifically, measures of personality, motivation and learning strategies. A variety of data mining models are compared to assess the relative accuracy of each.
Article
Full-text available
Contemporary literature on online and distance education almost unequivocally argues for the importance of interactions in online learning settings. Nevertheless, the relationship between different types of interactions and learning outcomes is rather complex. Analyzing 204 offerings of 29 courses, over the period of six years, this study aimed at expanding the current understanding of the nature of this relationship. Specifically, with the use of trace data about interactions and utilizing the multilevel linear mixed modeling techniques, the study examined whether frequency and duration of student-student, student-instructor, student-system, and student-content interactions had an effect of learning outcomes, measured as final course grades. The findings show that the time spent on student-system interactions had a consistent and positive effect on the learning outcome, while the quantity of student-content interactions was negatively associated with the final course grades. The study also showed the importance of the educational level and the context of individual courses for the interaction types supported. Our findings further confirmed the potential of the use of trace data and learning analytics for studying learning and teaching in online settings. However, further research should account for various qualitative aspects of the interactions used while learning, different pedagogical/media features, as well as for the course design and delivery conditions in order to better explain the association between interaction types and the learning achievement. Finally, the results might imply the need for the development of the institutional and program-level strategies for learning and teaching that would promote effective pedagogical approaches to designing and guiding interactions in online and distance learning settings.
Conference Paper
Full-text available
Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of ap-proval/failure and prediction of grade. The former is tackled as a classification task while the latter as a regression task. Separate models are trained for each course. The experiments were carried out using administrate data from the University of Porto, concerning approximately 700 courses. The algorithms with best results overall in classification were decision trees and SVM while in regression they were SVM, Random Forest, and AdaBoost.R2. However, in the classification setting, the algorithms are finding useful patterns, while, in regression, the models obtained are not able to beat a simple baseline.
Article
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Conference Paper
Full-text available
There is an emerging trend in higher education for the adoption of massive open online courses (MOOCs). However, despite this interest in learning at scale, there has been limited work investigating the impact MOOCs can play on student learning. In this study, we adopt a novel approach, using language and discourse as a tool to explore its association with two established measures of learning: traditional academic performance and social centrality. We demonstrate how characteristics of language diagnostically reveal the performance and social position of learners as they interact in a MOOC. We use Coh-Metrix, a theoretically grounded, computational linguistic modeling tool, to explore students’ forum postings across five potent discourse dimensions. Using a Social Network Analysis (SNA) methodology, we determine learners’ social centrality. Linear mixed-effect modeling is used for all other analyses to control for individual learner and text characteristics. The results indicate that learners performed significantly better when they engaged in more expository style discourse, with surface and deep level cohesive integration, abstract language, and simple syntactic structures. However, measures of social centrality revealed a different picture. Learners garnered a more significant and central position in their social network when they engaged with more narrative style discourse with less overlap between words and ideas, simpler syntactic structures and abstract words. Implications for further research and practice are discussed regarding the misalignment between these two learning-related outcomes.
Article
Full-text available
Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions-especially anger-emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level. © The Author(s) 2015.
Article
Full-text available
The smallest and most commonly used words in English are pronouns, articles, and other function words. Almost invisible to the reader or writer, function words can reveal ways people think and approach topics. A computerized text analysis of over 50,000 college admissions essays from more than 25,000 entering students found a coherent dimension of language use based on eight standard function word categories. The dimension, which reflected the degree students used categorical versus dynamic language, was analyzed to track college grades over students' four years of college. Higher grades were associated with greater article and preposition use, indicating categorical language (i.e., references to complexly organized objects and concepts). Lower grades were associated with greater use of auxiliary verbs, pronouns, adverbs, conjunctions, and negations, indicating more dynamic language (i.e., personal narratives). The links between the categorical-dynamic index (CDI) and academic performance hint at the cognitive styles rewarded by higher education institutions.
Article
Full-text available
This paper aims to provide the reader with a comprehensive background for understanding current knowledge on Learning Analytics (LA) and Educational Data Mining (EDM) and its impact on adaptive learning. It constitutes an overview of empirical evidence behind key objectives of the potential adoption of LA/EDM in generic educational strategic planning. We examined the literature on experimental case studies conducted in the domain during the past six years (2008-2013). Search terms identified 209 mature pieces of research work, but inclusion criteria limited the key studies to 40. We analyzed the research questions, methodology and findings of these published papers and categorized them accordingly. We used non-statistical methods to evaluate and interpret findings of the collected studies. The results have highlighted four distinct major directions of the LA/EDM empirical research. We discuss on the emerged added value of LA/EDM research and highlight the significance of further implications. Finally, we set our thoughts on possible uncharted key questions to investigate both from pedagogical and technical considerations.
Article
Full-text available
As higher education diversifies its delivery modes, our ability to use the predictive and analytical power of educational data mining (EDM) to understand students' learning experiences is a critical step forward. The adoption of EDM by higher education as an analytical and decision making tool is offering new opportunities to exploit the untapped data generated by various student information systems (SIS) and learning management systems (LMS). This paper describes a hybrid approach which uses EDM and regression analysis to analyse live video streaming (LVS) students' online learning behaviours and their performance in their courses. Students' participation and login frequency, as well as the number of chat messages and questions that they submit to their instructors, were analysed, along with students' final grades. Results of the study show a considerable variability in students' questions and chat messages. Unlike previous studies, this study suggests no correlation between students' number of questions / chat messages / login times and students' success. However, our case study reveals that combining EDM with traditional statistical analysis provides a strong and coherent analytical framework capable of enabling a deeper and richer understanding of students' learning behaviours and experiences.
Article
Full-text available
Happiness and other emotions spread between people in direct contact, but it is unclear whether massive online social networks also contribute to this spread. Here, we elaborate a novel method for measuring the contagion of emotional expression. With data from millions of Facebook users, we show that rainfall directly influences the emotional content of their status messages, and it also affects the status messages of friends in other cities who are not experiencing rainfall. For every one person affected directly, rainfall alters the emotional expression of about one to two other people, suggesting that online social networks may magnify the intensity of global emotional synchrony.
Article
Full-text available
Applying data mining (DM) in education is an emerging interdisciplinary research field also known as educational data mining (EDM). It is concerned with developing methods for exploring the unique types of data that come from educational environments. Its goal is to better understand how students learn and identify the settings in which they learn to improve educational outcomes and to gain insights into and explain educational phenomena. Educational information systems can store a huge amount of potential data from multiple sources coming in different formats and at different granularity levels. Each particular educational problem has a specific objective with special characteristics that require a different treatment of the mining problem. The issues mean that traditional DM techniques cannot be applied directly to these types of data and problems. As a consequence, the knowledge discovery process has to be adapted and some specific DM techniques are needed. This paper introduces and reviews key milestones and the current state of affairs in the field of EDM, together with specific applications, tools, and future insights. © 2012 Wiley Periodicals, Inc.
Article
Full-text available
Using a sample of 363 participants, we tested whether differences in the use of linguistic categories in written self-introductions at the start of the semester predicted final course performance at the end of the semester. The results supported this possibility: Course performance could indeed be predicted by relative word usage in particular linguistic categories—predominantly by the use of punctuation (commas and quotes), word simplicity, first-person singular pronouns, present tense, details concerning home and social life, and words pertaining to eating, drinking, and sex. Our interpretation of the findings emphasizes the egocentric “narrowed focus” of low-performing students and therefore stands in contrast to a previous interpretation that characterized these students as being “dynamic thinkers.”
Article
Purpose Online learning is well-known by its flexibility of learning anytime and anywhere. However, how behavioral patterns tied to learning anytime and anywhere influence learning outcomes are still unknown. Design/methodology/approach This study proposed concepts of time and location entropy to depict students’ spatial-temporal patterns. A total of 5,221 students with 1,797,677 logs, including 485 on-the-job students and 4,736 full-time students, were analyzed to depict their spatial-temporal learning patterns, including the relationships between identified patterns and students’ learning performance. Findings Analysis results indicate on-the-job students took more advantage of anytime, anywhere than full-time students. Students with a higher tendency for learning anytime and a lower level of learning anywhere were more likely to have better outcomes. Gender did not show consistent findings on students’ spatial-temporal patterns, but partial findings could be supported by evidence in neural science or by cultural and geographical differences. Research limitations/implications A more accurate approach for categorizing position and location might be considered. Some findings need more studies for further validation. Finally, future research can consider connections between other well-known performance predictors (such as financial situation, motivation, personality and major) and the type of learning patterns. Practical implications The findings gained from this study can help improve the understandings of students’ learning behavioral patterns and design as well as implement better online education programs. Originality/value This study proposed concepts of time and location entropy to identify successful spatial-temporal patterns of on-the-job and full-time students.
Article
As an emerging field of research, learning analytics (LA) offers practitioners and researchers information about educational data that is helpful for supporting decisions in management of teaching and learning. While often combined with educational data mining (EDM), crucial distinctions exist for LA that mandate a separate review. This study aims to conduct a systematic meta-review of LA for mining key information that could assist in describing new and helpful directions to this field of inquiry. Within 901 LA articles analyzed, eight reviews were identified and synthesised to identify and determine consistencies and gaps. Results show that LA is at the stage of early majority and has attracted great research efforts from other fields. The majority of LA publications were focused on proposing LA concepts or frameworks and conducting proof-of-concept analysis rather than conducting actual data analysis. Collecting small datasets for LA research is predominant, especially in K-12 field. Finally, four major LA research topics, including prediction of performance, decision support for teachers and learners, detection of behavioural patterns & learner modelling and dropout prediction, were identified and discussed deeply. The future research of LA is also outlined for purpose of better understanding and optimising learning as well as learning contexts.
Article
Performance prediction is a leading topic in learning analytics research due to its potential to impact all tiers of education. This study proposes a novel predictive modeling method to address the research gaps in existing performance prediction research. The gaps addressed include: the lack of existing research focus on performance prediction rather than identifying key performance factors; the lack of common predictors identified for both K-12 and higher education environments; and the misplaced focus on absolute engagement levels rather than relative engagement levels. Two datasets, one from higher education and the other from a K-12 online school with 13 368 students in more than 300 courses, were applied using the predictive modeling technique. The results showed the newly suggested approach had higher overall accuracy and sensitivity rates than the traditional approach. In addition, two generalizable predictors were identified from instruction-intensive and discussion-intensive courses.
Conference Paper
Exponential growth in information has made it totally unimaginable to manually find a relevant product in a quick time, entailing the need for a mechanical recommendation system which would remember the users and recommend most suitable items. Most of the approaches for such machinery have been to first find similarity in users or in items, and then exploit these similarities to recommend the products. These methods produce better results when demographic information about users and items are given to them. In this paper, we propose a deep neural network model which does not require any information be given to it other than the rating triples. We created spurious user profiles and item characteristics by using separate learner weights at the bottom most layer. The weights in the upper layers took these information, created by the weights at bottom most layer, to produce a real valued rating. Our model produced an RMSE 4.1824 on Jester 4-million data set, and this shows our deep network is comparable to the state of the art models.
Conference Paper
We show how the novel use of a semantic representation based on Osgood’s semantic differential scales can lead to effective features in predicting short- and long-term learning in students using a vocabulary learning system. Previous studies in students’ intermediate knowledge states during vocabulary acquisition did not provide much information on which semantic knowledge students gained during word learning practice. Moreover, these studies relied on human ratings to evaluate the students’ responses. To solve this problem, we propose a semantic representation for words based on Osgood’s semantic decomposition of vocabulary [16]. To demonstrate our method can effectively represent students’ knowledge in vocabulary acquisition, we build models for predicting the student’s short-term vocabulary acquisition and long-term retention. We compare the effectiveness of our Osgood-based semantic representation to that provided by Word2Vec neural word embedding [13], and find that prediction models using features based on Osgood scale-based scores (OSG) perform better than the baseline and are comparable in accuracy to those using Word2Vec score-based models (W2V). By using more interpretable Osgood-based scales, our study results can help with better understanding of students’ ongoing learning states and designing personalized learning systems that can address an individual’s weak points in vocabulary acquisition.
Conference Paper
This study takes a novel approach toward understanding success in a math course by examining the linguistic features and affect of students' language production within a blended (with both on-line and traditional face to face instruction) undergraduate course (n=158) on discrete mathematics. Three linear effects models were compared: (a) a baseline linear model including non-linguistic fixed effects, (b) a model including only linguistic factors, (c) a model including both linguistic and non-linguistic effects. The best model (c) explained 16% of the variance of final course scores, revealing significant effects for one non-linguistic feature (days on the system) and two linguistic features (Number of dependents per prepositional object nominal and Sentence linking connectives). One non-linguistic factor (Is a peer tutor) and two linguistic variables (Words related to self and Words related to tool use) demonstrated marginal significance. The findings indicate that language proficiency is strongly linked to math performance such that more complex syntactic structures and fewer explicit cohesion devices equate to higher course performance. The linguistic model also indicated that less self-centered students and students using words related to tool use were more successful. In addition, the results indicate that students that are more active in on-line discussion forums are more likely to be successful.
Article
Educational data mining (EDM) is a rapidly growing research area, and the outputs obtained from EDM shed light on educators’ and education planners’ efforts to make efficient decisions concerning educational strategies. However, a lack of work still exists on using EDM methods for international assessment studies such as the International Association for the Evaluation of Educational Achievement’s Trends in International Mathematics and Science Study (IEA’s TIMSS). This study aims to fill the gap in the current literature on the latest-released TIMSS 2011 data by applying a decision tree, a Bayesian network, a logistic regression, and neural networks. The best performing algorithm in classification based on several performance measures has been found for eighth-grade Turkish students’ mathematics data. During the construction of models, 11 student-based factors have been taken into account. The results show that logistic regression outperforms other algorithms in terms of measuring classification performance. The factor of student confidence has also been found as the most effective factor on eighth-grade students’ mathematics achievement.
Article
The field of education technology is embracing a use of learning analytics to improve student experiences of learning. Along with exponential growth in this area is an increasing concern of the interpretability of the analytics from the student experience and what they can tell us about learning. This study offers a way to address some of the concerns of collecting and interpreting learning analytics to improve student learning by combining observational and self-report data. The results present two models for predicting student academic performance which suggest that a combination of both observational and self-report data explains a significantly higher variation in student outcomes. The results offer a way into discussing the quality of interpretations of learning analytics and their usefulness for helping to improve the student experience of learning and also suggest a pathway for future research into this area.
Conference Paper
The collaborative learning processes of students in online learning environments can be supported by providing learning analytics-based visualisations that foster awareness and reflection about an individual's as well as the team's behaviour and their learning and collaboration processes. For this empirical study we implemented an activity widget into the online learning environment of a live five-months Master course and investigated the predictive power of the widget indicators towards the students' grades and compared the results to those from an exploratory study with data collected in previous runs of the same course where the widget had not been in use. Together with information gathered from a quantitative as well as a qualitative evaluation of the activity widget during the course, the findings of this current study show that there are indeed predictive relations between the widget indicators and the grades, especially those regarding responsiveness, and indicate that some of the observed differences in the last run could be attributed to the implemented activity widget.
Article
The data about high students' failure rates in introductory programming courses have been alarming many educators, raising a number of important questions regarding prediction aspects. In this paper, we present a comparative study on the effectiveness of educational data mining techniques to early predict students likely to fail in introductory programming courses. Although several works have analyzed these techniques to identify students' academic failures, our study differs from existing ones as follows: (i) we investigate the effectiveness of such techniques to identify students likely to fail at early enough stage for action to be taken to reduce the failure rate; (ii) we analyse the impact of data preprocessing and algorithms fine-tuning tasks, on the effectiveness of the mentioned techniques. In our study we evaluated the effectiveness of four prediction techniques on two different and independent data sources on introductory programming courses available from a Brazilian Public University: one comes from distance education and the other from on-campus. The results showed that the techniques analyzed in our study are able to early identify students likely to fail, the effectiveness of some of these techniques is improved after applying the data preprocessing and/or algorithms fine-tuning, and the support vector machine technique outperforms the other ones in a statistically significant way.
Article
In collaborative learning environments, students work together on assignments in virtual teams and depend on each other’s contribution to achieve their learning objectives. The online learning environment, however, may not only facilitate but also hamper group communication, coordination and collaboration. Group awareness widgets that visualise information about the different group members based on information collected from the individuals can foster awareness and reflection processes within the group. In this paper, we present a formative data study about the predictive power of several indicators of an awareness widget based on automatically logged user data from an online learning environment. In order to test whether the information visualised by the widget is in line with the study outcomes, we instantiated the widget indicators with data from four previous runs of the European Virtual Seminar on Sustainable Development (EVS). We analysed whether the tutor gradings in these previous years correlated with the students’ scores calculated for the widget indicators. Furthermore, we tested the predictive power of the widget indicators at various points in time with respect to the final grades of the students. The results of our analysis show that the grades and widget indicator scores are significantly and positively correlated, which provides a useful empirical basis for the development of guidelines for students and tutors on how to interpret the widget’s visualisations in live runs.
Article
In this letter, a new deep learning framework for spectral–spatial classification of hyperspectral images is presented. The proposed framework serves as an engine for merging the spatial and spectral features via suitable deep learning architecture: stacked autoencoders (SAEs) and deep convolutional neural networks (DCNNs) followed by a logistic regression (LR) classifier. In this framework, SAEs is aimed to get useful high-level features for the one-dimensional features which is suitable for the dimension reduction of spectral features, while DCNNs can learn rich features from the training data automatically and has achieved state-of-the-art performance in many image classification databases. Though the DCNNs has shown robustness to distortion, it only extracts features of the same scale, and hence is insufficient to tolerate large-scale variance of object. As a result, spatial pyramid pooling (SPP) is introduced into hyperspectral image classification for the first time by pooling the spatial feature maps of the top convolutional layers into a fixed-length feature. Experimental results with widely used hyperspectral data indicate that classifiers built in this deep learning-based framework provide competitive performance.
Article
An enduring issue in higher education is student retention to successful graduation. National statistics indicate that most higher education institutions have four-year degree completion rates around 50 percent, or just half of their student populations. While there are prediction models which illuminate what factors assist with college student success, interventions that support course selections on a semester-to-semester basis have yet to be deeply understood. To further this goal, we develop a system to predict students' grades in the courses they will enroll in during the next enrollment term by learning patterns from historical transcript data coupled with additional information about students, courses and the instructors teaching them. We explore a variety of classic and state-of-the-art techniques which have proven effective for recommendation tasks in the e-commerce domain. In our experiments, Factorization Machines (FM), Random Forests (RF), and the Personalized Multi-Linear Regression model achieve the lowest prediction error. Application of a novel feature selection technique is key to the predictive success and interpretability of the FM. By comparing feature importance across populations and across models, we uncover strong connections between instructor characteristics and student performance. We also discover key differences between transfer and non-transfer students. Ultimately we find that a hybrid FM-RF method can be used to accurately predict grades for both new and returning students taking both new and existing courses. Application of these techniques holds promise for student degree planning, instructor interventions, and personalized advising, all of which could improve retention and academic performance.
Conference Paper
In this paper we discuss the results of a study of students' academic performance in first year general education courses. Using data from 566 students who received intensive academic advising as part of their enrollment in the institution's pre-major/general education program, we investigate individual student, organizational, and disciplinary factors that might predict a students' potential classification in an Early Warning System as well as factors that predict improvement and decline in their academic performance. Disciplinary course type (based on Biglan's [7] typology) was significantly related to a student's likelihood to enter below average performance classifications. Students were the most likely to enter a classification in fields like the natural science, mathematics, and engineering in comparison to humanities courses. We attribute these disparities in academic performance to disciplinary norms around teaching and assessment. In particular, the timing of assessments played a major role in students' ability to exit a classification. Implications for the design of Early Warning analytics systems as well as academic course planning in higher education are offered.
Article
Although asynchronous online discussion (AOD) is increasingly used as a main activity for blended learning, many students find it difficult to engage in discussions and report low achievement. Early prediction and timely intervention can help potential low achievers get back on track as early as possible. This study presented a data mining process to construct proxy variables that reflect theoretical and empirical evidence and measured the accuracy of a prediction model that incorporated all of the variables for validation. For the empirical study, data were obtained from 105 university students who were enrolled in two blended learning courses that used AOD as their main activity. The results indicated the high accuracy of the prediction model as well as the possibility of early detection and timely interventions. In addition, we examined participants' learning behaviors in the two courses using the proxy variables and provided suggestions for practice. The implications of this study for education data mining and learning analytics are discussed.
Conference Paper
As courses become bigger, move online, and are deployed to the general public at low cost (e.g. through Massive Open Online Courses, MOOCs), new methods of predicting student achievement are needed to support the learning process. This paper presents a novel method for converting educational log data into features suitable for building predictive models of student success. Unlike cognitive modelling or content analysis approaches, these models are built from interactions between learners and resources, an approach that requires no input from instructional or domain experts and can be applied across courses or learning environments.
Article
Massive open online courses (MOOCs) continue to appear across the higher education landscape, originating from many institutions in the USA and around the world. MOOCs typically have low completion rates, at least when compared with traditional courses, as this course delivery model is very different from traditional, fee-based models, such as college courses. This research examined MOOC student demographic data, intended behaviours and course interactions to better understand variables that are indicative of MOOC completion. The results lead to ideas regarding how these variables can be used to support MOOC students through the application of learning analytics tools and systems.
Conference Paper
The pervasive collection of data has opened the possibility for educational institutions to use analytics methods to improve the quality of the student experience. However, the adoption of these methods faces multiple challenges particularly at the course level where instructors and students would derive the most benefit from the use of analytics and predictive models. The challenge lies in the knowledge gap between how the data is captured, processed and used to derive models of student behavior, and the subsequent interpretation and the decision to deploy pedagogical actions and interventions by instructors. Simply put, the provision of learning analytics alone has not necessarily led to changing teaching practices. In order to support pedagogical change and aid interpretation, this paper proposes a model that can enable instructors to readily identify subpopulations of students to provide specific support actions. The approach was applied to a first year course with a large number of students. The resulting model classifies students according to their predicted exam scores, based on indicators directly derived from the learning design.
Article
This study sought to identify significant behavioral indicators of learning using learning management system (LMS) data regarding online course achievement. Because self-regulated learning is critical to success in online learning, measures reflecting self-regulated learning were included to examine the relationship between LMS data measures and course achievement. Data were collected from 530 college students who took an online course. The results demonstrated that students' regular study, late submissions of assignments, number of sessions (the frequency of course logins), and proof of reading the course information packets significantly predicted their course achievement. These findings verify the importance of self-regulated learning and reveal the advantages of using measures related to meaningful learning behaviors rather than simple frequency measures. Furthermore, the measures collected in the middle of the course significantly predicted course achievement, and the findings support the potential for early prediction using learning performance data. Several implications of these findings are discussed.
Article
the purpose of this study is to identify at-risk online students earlier, more often, and with greater accuracy using time-series clustering. The case study showed that the proposed approach could generate models with higher accuracy and feasibility than traditional frequency aggregation approaches. The best performing model can start to capture at-risk students from week 10. In addition, the four phases in student's learning process detected holiday effect and illustrates at-risk students' behaviors before and after a long holiday break. The findings also enable online instructors to develop corresponding instructional interventions via course design or student-Teacher communications.
Article
This review pursues a twofold goal, the first is to preserve and enhance the chronicles of recent educational data mining (EDM) advances development; the second is to organize, analyze, and discuss the content of the review based on the outcomes produced by a data mining (DM) approach. Thus, as result of the selection and analysis of 240 EDM works, an EDM work profile was compiled to describe 222 EDM approaches and 18 tools. A profile of the EDM works was organized as a raw data base, which was transformed into an ad-hoc data base suitable to be mined. As result of the execution of statistical and clustering processes, a set of educational functionalities was found, a realistic pattern of EDM approaches was discovered, and two patterns of value-instances to depict EDM approaches based on descriptive and predictive models were identified. One key finding is: most of the EDM approaches are ground on a basic set composed by three kinds of educational systems, disciplines, tasks, methods, and algorithms each. The review concludes with a snapshot of the surveyed EDM works, and provides an analysis of the EDM strengths, weakness, opportunities, and threats, whose factors represent, in a sense, future work to be fulfilled.
Article
In recent years, deep neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Article
This paper shows how web usage mining can be applied in e-learning systems in order to predict the marks that university students will obtain in the final exam of a course. We have also developed a specific Moodle mining tool oriented for the use of not only experts in data mining but also of newcomers like instructors and courseware authors. The performance of different data mining techniques for classifying students are compared, starting with the student's usage data in several Cordoba University Moodle courses in engineering. Several well-known classification methods have been used, such as statistical methods, decision trees, rule and fuzzy rule induction methods, and neural networks. We have carried out several experiments using all available and filtered data to try to obtain more accuracy. Discretization and rebalance pre-processing techniques have also been used on the original numerical data to test again if better classifier models can be obtained. Finally, we show examples of some of the models discovered and explain that a classifier model appropriate for an educational environment has to be both accurate and comprehensible in order for instructors and course administrators to be able to use it for decision making. © 2010 Wiley Periodicals, Inc. Comput Appl Eng Educ; Published online in Wiley InterScience (www.interscience.wiley.com); DOI 10.1002/cae.20456