Figure 2 - uploaded by Ryan Baker
Content may be subject to copyright.
A learning curve of student performance over time in a Cognitive Tutor (from [39]) 

A learning curve of student performance over time in a Cognitive Tutor (from [39]) 

Source publication
Chapter
Full-text available
Data mining methods have in recent years enabled the development of more sophisticated student models which represent and detect a broader range of student behaviors than was previously possible. This chapter summarizes key data mining methods that have supported student modeling efforts, discussing also the specific constructs that have been model...

Contexts in source publication

Context 1
... The range of clustering methods used in educational data mining approximately corresponds to the types of prediction methods used in data mining more broadly, including algorithms such as k- means [34] and Expectation Maximization (EM)-Based Clustering [19], and model frameworks such as Gaussian Mixture Models [48]. Clustering has been used to develop student models for several types of educational software, including intelligent tutoring systems. In particular, fine-grained models of student behavior at the action-by-action level are clustered in terms of features of the student actions. For instance, Amershi & Conati used clustering on student behavior within an exploratory learning environment, discovering that certain types of reflective behavior and strategic advancement through the learning task were associated with better learning [4]. In addition, Beal and her colleagues applied clustering to study the categories of behavior within an intelligent tutoring system [16]. Other prominent research has investigated how clustering methods can assist in content recommendation within e-learning [55, 62]. Clustering is generally most useful when relatively little is known about the categories of interest in the data set, such as in types of learning environment not previously studied with educational data mining methods [e.g. 4] or for new types of learner-computer interaction, or where the categories of interest are unstable, as in content recommendation [e.g. 55, 62]. The use of clustering in domains where a considerable amount is already known brings some risk of discovering phenomena that are already known. As work in other areas of EDM goes forward, an increasing amount is known about student behavior across learning environments. One potential future use of clustering, in this situation, would be to use clustering as a second stage in the process of modeling student behavior in a learning system. First, existing detectors could be used to classify known categories of behavior. Then, data points not classified as belonging to any of those known behavior categories could be clustered, in order to search for unknown behaviors. Expectation Maximization (EM)- Based Clustering [19] is likely to be a method of particularly high potential for this, as it can explicitly incorporate already known categories into an initial starting point for clustering. One key recent trend facilitating the use of educational data mining methods to improve student models is the advance in methods for distilling data for human judgment. In many cases, human beings can make inferences about data, when it is presented appropriately, that are beyond the immediate scope of fully automated data mining methods. The information visualization methods most commonly used within EDM are often different than those most often used for other information visualization problems [cf. 36, 38], owing to the specific structure often present in intelligent tutor data, and the meaning embedded within that structure. For instance, data is meaningfully organized in terms of the structure of the learning material (skills, problems, units, lessons) and the structure of learning settings (students, teachers, collaborative pairs, classes, schools). Data is distilled for human judgment in educational data mining for two key purposes: classification and identification. One key area of development of data distillations supporting classification is the text replay methodology [12]. An example of a text replay is shown in Figure 1. In this case, sub-sections of a data set are displayed in text format, and labeled by human coders. These labels are then generally used as the basis for the development of a predictor. Text replays are significantly faster than competing methods for labeling, such as quantitative field observations or video coding [12, 13], and achieve good inter-rater reliability [12, 14]. Text replays have been used to support the development of prediction models of gaming the system in multiple learning environments [12, 14], and to develop models of scientific reasoning skill in inquiry learning environments [45, 51]. An alternate approach, displaying a re-constructed replay of a student’s screen, has also been used to label student data for use in classification [cf. 29]; however, this approach has become less common, as it is significantly slower than text replays [cf. 12], while not giving more information about student behavior or expression outside the system, unlike methods such as quantitative field observation and video methods. Identification of learning patterns and learner individual differences from visualizations is a key method for exploring educational data sets. For instance, Hershkovitz and Nachmias’s learnograms provide a rich representation of student behavior over time [36]. Within the domain of student models, a key use of identification with distilled and visualized data is in inference from learning curves, as shown in Figure 2. A great deal can be inferred from learning curves about the character of learning in a domain [26, 39], as well as about the quality of the domain model. Classic learning curves display the number of opportunities to practice a skill on the X axis, and display performance (such as percent correct or time taken to respond) on the Y axis. A curve with a smooth downward progression that is steep at first and gentler later indicates that successful learning is occurring. A flatter curve, as in Figure 2, indicates that learning is occurring, but with significant difficulty. A sudden spike upwards, by contrast, indicates that more than one knowledge component is included in the model [cf. 26]. A flat high curve indicates poor learning of the skill, and a flat low curve indicates that the skill did not need instruction in the first place. An upwards curve indicates the difficulty is increasing too fast. Hence, learning curves are a powerful tool to support quick inference about the character of learning in an educational system, leading to their recent incorporation into tools used by education researchers outside of the educational data mining community [e.g. 39]. An alternate method for developing student models is knowledge engineering [32,54]. Knowledge engineering approaches develop models that can engage in problem- solving, reasoning, or decision making, making the same decisions that a human expert would; they can do so simply by replicating the decision-making results or by attempting to develop a cognitive model that reasons in the same fashion that a human expert would. As a method, knowledge engineering relies upon human researchers studying the construct of interest, and directly developing – engineering – the model of the construct of interest. The mapping between features of the data set and the construct of interest is directly made by the engineer. As such, knowledge engineering can be contrasted to classification or regression, which use labels generated through expert decision-making but develop the mapping between the features of the data set and the construct of interest through an automated process. Knowledge engineering is frequently used to develop domain models, as discussed in another chapter in this volume. Within student modeling, knowledge engineering has been a prominent method for modeling sophisticated student behaviors within intelligent tutoring systems, with a focus on gaming the system and help-seeking behaviors. For instance, Beal and her colleagues used knowledge engineering to model gaming the system [16]. Shih and his colleagues used knowledge engineering to develop a mathematical model that could detect self-explanation and appropriate use of bottom-out hints [52]. Buckley and his colleagues used knowledge engineering to assess students’ level of systematicity during problem-solving in interactive simulations [20]. Within student modeling, knowledge engineering is frequently used to develop models of sophisticated student behavior in a hybrid fashion, where knowledge engineering is used to develop the functional form of a mathematical model, and then automated parameter-fitting is used to find (or refine) values for the parameters of that model. For instance, Aleven and his colleagues developed a model of a range of student help-seeking behaviors in Cognitive Tutors [2,3], using knowledge engineering to develop the functional form of a mathematical model, and then automated parameter-fitting to find values for the parameters of that model. Several of the components of Aleven et al’s model predicted differences in student learning. In another example, Beck presented a model of hasty guessing [17] (called disengagement in the original paper, but renamed hasty guessing in later work) in an intelligent tutor for reading, developed using knowledge engineering to develop an item-response theory model, and then using automated parameter-fitting to find values for the parameters of that model. Beck’s model successfully predicted differences in student post-test scores. Johns & Woolf (2006) used a similar combination of knowledge engineering and parameter fitting to model gaming the system [37]. In addition, educational data mining research often also involves some degree of knowledge engineering during the process of generating the data set features to use within classification or regression. During this step of the data mining process, researchers often attempt infer what types of features an expert coder would use – although this trend is diminishing as features are increasingly re-used in creating new data mining models, either from the same research group, or across research groups [cf. 8, 11, 21, 57]. As can be seen, knowledge engineering and educational data mining have both been used to model gaming the system. Aside from this overlap, the two approaches have been used to model different phenomena, with knowledge engineering methods emphasized in modeling ...
Context 2
... at the action-by-action level are clustered in terms of features of the student actions. For instance, Amershi & Conati used clustering on student behavior within an exploratory learning environment, discovering that certain types of reflective behavior and strategic advancement through the learning task were associated with better learning [4]. In addition, Beal and her colleagues applied clustering to study the categories of behavior within an intelligent tutoring system [16]. Other prominent research has investigated how clustering methods can assist in content recommendation within e-learning [55, 62]. Clustering is generally most useful when relatively little is known about the categories of interest in the data set, such as in types of learning environment not previously studied with educational data mining methods [e.g. 4] or for new types of learner-computer interaction, or where the categories of interest are unstable, as in content recommendation [e.g. 55, 62]. The use of clustering in domains where a considerable amount is already known brings some risk of discovering phenomena that are already known. As work in other areas of EDM goes forward, an increasing amount is known about student behavior across learning environments. One potential future use of clustering, in this situation, would be to use clustering as a second stage in the process of modeling student behavior in a learning system. First, existing detectors could be used to classify known categories of behavior. Then, data points not classified as belonging to any of those known behavior categories could be clustered, in order to search for unknown behaviors. Expectation Maximization (EM)- Based Clustering [19] is likely to be a method of particularly high potential for this, as it can explicitly incorporate already known categories into an initial starting point for clustering. One key recent trend facilitating the use of educational data mining methods to improve student models is the advance in methods for distilling data for human judgment. In many cases, human beings can make inferences about data, when it is presented appropriately, that are beyond the immediate scope of fully automated data mining methods. The information visualization methods most commonly used within EDM are often different than those most often used for other information visualization problems [cf. 36, 38], owing to the specific structure often present in intelligent tutor data, and the meaning embedded within that structure. For instance, data is meaningfully organized in terms of the structure of the learning material (skills, problems, units, lessons) and the structure of learning settings (students, teachers, collaborative pairs, classes, schools). Data is distilled for human judgment in educational data mining for two key purposes: classification and identification. One key area of development of data distillations supporting classification is the text replay methodology [12]. An example of a text replay is shown in Figure 1. In this case, sub-sections of a data set are displayed in text format, and labeled by human coders. These labels are then generally used as the basis for the development of a predictor. Text replays are significantly faster than competing methods for labeling, such as quantitative field observations or video coding [12, 13], and achieve good inter-rater reliability [12, 14]. Text replays have been used to support the development of prediction models of gaming the system in multiple learning environments [12, 14], and to develop models of scientific reasoning skill in inquiry learning environments [45, 51]. An alternate approach, displaying a re-constructed replay of a student’s screen, has also been used to label student data for use in classification [cf. 29]; however, this approach has become less common, as it is significantly slower than text replays [cf. 12], while not giving more information about student behavior or expression outside the system, unlike methods such as quantitative field observation and video methods. Identification of learning patterns and learner individual differences from visualizations is a key method for exploring educational data sets. For instance, Hershkovitz and Nachmias’s learnograms provide a rich representation of student behavior over time [36]. Within the domain of student models, a key use of identification with distilled and visualized data is in inference from learning curves, as shown in Figure 2. A great deal can be inferred from learning curves about the character of learning in a domain [26, 39], as well as about the quality of the domain model. Classic learning curves display the number of opportunities to practice a skill on the X axis, and display performance (such as percent correct or time taken to respond) on the Y axis. A curve with a smooth downward progression that is steep at first and gentler later indicates that successful learning is occurring. A flatter curve, as in Figure 2, indicates that learning is occurring, but with significant difficulty. A sudden spike upwards, by contrast, indicates that more than one knowledge component is included in the model [cf. 26]. A flat high curve indicates poor learning of the skill, and a flat low curve indicates that the skill did not need instruction in the first place. An upwards curve indicates the difficulty is increasing too fast. Hence, learning curves are a powerful tool to support quick inference about the character of learning in an educational system, leading to their recent incorporation into tools used by education researchers outside of the educational data mining community [e.g. 39]. An alternate method for developing student models is knowledge engineering [32,54]. Knowledge engineering approaches develop models that can engage in problem- solving, reasoning, or decision making, making the same decisions that a human expert would; they can do so simply by replicating the decision-making results or by attempting to develop a cognitive model that reasons in the same fashion that a human expert would. As a method, knowledge engineering relies upon human researchers studying the construct of interest, and directly developing – engineering – the model of the construct of interest. The mapping between features of the data set and the construct of interest is directly made by the engineer. As such, knowledge engineering can be contrasted to classification or regression, which use labels generated through expert decision-making but develop the mapping between the features of the data set and the construct of interest through an automated process. Knowledge engineering is frequently used to develop domain models, as discussed in another chapter in this volume. Within student modeling, knowledge engineering has been a prominent method for modeling sophisticated student behaviors within intelligent tutoring systems, with a focus on gaming the system and help-seeking behaviors. For instance, Beal and her colleagues used knowledge engineering to model gaming the system [16]. Shih and his colleagues used knowledge engineering to develop a mathematical model that could detect self-explanation and appropriate use of bottom-out hints [52]. Buckley and his colleagues used knowledge engineering to assess students’ level of systematicity during problem-solving in interactive simulations [20]. Within student modeling, knowledge engineering is frequently used to develop models of sophisticated student behavior in a hybrid fashion, where knowledge engineering is used to develop the functional form of a mathematical model, and then automated parameter-fitting is used to find (or refine) values for the parameters of that model. For instance, Aleven and his colleagues developed a model of a range of student help-seeking behaviors in Cognitive Tutors [2,3], using knowledge engineering to develop the functional form of a mathematical model, and then automated parameter-fitting to find values for the parameters of that model. Several of the components of Aleven et al’s model predicted differences in student learning. In another example, Beck presented a model of hasty guessing [17] (called disengagement in the original paper, but renamed hasty guessing in later work) in an intelligent tutor for reading, developed using knowledge engineering to develop an item-response theory model, and then using automated parameter-fitting to find values for the parameters of that model. Beck’s model successfully predicted differences in student post-test scores. Johns & Woolf (2006) used a similar combination of knowledge engineering and parameter fitting to model gaming the system [37]. In addition, educational data mining research often also involves some degree of knowledge engineering during the process of generating the data set features to use within classification or regression. During this step of the data mining process, researchers often attempt infer what types of features an expert coder would use – although this trend is diminishing as features are increasingly re-used in creating new data mining models, either from the same research group, or across research groups [cf. 8, 11, 21, 57]. As can be seen, knowledge engineering and educational data mining have both been used to model gaming the system. Aside from this overlap, the two approaches have been used to model different phenomena, with knowledge engineering methods emphasized in modeling help-seeking while educational data mining methods have been emphasized in modeling affect, self-efficacy, and off-task behavior. It is worth noting that the domains emphasized in educational data mining are often cases where recognition is fairly easy for humans (e.g. it is feasible to tell that a student is bored by looking at him/her [e.g. 31]), but where it is difficult to analyze exactly how those decisions are made in terms of features of data available in the log files. In these cases, an automated ...

Similar publications

Conference Paper
Full-text available
This is a report of an Action-Process-Object-Schema Theory (APOS) based study consisting of three research cycles on student learning of the basic ideas of two-variable functions. Each of the research cycles used semi-structured interviews with students to test an initial conjecture of needed mental constructions, develop supporting classroom activ...

Citations

... The main reason why we chose this family of algorithms was that they allow addressing a problem such a way that all the solution options are analysed, and they are very easy to interpret (Rokach & Maimon, 2005). Moreover, previous related research reports good performance using these techniques (Chung & Lee, 2019;Meedech et al., 2016), and they allow to handle heterogeneous data (qualitative and quantitative) very well and control over-fitting through pruning processes (Baker, 2010). The preparation of the dataset and the model training process were performed through four specific steps, and then an additional analysis through association rules was carried out. ...
Chapter
Full-text available
The surprise arrival of the COVID-19 pandemic produced an accelerated transition in all educational institutions, forcing them to take advantage of digital technologies and the Internet to ensure that their operation could keep going. In this document, a study of various scientific articles, reports, publications, and existing documentation on the digital transformation processes launched in the different Latin American universities was carried out, presenting the methodological proposals promoted toward the new modalities of remote education, the reinvention of administrative processes, and the support provided to the university community to reduce the digital divide. An online survey was designed to know the advances in the digital transformation (DT) of 20 universities in Latin America. Outcomes of the online survey supply insights in four key DT objectives: teaching and learning, student support, research, and administration. Also, a case study of the implementation and monitoring of the DT model at the Technological University of Panama and its projections was considered.
... On the other hand, an educational data mining utilizes methods of relationship mining, discovery with models, classification and clustering. The techniques has laid its foundation from automated method for discovery using educational data [12]. Figure 1: Distinction between educational datamining and learning analyticsTwo board approaches of educational data mining are the classification [13] of student model. ...
Conference Paper
Full-text available
Over the last two-decadeor so, educational data mining has evolved as an emerging discipline to analyzethe type of data that comes from academics. Several research studies hascarried outin Intelligent Tutoring System (ITS), Difficulty Factor Assessments, Latent Knowledge Estimation, Knowledge Inferences, Recommender System and Social Network Analysis.Gathering evidence of learning from educational setup has laid the foundation of learning analytics and educational data mining.Bayesian Knowledge Tracing (BKT), Q-Metrics, Performance Factor Analysis and Latent Knowledge Estimation methods are useful for the study of student’s success. Other methods like matrix factorization and knowledge components are suited for analyzing the student’s knowledge and performance. On the other hand, knowledge engineering and clustering is useful todevelop student models for educational software.The current scope of research areas and methods utilized ineducational data miningand learning analyticshasdiscussed in this paper
... The human evaluation of student behavior establishes when the behavior occurred (which serves as the predicted variable). For coding of wheel-spinning behavior in this study, play visualization based on text replays were adopted for their efficiency and accuracy [33]. Text replays, based on recorded log file data, are a text-based representation of student action during a given period of time. ...
Conference Paper
Full-text available
Games in service of learning are uniquely positioned to offer immersive, interactive educational experiences. Well-designed games build challenge through a series of well-ordered problems or activities, in which perseverance is key for working through in-game failure and increasing game difficulty. Indeed, persistence through challenges during learning is beneficial not just in games but in other contexts as well, with grit and perseverance positively associated with academic performance and learning outcomes. However, recent studies suggest that not all persistence is positive, suggesting that many students end up "wheel-spinning", spending considerable time on a topic without achieving mastery. Thus, it is vital to differentiate productive and unproductive persistence in order to understand emergent student progress, particularly in the context of learning games and personalized learning systems, in which individual pathways differ greatly based on student needs. Leveraging Educational Data Mining methods, this study builds a detector of wheel-spinning behavior (differentiated from productive persistence) in an adaptive, game-based learning system. With the ability to predict unproductive persistence early, this detection model can be used to intelligently adapt to students needing further support in-system, as well as informing in-person intervention in a classroom setting-thus supporting a personalized, engaging learning experience in both formal and informal learning environments.
... As an example, clustering methods including sequential pattern mining [160] and association rule discovery [161] are used in RomanTutor [162] to extract problem space and support tutoring services [163]. Interested readers are referred to read [164] for more details. ...
... Educational Data Mining methods come from different literature sources including data mining, machine learning, and information visualization. A point of view, proposed by Baker [6], classifies the work in EDM as follows: ...
... As an example, clustering methods including sequential pattern mining [160] and association rule discovery [161] are used in RomanTutor [162] to extract problem space and support tutoring services [163]. Interested readers are referred to read [164] for more details. ...
Preprint
This paper provides interested beginners with an updated and detailed introduction to the field of Intelligent Tutoring Systems (ITS). ITSs are computer programs that use artificial intelligence techniques to enhance and personalize automation in teaching. This paper is a literature review that provides the following: First, a review of the history of ITS along with a discussion on the interface between human learning and computer tutors and how effective ITSs are in contemporary education. Second, the traditional architectural components of an ITS and their functions are discussed along with approaches taken by various ITSs. Finally, recent innovative ideas in ITS systems are presented. This paper concludes with some of the author's views regarding future work in the field of intelligent tutoring systems.
... Detectors of affect and emotion have been used to drive automated adaptation to differences in student affect, significantly reducing students' frustration and anxiety and increasing the incidence of positive emotion". [16] III. RESEARCH METHODOLOGY All Graduating students across the seven (7) colleges are required to answer interview survey. ...
... We propose the use of terminology based on the "level" keyword, as illustrated in Fig. 4. This terminology has less overloaded meaning than the terminology using the keyword "stratified" and the "student-level cross-validation" notion has already been used in many papers, for example by Baker (2010), Koedinger et al. (2012b), and Baker et al. (2012). ...
Article
Full-text available
The core of student modeling research is about capturing the complex learning processes into an abstract mathematical model. The student modeling research, however, also involves important methodological aspects. Some of these aspects may seem like technical details not worth significant attention. However, the details matter. We discuss three important methodological issues in student modeling: the impact of data collection, the splitting of data into a training set and a test set, and the details concerning averaging in the computation of predictive accuracy metrics. We explicitly identify decisions involved in these steps, illustrate how these decisions can influence results of experiments, and discuss consequences for future research in student modeling.
... The Educational Data Mining (EDM) is focused in processing such data to provide information about aspects and elements included in the learning process, for instance, student-system interaction, performance of the learning environments, and learning process itself. This emerging field exploits statistical, machine-learning and data mining algorithms [1], and it is defined as an emerging discipline, concerned with development of methods for exploring the unique types of data that come from educational settings, and using those methods to understand students, and the settings in which they learn [2]. ...
... It contains several types of installation procedures. This feature is highly distinctive for this cluster since no other cluster contains it.2 3 This cluster is characterized by words such as structure, TS30, isolator, and electric pole. ...
... Data is mined and the algorithms are applied to predict the results [7]. Ryan S. J .d. Baker (n.d.) focuses on the data mining methods which support the student modeling efforts and modeled with use of EDM that credit for details of student behavior and performance [8]. Dr. Rachel Rubin (2014) study provide high survey response rate of selective higher education institution (82%, n = 63) combined with process-oriented interviews [9]. ...
Conference Paper
Every academic year the institution welcome’s its students from different location’s and provides its valuable resources for every student to attain their successful graduation. At the present scenario, the institution maintains the details of students’ manually. It becomes tedious task to analyze those records and fetching any information at short time. Data mining computational methodology helps to discover patterns in large data sets using artificial intelligence, machine learning, statistics, and database systems. Education Data Mining addresses these sensitive issues using a significant technique of data mining for analysis of admission. In this research paper, the analysis of admission is done with respect to location wise and comparison is done based on the year wise admission. The total admission rate for the current academic year and frequency of student admission across the state is calculated. The result of analyzed data is visualized and reported for the organizational decision making.