ArticlePDF Available

Clustering patterns of engagement in Massive Open Online Courses (MOOCs): the use of learning analytics to reveal student categories

Authors:

Abstract and Figures

Massive Open Online Courses (MOOCs) are remote courses that excel in their students’ heterogeneity and quantity. Due to the peculiarity of being massiveness, the large datasets generated by MOOC platforms require advanced tools and techniques to reveal hidden patterns for purposes of enhancing learning and educational behaviors. This publication offers a research study on using clustering as one of these techniques to portray learners’ engagement in MOOCs. It utilizes Learning Analytics techniques to investigate an offered MOOC by Graz University of Technology held for undergraduates. The authors mainly seek to classify students into appropriate categories based on their level of engagement. Clusters afterward are compared with another classical scheme (Cryer’s scheme of Elton) for further examination and comparison. The final results of this study show that extrinsic factors are not enough to make students highly committed to the MOOC, yet, adding intrinsic factors are recommended to improve future MOOCs.
Content may be subject to copyright.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Clustering Patterns of Engagement in Massive Open Online Courses (MOOCs): The Use of
Learning Analytics to Reveal Student Categories
Mohammad Khalil and Martin Ebner
Educational Technology, Graz University of Technology, Graz, Austria
mohammad.khalil@tugraz.at
Abstract
Massive Open Online Courses (MOOCs) are remote courses that excel in their students’ heterogeneity and quantity.
Due to the peculiarity of being massiveness, the large datasets generated by MOOC platforms require advanced tools
and techniques to reveal hidden patterns for purposes of enhancing learning and educational behaviors. This
publication offers a research study on using clustering as one of these techniques to portray learners’ engagement in
MOOCs. It utilizes Learning Analytics techniques to investigate an offered MOOC by Graz University of
Technology held for undergraduates. The authors mainly seek to classify students into appropriate categories based
on their level of engagement. Clusters afterward are compared with another classical scheme (Cryer’s scheme of
Elton) for further examination and comparison. The final results of this study show that extrinsic factors are not
enough to make students highly committed to the MOOC, yet, adding intrinsic factors are recommended to improve
future MOOCs.
Keywords
Massive Open Online Courses (MOOCs); Learning Analytics; Clustering; Engagement; Scheme; Patterns; Activity
Biography
Mohammad Khalil is a Ph.D. student at Graz University of Technology. His research focuses strongly on Learning
Analytics in MOOCs.
Martin Ebner is an non-tenurer associate professor and heads the department of Educational Technology at Graz
University of Technology.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Introduction
In the last years, Technology Enhanced Learning (TEL) has gone further advanced in distance learning. Newly
constructed online classes that encompass of various learning objects are called Massive Open Online Courses
(MOOCs) (McAuley et al., 2010). These four words imply a profound context of Massive ‘M’, indicates massive in
the number of enrollees than regular classes. Open ‘O’, explains the free accessibility and openness to anyone.
Online ‘O’ stands for courses held on the borderless Internet. Courses ‘C, represents a structured learning material
and is mostly embodied as filmed lectures, interactive social media channels, and articles.
The first version of MOOCs was developed by George Siemens and Stephan Downes (Holland & Tirthali, 2014). It
adopted the connectivism theory which is based on networking information over social channels. Since then, such
courses were referenced as cMOOCs. Thereafter, other versions of MOOCs are available. The xMOOCs or extended
MOOCs were offered in a form of a classical instructional way of learning. They consider content and assessment as
essential elements of the teaching and learning process (Ferguson & Clow, 2015). xMOOCs follow theories that are
based on guided learning and classical information transmission (Rodriguez, 2012), where hundreds of learners are
hosted simultaneously (Carson & Schmidt, 2012). One of the prominent and most successful activities of xMOOCs
has been done by Sebastian Thrun in 2011. His group launched an online course called “Introduction to Artificial
Intelligence”. The course attracted over 160,000 participants from all over the world (Yuan, et al., 2013).
MOOCs have the potential to scale education in disparate areas. Their benefits are crystallized to be welfare in
improving educational outcomes, extending accessibility, and reducing costs. In addition, Ebner and his colleagues
addressed the advantages that MOOCs can contribute to the fields of Open Educational Resources (OER) as well as
lifelong learning experiences in TEL contexts (Ebner, et al., 2014).
Despite their advantages, MOOCs suffered from students who register and afterward disengage. This has been cited
in several scientific research studies and is usually denoted as “dropout” or attrition” issues (Meyer, 2012; Jordan,
2013). Various investigations have been performed to identify the reasons behind the low completion rates (Khalil &
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Ebner, 2014; Lackner et al., 2015; Khalil & Ebner, 2016). Further, the lack of interaction between learners and
instructor(s), and the controversy argument about MOOCs pedagogical approach obstruct the positive advancement
of MOOCs. Recent research publications, additionally, discussed the patterns of engagement and the debate of
categorizing students in MOOCs (Kizilcec et al., 2013; Ferguson & Clow, 2015; Khalil & Ebner, 2015a; Alario-
Hoyos et al., 2016).
Since MOOCs include a large quantity of data that is generated by students who reside in an online crucible. They fit
well with what is so-called Learning Analytics. Knox (2014) discussed the high promises behind Learning Analytics
when it is applied to MOOCs datasets for the principles of overcoming their previously listed issues. The needs for
Learning Analytics emerged to optimize learning and studentscommitment in distance education applications
(Khalil & Ebner, 2015b).
This publication is an extended version of (Khalil, Kastl & Ebner, 2016) work. It carries out deep analysis on a 10-
week MOOC empirical data. The MOOC is offered on the Austrian MOOC platform, called iMooX
(http://www.imoox.at). We employ k-means clustering to discover interesting characteristics of participants. The
sought objectives behind this research study are to portray the engagement and behavior of learners in MOOC
platforms and support decisions of following up the students for purposes of increasing retention and improving
interventions for specific subpopulation. The paper also compares our results with the pedagogical theory of Elton
(1996) that was used to motivate students in classes. Finally, we believe this work contributes with an additional
value to ease grouping of MOOCs participants.
Literature Review
This section reviews related work of clustering activities and engagement in MOOCs. Through our search in
different academic libraries, the return results showed few quantitative or qualitative studies of learner engagement
and clustering in MOOCs. Nevertheless, we list below five related publications which strongly pertain to our
research study.
One of the prominent research studies in the Learning Analytics and Knowledge conference (LAK) is the work of
Kizilcec, Piech, and Schneider (2013). A recent survey study by Khalil and Ebner (2016a) found that the ultimate
number of citations in the conference proceedings between 2013 and 2015 belongs to this article. The authors
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
analyzed different subpopulations in MOOCs. Their results showed four types of students. I) Completers: those are
students who completed the courses. II) Auditors: they did assessments infrequently but were more interested in
watching videos. III) Disengaging students: students who dropped out after being engaged in the first third of the
class. IV) Sampling: learners watched videos for only the first two weeks. Their research results influenced other
researchers like Ferguson et al. (2015) who replicated the same clustering methodology. Their research analyzed five
MOOCs from the FutureLearn (http://www.futurelearn.com) MOOC platform. The authors concluded that clustering
subpopulations of one MOOC is not always applicable to other MOOCs. For example, they clustered long MOOCs
into seven groups, whilst shorter MOOCs were clustered into four groups. This conclusion indicates that different
approaches may stand with different MOOCs when it comes to cluster analysis.
Other types of clustering were used to examine assignments and lecture views in MOOCs. Anderson et al. (2014)
clustered students’ engagement between these two factors into five subpopulations based on quantitative
investigations. I) Viewers: watch lectures, hand few assignments. II) Solvers: hand assignments but views few
lecture videos. III) All-rounders: balanced between the two groups. IV) Collectors: download lectures, hand few
assignments. Finally, v) Bystanders: Registrants who never show up again.
Khalil and Ebner (2016b) classified four categories of students based on their activity of attending quizzes, watching
videos and engagement in discussion forums. I) Registrants: students just enroll in a course and never show up. II)
Active learners: students who do some type of activity like watch a single video or attend one quiz. III) Completers:
students who successfully finish all quizzes but do not ask for certificates. Finally, iv) Certified students: they are the
Completers who ask for the certificate letter.
Kovanovic et al. (2016) employed k-means clustering on 28 MOOCs from the Coursera platform and concluded five
clusters. I) Enrollees: students who are not active. II) Low Engagement: students with very low activity. III) Videos:
students who primarily watch videos. IV) Videos & Quizzes: students engaged in videos and do quizzes. V) Social:
students who participate actively in discussion forums.
Research Questions
MOOCs offer a fruitful and diverse environment that includes different types of students. Based on the collected
demographic information of the examined MOOC, this research study mainly explores two types of MOOC
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
enrollees: students from the university and students from outside the university, usually anybody from the general
public. The primary research questions are:
RQ1: How does each type of students engage with the MOOC platform? This question will be answered upon the
analysis of studentsbehavior in the MOOC platform. We will examine the main learning objects of each and
classify the correlated attitude.
RQ2: Are there similarities between the MOOC clustering outcomes and the classical face to face classes? We
follow the Cryer’s scheme of Elton framework (see Elton, 1996; Herzberg, 1968) to answer this question. The
scheme might be debatable and under criticism. But, it is derived from the famous fundamental principle of
Herzberg’s (1968) theory which was employed to motivate employees at work. Therefore, we think it is quite
engaging to see if an older framework like Elton’s might fit the new generation of distance learning.
Research Methodology
Data collection and parsing are performed using the iMooX Learning Analytics Prototype (iLAP). It is a tool that
provides researchers and administrators with filtered tracked data (Khalil & Ebner, 2016b). The generated large
amount of records enable us to analyze and classify learners. By tracing their left behind footmarks, the tool stores
learner actions. It fetches low-level data from the different available MOOC indicators. Videos, files download,
reading in forums, posting in forums, quiz results, and logins are such obtained information. The analyzed dataset in
this study derives from a MOOC offered in the summer semester of 2015 by Graz University of Technology. The
collected dataset was then parsed to refine the duplicated and unstructured data format. Data analysis was carried out
using the R software. An additional combined package called NbClust (Charrad, et al., 2014) was used for
implementing the k-means clustering algorithm.
We followed the content analysis methodology in which units of analysis (MOOC indicators) get measured and
benchmarked based on qualitative decisions (choosing k partitions in clustering, comprehensive decisions,
survey…etc.) (Neuendorf, 2002). These decisions are founded on sustained observations on a weekly basis and an
examination of surveys at the end of the course. Thus, we answer the research questions according to the outcomes
of the algorithm, survey, and observations.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
MOOC Platform and Course Overview
The MOOC-Platform
iMooX is an Austrian MOOC platform founded by the cooperation of Graz University of Technology and University
of Graz. The offered courses vary in topics between social science, engineering and technology. Moreover, they cope
with the lifelong learning and OER tracks in which materials are provided under creative commons license. The
target groups vary between school children, high school diploma, and university degree holders in the German-
speaking countries. iMooX offers certificates and badges to successful students who fulfilled course requirements at
no cost (Wüster & Ebner, 2016).
Course Overview and Demographic Analysis
The analyzed course is entitled as “Social Aspects of Information Technology” and abbreviated in this article as
GADI (Ebner & Maurer, 2008). We have selected this course because it is typified of being mandatory to university
students and at the same time was also opened to the local and international general public. The university students
came from different majors such as Information and Computer Engineering (Bachelor-6th semester), Computer
Science (Bachelor-2nd Semester), Software Development and Business Management (Bachelor-6th semester) as
well as for the Teacher Education of Computer Science (Bachelor -2nd Semester).
The course lasts 10 weeks long. Every week includes video lectures, discussion forums, readings and a multiple
choice quiz. GADI contents are mainly formed on interviews with experts, with 21 video lectures in summary of
about 17 minutes duration in average. The evaluation system followed the self-assessment principle in which each
quiz could be repeated up to five times. The system is programmed to record the highest grade; however, the student
should score at least 75% of every single trial in order to pass the course. The teaching staff predefined the workload
of 3 hours per week. Students of Graz University of Technology gain 2.5 ECTS (European Credit Transfer and
Accumulation System) points towards their degree if they successfully complete the MOOC; however, they still have
to do an essential practical work, additionally.
The GADI MOOC certification state is depicted in figure 1. The number of students from the university was quite
unexpected to any similar previously offered course in the university halls. The reason is clearly reasonable because
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
the 2015 course was offered at the iMooX platform for the first time. There were 459 matriculated undergraduates
and 379 external students. Because this MOOC is obligatory to pass the university class, the completion ratio was
quite high. The general certification rate (who gained a certificate) of this MOOC was 49%. Specifically, 79.96%
were the certification ratio of the undergraduates, and 11.35% of the external students.
Candidates, who successfully complete all the quizzes, were asked to submit answers for an evaluation form at the
end of the course. Questions varied between satisfaction factors and demographic information. Figure 2 reports
general information about the certified students (N= 410). The x-axis depicts the student type and the y-axis records
their age. Demographic analysis reveals that the majority of university students (N= 367, female = 40, male= 327)
were men. The average age was 23.1 years old and the standard deviation!was (σ = 2.94). On the other hand, we saw
a big difference in age and gender of the certified external students. The sample showed that there were (N= 43,
female = 20, male= 23) students. The average age was 46.95 years old and the standard deviation!was (σ = 10.88).
The overview of the population has further shown that 60% of the certified external students (N= 26) held bachelor,
!
Figure 1. The total number of students (N= 838) in the GADI MOOC grouped by
the status: certified or non-certified
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
master or Ph.D. degrees. This result meets the conclusion of Guo and Reinecke (2014) demographic research study
on a bigger sample from the edX platform. They found out that most students who earned a certificate held a
postgraduate degree.
In figure 3, we distributed certified students on the map. (N=375) participants filled out a valid city/country
information. (N=337) students were from Austria while there were (N=38) from other German-speaking countries
like Germany and Switzerland. It was quite interesting to the instructional teacher to notice that students enrolled in
the course came from other cities than Graz (University’s hometown).
!
Figure 2. Certified students (N= 410) of the GADI MOOC, grouped by gender and study level
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Survey Analysis
Students who completely finish the MOOC have to answer an evaluation form in order to obtain the certificate.
Before going further to the cluster analysis, we collected valuable questions that could give us an overview about the
certified population. The survey was distinguished between (N=43) external students and (N=364) valid university
students input. A summary of the survey results is presented in table 1. The psychometrics of this survey was based
on the Likert scale (1. strongly agree to 5. strongly disagree). One interesting observation is identified in question 4
in which participants from both cases disagree about the discussion forums value into engaging them positively in
the course. Furthermore, University students were more pessimistic towards the desire to learn than the external
students, whereas both agreed about being satisfied of the weekly quizzes.
Table 1. Survey description of the certified students from the GADI MOOC
Question
External Students (N=43)
Mean ± SD
1
Desire to learn throughout the course*
2.14 ± 0.96
2
Being discipline to the course*
2.23 ± 1.06
3
Weekly quizzes satisfaction*
2.20 ± 0.94
4
Discussion forum actively engaged you with the course topic*
3.41 ± 1.19
!
Figure 3. Certified students who filled out city information (N = 375), distributed on the map of Austria. From
university’s hometown (N= 219), other cities within Austria (N= 118), Outside Austria (N= 38)
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
5
Browsed materials outside the MOOC-platform**
2.07 ± 0.96
1 SD = Standard Deviation
* 1. Strongly agree … 5. Strongly disagree
** 1. Does not need … 5. More than 3 hours a week
Clustering and Case Studies
Clustering is about classifying a dataset represented by several parameters into a number of groups, categories or
clusters. Estimating the number of clusters has never been an easy task. It was considered as a complicated procedure
for experts (Jain & Dubes, 1988). This section focuses on the experiment we did to cluster the GADI MOOC
students. As conceded before, there was a fair gap between university and external students in this course. Both have
particular purposes. We believe there is a fine portion of university students who were attending the course only to
pass and transfer the allocated ECTS to their degree profile. Besides, there are other curious behaviors exists in the
dataset we wish to portray. In order to answer the first research question, our prospect of using the k-means
clustering was significantly promising.
The certification ratio of the university students, as shown in figure 1, was much higher than Jordan’s findings
(Jordan, 2013).The enrollment intention of university students was still debatable whether they were attending the
MOOC for learning or were they only looking for the grade. The survey in table 1 shows an average score of 2.92
regarding learning desire. However, we were more interested into investigating the Learning Analytics data.
Accordingly, the clustering was done independently on both groups. We believe it is not compelling to combine both
groups and then classify one single patch of students.
One of the main purposes of this study is to assign each participant in the MOOC to a relevant group that shares
common learning style. Each group should be distinct in full measure to prevent overlaps and the cluster elements
should fit as tied as possible to the defined group parameters. To set our experiment, we used the k-means clustering
algorithm. The scheme of measuring distance to compute the dissimilarity was set to “Euclidean”. Selecting this
method returns to its performance in reducing the variability inside one cluster, and maximizing the variability
between clusters (Peeples, 2011). In order to begin clustering, we labeled the variables that will be referenced in the
algorithm. The expected results should be clusters with activities and learning objects that distinguish the MOOC
participants.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
iLAP mines various MOOC indicators such as login frequency, discussion forum activity, watching videos, quiz
attempts…etc. Due to the relations between such indicators, we excluded the high correlated ones. Their impact,
accordingly, will not affect the grouping sequence in the cluster classification. For instance login frequency indicator
and reading in forums indicator correlation valued to (r = 0.807, p-value < 0.01). As a result, our absolute selection
of the MOOC variables in this clustering algorithm was:
1. Reading Frequency: This indicates the number of times a user clicked on posts in the forum.
2. Writing Frequency: This variable determines the number of written posts in the discussion forum.
3. Videos Watched: This variable contains the total number of videos a user clicked.
4. Quiz Attempts: Calculates the sum of assessment attempts in all weeks.
Case 1: University students (Undergraduates)
In this case, the k was assigned with a priori assumption of a value between 3 and 6, as long as we do not really want
more than 6 groups. In order to pick an optimal k, we used the NbClust package to validate the k value. This package
strongly depends on over 30 indices to propose the best clustering scheme (Charrad, et al., 2014). To do so, we used
a scree plot to visualize the sequential cluster levels on the x-axis and the groups sum of squares on the y-axis, as
shown in figure 4a. The optimal cluster solution can be identified at which the reduction in the sum of squares slows
dramatically (Peeples, 2011). The vertical dashed red line depicts a critical point (k= 4) where the difference in the
sum of squares becomes less apparent and respectively creates an “elbow” or a “bend” at cluster 4.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Afterward, we applied the k-means clustering algorithm by setting k= 4 for the university students. The outcome in
figure 5 depicts the generated clusters of the first case of the GADI MOOC participants. The clustering visual
interpretation usually follows either hierarchical or partitioning methods. We used the partitioning method since it
makes it easier for us to display each cluster in a two-dimensional plot rather than dendrogram plot. The x-axis and
the y-axis show the first two principal components. These components are algorithmically calculated based on the
largest possible variance of the used variables so that it shows as much flaw in the data as possible (Pison, Struyf &
Rousseeuw, 1999).
!!!!!!! !
Figure 4. (a) Left: the optimal number of clusters of the GADI MOOC undergraduates. (b) Right: The optimal
number of clusters of the GADI MOOC external students
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
The figure above reveals a scatter plot distribution of four clusters. Two of the groups, which are the blue and the
green, are overlapping. The relation between both of the principal components in x-axis and y-axis is valued to
67.76%. This percentage means that we have nearly 70% of unhidden information based on this clustering value1.
This value would be higher in other circumstances when substantial overlap does not exist, however, this hardly
meets our main goal of categorizing learners in the undergraduate GADI MOOC.
In the following, we show the formed clusters. A rough description was given for each that describes their way of
engagement:
Cluster (1) or “Dropout”, with the pink oval contains 95 students (20.69%). This group has low activity among the
four MOOC variables. Only 10 students were certified. The attrition rate was high.
Cluster (2) or “Perfect Students”, with the blue oval contains 154 students (33.55%). Most of the participants in this
group completed the course successfully. Certification ratio was 96.10%. This group was highly engaged reading in
discussion forum and accessing video lectures.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1!Explanation:!http://stats.stackexchange.com/questions/141280/understanding-cluster-plot-and-component-
variability!(Last!accessed,!11th!May!2016).!
!
Figure 5. Case 1: university students main clusters (N= 459, k = 4)
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Cluster (3) or Gaming the system”, with the green oval has 206 participants (44.88%). Certification ratio was
94.36%. Students in this group shared the same learning style as in Cluster (2), however, it was remarkable that
watching videos was quite low. The level of engagement in the number of quiz attempts was exceptionally higher
than the other clusters.
Cluster (4) or “Social”, is the smallest group and has four participants only (<1%) depicted as red oval. We noticed
that the students in this cluster are the only ones that had been writing in the discussion forum. The amount of
certified students in this cluster totals to 50 %.
Case 2: External Students
We replicated the same methodology before to this case. The range of the k was assigned in a range between 3 and 6
and the NbClust package was used again to validate the k value. Figure 4b shows the suggested number of clusters.
The vertical dashed red line depicts a critical point (k= 3) where the difference in the sum of squares becomes less
apparent and respectively creates an “elbow” at cluster 3.
By applying the k-means clustering algorithm, we set k to 3 for the external students sub dataset. Figure 5 depicts the
clustering results.
!
Figure 5. Case 2: external students main clusters (N= 379, k= 3)
!
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
The first two principal components variability shows a competitive rate of 88.89%, which indicates a fair clustering
validation.
In the following, we give a rough description of the clusters and list their characteristics:
Cluster (1) or “Gaming the system”, with the blue oval holds 42 students (%11.08). Certification ratio of this group
was 76.20%. The social activity and specifically reading in forums was moderate compared to the other clusters. The
level of engagement of quiz attempts was exceptionally higher than the other clusters.
Cluster (2) or “Perfect Students”, represented by the red oval and has only 8 students (2.11%). The certification rate
in this group was 100%. Participants showed the highest number of written contributions, reading frequency in the
forum as well as an active engagement in watching video lectures.
Cluster (3) or “Dropout” with the pink oval includes all the other participants (86.80%). The group’s
completion rate was close to 1% only.
Cluster Analysis
Within the previous clustering results, we studied the quantitative statistical values of each variable in every
generated cluster. The next step was to compare these values within the same group. As a result, we classified a scale
of “low”, “moderate” and “high” that describes the engagement level of each variable in every class. The scale was
determined based on the mean value of activities. Table 2 describes the difference between MOOC indicators of each
cluster. The table shows that scaling values of variables vary between groups. Learner clusters are clearly
distinguishable when observing the statistical dimensions. The “Dropout” cluster has a low level of engagement
among all variables in both cases. Eminent values amid variables of the “Perfect Students” learner cluster are
recorded in case 1 and case 2. Note that watching video lectures of the MOOC differs between both cases in the
“Gaming the system” cluster. We believe that external students were more into learning than the university students.
This is obvious when comparing the cluster size (44.88% to 11.08%). Additionally, university students were forced
to get a certificate from the MOOC in order to pass the final university course which definitely made “Gaming the
system” cluster the largest group in case 1.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
The “Social” cluster appears to be a distinctive group in clustering. This cluster is the smallest class in the university
students and could not be predicted in the external students group. A reasonable explanation behind this is due to the
pessimistic beliefs that were recorded from the survey in table 1. A design strategy could be implied in order to
encourage university students into engaging in discussion forums. It seems that the high activity in the discussion
forums guided the “Perfect Students” from the public to passing the course successfully. However, this is not always
the case; for instance, a recent study found out that not all top social contributors mean they can pass a MOOC, but it
might improve their performance (Alario-Hoyos et al., 2016).
Table 2. Characteristics and comparison between engagement level of clusters
Case: University students (Undergraduates)
Cluster
Reading Freq.
Mean ± SD
(Scale1)
Writing Freq.
Mean ± SD
(Scale)
Video Watches
Mean ± SD
(Scale)
Quiz Attempts
Mean ± SD
(Scale)
Cluster Size
(Percentage)
Certification
ratio
Dropout
6.25 ±6.38 (L)
0.01 ±0.10 (L)
2.44 ±3.42 (L)
2.76 ±3.86 (L)
95 (20.69%)
10.53%
Perfect Students
42.23 ±23.23 (H)
0.03 ±0.19 (L)
20.76 ±6.01 (H)
20.56 ±3.84 (H)
154 (33.55%)
96.10%
Gaming the system
23.99 ±11.19 (M)
0.00 ±0.07 (L)
5.77 ±4.01 (L)
19.64 ±3.84 (H)
206 (44.88%)
94.36%
Social
62.00 ±53.68 (H)
4.00 ±1.41 (H)
3.25 ±4.72 (L)
8.50 ±9.61 (M)
4 (<1%)
50%
Case: External students
Dropout
6.03 ±10.97 (L)
0.23 ±0.98 (L)
1.24 ±2.52 (L)
0.68 ±2.09 (L)
329 (86.80%)
<1%
Perfect Students
198.63 ±63.05 (H)
16.13 ±9.42 (H)
24.75 ±6.34 (H)
21.50 ±3.82 (H)
8 (02.11%)
100%
Gaming the system
51.76 ±43.22 (M)
0.71 ±1.88 (L)
18.10 ±8.36 (M)
19.33 ±6.06 (H)
42 (11.08%)
76.19%
1 Scale: L = Low, M= Moderate, H= High
Discussion
To answer the second research question, the opportunity to portray students’ engagement in the MOOC with the help
of the previous cluster analysis becomes conceivable. Our next challenge was to check if there is a clustering
experience from the traditional face to face classes matches ours. The results directed us to a study by Elton (1996)
which is based on Herzberg’s (1968) theory to motivate employees at work. The concept behind motivating a certain
category of people leads to the main goals of this research study. Elton’s clustering proposal called the Cryer’s
scheme is shown in figure 6. It shows a two-dimensional diagram to express student commitment in the class. The x-
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
axis represents the intrinsic factors, including, but not limited to, achievements and subject. The y-axis records the
examination preparation, which is named as the extrinsic factor.
In the bottom left of the Cryer’s scheme, it describes the students who are not interested in the course subject nor
score positive results. This class represents the rough description of students who disengaged “Dropout”. This profile
shares common patterns of being inactive among all the MOOC variables. The certification rate in this profile is low.
The class at the top left describes learners who have low commitment to the intrinsic factors. They were named as
“playing the system”. This term comes from a case when students are committed to doing specific tasks such as
doing an assignment. The collected dataset analysis reveals many students who were watching learning videos with
various skips. Interesting observations were recorded when some students started a quiz without watching the video
lecturing material, this was also reported in (Khalil & Ebner, 2015a). Such students meet the defined criteria of
“Gaming the system” cluster. The majority of this group could obtain the certificate at the end of the course.
The class at the bottom right is defined as the social-motivated category. This group of students shows sympathy
towards the course but fails because of bad exam preparations or time shortage. In our cluster analysis, this group
was identified in the undergraduates case as “Social”. Yet, it was difficult to detect within the external participants’
!
Figure 6. Cryer’s scheme of Elton based on levels of learner
commitment in classes (Elton, 1996)
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
group. This is expressed because of the 100% completion rate of the “Perfect student” group. We strongly believe
that students from the “Perfect Student” class would be underrated to this category if he/she does not complete the
course while retaining the high activity engagement in forums and vice versa. This category is characterized for
holding active students in MOOC discussion forums. Participants of this group discern a conflict between their high
intrinsic motivation of learning and the extrinsic motivation. They may find themselves interested in the topic but
their commitment to finish the course face obstacles, however, they might be involved in forums, or watching videos
from time to time.
The last class is the group which holds students whom their commitment is high. “Perfect Studentsresides in this
class where their commitment is high. The data can tell that students there are satisfied intrinsically and extrinsically.
They watch videos, discuss, do multiple quizzes for a better performance and their certification rate is quietly
spotless.
Results Interpretation
The ideas presented in this article can be extended to other similar courses. The cluster analysis result relies mainly
on quantitative data and recognizes the qualitative results from survey and observations. While the Cryer’s scheme
was just a framework concept that could be applied on distance learning platforms such as MOOC environments, we
see that it fairly fits our MOOC dataset. Stereotypes from traditional massive face to face classes might also occur in
online courses. In figure 7, we show the examined MOOC dataset being applied on the Cryer’s scheme.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
It must be stated that this scheme does not only include the shown specific profiles, but learners can also be
distributed unequally among these four profiles. For instance, while the “Social” cluster does not appear in the
external students case, they could be represented somewhere near the social-motivated profile in some certain
conditions. Observations from figure 7 can be translated into that large quantity of the MOOC students are on the left
part of the Cryer’s scheme. This part represents either the “Dropout” or “Gaming the system” clusters.
Several interventions can be adapted thereafter: (a) It is better to consider reallocating students from the dropout
class to the “Social” or “Playing the system” class. The transmission to the social-motivated profile is feasible by
concentrating on the improvement factors which by then increases the intrinsic motivation while the transmission to
the “Play the system” is attainable by focusing on the extrinsic factors. (b) Extrinsic motivation is not enough to lead
to the high commitment cluster, therefore, appraising students or recognizing their efforts by providing badges for
instance (Wüster & Ebner, 2016) might not transfer students from “Gaming the system” to “Perfect students”. (c)
Moving students from the “Social” to the “Perfect student” is attainable; however, there are conditions that neither an
optimal MOOC nor great didactical system would do that without an initiative step by the learners themselves. (d)
One of the steady lessons of this study is that paying a lot of attention on the extrinsic factors like shortening the
!
Figure 7. GADI MOOC data applied on Cryer’s scheme
!
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
MOOC, grades, certificates or badges are not just the case to make students progress positively towards the
“excellent” cluster, to make them “perfect”. However, improving the intrinsic factors strategy such as investing into
improvement of instructional design and great didactical approaches should transfer them to a fine edge of being
motivated in the learning process.
Conclusion
This research study examined learners’ engagement in a mandatory MOOC offered by the iMooX platform. We
studied patterns of the involved students and separated the main research into two use cases: university students
(undergraduates) and external students (public). Within our research study, we performed a cluster analysis, which
pointed out participants in MOOCs, whether they did the course on a voluntary basis or not. Furthermore, we found
that the clusters can be applied on the Cryer’s scheme of Elton (1996). We also realized that population in the GADI
MOOC looks similarly to a mass education scene happening in a large lecture hall. The experiment of this study
leads to the assumption that tomorrow’s instructors have to think about the increase of the intrinsic motivation by
those students who are only “playing the system”. Therefore, we strongly recommend researching on how MOOCs
can be more engaging and creating new didactical concepts to increase motivational factors, mainly the intrinsically
ones.
References
Alario-Hoyos, C., Muñoz-Merino, P. J., Pérez-Sanagustín, M., Delgado Kloos, C., & Parada, G. (2016). Who are the
top contributors in a MOOC? Relating participants' performance and contributions. Journal of Computer Assisted
Learning. doi: 10.1111/jcal.12127
Anderson, A., Huttenlocher, D., Kleinberg, J., & Leskovec, J. (2014). Engaging with massive online courses.
In Proceedings of the 23rd international conference on World wide web (pp. 687-698). ACM.
Carson, S. & Schmidt, J. (2012). The Massive Open Online Professor Academic Matter. Journal of higher education.
Available from http://www.academicmatters.ca/2012/05/the-massive-open-online-professor/ [last accessed
02.May.2016].
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2013). NbClust: An examination of indices for determining
the number of clusters: NbClust Package. Available from https://cran.r-
project.org/web/packages/NbClust/NbClust.pdf [last accessed 12.May.2016].
Ebner, M. & Maurer, H. (2008). Can Microblogs and Weblogs change traditional scientific writing؟. In Proceedings
of E-learn 2008, Las Vegas, USA (pp. (768-776).
Ebner, M., Kopp, M., Wittke, A. & Schön, S. (2014). Das O in MOOCs über die Bedeutung freier
Bildungsressourcen in frei zugänglichen Online-Kursen. In: HMD Praxis der Wirtschaftsinformatik, 52 (1), S. 68-80,
Springer, December 2014.
Elton, L. (1996). Strategies to enhance student motivation: a conceptual analysis. Studies in Higher Education, 21(1),
57-68.
Ferguson, R., & Clow, D. (2015). Examining engagement: analysing learner subpopulations in massive open online
courses (MOOCs). In Proceedings of the Fifth International Conference on Learning Analytics And Knowledge (pp.
51-58). ACM.
Ferguson, R., Clow, D., Beale, R., Cooper, A. J., Morris, N., Bayne, S., & Woodgate, A. (2015). Moving through
MOOCS: pedagogy, learning design and patterns of engagement. In: Design for Teaching and Learning in a
Networked World (Klobucar, Tomaˇz and Conole, Grainne eds.), Lecture Notes in Computer Science. (pp. 7084)
Springer International Publishing Switzerland.
Guo, P. J., & Reinecke, K. (2014). Demographic differences in how students navigate through MOOCs.
In Proceedings of the first ACM conference on Learning@ scale conference (pp. 21-30). ACM.
Herzberg, F. (1968). One more time: How do you motivate employees (pp. 46-57). Harvard Business Review, 46.
(pp. 53-62).
Hollands, F. M., & Tirthali, D. (2014). MOOCs: Expectations and reality. Center for Benefit-Cost Studies of
Education, Teachers College, Columbia University.
Jain, A.K., & Dubes, R.C. (1988). Algorithms for Clustering Data. Englewood Cliffs NJ: Prentice Hall.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Jordan, K. (2013). MOOC completion rates: The data. Available from: http://www.katyjordan.com/MOOCproject.
html. (Last accessed 03.May.2016).
Khalil, H., & Ebner, M. (2014). Moocs completion rates and possible methods to improve retention-a literature
review. In World Conference on Educational Multimedia, Hypermedia and Telecommunications (Vol. 2014, No. 1,
pp. 1305-1313).
Khalil, M., & Ebner, M. (2015a). A STEM MOOC for School ChildrenWhat Does Learning Analytics Tell us?. In
Proceedings of 2015 International Conference on Interactive Collaborative Learning, Florence, Italy. IEEE. (pp.
1217-1221)
Khalil, M., & Ebner, M. (2015b). Learning Analytics: Principles and Constraints. In Proceedings of World
Conference on Educational Multimedia, Hypermedia and Telecommunications, Canada (pp. 1789-1799).
Khalil, M. & Ebner, M. (2016a). “What is Learning Analytics about? A Survey of Different Methods Used in 2013-
2015”. In Proceedings of Smart Learning Conference, Dubai, UAE, 7-9 March, 2016 (pp. 294-304). Dubai: HBMSU
Publishing House.
Khalil, M., & Ebner, M. (2016b). What Massive Open Online Course (MOOC) Stakeholders Can Learn from
Learning Analytics?. In M. J. Spector, B. B. Lockee, & M. D. Childress (Eds.), Learning, Design, and Technology.
Springer International Publishing (in press).
Khalil, M., Kastl, C., & Ebner, M. (2016). Portraying MOOCs Learners: a Clustering Experience Using Learning
Analytics. In Khalil, M., Ebner, M., Kopp, M., Lorenz, A. & Kalz, M. (Eds)., Proceedings of the European
Stakeholder Summit on experiences and best practices in and around MOOCs (EMOOCS 2016) , Graz, Austria,
pp.265-278.
Kizilcec, R. F., Piech, C., & Schneider, E. (2013). Deconstructing disengagement: analyzing learner subpopulations
in massive open online courses. In Proceedings of the third international conference on learning analytics and
knowledge (pp. 170-179). ACM.
Preliminary*version,*Draft!–!Original!published!in:!Khalil,!M.,!Ebner,!M.!(2016)!Clustering!patterns!of!engagement!in!Massive!Open!Online!
Courses!(MOOCs):!the!use!of!learning!analytics!to!reveal!student!categories.!Journal!of!Computing!in!Higher!Education.!pp.!1-19.!
http://dx.doi.org/10.1007/s12528-016-9126-9!
Get!final*version!here:!http://rdcu.be/lZLx!!
!
Kovanović, V., Joksimović, S., Gašević, D., Owers, J., Scott, A. M., & Woodgate, A. (2016). Profiling MOOC
Course Returners: How Does Student Behavior Change Between Two Course Enrollments?. In Proceedings of the
Third (2016) ACM Conference on Learning@ Scale (pp. 269-272). ACM.
Lackner, E., Ebner, M., & Khalil, M. (2015). MOOCs as granular systems: design patterns to foster participant
activity. eLearning Papers, 42, 28-37.
McAuley, A., Stewart, B., Siemens, G. & Dave Cormier, D. (2010). Massive Open Online Courses Digital ways of
knowing and learning. The MOOC model For Digital Practice. available at:
http://www.elearnspace.org/Articles/MOOC_Final.pdf
Meyer, R. (2012). What it’s like to teach a MOOC (and what the heck’s a MOOC?), Retrieved October 2015,
available at: http://tinyurl.com/cdfvvqy
Neuendorf, K. A. (2002). The content analysis guidebook. Vol. 300. Thousand Oaks, CA: Sage Publications.
Peeples, M. A. (2011). R Script for K-Means Cluster Analysis. Available from:
http://www.mattpeeples.net/kmeans.html. (last accessed 13.05.2016).
Pison, G., Struyf, A., & Rousseeuw, P. J. (1999). Displaying a clustering with CLUSPLOT. Computational statistics
& data analysis, 30(4), 381-392.
Rodriguez, C. O. (2012). MOOCs and the AI-Stanford Like Courses: Two Successful and Distinct Course Formats
for Massive Open Online Courses. European Journal of Open, Distance and E-Learning.
Yuan, L., Powell, S., & CETIS, J. (2013). MOOCs and open education: Implications for higher education.
ster, M., & Ebner, M. (2016). How to integrate and automatically issue Open Badges in MOOC platforms. In
Khalil, M., Ebner, M., Kopp, M., Lorenz, A. & Kalz, M. (Eds) Proceedings of the European Stakeholder Summit on
experiences and best practices in and around MOOCs (EMOOCS 2016), Graz, Austria, pp.279-286.
... Collecting students' behavioral data through school management systems, online education platforms, questionnaire surveys and interviews, intelligent devices and applications, etc., we can understand students' attitudes, points of interest, difficulties, etc., which can help educators to better formulate teaching strategies and improve educational effects [3][4][5]. Among them, intelligent analysis can mine the interrelationships between different attributes of students' behavioral data through association rules, such as the correlation between the frequency of students' participation in extracurricular activities and the level of academic performance [6][7][8]; it can also be used to classify students into different types through clustering analysis, such as diligent, lazy, potential, etc., so that we can provide targeted educational measures for different types of students and select the most suitable employees for enterprises [9][10][11]; predictive analytics can also be used to predict students' academic performance, probability of further education, chances of employment, etc [12][13][14][15]. Second, pattern recognition and intelligent analysis of student behavior data are conducive to the realization of personalized education. ...
Article
Full-text available
This study attempts to mine students’ learning behavioral patterns and provide teachers with suggested interventions. This study takes the online behavioral data of large-scale learners as the research object, mines and extracts their learning behavioral features, then divides the extracted behavioral features into 11 specific behavioral indicators, and uses the K-Means algorithm to perform cluster analysis on the learning behavior of learners, and then compares the different learning groups in terms of learning motivation, time investment, learning effectiveness, and learning interactions in four directions, respectively. The lagged sequence analysis was used to explore the differences in the learning behavior sequences of different learning groups. The differences in the total time spent on the task and the number of replies are significant among the four categories of learners: general, negative, interactive, and active. The “average learners” had the best learning outcomes with a score center greater than 88, and the “negative learners” performed poorly in all four learning behaviors. Teachers can improve students’ learning motivation, interactive behavior, and time investment to enhance their performance on different learning behaviors while ensuring that the frequency of students’ behaviors is evenly distributed. The intelligent analysis method presented in this paper provides a reliable and reasonable basis for teachers’ teaching interventions.
... It is a mixed mode learning/ adaptive learning because of its ability to integrate face-to-face instruction with computer-mediated instruction and mobile assisted learning (Famorca and Elivera, 2020). If adopted fittingly, blended learning can transform a higher education institution into a more accommodating, open, and responsive institution (Oakley, 2016); can swiftly address challenges and respond to opportunities in improving educational outcomes, extending accessibility, and reducing costs (Khalil and Ebner, 2017); can strengthen the communicative language teaching approach in a language classroom (Rajeswaran, 2019); found to be enhancing English learning process, developing language skills, and improving the English learning environment (Albiladi & Alshareef, 2019); maximising authentic input in order to support learners' output and skills development to achieve an optimal level (Marsh, 2012;Zhang and Zhu, 2018); in some aspects, it is multiple delivery media combined together and designed to complement each other to promote learning and application of learned behaviour (Singh, 2003); offers flexible, selfpaced-personalised learning for the learners to enjoy. ...
Article
Full-text available
The English language classrooms have seen changes in teaching learning theories, approaches and methods. There was a shift from Communicative Language Teaching (CLT) of the eighties to Task-Based Language Teaching (TBLT) in the nineties. It was considered a shift from one pedagogy to another, but technology mediation in education in general is considered an external force that shapes our pedagogic principles and approaches. The Information and Communication Technology (ICT) which started serving educational sector by 1995, has evolved so tremendously that education without it is impossible. In the globalised context of English teaching and learning, the language classes operate on many theories and approaches with no specific focus on any one of them, because various theories and approaches seamlessly merge in a language classroom with technology as a catalyst, which, while strengthening the teaching learning process ensures learning outcome. This is a boon to higher education portals where digital natives are stakeholders. The ubiquitous presence of technology mediation in language learning makes it the orthodoxy of the era. This article studies the pedagogical perspectives of technology mediation in the language classrooms.
... Analyzing learning behavior and implementing intervention are effective ways to improve learning performance [78]. Educational technologies, such as learning analytics, adaptive learning systems, etc., can provide learners with personalized learning experiences and support based on learning behavior data [79]. ...
Article
Full-text available
Massive Open Online Course (MOOC) has gained widespread adoption across diverse educational domains and plays a crucial role in advancing educational equality. Nevertheless, skepticism surrounds the effectiveness of MOOC due to their notably low completion rates. To explore the outcomes of MOOC adoption in higher education and improve its application efficiency, this study compares MOOC with traditional course in terms of mean score and pass rate. The study examines the factors influencing MOOC performance within the context of higher education, utilizing the method of Partial Least Squares-Structural Equation Modeling (PLS-SEM). This study analyzed MOOC learning data from a college over a period of six years and a total of 4,282 Chinese college students participated in this study. The factor of learning environment was proposed for the first time, and it was proved to have a significant impact on learning behavior and MOOC performance in higher education. The results reveal that 1) MOOC has a lower pass rate than traditional course (including both compulsory and selective course); 2) MOOC has a lower mean score than selective course only; 3) we did not find a significant difference between MOOC and compulsory course in terms of the mean score; 4) Learning behavior, learning motivation, perceived value, learning environment, previous experience and self-regulation have significant and positive influences on MOOC performance in higher education. The study provides valuable insights that college administrators should pay attention to students’ learning environment, learning motivation and other factors while actively introducing MOOC.
... In the domain of online education, understanding learner behaviour and learning patterns is crucial for enhancing the effectiveness of digital learning platforms. Thus far, measurement of success of learning platforms has been limited primarily to completion rates and perceived benefits (Poellhuber et al., 2019), and the subsequent research in learner analytics has focused on categorising learners using aspects of their platform behaviour such as user clicks (Ho et al., 2015;Khalil & Ebner, 2017) in the context of tangible skills such as programming. The current study seeks to extend this body of research work by using two (number of attempts and course engagement time) parameters of online learners to understand learner behaviour in a massive online open course (available on framerspace.com) ...
Article
Understanding learning patterns in online learning platforms is an important aspect of digital learning. Our study aimed to present preliminary findings of learners' interactions (N=29,826) with course content (on an online interactive learning platform) for the course "social emotional learning for teachers", using course engagement time and quiz attempts as two parameters of learner engagement. Our findings suggest that question difficulty and learner characteristics (motivation to engage) both influence attempt patterns. The results of such analysis can provide data-driven strategies to improve content engagement, teaching methods, and platform features, enhancing online learning experiences.
... Furthermore, these videos have shown promise in predicting student performance and improving academic activity [22,34]. By analyzing students' interactions with video content, such as viewing patterns and engagement metrics, educators can gain valuable insights into student learning behaviors [26]. These insights help tailor instructional strategies to meet individual learning needs, enhancing the personalization of education [9]. ...
Article
Full-text available
Due to its inherent flexibility and adaptability, e-learning has become a versatile and effective educational option, with video-based learning (VBL) enabling asynchronous engagement. Despite the wealth of video resources, evaluating their effectiveness remains challenging. This study employs a comprehensive approach, analyzing student behavior and video metadata to assess educational videos’ efficacy. Metrics such as average playback speed, video and course completion rates, and video metadata were examined using the XuetangX dataset, MOOC- CubeX. Results indicate that viewing patterns are more influenced by students’ choices than by video content. Well-conceptualized videos increase viewing time, with 17 % of students engaging with videos over 30 min if they cover focused topics. While shorter videos generally have higher completion rates, many remain untouched. Playback speed analysis shows a preference for standard speed, with positive correlations between speed and both completion percentage and viewing time. Courses under 5 h are likely to be watched for less than an hour, reflecting brief viewing durations. Metadata analysis reveals that generic titles may deter engagement, underscoring the importance of precise tagging and content relevance for enhancing student interaction.
Article
Purpose The review aims to synthesize previous studies to present an overview of the techniques commonly used in learning analytics, as well as identify possible knowledge gaps in the extant studies and provide insights on future directions for learning analytics techniques moving forward. Design/methodology/approach This paper provides a systematic review of learning analytics techniques. A total of 63 articles were included in the final review and 3 main themes emerged based on our research questions. These themes include (A) individual learning, (B) collaborative learning and (C) game-based learning. The first theme is related to the application of learning analytics techniques in the context of individual student learning, while the second and third themes focus on the application of learning analytics techniques in the context of collaborative learning and game-based learning research, respectively. The paper summarizes key findings, identifies possible gaps for future research and provides recommendations for future research. Findings The commonly used techniques include classification, content analysis, social network analysis and taxonomic mapping. Multimodal learning analytics, which uses data from multiple sources to understand learners’ behavior and experience, is also growing. The review of learning analytics research highlights several knowledge gaps, including methodological issues, adaptability of techniques, ethical, risk and privacy concerns and precise terminologies for methodological decisions. The choice of learning analytics techniques should be guided by research questions and data nature. Originality/value This work meets the originality requirement.
Article
Full-text available
The intense rising of machine learning in the previous years, bolstered by post-COVID-19 digitalization, left some of us pondering upon the transparency practices involving projects sourced from European Union funds. This study focuses on the European Union research clusters and trends in the ecosystem of higher education institutions (HEIs). The manually curated dataset of bibliometric data from 2020 to 2024 was analyzed in steps, from the traditional bibliometric indicators to natural language processing and collaboration networks. Centrality metrics, including degree, betweenness, closeness, and eigenvector centrality, and a three-way-intersection of community detection algorithms were computed to quantify the influence and the connectivity of institutions in different communities in the collaborative research networks. In the EU context, results indicate that institutions such as Universidad Politecnica de Madrid, the University of Cordoba, and Maastricht University frequently occupy central positions, echoing their role as local or regional hubs. At the global level, prominent North American and UK-based universities (e.g., University of Pennsylvania, Columbia University, Imperial College London) also remain influential, standing as a witness to their enduring influence in transcontinental research. Clustering outputs further confirmed that biomedical and engineering-oriented lines of inquiry often dominated these networks. While multiple mid-ranked institutions do appear at the periphery, the data highly implies that large-scale initiatives gravitate toward well-established players. Although the recognized centers provide specialized expertise and resources, smaller universities typically rely on a limited number of niche alliances.
Article
L’utilisation débridée d’indicateurs pour mesurer la performance de cours en ligne nuit potentiellement à la compréhension des phénomènes qu’ils prétendent mesurer. En nous focalisant sur les MOOC, nous illustrons trois types de travaux qui permettent de renouveler le regard sur ces métriques à partir d’analyses de traces d’interaction. Le premier consiste à questionner la terminologie mobilisée. Il convient d’identifier le phénomène que l’on évoque lorsque l’on parle de nombre d’étudiants, d’inscrits, ou de certificats délivrés. Par exemple, les inscriptions sont souvent faites par « rafales » et correspondent à des cours qui se superposent, et la comparaison avec une inscription à un cours universitaire montre ses limites. La recherche peut également viser à proposer de nouvelles catégories d’apprenants, qui constituent alors autant d’indicateurs alternatifs. L’usage du taux de certification minore en effet le poids des non-certifiés, qui peuvent développer une activité non négligeable, invisibilisée par le choix de certains indicateurs. Enfin, le chercheur peut enfin reprendre des indicateurs traditionnels, mais en segmentant le public d’un MOOC de sorte à appliquer les métriques de manière pertinente. Il convient alors de légitimer les critères mobilisés pour réaliser ce travail de segmentation.
Article
Full-text available
Many MOOCs initiatives continue to report high attrition rates among distance education students. This study investigates why students dropped out or failed their MOOCs. It also provides strategies that can be implemented to increase the retention rate as well as increasing overall student satisfaction. Through studying literature, accurate data analysis and personal observations, the most significant factors that cause high attrition rate of MOOCs are identified. The reasons found are lack of time, lack of learners’ motivation, feelings of isolation and the lack of interactivity in MOOCs, insufficient background and skills, and finally hidden costs. As a result, some strategies are identified to increase the online retention rate, and will allow more online students to graduate.
Article
Full-text available
The area of Learning Analytics has developed enormously since the first International Conference on Learning Analytics and Knowledge (LAK) in 2011. It is a field that combines different disciplines such as computer science, statistics, psychology and pedagogy to achieve its intended objectives. The main goals illustrate in creating convenient interventions on learning as well as its environment and the final optimization about learning domain stakeholders. Because the field matures and is now adapted in diverse educational settings, we believe there is a pressing need to list its own research methods and specify its objectives and dilemmas. This paper surveys publications from Learning Analytics and Knowledge conference from 2013 to 2015 and lists the significant research areas in this sphere. We consider the method profile and classify them into seven different categories with a brief description on each. Furthermore, we show the most cited method categories using Google scholar. Finally, the authors raise the challenges and constraints that affect its ethical approach through the meta-analysis study. It is believed that this paper will help researchers to identify the common methods used in Learning Analytics, and it will assist by establishing a future forecast towards new research work taking into account the privacy and ethical issues of this strongly emerged field.
Chapter
Full-text available
Massive open online courses (MOOCs) are the road that led to a revolution and a new era of learning environments. Educational institutions have come under pressure to adopt new models that assure openness in their education distribution. Nonetheless, there is still altercation about the pedagogical approach and the absolute information delivery to the students. On the other side with the use of Learning Analytics, powerful tools become available which mainly aim to enhance learning and improve learners’ performance. In this chapter, the development phases of a Learning Analytics prototype and the experiment of integrating it into a MOOC platform, called iMooX will be presented. This chapter explores how MOOC stakeholders may benefit from Learning Analytics as well as it reports an exploratory analysis of some of the offered courses and demonstrates use cases as a typical evaluation of this prototype in order to discover hidden patterns, overture future proper decisions, and to optimize learning with applicable and convenient interventions.
Poster
Full-text available
Massive Open Online Courses represent a fertile ground for examining student behavior. However, due to their openness MOOC attract a diverse body of students, for the most part, unknown to the course instructors. However, a certain number of students enroll in the same course multiple times, and there are records of their previous learning activities which might provide some useful information to course organizers before the start of the course. In this study, we examined how student behavior changes between subsequent course offerings. We identified profiles of returning students and also interesting changes in their behavior between two enrollments to the same course. Results and their implications are further discussed.
Conference Paper
Full-text available
The area of Learning Analytics has developed enormously since the first International Conference on Learning Analytics and Knowledge (LAK) in 2011. It is a field that combines different disciplines such as computer science, statistics, psychology and pedagogy to achieve its intended objectives. The main goals illustrate in creating convenient interventions on learning as well as its environment and the final optimization about learning domain's stakeholders (Khalil & Ebner, 2015b). Because the field matures and is now adapted in diverse educational settings, we believe there is a pressing need to list its own research methods and specify its objectives and dilemmas. This paper surveys publications from Learning Analytics and Knowledge conference from 2013 to 2015 and lists the significant research areas in this sphere. We consider the method profile and classify them into seven different categories with a brief description on each. Furthermore, we show the most cited method categories using Google scholar. Finally, the authors raise the challenges and constraints that affect its ethical approach through the meta-analysis study. It is believed that this paper will help researchers to identify the common methods used in Learning Analytics, and it will assist by establishing a future forecast towards new research work taking into account the privacy and ethical issues of this strongly emerged field.
Chapter
Full-text available
Though MOOC platforms offer quite good online learning opportunities, thereby gained skills and knowledge is not recognized appropriately. Also, they fail in main-taining the initial learner’s motivation to complete the course. Mozilla’s Open Badges, which are digital artifacts with embedded meta-data, could help to solve these problems. An Open Badge contains, beside its visual component, data to trustworthy verify its receipt. In addition, badges of different granularity cannot just certify successful course completion, but also help to steer the learning process of learners through formative feedback during the course. Therefore, a web application was developed that enabled iMooX to issue Open Badges for formative feedback as well as summative evaluation. A course about Open Educa-tional Resources served as prototype evaluation, which confirmed its aptitude to be also used in other courses.
Conference Paper
Full-text available
Massive Open Online Courses are remote courses that excel in their students' heterogeneity and quantity. Due to the peculiarity of being massiveness, the large datasets generated by MOOCs platforms require advance tools to reveal hidden patterns for enhancing learning and educational environments. This paper offers an interesting study on using one of these tools, clustering, to portray learners' engagement in MOOCs. The research study analyse a university mandatory MOOC, and also opened to the public, in order to classify students into appropriate profiles based on their engagement. We compared the clustering results across MOOC variables and finally, we evaluated our results with an eighties students' motivation scheme to examine the contrast between classical classes and MOOCs classes. Our research pointed out that MOOC participants are strongly following the Cryer's scheme of Elton (1996).
Conference Paper
Massive open online courses (MOOCs) are now being used across the world to provide millions of learners with access to education. Many learners complete these courses successfully, or to their own satisfaction, but the high numbers who do not finish remain a subject of concern for platform providers and educators. In 2013, a team from Stanford University analysed engagement patterns on three MOOCs run on the Coursera platform. They found four distinct patterns of engagement that emerged from MOOCs based on videos and assessments. However, not all platforms take this approach to learning design. Courses on the FutureLearn platform are underpinned by a social-constructivist pedagogy, which includes discussion as an important element. In this paper, we analyse engagement patterns on four FutureLearn MOOCs and find that only two clusters identified previously apply in this case. Instead, we see seven distinct patterns of engagement: Samplers, Strong Starters, Returners, Mid-way Dropouts, Nearly There, Late Completers and Keen Completers. This suggests that patterns of engagement in these massive learning environments are influenced by decisions about pedagogy. We also make some observations about approaches to clustering in this context.