Access to this full-text is provided by MDPI.
Content available from Information
This content is subject to copyright.
Academic Editors: Rui Chibante,
Kleopatra Nikolopoulou and Filipe T.
Moreira
Received: 14 February 2025
Revised: 31 March 2025
Accepted: 16 April 2025
Published: 19 April 2025
Citation: Córdova-Esparza, D.-M.;
Terven, J.; Romero-González, J.-A.;
Córdova-Esparza, K.-E.; López-
Martínez, R.-E.; García-Ramírez, T.;
Chaparro-Sánchez, R. Predicting and
Preventing School Dropout with
Business Intelligence: Insights from a
Systematic Review. Information 2025,
16, 326. https://doi.org/10.3390/
info16040326
Copyright: © 2025 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Systematic Review
Predicting and Preventing School Dropout with Business
Intelligence: Insights from a Systematic Review
Diana-Margarita Córdova-Esparza 1, Juan Terven 2, Julio-Alejandro Romero-González 1,
Karen-Edith Córdova-Esparza 3,* , Rocio-Edith López-Martínez 1, Teresa García-Ramírez 1
and Ricardo Chaparro-Sánchez 1
1Facultad de Informática, Universidad Autónoma de Querétaro, Av. de las Ciencias S/N,
Queretaro 76230, Mexico; diana.cordova@uaq.mx (D.-M.C.-E.); julio.romero@uaq.mx (J.-A.R.-G.);
rocio.edith.lopez@uaq.mx (R.-E.L.-M.); teregar@uaq.mx (T.G.-R.); rchapa@uaq.mx (R.C.-S.)
2CICATA-Unidad Querétaro, Instituto Politécnico Nacional, Cerro Blanco 141, Col. Colinas del Cimatario,
Queretaro 76090, Mexico; jrtervens@ipn.mx
3Facultad de Filosofía, Universidad Autónoma de Querétaro, Av. 16 de Septiembre No. 57, Centro Histórico,
Queretaro 76000, Mexico
*Correspondence: karen.cordova@uaq.mx
Abstract: School dropout in higher education remains a significant global challenge with
profound socioeconomic consequences. To address this complex issue, educational in-
stitutions increasingly rely on business intelligence (BI) and related predictive analytics,
such as machine learning and data mining techniques. This systematic review critically
examines the application of BI and predictive analytics for analyzing and preventing stu-
dent dropout, synthesizing evidence from 230 studies published globally between 1996
and 2025. We collected literature from the Google Scholar and Scopus databases using a
comprehensive search strategy, incorporating keywords such as “business intelligence”,
“machine learning”, and “big data”. The results highlight a wide range of predictive tools
and methodologies, notably data visualization platforms (e.g., Power BI) and algorithms
like decision trees, Random Forest, and logistic regression, demonstrating effectiveness
in identifying dropout patterns and at-risk students. Common predictive variables in-
cluded personal, socioeconomic, academic, institutional, and engagement-related factors,
reflecting dropout’s multifaceted nature. Critical challenges identified include data privacy
regulations (e.g., GDPR and FERPA), limited data integration capabilities, interpretability
of advanced models, ethical considerations, and educators’ capacity to leverage BI effec-
tively. Despite these challenges, BI applications significantly enhance institutions’ ability
to predict dropout accurately and implement timely, targeted interventions. This review
emphasizes the need for ongoing research on integrating ethical AI-driven analytics and
scaling BI solutions across diverse educational contexts to reduce dropout rates effectively
and sustainably.
Keywords: business intelligence; school dropout; machine learning
1. Introduction
In today’s information-driven society, educational institutions face the challenge of
efficiently transforming large volumes of data into actionable insights to address critical
problems, such as school dropout. School dropout, particularly in higher education, is a
complex phenomenon with severe implications at the individual, social, and economic lev-
els. Factors contributing to dropout range from socioeconomic circumstances and academic
Information 2025,16, 326 https://doi.org/10.3390/info16040326
Information 2025,16, 326 2 of 30
performance to institutional policies and emotional support [
1
,
2
]. Despite considerable
efforts to mitigate dropout, many institutions still struggle due to inadequate predictive
tools, delayed interventions, and fragmented data management practices.
In this context, business intelligence (BI) has emerged as a powerful approach to
support data-driven decision-making, offering institutions strategic tools for analyzing data
to identify patterns, trends, and risks proactively [
3
]. Originally defined as “concepts and
methods to improve business decisions through fact-based support systems” [
3
], BI now
encompasses a wide range of technologies and methodologies, including data warehousing,
online analytical processing (OLAP), predictive analytics, and data visualization. These
capabilities enable organizations not only to understand historical data but also to anticipate
future trends and issues, thus facilitating informed and timely decisions [4,5].
Educational institutions adopting BI tools can significantly improve their ability to
predict and prevent dropout by leveraging predictive models and comprehensive data
analyses. Specifically, BI allows for the integration of diverse data sources—from academic
records and socio-demographic indicators to real-time student engagement data—into
cohesive analytical frameworks that help identify at-risk students early and precisely. This
integration supports the development of targeted interventions and strategic resource
allocation to effectively reduce dropout rates.
Nevertheless, despite growing interest and promising results reported in various
studies, the application of BI tools for dropout prediction in education remains fragmented.
Many institutions struggle to identify best practices, appropriate methodologies, and robust
predictive models suitable for their contexts. Furthermore, the existing literature is often
dispersed, lacking comprehensive syntheses or clear guidance on effectively implementing
BI approaches for dropout prevention.
Addressing these gaps, this systematic review aims to consolidate and critically
analyze the existing literature on the application of BI and related data-driven ap-
proaches—specifically machine learning and big data analytics—in predicting and pre-
venting school dropout in higher education. The study seeks to clarify how these tools
have been implemented, identify the most effective strategies, and highlight ongoing chal-
lenges and opportunities for future research. By providing a clearer and more gradual
transition from general BI concepts to specific applications in dropout analysis, this re-
view contributes to enhancing institutional capacity to reduce dropout rates and improve
educational outcomes globally.
Business Intelligence in the Educational Sector
Business intelligence (BI) provides educational institutions with critical tools and
methodologies for the continuous analysis of institutional data, allowing for the proactive
identification of trends, opportunities, and risks [
6
]. By effectively transforming large
volumes of diverse data into actionable insights, BI supports better decision making in
various educational operations.
The key benefits of BI for educational institutions include creating a continuous cycle
of improvement in which data are analyzed to inform decision-making, subsequently gen-
erating new data for ongoing analysis [
3
]. BI also enables comprehensive, historical views
of institutional performance through robust metrics such as Key Performance Indicators
(KPIs) and Key Goal Indicators (KGIs). Additionally, BI tools provide detailed yet intuitive
access to updated data, facilitating swift and effective decision-making.
Within educational contexts, implementing BI technologies—such as data warehous-
ing, online analytical processing (OLAP), visual analytics, predictive analytics, dashboards,
and data integration through Extract–Transform–Load (ETL) processes—significantly im-
Information 2025,16, 326 3 of 30
proves institutional responsiveness and efficiency [
3
,
7
]. Specifically, as illustrated in
Figure 1, these technologies empower institutions to:
•
Analyze Student Data: Evaluate academic performance, attendance, and progression
metrics to proactively identify students at risk, enabling targeted instructional and
support interventions.
•
Monitor Institutional Performance: Track strategic indicators, including graduation
rates, student satisfaction, and employment outcomes, to pinpoint strengths and
weaknesses and develop evidence-based improvement strategies.
•
Personalize Learning Experiences: Tailor educational content and approaches based
on students’ learning patterns, preferences, and challenges, thereby enhancing educa-
tional effectiveness and student engagement.
•
Optimize Resource Utilization: Analyze the allocation and effectiveness of educa-
tional resources such as staff, materials, and technology to maximize educational
impact and institutional efficiency.
•
Evaluate Educational Programs: Assess program performance comprehensively, iden-
tifying successful initiatives and areas needing improvement, thus guiding informed
decisions about curricular and operational adjustments.
This systematic review specifically explores how educational institutions leverage BI
and related predictive analytics tools, such as machine learning and big data, to address
school dropout. By examining global case studies, the review identifies successful strategies,
highlights implementation challenges, and outlines opportunities for future research and
practice aimed at reducing dropout rates and improving educational outcomes.
Aplications of BI in the
educational context
Resource
optimization
Personalization
of learning
Figure 1. Business intelligence applications in the educational environment.
Information 2025,16, 326 4 of 30
2. Materials and Methods
This study was conducted using a systematic literature review approach, following
the guidelines proposed by the Preferred Reporting Items for Systematic Reviews and
Meta-Analyses (PRISMA) framework [
8
]. The PRISMA methodology was selected as it
ensures transparency, replicability, and rigor in the identification, selection, and analysis
of the relevant literature. This systematic approach allows us to comprehensively address
our research questions regarding the use of business intelligence and machine learning
methods for predicting and preventing school dropout in higher education.
To develop this study, we followed the four stages proposed by Grijalva et al. [
9
]
shown in Figure 2.
Identification of study
field and sources Planning
Conducting the
search Selection
Management and
cleaning of results Extraction
Discussion and
conclusions of
findings
Execution
Figure 2. Stages followed to conduct the systematic review based on [9].
The following are detailed descriptions of each stage.
2.1. Planning
In this stage, this study addresses the following research questions, which serve as
a guide.
• Has business intelligence allowed for the prediction and prevention of school dropouts?
•
What are the methodologies and tools for business intelligence that educational insti-
tutions have used in their management processes?
•
What have been the main contributions of business intelligence to the issue of
school dropout?
Information Retrieval
To perform the analysis, we selected two academic databases: Google Scholar and
Scopus. Google Scholar was chosen due to its extensive indexing of a wide range of
academic outputs, including journal articles, conference proceedings, preprints, theses,
dissertations, and patents. This broad scope ensured comprehensive coverage of diverse
and emerging literature related to business intelligence, machine learning, and predictive
analytics in education. Scopus was selected as a complementary source because of its
specialized focus on peer-reviewed literature, providing high-quality academic articles and
ensuring methodological rigor in our systematic review process.
Information 2025,16, 326 5 of 30
The search criteria included scientific articles, conference proceedings, book chapters,
and theses containing the keywords presented in Table 1. These databases together offered
the most balanced and thorough representation of relevant scholarly work, aligning well
with the objectives and scope of this review.
Table 1. Search strings used for retrieving articles and theses from Google Scholar and Scopus databases.
Academic Database Keywords
Google Scholar
((“school dropout”) AND (“higher education”) AND (“busi-
ness intelligence” OR “artificial intelligence” OR “machine
learning”)):AB
Scopus
ALL ( ( ( “school dropout” ) AND ( “higher education” ) AND
( “business intelligence” OR “artificial intelligence” OR “ma-
chine learning” ) ) )
2.2. Data Selection
Articles were selected for this systematic review based on strict inclusion and exclusion
criteria shown in Table 2. If a study met the following criteria, it was included: (1) Business
intelligence-related tools were used for dropout prediction; and (2) the data were from
real-world case studies. The excluded articles were those that did not have a clear BI
implementation framework, were solely theoretical in focus, or were not published in
English or Spanish. This approach helped prevent the review from including studies with
little or no practical relevance to dropout prevention.
Table 2. Inclusion and exclusion criteria for article selection.
Inclusion Criteria Exclusion Criteria
Articles published in English or Spanish
Articles published in languages other than
English or Spanish
Studies explicitly using Business Intelligence
(BI), machine learning, or data mining tools
for dropout prediction and prevention
Studies without clear use or implementa-
tion details of BI, machine learning, or data
mining tools
Empirical studies based on real-world ed-
ucational data
Purely theoretical studies or conceptual pa-
pers without empirical validation
Peer-reviewed journal articles, conference
proceedings, book chapters, and validated
academic theses
Non-peer-reviewed sources, short commu-
nications, editorials, abstracts, or opinion
pieces
2.3. Information Extraction
The methodology used for this systematic review is based on the PRISMA State-
ment [
8
] (shown in Figure 3), which aims to identify and analyze business intelligence
methodologies that have been used as strategies to analyze school dropout.
Information 2025,16, 326 6 of 30
Identification of studies via database and academic search engines
Identification
Records identified from:
Scopus (n=625)
Google Scholar (n=966)
(n=1591)
ScreeningEligibilityInclusion
Records screened
(n=532)
Records excluded after
reading the abstract and the
title
(n=407)
Records assessed for
eligibility
(n=230)
Studies included in review
(n=230)
Articles after removing
duplicates (n=939)
Revising the inclusion and
(n=302)
exclusion criteria
Figure 3. Process of selecting scientific articles for review.
3. Results
This section integrates and summarizes the main findings from the literature on
business intelligence and data-driven approaches to predict and prevent student dropout
in higher education. The discussion is organized according to the distribution of research
by region (Section 3.1), publication trends over time (Section 3.2), data sources used in
the analyzed studies (Section 3.3), the machine learning methods employed (Section 3.4),
software tools and platforms utilized (Section 3.5), the general types of research approaches
(Section 3.6), the main application objectives of these studies (Section 3.7), and the reported
challenges (Section 3.8). Each subsection integrates insights from all the references we
identified, highlighting how each study contributes to understanding the school dropout
phenomenon and how business intelligence can address it.
3.1. Global Distribution of Research
A large portion of research on student dropout prediction and prevention using busi-
ness intelligence originates from countries such as the United States, the United Kingdom,
Spain, Brazil, and other regions in Latin America (e.g., Mexico, Colombia, Chile) [
10
],
reflecting a growing global interest in these techniques. For example, multiple studies
conducted in the United States explore how analytics can identify at-risk students [
11
–
24
],
Information 2025,16, 326 7 of 30
while in Latin America, Brazil has a large body of work focusing on machine learning
models for prediction of dropout [
25
–
38
] and also from other works [
39
–
50
] that reflect
widespread investigations in the region. The focus on Latin America is further emphasized
by customized solutions such as the ADHE Dashboard [
51
], developed in Colombia to help
academic program administrators better visualize the dropout phenomenon, or the multi-
department study in Peru [
52
], which employed machine learning and survival analysis to
predict dropout times in various engineering faculties. Additional examples include recent
Ecuador-based investigations [
53
,
54
], where machine learning aids in the identification of
at-risk students in STEM institutions. Other Latin American countries with documented
efforts include Chile, where logistic regression and other machine learning models have
been tested using public datasets [
10
]. Increasingly, higher education institutions in Mexico
are using predictive analytics to identify at-risk students in large-scale data from multiple
groups [50,55–61].
There is also a notable concentration of publications from Asia, such as those from
Saudi Arabia [
62
–
67
], India [
68
–
74
], China [
75
–
77
], Taiwan [
78
], Malaysia [
79
], Thailand [
80
],
and Japan [
81
], in conjunction with a wider collection of works that demonstrate a growing
interest in Asian contexts [
82
–
89
]. For example, Patel et al. [
73
] focuses on a newly cre-
ated Gujarat dataset (EduDropX) to analyze dropout based on demographic and cultural
variables, achieving highly accurate regression-based predictions. In contrast, the work
presented in [
74
] analyzes Kaggle-based data in India to compare random forest and naive
Bayes for a balanced dropout classification. Tsai et al. [
78
] present a case study in Taiwan
exploring statistical and deep learning for precision education, while [
81
] leverage LMS log
data in Japan for early detection of at-risk students. Meanwhile, Villegas et al. [
54
] offer
insight into improving student retention in higher education settings through a sustainable
approach, reflecting how institutions around the world seek more holistic frameworks for
retention, resource optimization, and ethical AI use.
Research in African contexts, though less frequent, is present with Ghana and Tan-
zania as prime examples [
90
–
92
], along with Nigeria [
93
]. Likewise, references from
European contexts abound, such as Germany [
94
–
99
], Finland [
100
], Spain [
101
–
104
],
Portugal [
105
–
107
], and Italy [
108
–
110
], each discussing methods to track and forecast stu-
dent dropout. The Netherlands is represented by [
111
], which documents a field experiment
on early warning systems and the effect of improving individual student counseling. Other
publications reflect cross-continental collaborations and meta-analyses [
8
,
112
–
116
]. Africa-
based or Africa-related studies also include references from Uganda [
117
] and data-mining
applications in Ghana [118].
An early examination of academic support and monitoring in Latin American contexts
also emerges in [
119
], highlighting how new technologies (artificial intelligence, smart
monitors) can significantly improve student retention but require careful integration within
local cultures. Furthermore, some universities have begun focusing on designing actionable
methods that target specific course pathways, as outlined in [
120
]. Overall, the scope of
publications underscores that the worldwide spread of learning analytics and data-driven
insights is not limited to a single region, but rather is a universal pursuit to reduce student
dropout rates.
The global distribution of research highlighted above is visually represented in
Figure 4,
which provides an overview of the geographic concentration of publications
focusing on student dropout prediction and prevention through business intelligence and
data-driven methods. The map illustrates the number of studies per country, with color
intensity corresponding to publication frequency. Countries such as the United States,
Brazil, Spain, and the United Kingdom show particularly high research activity, reflecting
robust academic and institutional interest. Additionally, notable concentrations in regions
Information 2025,16, 326 8 of 30
like Latin America, Asia, and Europe underscore the worldwide adoption and adaptation
of these predictive analytics techniques in diverse educational contexts.
Figure 4. Global distribution of research papers focused on using business intelligence and data-
driven techniques for predicting student dropout in higher education. Color intensity indicates the
number of publications per country.
3.2. Temporal Distribution of Research
Figure 5highlights how interest in dropout prediction has expanded considerably
since earlier works of the late 1990s, such as those by Witten [
121
] and Shin [
122
], and
by Dietterich [
123
]. Although earlier approaches were more exploratory, the prolifera-
tion of computing power, big data, and machine learning frameworks has led to more
sophisticated predictive models. Classic works in the early 2000s demonstrated the fea-
sibility of machine learning for dropout or at-risk detection [
124
–
127
], while mid-2010s
studies investigated large-scale analytics in more depth [
128
–
130
]. From 2019 onward,
the growth in publications has been markedly accelerated, reflecting a significant increase
in the adoption of advanced data-driven solutions and the urgency to reduce dropout
rates [105,108,115,131–135].
Recent studies (2020–2025) exhibit a wide methodological range, including multi-
phase predictive modeling [
107
] and novel data pipelines [
136
], while also addressing
interpretability challenges [
120
,
137
]. Works such as [
138
] explore recommender systems
to guide students toward suitable courses and prevent attrition; similarly, Alturki & Al-
turki [
65
] emphasizes how the increasing adoption of educational data mining from around
2015 onward promoted new lines of inquiry focused on predicting academic achievement
and identifying at-risk students. The need for very early detection has propelled investiga-
tions into multi-semester data usage [
72
,
139
], while deep learning or neural approaches
have gained attention in 2020 publications [
84
,
95
]. For example, Tsai et al. [
78
] shows
how the combination of big data with deep neural networks can yield 77% precision for
identifying dropouts in Taiwan. Similarly, Paul [
140
] describes how student participation
factors are essential for reliable predictions using supervised and unsupervised methods.
Furthermore, Villegas et al. [
54
] details recent explorations of integrating AI-based
retention strategies into broader institutional sustainability and quality frameworks. The
works in [
44
,
141
,
142
] represent intermediate phases in which educational institutions have
increasingly sought predictive analytics to curb dropout, although with fewer data in-
tegration capabilities compared to today. In the very recent period (2020–2025), many
works [
12
,
48
,
57
,
82
,
86
,
143
,
144
] explore or refine deep learning approaches, advanced clus-
Information 2025,16, 326 9 of 30
tering, or creative feature engineering. Studies such as [
145
] propose probabilistic machine
learning pipelines that quantify uncertainty, helping instructors with structured interven-
tions throughout the semester. Meanwhile, the works from [
66
,
146
] demonstrate the value
of early-stage predictions in online or hybrid contexts, allowing for timely intervention to
keep students engaged.
Figure 5. Temporal distribution of research papers published on predicting student dropout in higher
education using business intelligence and machine learning techniques (1996–2025). The number of
publications demonstrates rapid growth in interest, particularly from 2019 onward.
Several 2023–2025 publications reflect the growing interest in specialized approaches.
For example, Mosia [
145
] uses probabilistic logistic regression in multiple stages to iden-
tify at-risk students at different times, while [
89
] focuses on MOOC dropout modeling
with XGBoost. Suaprae et al. [
80
] discuss an intelligent consulting system equipped
with cognitive technology for student counseling, and [
93
] refines predictive modeling in
Nigeria by applying improved data pre-processing in multi-year records. Furthermore,
Demartini et al. [
147
] shows how AI-driven dashboards can catalyze acceptance and usage
of learning analytics in primary and secondary education, although the insights apply
equally to higher education. Awedh and Mueen [
67
] present a new hybrid approach
(LR-KNN) and point to the future need to bridge interpretability with advanced analytics.
These contributions highlight how the literature continues to evolve, pushing toward more
adaptive, context-driven, and multi-source methodologies.
3.3. Data Sources in Dropout Prediction
The literature confirms that an accurate prediction of dropout is often based on a com-
bination of institutional and external data. Studies repeatedly highlight internal academic
records (grades, assessments, GPA) and demographic information as fundamental charac-
teristics [
9
,
46
,
148
–
151
]. Some authors, such as [
64
,
90
,
152
–
154
], incorporate data extracted
from virtual learning environments (VLEs), learning management systems (LMSs) [
81
,
146
],
Information 2025,16, 326 10 of 30
or MOOC platforms [
89
]. Others emphasize less common input features; for example, the
clustering of digital traces from e-learning platforms [
46
], the combination of continuous
and categorical data through specialized clustering-then-classification methodologies [
133
],
or the exploration of textual variables with advanced NLP [155,156].
Longitudinal data from multi-semester course performance have been proven critical
in many contexts, including recent work such as [
139
], where the course completion
data for each semester improve predictive accuracy. Similarly, Martins et al. [
107
] uses
academic, social–demographic, and macroeconomic information collected at different
phases of a student’s first year to detect dropout risk as early as possible. Additional
works incorporate structured questionnaires to measure intangible factors associated with
dropout, such as CPQ-based approaches in [
103
], while [
157
] emphasizes the importance
of thorough data cleaning, labeling, and transformation steps in building robust predictive
pipelines. Jaiswal et al. [
158
] highlights how mining large educational datasets can unmask
hidden relationships that predict dropout, particularly in contexts where institutional data
accumulate every year.
In many Latin American institutions, official government or institutional datasets
are central for analyses [
10
,
51
,
52
], combining socio-demographic, socio-economic, and
academic indicators to identify when a student becomes at risk of leaving. Elsewhere,
Vidal et al. [
53
] shows how such data can reveal motivational or attributional patterns,
especially relevant in STEM contexts. In the United States, Jain [
22
] merges high school and
freshman college data for early warning systems, while [
23
] leverages Canvas engagement
logs. The works such as [
79
] in Malaysia or [
93
] in Nigeria illustrate how data from mul-
tiple enrollment groups combined with pre-processing in Weka yield accurate predictive
models of student attrition. Demartini et al. [
147
] adapt analytics to a broader educational
domain, analyzing data streams from secondary schools and exploring how AI-based
dashboards benefit teachers. Meanwhile, Guarda [
61
] discusses how identifying the per-
sonal and institutional reasons for dropout requires cross-sectional and time series data for
robust classification.
Socio-economic variables remain consistently identified as crucial predictors of
dropout. Multiple references [
42
,
96
,
127
,
159
–
162
] and more recent works [
62
,
67
,
117
] stress
that parental support, financial conditions, and scholarship data can carry strong predictive
power. De Jesus [
163
] uses fuzzy logic to prescribe specific interventions based on a set
of risk factors that include family environment, personal motivation, and socioeconomic
background. Sani et al. [
93
] and Adnan et al. [
146
] confirm that thorough pre-processing
and feature selection further augment the value of such data, thus enhancing predictive
performance. Finally, specialized data warehousing frameworks that unify multiple data
streams appear in [
49
,
164
,
165
], ensuring that advanced machine learning models receive
high-fidelity inputs. While data availability expands, the works from [
157
,
166
] remind us
of the critical need for systematic data cleaning and transformation to combat noisy or
incomplete institutional records.
Figure 6summarizes the range of data sources used in dropout prediction studies. It
shows that internal databases, academic records, and socio-demographic data are the most
frequently utilized sources in the literature employing business intelligence and machine
learning. This emphasizes the critical role of comprehensive data integration in developing
accurate predictive models.
Information 2025,16, 326 11 of 30
Figure 6. Data sources utilized in studies predicting student dropout in higher education using
business intelligence and machine learning techniques. Internal institutional databases, socio-
demographic data, and academic performance data are the most frequently employed data sources
across the reviewed literature.
3.4. Machine Learning Methods Used
Figure 7illustrates the algorithms most commonly employed in the prediction
of student dropout. Decision trees remain quite popular [
24
,
35
,
68
,
71
,
112
,
159
,
167
–
173
],
mainly due to their interpretability and ease of implementation. Random Forest and
Gradient Boosting Machines (e.g., XGBoost, LightGBM, CatBoost) have gained traction
for their robust performance [
46
,
47
,
82
,
89
,
107
,
109
,
118
,
131
,
133
,
143
,
174
–
183
], including in
MOOC dropout research. Logistic regression remains a consistent baseline for inter-
pretability [
68
,
69
,
72
,
79
,
87
,
96
,
145
,
152
,
162
,
184
–
186
], and some authors integrate stepwise
or probabilistic refinements [145].
Ensemble methods and hybrid approaches often emerge as top performers, exem-
plified in [
43
,
66
,
114
,
132
,
150
,
175
,
181
,
187
–
189
]. Deep learning frameworks (feedforward
networks, CNNs, LSTMs) have gained traction [
71
,
78
,
84
,
87
,
95
,
152
,
154
,
160
,
190
], and ap-
proaches like [
67
] show potential for hybrid classification (LR-KNN) or fuzzy logic-based
systems [
44
,
153
,
163
]. Adnan et al. [
146
] proposes a staged approach to identify at-risk
students at 0%, 20%, 40%, 60%, 80%, and 100% of the course timeline, utilizing random
forest models to capture changing levels of engagement. Meanwhile, Sani et al. [
93
] applies
Random Forest and decision trees with improved data pre-processing, surpassing 79%
accuracy in student attrition forecasting.
Less common techniques such as Fuzzy-ARTMAP networks [
44
], Bayesian classi-
fiers [
23
,
56
,
191
,
192
], and cluster-then-classify strategies [
46
,
133
] complement the more tra-
ditional approaches, especially when institutions seek to handle unstructured or noisy data.
Kondo et al. [
81
] demonstrate how straightforward logistic regression and decision trees
benefit from LMS log data for early warnings, while [
80
] deploy an intelligent consulting
Information 2025,16, 326 12 of 30
system integrated with cognitive technology. An increasing emphasis on explainability also
appears, with XAI methods (e.g., LIME, SHAP) helping to clarify model decisions [
131
,
137
].
As these tools evolve, a shared trend is to balance predictive performance with transparency
for institutional acceptance [120,170].
Figure 7. Frequency of machine learning methods employed in predicting student dropout in
higher education.
3.5. Software Tools and Platforms
Regarding specific software tools, the works analyzed indicate frequent usage of open
source data mining platforms such as Weka [
79
,
93
,
193
–
195
], KNIME [
196
], Orange, R, and
Python-based libraries (scikit-learn,TensorFlow,PyTorch) [
36
,
143
,
154
,
160
,
197
,
198
]. Commercial
platforms like IBM SPSS Modeler and SAS Enterprise Miner appear in large institutional or multi-
campus projects [
199
–
201
], while Power BI or Tableau are commonly cited for data visualization
and reporting to decision-makers [
202
–
204
]. In some cases, specialized data warehousing or
star schema designs facilitate data extraction and transformation [164,205–207].
LMS and MOOC platforms also serve as crucial data sources. The works of [
81
,
89
]
use online log data to train predictive models and implement early warning systems.
Novel or hybrid frameworks appear in references such as [
157
], which offers an end-
to-end pipeline with data augmentation, labeling, and feature engineering. In [
147
], a
cloud-based AI dashboard is implemented to enhance teacher acceptance of learning
analytics, especially in primary and secondary schools, but with clear parallels to higher
education usage. Suaprae et al. [
80
] emphasize integrating machine learning into intelligent
consulting systems, demonstrating how advanced analytics can be wrapped into user-
friendly interfaces for counselors. In general, researchers stress solutions that are accessible
and flexible enough to accommodate diverse institutional data, ensuring robust and scalable
dropout prevention initiatives.
The diverse ecosystem of software tools and platforms employed in the research on
dropout prediction discussed above is visually summarized in Figure 8. This waffle plot
illustrates the proportional usage of various tools across reviewed studies. Open-source
data mining platforms such as Weka,KNIME, R, and Python-based libraries are prominently
featured, along with commercial solutions like IBM SPSS Modeler,SAS Enterprise Miner,
Power BI, and Tableau. The visualization highlights the widespread adoption of versatile
Information 2025,16, 326 13 of 30
software tools, emphasizing both the flexibility and accessibility required by institutions
implementing effective dropout prediction and prevention systems.
Figure 8. Waffle plot illustrating the proportion of software tools used in studies predicting student
dropout in higher education.
3.6. Types of Research Approaches
Figure 9shows how most studies employ quantitative empirical methods that feature
some variant of supervised machine learning in institutional data. Many of these are
cross-sectional analyses, although a growing number of researchers adopt longitudinal ap-
proaches to capture changes in student performance over
time [42,79,102,128,203,208,209].
Intervention-based pilot implementations of early warning systems are increasingly com-
mon [
34
,
66
,
131
,
210
–
213
]. Here, data dashboards are tested in real courses or programs,
with interventions triggered once high-risk profiles emerge [
81
,
146
]. The work of De Jesus
et al. [
163
] proposes a fuzzy logic-based prescriptive analytics system that predicts both
dropout risk and suggests targeted interventions. Similarly, [
80
] integrates predictive ana-
lytics into an intelligent counseling system, guided by cognitive technology to decrease
mid-exits.
Studies such as [
113
] test specific interventions through randomized or quasi-
experimental designs, while meta-analyses such as [
112
,
114
,
116
] evaluate multiple predic-
tive modeling approaches. Other authors focus on design-based research [
58
,
135
,
136
,
214
],
focusing on iterative improvements in early warning dashboards. Works such as [
52
,
93
]
employ survival analysis or multi-year classification to unravel dropout trajectories. Some
efforts combine numeric data with interviews or surveys to validate the root causes of
dropout behavior [21,61,191,215].
New lines of research incorporate novel analytics and AI. For example, Demartini
et al. [
147
] addresses K–12 education, but highlights a multidimensional approach to bridging
analytics and stakeholder acceptance, relevant to higher education. Vidal et al. [
53
] employ
neural networks and a robust set of motivational variables in Ecuador, demonstrating how
psychosocial factors can affect the success of STEM courses. In [
67
], the authors propose a
layered approach that integrates pre-processing, clustering (canopy + Gaussian flow opti-
mizer), and a hybrid LR-KNN model to identify at-risk students at King Abdulaziz University.
Information 2025,16, 326 14 of 30
References like [
157
] detail data pre-processing pipelines (CRISP-DM, star schemas) that serve
as the backbone for predictive tasks. Finally, large-scale or multi-institutional explorations,
such as [
104
,
190
], underscore the importance of shared data standards to refine predictions
across varying contexts.
Figure 9. Treemap representing the distribution of research types among studies on predicting
student dropout in higher education using business intelligence and machine learning techniques.
3.7. Application Objectives of the Studies
Whereas early work on dropout analysis often aimed simply to identify at-risk
students, recent publications indicate broader objectives of data usage. Figure 10
illustrates how early detection and intervention design remain the most common
goals [46,83,148,181,185,216].
Many studies propose frameworks for timely or even real-
time interventions, focusing on academic counseling, customized feedback, or improved
course design [
24
,
63
,
80
,
132
,
142
,
153
,
163
,
217
–
219
]. Another line of research focuses on build-
ing recommendation systems or advising platforms (e.g., personalized learning path-
ways [133,216] or data-driven academic counseling tools [93,108,220]).
In addition, a portion of the literature contemplates how predictive insights can guide
broader institutional policy. For example, works in [
159
,
221
,
222
] detail using analytics to
reorient curriculum design and resource allocation. Several authors highlight the use of AI-
based predictive models to reduce inequality and strengthen equity in education, especially
for disadvantaged groups [
53
,
62
,
82
,
117
,
223
]. Another dimension lies in designing advanced
dashboards or holistic monitoring solutions that unify multiple data streams into a coherent
framework [
132
,
147
,
206
,
224
–
226
], supporting administrators in real-time decision-making.
Integrating the concept of “digital portraits” from social media and university records
has also been explored [
227
] to provide a 360-degree overview of students’ academic and
extracurricular lives.
Several works address how performance analytics can be leveraged by institutional
leaders to formulate macro-level policies. Martins et al. [
107
] study whether students
are likely to drop out, delay, or complete on time, while [
66
] focuses on improving reten-
tion rates for students at risk by combining LSTM with other machine learning classifiers.
Sharma and Yadav [
72
] help first-year engineering students through a streamlined predictor
set for timely interventions, while Adnan et al. [
146
] explicitly tailors random forest models
to identify at-risk students at multiple junctures across a course. MOOC-based contexts also
Information 2025,16, 326 15 of 30
arise: Patel and Amin [
89
] show how advanced gradient enhancement promotes retention
through early detection, facilitating more personalized online learning. Integration of insti-
tutional sustainability goals into references such as [
54
] reveals a growing acknowledgment
that dropout reduction aligns with resource optimization, mental health considerations,
and institutional longevity.
In general, these specialized objectives converge on the broad aim to reduce dropout,
improve academic performance, and optimize routes in a data-driven, equitable manner.
Local contexts, from Taiwan [
78
] to Nigeria [
93
] to Ecuador [
53
], demonstrate that region-
specific features (cultural, socioeconomic, infrastructural) must be integrated into models
to maximize impact. Consequently, the typical workflow is shifting from predictive to
prescriptive as exemplified by [
163
], which merges fuzzy logic with interventions to address
the multifaceted challenges of student attrition.
Figure 10. Distribution of application objectives among studies predicting student dropout in higher
education using business intelligence and machine learning techniques.
3.8. Challenges Reported
Several recurring challenges in the use of BI and machine learning for predicting
dropout appear in the literature, such as data privacy, data quality, interpretability, resource
limitations, and ethical concerns, as depicted in Figure 11.
Figure 11. Challenges reported in studies focused on predicting student dropout in higher education
using business intelligence and machine learning techniques.
Information 2025,16, 326 16 of 30
3.8.1. Data Privacy and Security
Data privacy remains a critical concern, especially regarding sensitive student informa-
tion. Studies such as [
12
,
228
] emphasize risks associated with bias and privacy violations
arising from inappropriate use or sharing of student data. Transparent data governance
frameworks are recommended by [
20
,
185
] to mitigate these risks. Cross-platform and
cross-institutional data sharing raise additional privacy complexities, especially in contexts
involving emerging AI-driven analytics [
83
,
156
,
190
]. Maintaining adherence to robust ethi-
cal standards and institutional policies, as highlighted by [
54
], is essential for responsible
and secure data use.
3.8.2. Data Quality and Integration
Integrating data from diverse sources often poses significant challenges due to in-
consistencies, missing values, or variations in data formats [
155
,
166
]. The literature ad-
vocates for adopting robust data management practices, including structured pipelines,
the CRISP-DM methodology, star schema designs, and advanced data warehousing ap-
proaches [
49
,
51
,
164
,
205
,
229
,
230
]. Additionally, longitudinal data present difficulties in
maintaining consistency over extended periods, requiring special attention to shifting defi-
nitions and evolving institutional standards [
72
,
79
,
139
]. Work such as [
73
] emphasizes the
need to generate region-specific datasets, while [
74
,
157
] emphasizes systematic strategies
to handle class imbalances, data cleaning, and labeling to improve model robustness.
3.8.3. Model Interpretability
The increasing sophistication of machine learning models, such as deep neural net-
works and ensemble methods, often results in reduced transparency, commonly referred
to as the black box problem [
95
,
160
,
170
,
185
]. This lack of interpretability can undermine
stakeholder trust and limit practical adoption. To address these issues, researchers em-
phasize the development of explainable AI (XAI) frameworks such as LIME or SHAP,
which provide insights into model decisions [
131
,
133
,
231
,
232
]. Additionally, probabilistic
approaches and Bayesian models are recommended to integrate uncertainty explicitly, thus
enhancing interpretability without sacrificing predictive performance [
23
,
145
]. Studies such
as [
120
,
137
] further highlight the importance of obtaining causal insights and actionable
interventions from predictive analytics.
3.8.4. Resource Limitations and Expertise
A recurring barrier to effective implementation of BI in higher education is the lack of
technical expertise and the necessary infrastructure [
114
,
167
,
191
]. Institutions often face
constraints related to limited financial resources, insufficient technological infrastructure,
and staff training gaps [
166
,
233
]. These limitations are particularly pronounced in devel-
oping countries where unstable internet connectivity, inadequate hardware, and minimal
technical support restrict the deployment of sophisticated predictive systems [
91
–
93
]. Ad-
dressing these challenges requires comprehensive capacity building initiatives, targeted
training programs, and the creation of user-friendly analytical tools to ensure successful
adoption and sustained implementation [80,119].
3.8.5. Ethical Concerns and Bias
The risk of perpetuating existing biases through predictive analytics is widely noted,
raising significant ethical concerns [20,83,108,143,185]. There is a critical need for fairness-
aware modeling practices to ensure that dropout prediction tools do not disproportionately
disadvantage marginalized or vulnerable groups [
12
,
14
,
234
]. Researchers advocate for
interpretive frameworks that explicitly account for demographic, socioeconomic, and
Information 2025,16, 326 17 of 30
cultural contexts to prevent unintended discrimination [
98
,
117
]. Moreover, careful vali-
dation practices and inclusive model design are essential to mitigate inadvertent biases,
especially in diverse educational settings such as MOOCs or digitally mediated learning
environments [67,89].
4. Discussion
The results of this systematic analysis highlight the rapidly expanding global interest
in using business intelligence (BI) and data-driven approaches to predict and prevent
student dropout in higher education. Taken together, these findings emphasize how
diverse regions, institutional contexts, and research methods converge on a common goal:
to reduce attrition rates by leveraging advanced analytics. Below, we discuss key insights
and their implications for educational practice, policy, and research.
4.1. Global and Temporal Trends
The geographical scope of the reviewed research demonstrates that dropout prediction
efforts are widespread in North America, Europe, Latin America, and Asia, with a growing
presence in Africa. This global engagement highlights the universal challenge of student
attrition, while also revealing significant context-specific nuances. Nations such as the
United States, Brazil, Spain, and the United Kingdom stand out for their high volume of
studies, although work from other regions continues to grow. At the same time, the marked
increase in publications from 2019 onward suggests that the urgent need to improve student
success, combined with greater data availability and computational capacity, has fueled
more robust and sophisticated modeling techniques.
Despite varying resource constraints across regions, the common denominator is a
shared objective: transitioning from traditional, often manual tracking systems to proactive,
data-driven strategies that inform timely interventions. This widespread interest points to
the potential for cross-regional knowledge exchange, especially as more mature educational
systems refine their approaches and emerging systems adopt best practices. Moreover, the
methodological evolution observed over the last decade—from relatively simple classifica-
tion models to deep learning and explainable AI—reflects heightened demand for better
predictive power and interpretability.
4.2. Data Sources and Methods
A variety of data sources have been employed to predict dropout, with the most
prominent being institutional academic records (e.g., grades, GPA) and socio-demographic
variables (e.g., socioeconomic status, parental support). Additionally, learning manage-
ment system (LMS) logs, MOOC platform data, and longitudinal performance records are
increasingly integrated to capture student engagement and changes in risk over time. This
combination of data highlights the multifactorial nature of dropout, in which academic,
social, psychological, and economic factors intersect to influence students’ decisions to
abandon their studies.
From a methodological point of view, machine learning approaches show significant
heterogeneity. Decision trees and logistic regression are consistently used because of their
interpretability, while ensemble methods (e.g., Random Forest, XGBoost) and deep learning
models have gained popularity for their ability to handle complex, high-dimensional data.
However, the shift toward more advanced models often comes at the cost of reduced
transparency, leading to a growing emphasis on explainable artificial intelligence (XAI)
methods. Such approaches aim to balance predictive accuracy with stakeholder inter-
pretability, particularly as institutional acceptance and ethical use hinge on clear, justifiable
decision-making processes.
Information 2025,16, 326 18 of 30
4.3. Research Approaches and Application Objectives
Quantitative empirical analyses predominate, often featuring supervised learning to
classify or predict at-risk students. Yet, a notable subset of studies incorporates intervention-
based designs, in which predictive models guide real-time or near-real-time actions. These
interventions—ranging from personalized counseling to curricular adjustments—are often
delivered through user-friendly dashboards or integrated advising platforms. By embed-
ding predictive results into day-to-day academic operations, these studies demonstrate the
feasibility of moving beyond retrospective analytics and toward proactive student support.
The reviewed research also reveals an expansion of objectives. While early detection
of at-risk students remains central, many institutions now aim to use dropout risk data
to inform broader policy decisions. For example, predictive insights help drive course
redesign, equitable resource distribution, and even institutional strategies to improve
mental health and social engagement. Some universities align these efforts with long-term
sustainability and inclusion goals, recognizing that dropout prevention intersects with
ethics, student well-being, and institutional reputation.
4.4. Challenges and Limitations
Educational institutions often face technical challenges due to educators’ limited
familiarity with BI tools and predictive analytics. To effectively overcome these barriers,
institutions should invest in targeted professional development initiatives, including hands-
on workshops, certifications in data analytics, and training programs tailored specifically
to educational contexts. Collaborative efforts with industry experts or partnerships with
institutions experienced in data-driven practices can also facilitate knowledge transfer
and skill enhancement. Additionally, implementing user-friendly, intuitive BI dashboards
designed explicitly for educators can significantly reduce barriers to effective use, ultimately
improving the practical impact of BI tools on dropout prevention strategies.
Data privacy remains a significant challenge in implementing BI and predictive an-
alytics in education, particularly given stringent legal frameworks such as the General
Data Protection Regulation (GDPR) in the European Union and the Family Educational
Rights and Privacy Act (FERPA) in the United States. GDPR sets strict standards for
collecting, processing, and managing personal data, emphasizing transparency, consent,
and accountability, while FERPA specifically protects the privacy of student educational
records, limiting how institutions can use and disclose data. These regulations require
educational institutions to establish robust data governance practices and infrastructure to
ensure compliance, which poses considerable resource challenges, especially for smaller or
under-resourced institutions.
Data quality and integration also pose substantial difficulties. Heterogeneous data
systems, missing values, and inconsistent recording practices can hinder model accuracy,
underscoring the need for standardized data warehousing solutions and sophisticated
pre-processing workflows.
Another limitation involves the interpretability of advanced predictive methods. Al-
though sophisticated algorithms, including deep learning and ensemble models, offer
high predictive performance, their inherent complexity often limits transparency. This
so-called “black-box” issue can undermine institutional stakeholders’ trust, necessitating
greater emphasis on explainable AI methodologies that provide clarity and confidence in
predictive outputs.
Additionally, successful BI implementation hinges upon adequate training and skill
development among educators and administrators. Even the most effective predictive
analytics systems require skilled users who can interpret and translate data insights into
Information 2025,16, 326 19 of 30
actionable interventions. Without sufficient training and ongoing professional development,
educational institutions may struggle to realize the full potential of BI technologies.
Finally, ethical concerns remain significant. Predictive models trained on historical
data risk reinforcing existing biases, inadvertently disadvantaging vulnerable student
populations. Ensuring fairness and equity requires rigorous validation, ongoing monitoring,
and transparent modeling practices, highlighting the ethical responsibilities of institutions
employing predictive analytics.
4.5. Future Directions
Several avenues for further work emerge from these findings. First, efforts to integrate
multi-modal data—encompassing academic records, LMS logs, and psychosocial indica-
tors—should continue, as richer datasets promise more robust risk assessments. Second,
bridging the gap between prediction and intervention calls for interdisciplinary research,
uniting data scientists, educators, and policy experts to design actionable strategies. Third,
scaling up cross-institutional collaborations can accelerate learning from diverse contexts,
particularly if researchers embrace open science principles that encourage data and method-
ology sharing. Lastly, developing and validating fairness-aware machine learning models
is critical to ensuring that dropout prevention strategies support, rather than stigmatize,
underrepresented or marginalized student groups.
5. Conclusions
The collective evidence from these studies points to the growing sophistication and
relevance of business intelligence and machine learning methods in addressing student
dropout in higher education. Institutions worldwide are leveraging increasingly varied
data sources and advanced analytical tools, supported by user-centered dashboards and
interventions that target students most in need. However, persistent challenges, such
as data quality, privacy, interpretability, and ethical risk, underscore that successful im-
plementation requires robust governance and community involvement. In the future,
collaborative and context-specific research will play a key role in unlocking the full po-
tential of data-driven dropout prevention, ultimately contributing to more inclusive and
effective educational systems.
Author Contributions: Conceptualization, D.-M.C.-E.; methodology D.-M.C.-E., J.T., J.-A.R.-G. and
K.-E.C.-E.; formal analysis, D.-M.C.-E., J.T., J.-A.R.-G. and K.-E.C.-E.; investigation, D.-M.C.-E., J.T. and
K.-E.C.-E.; resources D.-M.C.-E., J.T., J.-A.R.-G., K.-E.C.-E., R.-E.L.-M. and R.C.-S.; writing—original
draft preparation, D.-M.C.-E., J.T., J.-A.R.-G. and K.-E.C.-E.; writing—review and editing, D.-M.C.-E.,
J.-A.R.-G., J.T., K.-E.C.-E., R.-E.L.-M., T.G.-R. and R.C.-S.; supervision, D.-M.C.-E., K.-E.C.-E., T.G-R.
and J.T.; project administration, D.-M.C.-E., J.T. and K.-E.C.-E. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: No new data were created or analyzed in this study.
Acknowledgments: We thank the Secretaría de Ciencia, Humanidades, Tecnología e Innovación
(SECIHTI) for its support through the National System of Researchers (SNII).
Conflicts of Interest: The authors declare no conflicts of interest.
Information 2025,16, 326 20 of 30
References
1.
Urbina-Nájera, A.B.; Camino-Hampshire, J.C.; Barbosa, R.C. Deserción escolar universitaria: Patrones para prevenirla aplicando
minería de datos educativa. RELIEVE—Rev. Electrón. Investig. Eval. Educativa 2020,26. [CrossRef]
2.
Rochin Berumen, F.L. Deserción escolar en la educación superior en México: Revisión de literatura. RIDE—Rev. Iberoam. Para
Investig. Desarro. Educ. 2021,11 , 1–11. [CrossRef]
3. Curto Díaz, J. Introduccion al Business Intelligence; Editorial UOC: Barcelona, Spain, 2017.
4.
Ayala, J.; Ortiz, J.; Guevara, C.; Maya, E. Herramientas de Business Intelligence (BI) modernas, basadas en memoria y con lógica
asociativa. Revistapuce 2018,106. [CrossRef]
5.
Ouriniche, N.; Benabbou, Z.; Abbar, H. Global Performance Management Using the Sustainability Balanced Scorecard and
Business Intelligence—A Case Study. Proc. Eng. 2022,4, 313–326. [CrossRef]
6. Cano, J.L. Business Intelligence: Competir con Información; Fundación Banesto: Madrid, Spain, 2007.
7.
Pascal, G.; Servetto, D.; Mirasson, U.L.; Luna, Y. Aplicación de Business Intelligence para la toma de decisiones en Instituciones
Universitarias. Implementación de Boletines Estadísticos en la Universidad Nacional de Lomas de Zamora (UNLZ). Rev. Electrón.
Sobre Tecnol. Educ. Soc. 2017,4, 7.
8. Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Group, P. Preferred reporting items for systematic reviews and meta-analyses:
The PRISMA statement. Int. J. Surg. 2010,8, 336–341. [CrossRef]
9.
Grijalva, P.K.; Cornejo, G.E.; Gómez, R.R.; Real, K.P.; Fernández, A. Herramientas colaborativas para revisiones sistemáticas. Rev.
Espac. 2019,40, 9.
10.
Uldall, J.S.; Rojas, C.G. An Application of Machine Learning in Public Policy: Early Warning Prediction of School Dropout in the
Chilean Public Education System. Multidiscip. Bus. Rev. 2022,15, 20–35. [CrossRef]
11.
Christie, S.T.; Jarratt, D.C.; Olson, L.A.; Taijala, T.T. Machine-Learned School Dropout Early Warning at Scale. In Proceedings of
the Educational Data Mining, Montreal, QC, Canada, 2–5 July 2019.
12.
Bird, K. Predictive Analytics in Higher Education: The Promises and Challenges of Using Machine Learning to Improve Student
Success. AIR Prof. File 2023,2023, 11–18. [CrossRef]
13.
Ryan, L.; Snow, N. Supporting Student Success with Intuitive, Approachable Data Visualization. In Supporting the Success of Adult
and Online Students Proven Practices in Higher Education; CreateSpace Independent Publishing: San Bernardino, CA, USA, 2016.
14.
Pinkus, L. Using Early-Warning Data to Improve Graduation Rates: Closing Cracks in the Education System; Alliance for Excellent
Education: Washington, DC, USA, 2009.
15.
Aulck, L.; Velagapudi, N.; Blumenstock, J.E.; West, J.D. Predicting Student Dropout in Higher Education. arXiv 2016,
arXiv:1606.06364.
16.
Kang, K.; Wang, S. Analyze and Predict Student Dropout from Online Programs. In Proceedings of the Conference on Educational
Data Mining, DeKalb, IL, USA, 23–25 March 2018; pp. 6–12. [CrossRef]
17.
Essa, A.; Ayad, H. Improving student success using predictive models and data visualisations. Res. Learn. Technol. 2012,20, 58–70.
[CrossRef]
18.
Matz, S.; Peters, H.; Stachl, C. Using machine learning to predict student retention from socio-demographic characteristics and
app-based engagement metrics. Dent. Sci. Rep. 2023,13, 5705. [CrossRef]
19.
Yu, R.; Li, Q.; Fischer, C.; Doroudi, S.; Xu, D. Towards Accurate and Fair Prediction of College Success: Evaluating Different
Sources of Student Data. In Proceedings of the Educational Data Mining Conference, Virtual Event, 10–13 July 2020; pp. 292–301.
20.
Yu, R.; Lee, H.; Kizilcec, R.F. Should College Dropout Prediction Models Include Protected Attributes. In Proceedings of the
Learning at Scale Conference, Virtual, 22–25 June 2021; pp. 91–100. [CrossRef]
21.
Zhao, Y.; Otteson, A. AI-Driven Strategies for Reducing Student Withdrawal—A Study of EMU Student Stopout. arXiv 2024,
arXiv:2408.02598. [CrossRef]
22.
Jain, H. Predicting college dropout likelihood based on high school and college data: A machine learning approach. In
Proceedings of the International Conference on Science & Technology, Bali, Indonesia, 18–19 July 2024. [CrossRef]
23.
Mínguez
-
Martínez, A.L.; Sood, K.; Mahto, R. Early Detection of At
-
Risk Students Using Machine Learning. arXiv 2024,
arXiv:2412.09483.
24.
Bukralia, R.; Deokar, A.V.; Sarnikar, S. Using Academic Analytics to Predict Dropout Risk in E
-
Learning Courses. In Reshaping
Society Through Analytics, Collaboration, and Decision Support; Iyer, L.S., Power, D.J., Eds.; Annals of Information Systems; Springer:
Cham, Switzerland, 2015; Volume 18, pp. 67–93. [CrossRef]
25.
Cambruzzi, W.; Rigo, S.; Barbosa, J.L.V. Dropout Prediction and Reduction in Distance Education Courses with the Learning
Analytics Multitrail Approach. J. Univers. Comput. Sci. 2015,21, 23–47. [CrossRef]
26.
da S. Freitas, F.A.; Vasconcelos, F.F.X.; Peixoto, S.A.; Hassan, M.M.; Dewan, M.A.A.; de Albuquerque, V.H.C.; Filho, P.P.R. IoT
System for School Dropout Prediction Using Machine Learning Techniques Based on Socioeconomic Data. Electronics 2020,
9, 1613. [CrossRef]
Information 2025,16, 326 21 of 30
27.
Barthès, J.P.A. An explainable machine learning approach for student dropout prediction. Expert Syst. Appl. 2023,233, 120933.
[CrossRef]
28.
Flores, V.; Heras, S.; Julián, V. Comparison of Predictive Models with Balanced Classes for the Forecast of Student Dropout in
Higher Education. In Practical Applications of Agents and Multi-Agent Systems; Springer: Cham, Switzerland, 2021; pp. 139–152.
[CrossRef]
29.
Gonzalez-Nucamendi, A.; Noguez, J.; Neri, L.; Robledo-Rella, V.; García-Castelán, R.M.G.; Escobar-Castillejos, D. Learning
Analytics to Determine Profile Dimensions of Students Associated with their Academic Performance. Appl. Sci. 2022,12, 10560.
[CrossRef]
30.
Melo, E.C.; de Souza, F.S.H. Improving the prediction of school dropout with the support of the semi-supervised learning
approach. iSys 2023,16, 10:1–10:26. [CrossRef]
31.
Dávila, G.; Haro, J.; González, A.; Ruiz-Vivanco, O.; Guamán, D. Student Dropout Prediction in High Education, Using Machine
Learning and Deep Learning Models: Case of Ecuadorian University. In Proceedings of the 2023 International Conference on
Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 13–15 December 2023. [CrossRef]
32.
Fernandes, T.D.S.; Ramos, G.N. Generating and Understanding Predictive Models for Student Attrition in Public Higher
Education. In Proceedings of the IEEE Frontiers in Education Conference, College Station, TX, USA, 18–21 October 2023; pp. 1–5.
[CrossRef]
33. Kalegele, K. Enabling Proactive Management of School Dropouts Using Neural Network. J. Softw. Eng. Appl. 2020,13, 245–257.
[CrossRef]
34.
Baneres, D.; Rodríguez-González, M.E.; Guerrero-Roldán, A.E. A Real Time Predictive Model for Identifying Course Dropout in
Online Higher Education. IEEE Trans. Learn. Technol. 2023,16, 484–499. [CrossRef]
35.
Sayed, M. Student Progression and Dropout Rates Using Convolutional Neural Network: A Case Study of the Arab Open
University. J. Adv. Comput. Intell. Intell. Inform. 2024,28, 668–678. [CrossRef]
36.
do Carmo Nicoletti, M.; de Oliveira, O.L. A Machine Learning-Based Computational System Proposal Aiming at Higher
Education Dropout Prediction. High. Educ. Stud. 2020,10, 12–24. [CrossRef]
37.
Kim, S.; Yoo, E.; Kim, S. Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using
Machine Learning Techniques. arXiv 2023, arXiv:2310.10987. [CrossRef]
38.
Silva, J.; Matos, L.F.A.; Mosquera, C.M.; Mercado, C.V.; González, R.B.; Llinas, N.O.; Lezama, O.B.P. Prediction of academic
dropout in university students using data mining: Engineering case. In Advances in Intelligent Systems and Computing; Springer:
Singapore, 2020; pp. 495–500. [CrossRef]
39.
Manrique, R.; Nunes, B.P.; Marino, O.; Casanova, M.A.; Nurmikko-Fuller, T. An Analysis of Student Representation, Representa-
tive Features and Classification Algorithms to Predict Degree Dropout. In Proceedings of the Learning Analytics and Knowledge
Conference, Tempe, AZ, USA, 4–8 March 2019; pp. 401–410. [CrossRef]
40.
Beltrame, W.A.R.; Gonçalves, O.L. Socioeconomic Data Mining and Student Dropout: Analyzing a Higher Education Course in
Brazil. Int. J. Innov. Educ. Res. 2020,8, 505–518. [CrossRef]
41.
Böttcher, A.; Thurner, V.; Hafner, T. Applying Data Analysis to Identify Early Indicators for Potential Risk of Dropout in CS
Students. In Proceedings of the Global Engineering Education Conference, Porto, Portugal, 27–30 April 2020; pp. 827–836.
[CrossRef]
42.
Hegde, V.; Prageeth, P.P. Higher education student dropout prediction and analysis through educational data mining. In
Proceedings of the International Conference on Information and Communication Systems (ICISC), Coimbatore, India, 19–20
January 2018. [CrossRef]
43.
Aguirre, C.E.; Pérez, J.C. Predictive data analysis techniques applied to dropping out of university studies. In Proceedings of the
Conference on Latin American Computing, Loja, Ecuador, 19–23 October 2020; pp. 512–521. [CrossRef]
44.
Martinho, V.R.C.; Nunes, C.; Minussi, C.R. Prediction of School Dropout Risk Group Using Neural Network Fuzzy ARTMAP. In
Proceedings of the 2013 Federated Conference on Computer Science and Information Systems, Krakow, Poland, 8–11 September
2013; pp. 111–114.
45.
Palacios, C.A.; Reyes-Suarez, J.A.; Bearzotti, L.; Leiva, V.; Marchant, C. Knowledge Discovery for Higher Education Student
Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile. Entropy 2021,23, 485. [CrossRef]
46.
Pecuchova, J.; Drlík, M. Enhancing the Early Student Dropout Prediction Model Through Clustering Analysis of Students’ Digital
Traces. IEEE Access 2024,12, 159336–159367. [CrossRef]
47.
Poellhuber, L.V.; Poellhuber, B.; Desmarais, M.C.; Léger, C.; Roy, N.; Vu, M.M.C. Cluster-Based Performance of Student Dropout
Prediction as a Solution for Large Scale Models in a Moodle LMS. In Proceedings of the International Conference on Learning
Analytics and Knowledge, Arlington, TX, USA, 13–17 March 2023. [CrossRef]
48.
Kok, C.L.; Ho, C.K.; Chen, L.; Koh, Y.Y.; Tian, B. A Novel Predictive Modeling for Student Attrition Utilizing Machine Learning
and Sustainable Big Data Analytics. Appl. Sci. 2024,14, 9633. [CrossRef]
Information 2025,16, 326 22 of 30
49.
Orong, M.Y.; Caroro, R.A.; Durias, G.D.; Cabrera, J.A.; Lonzon, H.A.; Ricalde, G.T. A Predictive Analytics Approach in
Determining the Predictors of Student Attrition in the Higher Education Institutions in the Philippines. In Proceedings of the
International Conference on Software Engineering and Information Management, Sydney, NSW, Australia, 12–15 January 2020;
pp. 222–225. [CrossRef]
50.
Pérez, P.N.M.; C, J.R.A.; Zamora, A.R.R. Predictive Model Design applying Data Mining to identify causes of Dropout in
University Students. Int. J. Sci. Technol. Soc. 2019,7, 11–39.
51.
Patino-Rodriguez, C.E. ADHE: A Tool to Characterize Higher Education Dropout Phenomenon. Rev. Fac. Ing.-Univ. Antioq.
2023,111, 64–75 . [CrossRef]
52.
Gutierrez-Pachas, D.A.; Zanabria, G.G.; Cuadros-Vargas, E.; Chavez, G.C.; Gomez-Nieto, E. Supporting Decision-Making Process
on Higher Education Dropout by Analyzing Academic, Socioeconomic, and Equity Factors through Machine Learning and Survival
Analysis Methods in the Latin American Context. Educ. Sci. 2023,13, 154. [CrossRef]
53.
Vidal, J.; Gilar-Corbi, R.; Pozo-Rico, T.; Castejón, J.L.; Sánchez-Almeida, T. Predictors of University Attrition: Looking for an
Equitable and Sustainable Higher Education. Sustainability 2022,14, 10994. [CrossRef]
54.
Villegas-Ch, W.; Govea, J.; Revelo-Tapia, S. Improving Student Retention in Institutions of Higher Education through Machine
Learning: A Sustainable Approach. Sustainability 2023,15, 14512. [CrossRef]
55.
Gonzalez-Nucamendi, A.; Noguez, J.; Neri, L.; Robledo-Rella, V.; García-Castelán, R.M.G. Predictive analytics study to determine
undergraduate students at risk of dropout. Front. Educ. 2023,8, 1244686. [CrossRef]
56.
Núñez, A.; del Carmen Santiago Díaz, M.; Vázquez, A.C.Z.; Marcial, J.P.; Linarès, G. Early Detection of Students at High Risk of
Academic Failure using Artificial Intelligence. Int. J. Comb. Optim. Probl. Inform. 2024,15, 155–160. [CrossRef]
57.
Su, Z.; Liu, Y.; Zhang, X. Preventing Dropouts and Promoting Student Success: The Role of Predictive Analytics. In Proceedings
of the 2nd International Conference on Computer Application Technology (CCAT 2023), Guiyang, China, 15–17 September 2023;
pp. 282–286. [CrossRef]
58.
Heredia
-
Jiménez, V.; Jimenez, A.; Ortiz
-
Rojas, M.; Marín, J.I.; Moreno
-
Marcos, P.M.; Muñoz
-
Merino, P.J.; Kloos, C.D. An Early
Warning Dropout Model in Higher Education Degree Programs: A Case Study in Ecuador. In Proceedings of the Workshop on
Adoption, Adaptation and Pilots of Learning Analytics in Under
-
represented Regions (LAUR 2020), online, 14–15 September
2020; Volume 2704 of CEUR Workshop Proceedings; pp. 58–67.
59.
Núñez
-
Naranjo, A.F.; Ayala
-
Chauvin, M.; Riba
-
Sanmartí, G. Prediction of University Dropout Using Machine Learning. In
Proceedings of the International Conference on Information Technology and Systems (ICITS 2021), Universidad Estatal Península
de Santa Elena, Libertad, Ecuador, 4–6 February 2021; Advances in Intelligent Systems and Computing; Springer: Cham,
Switzerland, 2021; Volume 1330, pp. 396–406. [CrossRef]
60.
Santacoloma, G.D. Predictive Model to Identify College Students with High Dropout Rates. Rev. Electron. Investig. Educ. 2023,
25, 1–10. [CrossRef]
61. Guarda, T. Higher Education Students Dropout Prediction. In Proceedings of the 8th International Conference on Data Mining
and Big Data (DMBD 2023), Sanya, China, 9–12 December 2023; Communications in Computer and Information Science; Springer:
Singapore, 2023; Volume 328, pp. 121–128. [CrossRef]
62.
Khan, M.A.; Khojah, M.; Vivek. Artificial Intelligence and Big Data: The Advent of New Pedagogy in the Adaptive E-Learning
System in the Higher Educational Institutions of Saudi Arabia. Educ. Res. Int. 2022,2022, 1263555. [CrossRef]
63.
Khan, I.M.; Khan, I.M.; Ahmad, A.R.; Jabeur, N.; Mahdi, M.N. An Artificial Intelligence Approach to Monitor Student Performance
and Devise Preventive Measures. Smart Learn. Environ. 2021,8, 17. [CrossRef]
64. Alfahid, A. Algorithmic Prediction of Students On-Time Graduation from the University. TEM J. 2024,13, 692–698. [CrossRef]
65.
Alturki, S.; Alturki, N.M. Using Educational Data Mining to Predict Students’ Academic Performance for Applying Early
Interventions. J. Inf. Technol. Educ. Innov. Pract. 2021,20, 121–137. [CrossRef]
66.
Brdesee, H.; Alsaggaf, W.; Aljohani, N.R.; Hassan, S.U. Predictive Model Using a Machine Learning Approach for Enhancing the
Retention Rate of Students At-Risk. Int. J. Semant. Web Inf. Syst. 2022,18, 1–21. [CrossRef]
67.
Awedh, M.; Mueen, A. Early Identification of Vulnerable Students with Machine Learning Algorithms. WSEAS Trans. Inf. Sci.
Appl. 2025,22, 166–188. [CrossRef]
68.
Priya V, G.; Eliyas, S.; Kumar M, S. Detecting and Predicting Learner’s Dropout Using KNN Algorithm. In Proceedings of the
2024 OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry
4.0, Raigarh, India, 5–7 June 2024; pp. 1–6. [CrossRef]
69.
Deb, S.; Sammy, M.S.R.; Tusher, A.N.; Sakib, M.R.S.; Hasan, M.H.; Aunik, A.I. Predicting Student Dropout: A Machine Learning
Approach. In Proceedings of the 15th International Conference on Computing, Communication and Networking Technologies
(ICCCNT 2024), IIT Mandi (Kamand), Himachal Pradesh, India, 24–28 June 2024; pp. 1–7. [CrossRef]
70.
Revathy, M.; Kamalakkannan, S. Collaborative learning for improving intellectual skills of dropout students using datamining
techniques. In Proceedings of the International Conference on Artificial Intelligence, Coimbatore, India, 25–27 March 2021.
[CrossRef]
Information 2025,16, 326 23 of 30
71.
Gupta, K.; Gupta, K.; Dwivedi, P.; Chaudhry, M. Binary Classification of Students’ Dropout Behaviour in Universities using
Machine Learning Algorithms. In Proceedings of the Conference, New Delhi, India, 28 February–1 March 2024; pp. 709–714.
[CrossRef]
72.
Sharma, M.; Yadav, M.L. Predicting Students’ Drop-Out Rate Using Machine Learning Models: A Comparative Study. In
Proceedings of the International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT),
Kannur, India, 11–12 August 2022; pp. 1166–1171. [CrossRef]
73.
Patel, D.D.; Savaj, K.; Malani, P.; Patel, J.; Trivedi, H. Unlocking Enigmatic Pathways: Empowering Student Dropout Analysis
with Machine Learning and Energizing Holistic Investigation. In Proceedings of the 2024 IEEE 9th International Conference for
Convergence in Technology (I2CT), Pune, India, 5–7 April 2024. [CrossRef]
74.
Dongre, P.G.G. Predicting Student Dropout Rates in Higher Education: A Comparative Study of Machine Learning Algorithms.
Indian Sci. J. Res. Eng. Manag. 2024. [CrossRef]
75.
Cheng, J.; Yang, Z.Q.; Cao, J.; Yang, Y.; Poon, K.C.F.; Lai, D. Modeling Behavior Change for Multi-model At-Risk Students Early
Prediction. In Proceedings of the 2024 International Symposium on Educational Technology (ISET), Macau, Macao, 29 July–1
August 2024; pp. 54–58. [CrossRef]
76.
Cheng, Y.H. Improving Students’ Academic Performance with AI and Semantic Technologies. arXiv 2022, arXiv:2206.03213.
[CrossRef]
77.
Bagunaid, W.; Chilamkurti, N.; Shahraki, A.S.; Bamashmos, S. Visual Data and Pattern Analysis for Smart Education: A Robust
DRL-Based Early Warning System for Student Performance Prediction. Future Internet 2024,16, 206. [CrossRef]
78.
Tsai, S.C.; Chen, C.H.; Shiao, Y.T.; Ciou, J.S.; Wu, T.N. Precision education with statistical learning and deep learning: A case
study in Taiwan. Int. J. Educ. Technol. High. Educ. 2020,17, 12. [CrossRef]
79.
Yaacob, W.F.W.; Sobri, N.M.; Nasir, S.A.M.; Norshahidi, N.D.; Husin, W.Z.W. Predicting Student Drop-Out in Higher Institution
Using Data Mining Techniques. J. Phys. Conf. Ser. 2020,1496, 012005. [CrossRef]
80. Suaprae, P.; Nilsook, P.; Wannapiroon, P. System Framework of Intelligent Consulting Systems with Intellectual Technology. In
Proceedings of the 9th International Conference on Computer and Communications Management (ICCCM ’21), Singapore, 16–18
July 2021; pp. 31–36. [CrossRef]
81.
Kondo, N.; Okubo, M.; Hatanaka, T. Early Detection of At-Risk Students Using Machine Learning Based on LMS Log Data. In
Proceedings of the International Conference on Advanced Applied Informatics, Hamamatsu, Japan, 9–13 July 2017; pp. 198–201.
[CrossRef]
82.
Sulak, S.; Koklu, N. Predicting Student Dropout Using Machine Learning Algorithms. Intell. Methods Eng. Sci. 2025,3, 91–98.
[CrossRef]
83.
Shin, S. Datafication of Education and Machine Learning Techniques in Education Research: A Critical Review. Gyoyug Yeon’Gu
2024,2, 215–240. [CrossRef]
84.
Rosdiana, R.; Sunandar, E.; Purnama, A.; Arribathi, A.H.; Yusuf, D.A.; Daeli, O.P.M. Strategies and Consequences of AI-Enhanced
Predictive Models for Early Identification of Students at Risk. In Proceedings of the 2024 3rd International Conference on Creative
Communication and Innovative Technology (ICCIT), Tangerang, Indonesia, 7–8 August 2024; pp. 1–6. [CrossRef]
85.
Sridevi, K.; Ranjani, A.P.; Ahmad, S.S. Recognizing Students At-Danger with Early Intervention Using Machine Learning
Techniques. J. Adv. Zool. 2024,45, 481. [CrossRef]
86.
Prajwal, P.; Sahana, L.R.; Kanchana, V. Forecasting Student Attrition Using Machine Learning. In Proceedings of the 4th Asian
Conference on Innovation in Technology (ASIANCON 2024), Pimpri Chinchwad College of Engineering and Research (PCCOER),
Pune, India, 23–25 August 2024; pp. 1–7. [CrossRef]
87.
Prasanth, A.; Alqahtani, H. Predictive Modeling of Student Behavior for Early Dropout Detection in Universities using Machine
Learning Techniques. In Proceedings of the International Conference on Emerging Technologies and Applications, Bahrain,
Bahrain, 25–27 October 2023; pp. 1–5. [CrossRef]
88.
Al-Tameemi, G.; Xue, J.; Ajit, S.; Kanakis, T.; Hadi, I. Predictive Learning Analytics in Higher Education: Factors, Methods
and Challenges. In Proceedings of the International Conference on Advances in Computing and Communication Engineering,
Las Vegas, NV, USA, 22–24 June 2020; pp. 1–9. [CrossRef]
89.
Patel, K.; Amin, K. Predictive modeling of dropout in MOOCs using machine learning techniques. Sci. Temper 2024,15, 2199–2206.
[CrossRef]
90.
Tahiru, F. Predicting At
-
Risk Students in a Higher Educational Institution in Ghana for Early Intervention Using Machine
Learning. Ph.D. Thesis, Durban University of Technology, Durban, South Africa, 2023. [CrossRef]
91.
Mduma, N.; Kalegele, K.; Machuve, D. Machine learning approach for reducing students dropout rates. Int. J. Adv. Comput. Res.
2019,9, 156–169. [CrossRef]
92.
Mnyawami, Y.N.; Maziku, H.; Mushi, J.C. Enhanced Model for Predicting Student Dropouts in Developing Countries Using
Automated Machine Learning Approach: A Case of Tanzanian’s Secondary Schools. Appl. Artif. Intell. 2022,36, 2071406.
[CrossRef]
Information 2025,16, 326 24 of 30
93.
Sani, G.; Oladipo, F.; Ogbuju, E.; Agbo, F.J. Development of a Predictive Model of Student Attrition Rate. J. Appl. Artif. Intell.
2022,3, 1–12. [CrossRef]
94.
Fauszt, T.; Erdélyi, K.; Dobák, D.; Bognár, L.; Kovács, E. Design of a Machine Learning Model to Predict Student Attrition. Int. J.
Emerg. Technol. Learn. 2023,18, 184. [CrossRef]
95.
Nagy, M. Interpretable Dropout Prediction: Towards XAI-Based Personalized Intervention. Int. J. Artif. Intell. Educ. 2023,34,
274–300. [CrossRef]
96.
Berens, J.; Schneider, K.; Görtz, S.; Oster, S.; Burghoff, J. Early Detection of Students at Risk—Predicting Student Dropouts Using
Administrative Student Data from German Universities and Machine Learning Methods. In Proceedings of the Educational Data
Mining Conference, Montréal, QC, Canada, 2–5 July 2019; Volume 11, pp. 1–41. [CrossRef]
97.
Glandorf, D.; Lee, H.R.; Orona, G.A.; Pumptow, M.; Yu, R.; Fischer, C. Temporal and Between-Group Variability in College
Dropout Prediction. In Proceedings of the 14th Learning Analytics and Knowledge Conference, Kyoto, Japan, 18–22 March 2024.
[CrossRef]
98.
Hammoodi, M.S.; Al-Azawei, A.H.S. Using Socio-Demographic Information in Predicting Students’ Degree Completion based
on a Dynamic Model. Int. J. Intell. Eng. Syst. 2022,15, 107–115. [CrossRef]
99.
Perchinunno, P.; Bilancia, M.; Vitale, D. A Statistical Analysis of Factors Affecting Higher Education Dropouts. Soc. Indic. Res.
2021,156, 341–362. [CrossRef]
100.
Vaarma, M.; Li, H. Predicting student dropouts with machine learning: An empirical study in Finnish higher education. Technol.
Soc. 2024,76, 102474. [CrossRef]
101.
Ortiz-Lozano, J.M.; Aparicio-Chueca, P.; Triadó-Ivern, X.; Arroyo-Barrigüete, J.L. Early dropout predictors in social sciences and
management degree students. Stud. High. Educ. 2023,49, 1303–1316. [CrossRef]
102.
Delogu, M.; Lagravinese, R.; Paolini, D.; Resce, G. Predicting dropout from higher education: Evidence from Italy. Econ. Model.
2024,130, 106583. [CrossRef]
103.
Simón, E.J.L.; Puerta, J.G. Prediction of early dropout in higher education using the SCPQ. Cogent Psychol. 2022,9, 2123588.
[CrossRef]
104.
Fernández-García, A.J.; Preciado, J.C.; Melchor, F.; Rodríguez-Echeverría, R.; Conejero, J.M.; Sánchez-Figueroa, F. A Real-Life
Machine Learning Experience for Predicting University Dropout at Different Stages Using Academic Data. IEEE Access 2021,
9, 133076–133090. [CrossRef]
105.
Blanquet, L.; Grilo, J.; Strecht, P.; Camanho, A. Curbing Dropout: Predictive Analytics at the University of Porto. In Proceedings
of the 23rd Conference of the Portuguese Association for Information Systems (CAPSI 2023), Porto, Portugal, 19–21 October 2023;
p. 14. Available online: https://aisel.aisnet.org/capsi2023/14 (accessed on 15 April 2025).
106.
Realinho, V.; Machado, J.A.D.; Baptista, L.M.T.; Martins, M. Predicting Student Dropout and Academic Success. Data 2022,7, 146.
[CrossRef]
107.
Martins, M.V.; Baptista, L.; Machado, J.; Realinho, V. Multi-class phased prediction of academic performance and dropout in
higher education. Appl. Sci. 2023,13, 4702. [CrossRef]
108. Bassetti, E.; Conti, A.; Panizzi, E.; Tolomei, G. ISIDE: Proactively Assist University Students at Risk of Dropout. In Proceedings
of the 2022 IEEE International Conference on Big Data (BigData 2022), Osaka, Japan, 17–20 December 2022; pp. 1776–1783.
[CrossRef]
109.
Segura, M.; Mello, J.; Hernández, A. Machine Learning Prediction of University Student Dropout: Does Preference Play a Key
Role? Mathematics 2022,10, 3359. [CrossRef]
110.
Agrusti, F.; Bonavolonta, G.; Mezzini, M. Use of Artificial Intelligence to Predict University Dropout: A Quantitative Research. In
Proceedings of the EDEN Conference Proceedings, Virtual, 21–24 June 2020; pp. 245–254. [CrossRef]
111.
Plak, S.; Cornelisz, I.; Meeter, M.; van Klaveren, C. Early warning systems for more effective student counselling in higher
education: Evidence from a Dutch field experiment. High. Educ. Q. 2021,76, 131–152. [CrossRef]
112.
Roda-Segarra, J.; de-la Peña, C.; Mengual-Andrés, S. Effectiveness of Artificial Intelligence Models for Predicting School Dropout:
A Meta-Analysis. REMIE Multidiscip. J. Educ. Res. 2024,14, 317–340. [CrossRef]
113. Eegdeman, I.; Cornelisz, I.; Meeter, M.; van Klaveren, C. Identifying false positives when targeting students at risk of dropping
out. Educ. Econ. 2023,31, 313–325. [CrossRef]
114.
Shafiq, D.A.; Marjani, M.; Habeeb, R.A.A.; Asirvatham, D. Student Retention Using Educational Data Mining and Predictive
Analytics: A Systematic Literature Review. IEEE Access 2022,10, 72480–72503. [CrossRef]
115.
Sonderlund, A.L.; Hughes, E.; Smith, J.R. The efficacy of learning analytics interventions in higher education: A systematic
review. Br. J. Educ. Technol. 2019,50, 2594–2618. [CrossRef]
116.
Ennibras, F.; Aoula, E.S.; Bouihi, B. AI in Preventing Dropout in Distance Higher Education: A Systematic Literature Review. In
Proceedings of the 2024 4th International Conference on Innovative Research in Applied Science, Engineering and Technology
(IRASET), Fez, Morocco, 16–17 May 2024. [CrossRef]
Information 2025,16, 326 25 of 30
117.
Mukooyo, H.; Kasse, J.P. Towards Sustainable Education: A Machine Learning Model for Early Student Dropout Prediction in
Higher Education Institutions. Uganda High. Educ. Rev. J. 2024,11, 57–68. [CrossRef]
118.
Dake, D.K.; Buabeng-Andoh, C. Using Machine Learning Techniques to Predict Learner Drop-out Rate in Higher Educational
Institutions. Mob. Inf. Syst. 2022,2022, 2670562. [CrossRef]
119.
Aracelli, G.S.; Bertha, S.N.; Angela, P.C. Monitoring and Academic Accompaniment of University Students Using Emerging
Technologies. Int. J. Relig. 2024,5, 2926–2937. [CrossRef]
120.
Olivares-Rodríguez, C.; Moreno-Marcos, P.M.; Scheihing, E.; Muñoz-Merino, P.J.; Kloos, C.D. An Actionable Learning Path-based
Model to Predict and Describe Academic Dropout. Rev. Ing. Investig. 2024,44, e109389. [CrossRef]
121. Witten, I.H.; Frank, E.; Hall, M. Data Mining: Practical Machine Learning Tools and Techniques; Morgan Kaufmann Publishers: San
Francisco, CA, USA, 1999.
122.
Shin, N.; Kim, J. An Exploration of Learner Progress and Drop-Out in Korea National Open University. Distance Educ. 1999,
20, 81–95. [CrossRef]
123. Dietterich, T.G. Machine learning. ACM Comput. Surv. 1996,28, 3-es. [CrossRef]
124.
Kotsiantis, S.B.; Pierrakeas, C.J.; Pintelas, P.E. Preventing Student Dropout in Distance Learning Using Machine Learning
Techniques. In Proceedings of the 7th International Conference on Knowledge
-
Based and Intelligent Information and Engineering
Systems (KES 2003), Oxford, UK, 3–5 September 2003; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany,
2003; Volume 2774, pp. 267–274. [CrossRef]
125.
Lodge, J.M. What if Student Attrition Was Treated Like an Illness? An Epidemiological Model for Learning Analytics. In
Proceedings of the ASCILITE 2011: Changing Demands, Changing Directions, Hobart, TAS, Australia, 4–7 December 2011;
pp. 822–825.
126.
Kamal, M.S.; Chowdhury, L.; Nimmy, S.F. New Dropout Prediction for Intelligent System. Int. J. Comput. Appl. 2012,42, 26–31.
[CrossRef]
127. Parker, A. A Study of Variables that Predict Dropout from Distance Education. J. Educ. Technol. 1999,1, 1–10.
128.
Niemi, D.; Gitin, E. Using Big Data to Predict Student Dropouts: Technology Affordances for Research. In Proceedings of the
Cognition and Exploratory Learning in Digital Age, Madrid, Spain, 19–21 October 2012; Volume 2012.
129.
Bukralia, R.; Deokar, A.V.; Sarnikar, S.; Hawkes, M. Using Machine Learning Techniques in Student Dropout Prediction. In Cases
on Institutional Research Systems; IGI Global: Hershey, PA, USA, 2012; pp. 117–131. [CrossRef]
130.
Hall, O.P. Detecting students at risk using machine learning: Applications to business education. Int. J. Soc. Media Interact. Learn.
Environ. 2022,6, 267. [CrossRef]
131.
Nguyen, T.H.; Le, P.; Nguyen, T.T.T.; Su, A.K. A multivariate analysis of the early dropout using classical machine learning and
local interpretable model-agnostic explanations. CTU J. Innov. Sustain. Dev. 2024,16, 98–106. [CrossRef]
132.
Lee, S.; Chung, J.Y. The Machine Learning-Based Dropout Early Warning System for Improving the Performance of Dropout
Prediction. Appl. Sci. 2019,9, 3093. [CrossRef]
133.
Fields, J.; Chovanec, K.; Madiraju, P. Integrating categorical and continuous data in a cluster-then-classify methodology for
predicting undergraduate student success. In Proceedings of the IEEE International Conference on Big Data, Washington, DC,
USA, 15–18 December 2024; pp. 8118–8126. [CrossRef]
134.
Huang, Q.; Nathawitharana, N.; Ong, K.L.; Keller, S.; Alahakoon, D. Mind the Gap: From Analytics to Action in Student
Retention. In Advances in Business Strategy and Competitive Advantage: Applying Business Intelligence Initiatives in Healthcare and
Organizational Settings; Nam, K., Ed.; IGI Global: Hershey, PA, USA, 2019; pp. 218–236. [CrossRef]
135.
Gong, L.; Liu, Y. Design and Application of Intervention Model based on Learning Analytics under Blended Learning Environ-
ment. In Proceedings of the 2019 7th International Conference on Information and Education Technology, Aizu-Wakamatsu,
Japan, 29–31 March 2019; pp. 225–229. [CrossRef]
136.
Heredia-Jiménez, V.; Jiménez-Macías, A.; Ortiz-Rojas, M. Using Design-Based Research for an Academic Dropout and Retention
Dashboard. In Proceedings of the International Conference on eDemocracy & eGovernment, Quito, Ecuador, 3–5 April 2023;
pp. 1–5. [CrossRef]
137.
Cohausz, L. When probabilities are not enough-A framework for causal explanations of student success models. J. Educ. Data
Min. 2022,14, 52–75.
138.
Karypis, G. Improving Higher Education: Learning Analytics & Recommender Systems Research. In Proceedings of the Eleventh
ACM Conference on Recommender Systems (RecSys ’17), Villa Erba, Cernobbio, Italy, 27–31 August 2017; p. 2. [CrossRef]
139.
Kiss, V.; Maldonado, E.; Segall, M. The use of semester course data for machine learning prediction of college dropout rates. J.
High. Educ. Theory Pract. 2022,22, 64–74.
140.
Paul, D.B. Machine Learning Model for Student Drop
-
Out Prediction Based on Student Engagement. In Proceedings of the 6th
International Conference “New Technologies, Development and Applications” (NT 2023), Sarajevo, Bosnia and Herzegovina, 22–24 June
2023; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2023; Volume 687, pp. 486–496. [CrossRef]
Information 2025,16, 326 26 of 30
141.
Gkontzis, A.F.; Kotsiantis, S.; Panagiotakopoulos, C.T.; Verykios, V.S. A predictive analytics framework as a countermeasure for
attrition of students. Interact. Learn. Environ. 2019,30, 1028–1043. [CrossRef]
142.
Gray, C.C.; Perkins, D. Utilizing Early Engagement and Machine Learning to Predict Student Outcomes. Comput. Educ. 2019,
131, 22–32. [CrossRef]
143.
Cam, H.N.T.; Sarlan, A.; Arshad, N.I. A hybrid model integrating recurrent neural networks and the semi-supervised support
vector machine for identification of early student dropout risk. PeerJ 2024,10, e2572. [CrossRef]
144.
Wagner, K.; Merceron, A.; Sauer, P.; Pinkwart, N. Which Approach Best Predicts Dropouts in Higher Education? In Proceedings
of the 15th International Conference on Computer Supported Education (CSEDU ’23), Prague, Czech Republic, 21–23 April 2023;
pp. 15–26. [CrossRef]
145.
Mosia, M.S. Identifying At-Risk Students for Early Intervention—A Probabilistic Machine Learning Approach. Appl. Sci. 2023,
13, 3869. [CrossRef]
146.
Adnan, M.; Habib, A.; Ashraf, J.; Mussadiq, S.; Raza, A.A.; Abid, M.; Bashir, M.; Khan, S.U. Predicting at-Risk Students at
Different Percentages of Course Length for Early Intervention Using Machine Learning Models. IEEE Access 2021,9, 7519–7539.
[CrossRef]
147.
Demartini, C.G.; Sciascia, L.; Bosso, A.; Manuri, F. Artificial Intelligence Bringing Improvements to Adaptive Learning in
Education: A Case Study. Sustainability 2024,16, 1347. [CrossRef]
148.
López-García, L.; Lino-Ramírez, C.; Zamudio-Rodríguez, V.M.; Valle-Hernández, J.D. Predictive model for the analysis of
academic performance and preventing student dropout using machine learning techniques. Rev. Educ. Tec. 2022,6, 1–5.
[CrossRef]
149.
Alalawi, K.; Athauda, R.; Chiong, R. An Extended Learning Analytics Framework Integrating Machine Learning and Pedagogical
Approaches for Student Performance Prediction and Intervention. Int. J. Artif. Intell. Educ. 2024 . [CrossRef]
150.
Balcıo˘glu, Y.S.; Artar, M. Predicting academic performance of students with machine learning. Inf. Dev. 2023,39, 390–398.
[CrossRef]
151.
Guanin-Fajardo, J.H.; Guaña-Moya, J.; Casillas, J. Predicting Academic Success of College Students Using Machine Learning
Techniques. Data 2024,9, 60. [CrossRef]
152.
Ujkani, B.; Minkovska, D.; Hinov, N. Course Success Prediction and Early Identification of At-Risk Students Using Explainable
Artificial Intelligence. Electronics 2024,13, 4157. [CrossRef]
153.
Embarak, O.; Hawarna, S.A. Automated AI-driven System for Early Detection of At-risk Students. Procedia Comput. Sci. 2023,
231, 151–160. [CrossRef]
154.
Vijaya, P.; Rajendran, R.; Kumar, B.; Mani, J. Early Prediction of At
-
Risk Students in Higher Education Institutions Using Adaptive
Dwarf Mongoose Optimization Enabled Deep Learning. In Computational Sciences and Sustainable Technologies: First International
Conference, ICCSST 2023, Bangalore, India, 8–9 May 2023, Revised Selected Papers; Immanuel, A., Chandra, J., Aurelia, S., Mani, J.,
Padmanabha, V., Eds.; Communications in Computer and Information Science; Springer: Cham, Switzerland, 2024; Volume 1973,
pp. 15–29. [CrossRef]
155.
Alblawi, A.S. Big Data and Learning Analytics in Higher Education: Demystifying Variety, Acquisition, Storage, NLP and
Analytics. arXiv 2018, arXiv:1801.06052.
156.
Durrani, U.; Akpınar, M.; Togher, M.; Malik, A.; Ðor ¯
devi´c, M.; Aoudi, S. Assessing the Effectiveness of Large Language Models in
Predicting Student Dropout Rates. In Proceedings of the 2024 International Conference on Artificial Intelligence, Metaverse and
Cybersecurity (ICAMAC), Dubai, United Arab Emirates, 25–26 October 2024; pp. 1–6. [CrossRef]
157.
Masood, S.W.; Begum, S.A. Data Collection and Pre-processing for Machine Learning-Based Student Dropout Prediction. In
Big Data, Machine Learning, and Applications; Dutta Borah, M., Laiphrakpam, D.S., Auluck, N., Balas, V.E., Eds.; Lecture Notes in
Electrical Engineering; Springer Nature Singapore: Singapore, 2024; Volume 1053, pp. 355–367. [CrossRef]
158.
Jaiswal, G.; Sharma, A.; Yadav, S. Analytical Approach for Predicting Dropouts in Higher Education. Int. J. Inf. Commun. Technol.
Educ. 2019,15, 89–102. [CrossRef]
159.
Kande, J. Predictive Modeling for Early Detection of High School Dropouts Using Machine Learning Techniques. J. Data Anal.
Artif. Intell. 2024,3, 99–105. [CrossRef]
160.
Goran, R.; Jovanovic, L.; Baˇcanin, N.; Stankovi´c, M.S.; Šimi´c, V.; Antonijevi´c, M.; Živkovi´c, M. Identifying and understanding
student dropouts using metaheuristic optimized classifiers and explainable artificial intelligence techniques. IEEE Access 2024,12,
122377–122400. [CrossRef]
161.
Oqaidi, K.; Aouhassi, S.; Mansouri, K. Towards a Students’ Dropout Prediction Model in Higher Education Institutions Using
Machine Learning Algorithms. Int. J. Emerg. Technol. Learn. (IJET) 2022,17, 103–117. [CrossRef]
162.
Keane, M.; Gray, G. An Investigation into the pre-enrolment characteristics of students to identify factors predictive of academic
performance within first year computing and engineering programmes of study in a Higher Educational Institution. All Irel. J.
High. Educ. 2019,11, 1–20.
Information 2025,16, 326 27 of 30
163.
de Jesus, C.G.; Ledda, M.K.C. Intervention Support Program for Students at Risk of Dropping Out Using Fuzzy Logic-Based
Prescriptive Analytics. In Proceedings of the International Colloquium on Signal Processing and Its Applications, Langkawi,
Malaysia, 5–6 March 2021; pp. 144–149. [CrossRef]
164.
Ismaili, B.; Besimi, A. A Data Warehousing Framework for Predictive Analytics in Higher Education: A Focus on Student at-Risk
Identification. SEEU Rev. 2024,19, 43–57. [CrossRef]
165.
Sosa, D.; Torres, H.; Naranjo, M.; Matango, R. Modelo de Inteligencia de Negocios para el análisis de la deserción estudiantil en
las Instituciones de Educación Superior. Mem. Sucre Rev. 2022,2, 71–86.
166.
Maldonado, S.; Miranda, J.; Olaya, D.; Vásquez, J.; Verbeke, W. Redefining Profit Metrics for boosting Student Retention in Higher
Education. Decis. Support Syst. 2021,143, 113493. [CrossRef]
167.
Seidel, E.; Kutieleh, S. Using predictive analytics to target and improve first year student attrition. Aust. J. Educ. 2017,61, 200–218.
[CrossRef]
168.
Karabacak, E.S.; Yaslan, Y. Comparison of Machine Learning Methods for Early Detection of Student Dropouts. In Proceedings of
the 8th International Conference on Computer Science and Engineering (UBMK 2023), Mehmet Akif Ersoy University, Burdur,
Turkey, 13–15 September 2023; pp. 376–381. [CrossRef]
169.
Delen, D. Predicting Student Attrition with Data Mining Methods. J. Coll. Stud. Retent. Res. Theory Pract. 2011,13, 17–35.
[CrossRef]
170.
Padmasiri, P.; Kasthuriarachchi, S. Interpretable Prediction of Student Dropout Using Explainable AI Models. In Proceedings of
the 2024 International Research Conference on Smart Computing and Systems Engineering (SCSE), Colombo, Sri Lanka, 4 April
2024. [CrossRef]
171.
Ghosh, P.; Charit, A.; Banerjee, H.; Bandhu, D.; Ghosh, A.; Pal, A.; Goto, T.; Sen, S. A Machine Learning Based Automated Model
for Managing Student Dropout. In Proceedings of the 2024 IEEE/ACIS 22nd International Conference on Software Engineering
Research, Management and Applications (SERA), Honolulu, HI, USA, 30 May–1 June 2024; pp. 261–266. [CrossRef]
172.
Santana, M.A.; Costa, E.; dos Santos Neto, B.F.; Silva, I.C.L.; Rego, J.B.A. A predictive model for identifying students with dropout
profiles in online courses. In Proceedings of the EDM (Workshops), Madrid, Spain, 26–29 June 2015.
173.
Jiménez-Gutiérrez, A.L.; Mota-Hernandez, C.; Mezura-Montes, E.; Alvarado-Corona, R. Application of the performance of
machine learning techniques as support in the prediction of school dropout. Dent. Sci. Rep. 2024,14, 3957. [CrossRef]
174.
Noviandy, T.R.; Zahriah, Z.; Yandri, E.; Jalil, Z.; Yusuf, M.; Yusof, N.; Lala, A.; Idroes, R. Machine Learning for Early Detection of
Dropout Risks and Academic Excellence: A Stacked Classifier Approach. J. Educ. Manag. Learn. 2024,2, 28–34. [CrossRef]
175.
Kowarsch, D. Elevating Student Success: Harnessing Machine Learning to Enhance University Completion Rates. In Proceedings
of Intelligent Human Systems Integration (IHSI 2025): Integrating People and Intelligent Systems, New York, NY, USA, 26–30 July 2025;
AHFE Open Access: New York, NY, USA, 2025. [CrossRef]
176.
Dwivedi, D.N.; Mahanty, G.; Khashouf, S. Predictive Analytics for Reducing University Dropout Rates. In Contemporary
Approaches to Activity Theory: Interdisciplinary Perspectives on Human Behavior; Hansson, T., Ed.; Advances in Human and Social
Aspects of Technology; IGI Global: Hershey, PA, USA, 2024; pp. 186–202. [CrossRef]
177.
Shiao, Y.T.; Chen, C.H.; Wu, K.F.; Chen, B.L.; Chou, Y.H.; Wu, T.N. Reducing dropout rate through a deep learning model for
sustainable education: Long-term tracking of learning outcomes of an undergraduate cohort from 2018 to 2021. Smart Learn.
Environ. 2023,10, 55. [CrossRef]
178.
Zhang, L.; Rangwala, H. Early Identification of At
-
Risk Students Using Iterative Logistic Regression. In Proceedings of the 19th
International Conference on Artificial Intelligence in Education (AIED 2018), London, UK, 27–30 June 2018; Lecture Notes in
Computer Science; Springer: Cham, Switzerland, 2018; Volume 10947, pp. 613–626. [CrossRef]
179.
Lee, S.; Jun, Y.K. Student Dropout Prediction for University with High Precision and Recall. Appl. Sci. 2023,13, 6275. [CrossRef]
180.
Bonifro, F.D.; Gabbrielli, M.; Lisanti, G.; Zingaro, S.P. Student Dropout Prediction. In Artificial Intelligence in Education; Springer:
Cham, Switzerland, 2020; Volume 12163, pp. 129–140. [CrossRef]
181.
Baneres, D.; Rodríguez, M.E.; Guerrero-Roldán, A.E.; Karadeniz, A. An Early Warning System to Detect At-Risk Students in
Online Higher Education. Appl. Sci. 2020,10, 4427. [CrossRef]
182.
Lainjo, B. Mitigating Academic Institution Dropout Rates with Predictive Analytics Algorithms. Int. J. Educ. Teach. Soc. Sci. 2023,
3, 29–49. [CrossRef]
183.
Liu, K.; Tatinati, S.; Khong, A.W.H. A Weighted Feature Extraction Technique Based on Temporal Accumulation of Learner
Behavior Features for Early Prediction of Dropouts. In Proceedings of the IEEE International Conference on Teaching, Assessment,
and Learning for Engineering, Takamatsu, Japan, 8–11 December 2020; pp. 295–302. [CrossRef]
184.
Esquivel, J.A. Towards Predicting Student’s Dropout in Higher Education Using Supervised Machine Learning Techniques. In
Proceedings of the International Conference on Industrial Engineering and Operations Management, Lisbon, Portugal, 18–20
July 2023. [CrossRef]
185.
Albugami, S.; Almagrabi, H.; Wali, A. From Data to Decision: Machine Learning and Explainable AI in Student Dropout
Prediction. J. e-Learn. High. Educ. 2024,2024, 246301. [CrossRef]
Information 2025,16, 326 28 of 30
186.
Jiménez, A.F.F.; Lee, V.S.S.; Belmonte, I.I.; González, E.F.P. Application of Data Science and Machine Learning in the Prediction of
College Dropout: A Data-Driven Predictive Approach. In Proceedings of the 12th International Conference on Software Process
Improvement (CIMPS 2023), Cuernavaca, Mexico, 18–20 October 2023; pp. 234–243. [CrossRef]
187.
Sabbir, W.; Abdullah-Al-Kafi, M.; Afridi, A.S.; Rahman, M.S.; Karmakar, M. Improving Predictive Analytics for Student Dropout:
A Comprehensive Analysis and Model Evaluation. In Proceedings of the Conference, New Delhi, India, 28 February–1 March
2024; pp. 951–956. [CrossRef]
188.
Cardona, T.; Cudney, E.A.; Hoerl, R.; Snyder, J. Data Mining and Machine Learning Retention Models in Higher Education. J.
Coll. Stud. Retent. Res. Theory Pract. 2023,25, 51–75. [CrossRef]
189.
Embarak, O. Apply Machine Learning Algorithms to Predict At
-
Risk Students to Admission Period. In Proceedings of the 2020
7th International Conference on Information Technology Trends (ITT 2020), Virtual, Abu Dhabi, United Arab Emirates, 25–26
November 2020; pp. 190–195. [CrossRef]
190.
Porras, J.M.; Lara, J.A.; Romero, C.; Ventura, S. A Case-Study Comparison of Machine Learning Approaches for Predicting
Student’s Dropout from Multiple Online Educational Entities. Algorithms 2023,16, 554. [CrossRef]
191.
Manthon, A.; Piriyasurawong, P. The Causal Factors Affecting the Management of Predictive Student Relationship using Business
Intelligence Concept for the Retention of Undergraduate Students. High. Educ. Stud. 2024,15, 106. [CrossRef]
192.
Ujkani, B.; Minkovska, D.; Stoyanova, L. Application of Logistic Regression Technique for Predicting Student Dropout. In
Proceedings of the XXXI International Scientific Conference Electronics (ET 2022), Sozopol, Bulgaria, 13–15 September 2022;
pp. 1–4. [CrossRef]
193. Patil, O.S.; Dhere, P.M. Predicting Dropout Students Using Data Mining Techniques. Int. J. Res. 2015,2, 369–375.
194.
Perez, B.; Castellanos, C.; Correal, D. Applying Data Mining Techniques to Predict Student Dropout: A Case Study. In Proceedings
of the Conference on Latin American Computing, Medellin, Colombia, 16–18 May 2018; pp. 1–6. [CrossRef]
195.
Hafzan, M.Y.N.N.; Safaai, D.; Asiah, M.; Saberi, M.M.; Syuhaida, S.S. Review on Predictive Modelling Techniques for Identifying
Students at Risk in University Environment. MATEC Web Conf. 2019,255, 03002. [CrossRef]
196.
Lourens, A.; Bleazard, D. Applying predictive analytics in identifying students at risk: A case study. S. Afr. J. High. Educ. 2016,
30, 129–142. [CrossRef]
197.
Shynarbek, N.; Saparzhanov, Y.; Saduakassova, A.; Orynbassar, A.; Sagyndyk, N. Forecasting Dropout In University Based On
Students’ Background Profile Data Through Automated Machine Learning Approach. In Proceedings of the IEEE International
Conference on Smart Information Systems and Technologies (SIST), Nur-Sultan, Kazakhstan, 28–30 April 2022; pp. 1–5. [CrossRef]
198.
Osemwegie, E.E.; Amadin, F.; Uduehi, O.M. Student dropout prediction using machine learning. Fudma J. Sci. 2023,7, 347–353.
[CrossRef]
199.
Duan, Y.; Cao, G.; Ong, V.K.; Woolley, M. Intelligent Student Engagement Management: Applying Business Intelligence in Higher
Education. In Proceedings of the International Conference on Information and Social Science (ISS 2013), Nagoya, Japan, 24–26
September 2013.
200. Davis, M.C.; Burgher, K.E. Predictive Analytics for Student Retention: Group vs. Individual Behavior. Coll. Univ. 2013,88, 63.
201.
Wang, X.; Schneider, H.; Walsh, K.R. A Predictive Analytics Approach to Building a Decision Support System for Improving
Graduation Rates at a Four-Year College. J. Organ. End User Comput. 2020,32, 43–62. [CrossRef]
202.
Alejandro Becerra, J.P.; Aguirre Ochoa, M.; Romero Torres, M.E.; Estrella Ríos, L. La Inteligencia de negocios con Power
Pivot usado en el Instituto Superior Tecnológico Huaquillas: Business intelligence with Power Pivot used at Instituto Superior
Tecnológico Huaquillas. Cumbres 2021,7, 71–81. [CrossRef]
203.
Almonteros, J.R.; Matias, J.B.; Pitao, J.V.S. Forecasting Students’ Success To Graduate Using Predictive Analytics. Int. J. Comput.
Digit. Syst. 2024,15, 697–711. [CrossRef] [PubMed]
204.
Sequeira, N.; Reis, A.; Branco, F.; Alves, P. Roadmap for Implementing Business Intelligence Systems in Higher Education
Institutions: Exploratory Work. In Proceedings of the 20th International Conference on Smart Business Technologies (ICSBT
2023), Rome, Italy, 11–13 July 2023; pp. 162–169. [CrossRef]
205.
Menolli, A.; Horita, F.; Dias, J.J.L.; Coelho, R. BI–based methodology for analyzing higher education: A case study of dropout
phenomenon in information systems courses. In Proceedings of the XVI Brazilian Symposium on Information Systems, São
Bernardo do Campo, Brazil, 3–6 November 2020; pp. 1–8.
206.
Sánchez Arevalo, M.L.; Abuchar Porras, A.; Gutiérrez Herrera, J.D.; Ferro Escobar, R. Proposal for the Implementation of a
Business Intelligence Tool to Detect Cases of Student Desertion at the Francisco José de Caldas District University. In Proceedings
of the AETA 2019–Recent Advances in Electrical Engineering and Related Sciences: Theory and Application (AETA 2019),
Bogotá, Colombia, 6–8 November 2019; Lecture Notes in Electrical Engineering; Springer: Cham, Switzerland, 2021; Volume 685,
pp. 482–489. [CrossRef]
207.
Villegas-Ch, W.; Palacios-Pacheco, X.; Luján-Mora, S. A business intelligence framework for analyzing educational data.
Sustainability 2020,12, 5745. [CrossRef]
Information 2025,16, 326 29 of 30
208.
Psyridou, M.; Prezja, F.; Torppa, M.; Lerkkanen, M.; Poikkeus, A.; Vasalampi, K. Machine learning predicts upper secondary
education dropout as early as the end of primary school. Dent. Sci. Rep. 2024,14, 12956. [CrossRef]
209.
Ferdousee, Z.; Haque, M.R. Integrating Feature Selection and Extraction with Tuned Multilayer Perceptrons for Predicting Student
Dropout Risk in Higher Education. In Proceedings of the 2024 2nd International Conference on Information and Communication
Technology (ICICT), Dhaka, Bangladesh, 21–22 October 2024; pp. 11–15. [CrossRef]
210.
Weagant, R.; Zhao, Z.; Bradley, A.; Collins, C. AdVizor: Using Visual Explanations to Guide Data-Driven Student Advising. In
Proceedings of the 2024 IEEE VIS Workshop on Visualization Education, Literacy, and Activities (EduVIS), St. Pete Beach, FL,
USA, 13–14 October 2024; pp. 21–29. [CrossRef]
211.
Bokgoshi, L.; Sixhaxa, K.; Jadhav, A.; Nyamane, S.; Ajoodha, R. Enhancing Timely Graduations: An Explainable AI Approach to
Predict Academic Risks in South African Students. In Proceedings of the IEEE International Conference on Electrical, Computer,
and Energy Technologies (ICECET), Cape Town, South Africa, 16–17 November 2023; pp. 1–7. [CrossRef]
212.
Baneres, D.; Rodríguez-González, M.E.; Guerrero-Roldán, A.E.; Cortadas, P. An early warning system to identify and intervene
online dropout learners. Int. J. Educ. Technol. High. Educ. 2023,20, 3. [CrossRef]
213.
Kostopoulos, G.; Kotsiantis, S.; Ragos, O.; Grapsa, T.N. Early dropout prediction in distance higher education using active
learning. In Proceedings of the International Conference on Information, Intelligence, Systems and Applications, Larnaca, Cyprus,
27–30 August 2017; pp. 1–6. [CrossRef]
214.
Drzyzga, G.; Harder, T. A Three Level Design Study Approach to Develop a Student
-
Centered Learner Dashboard. In Proceedings
of the 7th International Conference on Computer
-
Human Interaction Research and Applications (CHIRA 2023), Communications
in Computer and Information Science, Rome, Italy, 16–17 November 2023; Springer: Cham, Switzerland, 2023; Volume 1996,
pp. 262–281. [CrossRef]
215.
Fuentes, N.B.; Feliscuzo, L.S.; Romana, C.L.C.S. Enhancing Student Retention in Higher Education: A Fuzzy Logic Approach
to Prescriptive Analytics. In Proceedings of the 2024 IEEE 7th International Conference on Big Data and Artificial Intelligence
(BDAI), Beijing, China, 5–7 July 2024; pp. 41–48. [CrossRef]
216. Bousalem, Z.; Qazdar, A.; Guabassi, I.E.; Haj, A. A Recommendation System Based on Early Academic Performance Prediction
and Student Classification: Utilizing Artificial Intelligence and Mobile-Based Application. Int. J. Interact. Mob. Technol. 2024,
18, 169–189. [CrossRef]
217.
Malatji, M.M.; Mohlomi, R.; Kirui, G.K.; Mndebele, S.; Ekoru, J.E.; Madahana, M.C. Student Retention Model via Machine
Learning and Predictive analysis. In Proceedings of the 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 29–31
May 2024. [CrossRef]
218.
Herodotou, C.; Naydenova, G.; Boroowa, A.; Gilmour, A.; Rienties, B. How Can Predictive Learning Analytics and Motivational
Interventions Increase Student Retention and Enhance Administrative Support in Distance Education. J. Learn. Anal. 2020,
7, 72–83. [CrossRef]
219.
Bilquise, G.; Shaalan, K. AI-based Academic Advising Framework: A Knowledge Management Perspective. Int. J. Adv. Comput.
Sci. Appl. 2022,13, 193–203. [CrossRef]
220.
Böttcher, A.; Thurner, V.; Hafner, T.; Hertle, J. A Data Science-based Approach for Identifying Counseling Needs in first-year
Students. In Proceedings of the Global Engineering Education Conference, Vienna, Austria, 21–23 April 2021; pp. 420–429.
[CrossRef]
221.
Mun, J.; Jo, M. Applying machine learning-based models to prevent University student dropouts. Gyo’yug Pyeong’ga Yeon’gu
2023,36, 289–313. [CrossRef]
222.
Bouihi, B.; Bousselham, A.; Aoula, E.S.; Ennibras, F.; Deraoui, A. Prediction of Higher Education Student Dropout based on
Regularized Regression Models. Eng. Technol. Appl. Sci. Res. 2024,14, 17811–17815. [CrossRef]
223.
Gretchen Gaye, C.E.; Ramírez, G.M.; Jaime Díaz, F.M. Towards Educational Sustainability: An AI System for Identifying and
Preventing Student Dropout. Rev. Iberoam. Tecnol. Aprendiz. 2024,19, 100–110. [CrossRef]
224.
Saparkhojayev, N.; Tashkenbayev, Y.; Zholayeva, L. Analyzing Big Data Using Machine Learning to Identify Factors Affecting
College Students Dropout Risk. Univ. Pap. 2024,3, 300–304. [CrossRef]
225.
Adimulam, T. Integrating predictive analytics and machine learning within sap erp hcm ecc6: Enhancing strategic human capital
management through intelligent data insight. World J. Adv. Res. Rev. 2019,3, 072–080. [CrossRef]
226.
Alcolea, J.J.; Ortigosa, A.; Carro, R.M.; Blanco, O.J. Best Practices in Dropout Prediction: Experience
-
Based Recommendations for
Institutional Implementation. In Early Warning Systems and Targeted Interventions for Student Success in Online Courses; Advances
in Educational Technologies and Instructional Design (AETID); IGI Global: Hershey, PA, USA, 2020; pp. 301–323. [CrossRef]
227.
Oreshin, S.A.; Filchenkov, A.; Kozlova, D.; Petrusha, P.; Lisitsyna, L.S.; Panfilov, A.; Glukhov, I.; Krasheninnikov, E.; Buraya,
K.I. The Use of Students’ Digital Portraits in Creating Smart Higher Education: A Case Study of the AI Benefits in Analyzing
Educational and Social Media Data. In Smart Education and e-Learning 2020; Springer: Singapore, 2020; pp. 233–243. [CrossRef]
Information 2025,16, 326 30 of 30
228.
Silva, L.H.N.D.; Chounta, I.A.; Rodríguez-Triana, M.J.; Roa, E.R.; Gramberg, A.; Valk, A. Toward an Institutional Analytics
Agenda for Addressing Student Dropout in Higher Education: An Academic Stakeholders’ Perspective. J. Learn. Anal. 2022,
9, 179–201. [CrossRef]
229.
Albadri, A.H.M. Data Virtualization Design Model for Near Real Time Decision Making in Business Intelligence Environment.
Ph.D. Thesis, Universiti Utara Malaysia, Sintok, Malaysia, 2017.
230.
Purnomo, R.; Priatna, W.; Putra, T.D. Implementasi Big Data Analytical Untuk Perguruan Tinggi Menggunakan Machine
Learning. J. Inform. Inf. Secur. 2021,2, 77–88. [CrossRef]
231.
Qin, A.; Boicu, M. EduBoost: An Interpretable Grey-Box Model Approach to Identify and Prevent Student Failure and Dropout.
In Proceedings of the Frontiers in Education Conference, College Station, TX, USA, 18–21 October 2023; pp. 1–7. [CrossRef]
232.
Delen, D.; Davazdahemami, B.; Dezfouli, E.R. Predicting and Mitigating Freshmen Student Attrition: A Local-Explainable
Machine Learning Framework. Inf. Syst. Front. 2023,26, 641–662. [CrossRef]
233.
Leo, J. Early-Warning Dropout Visualization Tool for Secondary Schools: Using Machine Learning, QR Code, GIS and Mobile
Application Techniques. Int. J. Adv. Comput. Sci. Appl. 2022,13, 655–663. [CrossRef]
234.
Simeunovi´c, V.; Mili´c, S.; Ratkovi´c
-
Obradovi´c, S. Educational Data Mining in Higher Education: Building a Predictive Model for
Retaining University Graduates as Master’s Students. J. Coll. Stud. Retent. Res. Theory Pract. 2024,26, Advance online publication.
[CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
Content uploaded by Diana-Margarita Cordova-Esparza
Author content
All content in this area was uploaded by Diana-Margarita Cordova-Esparza on Apr 22, 2025
Content may be subject to copyright.