PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.
1
Towards reproducible systematic reviews
in Open, Distance, and Digital Education (ODDE)
– an umbrella mapping review
Olaf Zawacki-Richter, Berrin Cefa, John Y. Bai
Abstract
More and more systematic reviews (SRs) are being published in the educational sciences. This umbrella
mapping review examines 576 SRs published between 2018 and 2022 in the field of open, distance, and
digital education (ODDE) to investigate publication and authorship patterns and to evaluate the quality of
these SRs. A quality index score was calculated for each included study based on the PRISMA reporting
items for SRs (including elements such as the search strategy, eligibility criteria, protocol registration,
study quality appraisal, interrater reliability, etc.). Almost as many SRs were published in 2022 as in the
previous four years and the most rigorous SRs come from the field of medical education. However, the
results show that there is room for improvement in SRs published in ODDE. A content analysis that
explored the thematic scope of SRs showed that the majority of SRs addressed topics related to learning
design, AI in education, and the effectiveness of online learning and teaching interventions. Research
during this time period was strongly influenced by the experiences with online learning during the
COVID-19 pandemic. The results of this umbrella review should help to improve the quality of SRs
towards reproducible reviews in ODDE.
Keywords
umbrella review, systematic review, digital education, distance education, ODDE
2
Context and Implications Box:
• Rationale for this study
The field of Open, Distance, and Digital Education (ODDE) is in transition and an umbrella mapping
review is warranted given the dynamic growth of systematic reviews in this literature.
• Why the new findings matter:
The quality of many published systematic reviews limits the reproducibility and validity of the presented
research evidence the results of the present quality appraisal call for more rigorous systematic reviews in
ODDE.
• Implications for ….
Researchers: Conducting systematic reviews is a fruitful exercise for individual researchers and research
institutions to gain a solid overview of a given topic. However, researchers must be trained in the
appropriate methodology for the results to be reproducible and valid.
Practitioners and policy-makers: Systematic reviews can be a valuable source to inform practice and
policy-making. However, attention must be paid to the quality of the reviews; if the method is not carried
out accurately, the results must be interpreted with caution.
Journal editors: As gatekeepers responsible for ensuring journal quality, editors should invite experts for
the systematic review methodology to be members of the editorial team to handle the peer-review process
of submissions. It must be guaranteed that only methodologically sound systematic reviews are published.
3
1 Introduction
Literature reviews are traditional ways of gathering knowledge from the accumulated pool of scientific
information. As Sutton et al. (2019) discuss in detail, the role of reviews evolved from a simple random
search that summarized information to a crucially important tool in informing decision-making about
institutional and organizational practices (Gough et al., 2017) and framing the new research in the context
of what has already been discovered. An evidence-based synthesis of knowledge requires a systematic
approach based on a rigorous methodology (Buntins et al., 2023). Systematic reviews (SR) represent a
structured and explicit strategy for examining published work on a given topic. A SR entails the
systematic identification, evaluation, and aggregation of all relevant research, utilizing specified
procedures to minimize bias and ensure consistency and scientific rigor (Higgins et al., 2022).
SRs were first adopted in health science, medicine, and psychology, and date back to the origins of the
Cochrane Collaboration in the 1970s (Bennett, 2005). In the 1990s, with the emergence of the evidence-
based medicine movement in Canada, there was a need to achieve the conscientious, explicit and judicial
use of current evidence in making decisions about the care of individual patients” (Sackett et al., 1996, p.
71). The need for making evidence-based decisions is not limited to medicine, but to any field that
requires an overview of an ever-increasing mountain of data and information (Petticrew & Roberts,
2008). The synthesis of evidence can offer practitioners (such as educators) and policy-makers important
insights into the effectiveness of a specific intervention, approach, or method. SRs are particularly useful
in situations where resources such as time and money are limited, and there is an urgent need to
determine which approach to select (or fund, in the case of policy-makers) and apply in educational
settings.
Furthermore, conducting a systematic review is a fruitful exercise for any research institution or individual
researcher to develop a research profile Especially for early career researchers and doctoral candidates,
SRs offer the opportunity to gain a solid overview of a given topic, to develop their own research topics
and agendas, and to provide a strong rationale for new research questions. However, the popular use of
the term “systematic” does not always correspond with what a scientific approach to the SR methodology
4
requires. As the units of analysis in SRs are compilations of published work, the process requires
transparency in the search strategy, data collection, extraction, and the whole decision-making process of
the team of reviewers; this ensures a reduction in potential bias from the researchers conducting the
synthesis. To achieve this rigor in a reproducible and replicable body of synthesis, the elements of the SR
process need to be transparently reported (Booth, 2016).
The major feature of a SR that follows an explicit, transparent review protocol is its reproducibility, which
"refers to the ability of a researcher to duplicate the results of a prior study using the same materials and
procedures as were used by the original investigator" (Bollen et al., 2015, p. 3). However, studies that
employ poor-quality methodology are not reproducible (Gessler & Siemer, 2020; Sayre & Riegelman,
2018). Concerns about the reproducibility of research results are not unique to the educational sciences
(cf. Makel & Plucker, 2014). In a large-scale assessment of the reproducibility of 100 psychology studies,
researchers reported replicating only 39% of the original results successfully (Open Science Collaboration,
2015). Sayre and Riegelman (2018) even speak of a reproducibility crisis across several disciplines. They
distinguish between reproducibility, defined as using the same procedures and data to confirm the results
of a previous experiment or analysis, and replicability, which goes further by gathering new data to
confirm the results of an earlier study. We follow these definitions in the understanding that
reproducibility is a prerequisite for replicability to confirm or update findings from a previous systematic
review.
Evidence syntheses in the field of Open, Distance, and Digital Education (ODDE). More and more systematic
reviews are published in education, but many review articles that claim to be “systematic”, do not meet
the criteria and requirements of a reproducible and transparent systematic review. In a tertiary mapping
review of 446 evidence syntheses in the field of educational technology, Buntins et al. (2023) found that
only 44% reported the complete search string, 62% outlined the inclusion/exclusion criteria, 37%
provided the data extraction coding scheme, and only 26% of systematic reviews performed a quality
assessment. Similarly, Bond et al. (2024) conducted a meta-systematic review of 66 evidence syntheses
focused on Artificial Intelligence in Education (AIEd); they found that 31.8% of studies searched only
one or two databases, 51.5% did not mention inter-rater reliability or explain how screening and coding
decisions were made among review teams, only 24.2% disclosed their precise data extraction coding
5
scheme, 45.5 % conducted no quality assessment, and 34.8% did not reflect on the limitations of their
review.
The lack of transparency and reproducibility in the area of educational technology (Buntins et al., 2023)
and AIEd research (Bond et al., 2024) also underscores the need for a review of systematic reviews in the
broader domain of Open, Distance, and Digital Education (ODDE).
Terminology in transition. Within the process of digital transformation, the field of distance and online
education is in transition. The traditional boundaries between dedicated distance teaching institutions and
conventional residential institutions are blurring. This convergence is also reflected in a current discussion
about terminology (Nichols, 2023, 2024; Zawacki-Richter et al., 2024). The COVID-19 pandemic has
boosted the use of digital media and tools for online learning in mainstream education. However, Nichols
(2023) criticizes that everything is now online: “It is unfortunate, though, that this shift became popularly
known as a move to ‘online’ and even ‘distance’ education” (p. 142). The term ODDE serves as an
umbrella term for the various modes and formats of learning and teaching in the digital context. In 2023,
a handbook of ODDE was published in Springer’s Major Reference Series (Zawacki-Richter & Jung,
2023), Athabasca University in Canada is offering a Master of ODDE program since 2023, and a new
international Journal of Open, Distance, and Digital Education (JODDE) was launched in 2024.
In light of this, we use the term ODDE as defined by Zawacki-Richter and Jung (2023) in the Handbook
of ODDE:
We conceptualize ODDE as an overarching term to refer to all kinds of learning and teaching
processes in which knowledge and skill base of educational technology, digital media, and tools
are used to present and deliver content, as well as facilitate and support communication,
interaction, collaboration, assessment, and evaluation. Thus, ODDE is not monolithic in form. It
includes various practices, from technology-enhanced education, to flipped learning and blended
learning to fully online education. (p. 4)
Aim and research questions. This study utilizes an umbrella review (Biondi-Zoccai, 2016; Gessler & Siemer,
2020) to explore the quality of SRs in the field of ODDE. Grant and Booth (2009) define an umbrella
review as a review “compiling evidence from multiple reviews into one accessible and usable document.
6
[It] focuses on a broad condition or problem for which there are competing interventions and highlights
reviews that address these interventions and their results” (p. 95). This approach can be combined with a
mapping review, which seeks to map out literature on a topic and identify gaps and priority areas to
commission further primary and/or secondary research (p. 97).
Therefore, unlike SRs that primarily focus on synthesizing results, this review is an umbrella mapping
review that aims to provide an overview of the SR landscape in ODDE. Specifically, this paper addresses
the following research questions in four areas:
RQ 1 Publication and authorship patterns: How has the number of SRs developed over the last
five years and in which journals are they published? How many authors collaborate on a
systematic review, and what is the country of origin in terms of their institutional affiliation?
RQ 2 Evaluation of the overall rigor of SR: What is the quality of the published reviews? Do they
follow the reporting standards for SRs? What kind of document types (journal articles,
conference proceedings, book chapters, etc.) and how many literature databases are included?
How many studies are included in each stage of the systematic review process? What is the
relationship between the quality of the reviews and the quality, ranking, and impact of the
journals in which they are published?
RQ 3 Method of quality appraisals: Which quality appraisal tools or methods are most frequently
employed in SRs to assess the quality of included studies
RQ 4 Content analysis: What are the major topics in ODDE that are covered in the SRs? In
which educational context were the SRs conducted (higher education, K-12, adult, continuing
education, etc.)?
This umbrella mapping review of SRs aims to identify best practices in conducting rigorous and
reproducible SRs, evaluate the quality of published SRs in ODDE, and explore major topics covered in
the reviews to identify gaps and priority areas in ODDE research. In this way, the presented umbrella
review offers guidance for research institutions and both established and younger scientists in the
development of future research agendas. We also hope to contribute to the professionalization and
quality development of SRs in education in general and in ODDE in particular.
7
2 Method
The purpose of a systematic review is to answer specific questions based on an explicit, systematic, and
replicable search strategy, with inclusion and exclusion criteria to identify appropriate studies (Gough et
al., 2017; Zawacki-Richter et al., 2020). Data are then coded and extracted from the included studies to
synthesize findings and identify gaps or contradictions.
Umbrella reviews are quite similar in procedure to SRs (Aromataris et al., 2014). Therefore, the best
starting point for conducting an umbrella review is to follow the guidance for the steps in the systematic
process (Gough et al., 2017); that is, 1) formulating the review question and protocols, 2) defining
inclusion and exclusion criteria, 3) forming the search strategy and framing information sources, 4)
screening the articles based on the inclusion/exclusion criteria, 5) presenting the results of the search
strategy in a flowchart, 6) extracting relevant descriptive data from included studies, 7) appraising the
quality of included studies, and 8) synthesizing evidence collected. This systematic and transparent
process aids in making a systematic review reproducible.
The protocol for this review (including a RIS-file with bibliographic information of all 576 articles) was
registered and published on the Open Science Framework (OSF) platform (https://osf.io/) and is
available via the following link: https://doi.org/10.17605/OSF.IO/MVKX2.
Search strategy. The authors formulated a broad search string (see Table 1) for this umbrella review to
retrieve SRs published in ODDE. The eligibility criteria (Table 2) for this umbrella review identifies four
areas for inclusion and exclusion: publication year, publication language, education level, methodology of
the publications, and the publication type.
The articles were retrieved from three international databases: Education Source, Scopus, and Web of
Science. While there are concerns about peer-review processes within the scientific community (e.g.,
Smith, 2006), articles in this review were limited to those published in peer-reviewed journals due to their
general trustworthiness in academia and their relatively rigorous review processes (Nicholas et al., 2015).
Due to the enormous proliferation in the publication of SRs, the search was limited to the years between
2018 and 2022, covering the last five years when this study was conducted. Given that English is our
common working language, we limited our search to publications in English. The studies we targeted are
8
those in the field of ODDE that declare, through their titles or abstracts, that their methodological
approach is a SR. We included articles that report on the ODDE field across all learning stages.
Table 1
Search string
Topic
Search terms
Context
(distan* OR online OR open OR technology-enhanc* OR digital) W/3 (educat* OR
learn* OR teach*)
AND
Review type
systematic W/2 review
Search procedure. The initial search on the above-mentioned databases covering titles, abstracts, and
keywords was undertaken in November 2022 and updated in January 2023 to include all records of the
year 2022. We initially identified 4,449 records (see Figure 1) and imported all of the articles into a
reference management software (Zotero). Next, 1,359 duplicates were removed. A simple Python code
(see Appendix C) was developed to identify and remove the studies that were initially gathered despite not
using the term “systematic” in either their title or abstract. A total of 474 papers were removed by the
code and then manually checked to ensure no papers were falsely rejected.
The three authors of this study screened the titles and abstracts of the remaining articles (N = 2,616) on
Rayyan.ai, based on the inclusion and exclusion criteria in Table 2. As the initial selection process on the
titles and abstracts requires sensitivity rather than specificity, (i.e., to include rather than exclude; Zawacki-
Richter et al., 2020), we included the papers unless they were clearly irrelevant to the intended research
questions so as to re-evaluate them in the full-text screening process.
Table 2
Inclusion and exclusion criteria
9
Criteria
Inclusion
Exclusion
Publication year
2018 – 2022
before 2018
Language
English
Not in English
Education level
Any level in ODDE, including K-12,
HE, LLL, TVET
Not ODDE, informal, non-formal
Methodology
SRs*
Not-SRs
Publication type
Peer-reviewed academic journal
articles indexed in Scopus, WoS, and
Education Source
Not journal articles (e.g., books,
editorials, notes)
* Papers that claim to conduct a systematic review in the title or abstract were all included.
Inter-rater reliability. Inter-rater reliability between the three coders was assessed using Fleiss' kappa (κ;
Fleiss, 1971); this coefficient indicates the degree of consistency among three or more raters based on the
number of codings in the coding scheme (Fleiss, 1981; Neumann, 2007, p. 326). Kappa values of .40 to
.60 are characterized as moderate, .60 to .80 as good, and over .80 as very good agreement (Landis &
Koch, 1977). A total of 60 articles across three runs of 20 were blindly screened, coded, compared, and
reconciled to reach a common understanding of the inclusion and exclusion criteria. Coding consistency
for inclusion or exclusion of the 60 articles between the three raters was κ = .77. Therefore, inter-rater
reliability can be considered good for the coding of inclusion and exclusion criteria.
Coding and data extraction. We conducted the full-text screening and coding on SPSS 23 with numeric codes
(Appendix A). The main coding scheme included fundamental aspects of the systematic review
methodology as well as demographic descriptors such as the country of the authors. The reviewed articles
were coded for: a) descriptors of studies according to the country of the first author, journal names, and
number of collaborators; b) descriptors of the scope of the reviews; and c) the reporting items of SRs.
Descriptive data analysis was carried out with the statistics software R using the tidyr, tidyverse package
(Wickham & Grolemud, 2016).
SR Quality Index Score (QIS). To assess and quantify the methodological quality of the SRs included in the
corpus, we developed a simple index that is based on the main quality dimensions of the measurement
tool for the ‘assessment of multiple systematic reviews’ (AMSTAR; Shea et al., 2007). The Quality Index
Score (QIS) combines the reporting items into an index ranging between 0 and 100. As we emphasize the
reproducibility of the SR, we have - in contrast to the procedure used for AMSTAR - weighted individual
10
items higher if they are essential to ensuring reproducibility of a SR; in particular, the reporting of the
search string, the inclusion and exclusion criteria, a full documentation of the filtering process in a
PRISMA flow chart, and a discussion of potential limitation or biases. Further variables are the conduct
of a quality appraisal, the reporting of interrater reliability in the coding decisions of the SR team, and the
publication of the review protocol. The review protocol was given less weight, as the registration and
publication of the review protocol is still unusual in educational science and reproducibility is ensured as
long as the important steps of the SR are described in the respective article.
These SR reporting items are treated as dichotomous variables (1 = yes, 0 = no), and the index is
calculated as follows: QIS = 10*(2*string + 2*criteria + 2*prisma chart + 1.5*limitations + quality
appraisal + interrater reliability + 0.5 protocol).
Computer-assisted content analysis. We used content analysis to examine the conceptual structure and most
frequently occurring themes within the title and abstracts of the included articles (see Fisk et al., 2012;
Krippendorff, 2013, for discussions of computer-aided content analysis). For this study, we used the
software LeximancerTM (https://www.leximancer.com) to generate a concept map. The software locates
core concepts within textual data (conceptual analysis) and identifies how these concepts relate to each
other (relational analysis) based on the frequency with which words co-occur in the text. LeximancerTM
then generates a concept map that clusters similar concepts that co-occur in close proximity. Thematic
regions are formed depending on the connectedness of concepts and are named by the most prominent
concept in that thematic region.
For this analysis, we used the abstracts and titles of the 576 research articles published between 2018 and
2022 from three literature databases (see Figure 1). Titles and abstracts are considered appropriate for
such content analysis since they are "lexically dense and focus on the core issues presented in articles"
(Cretchley et al., 2010, p. 319). The generated map illustrates the semantic relation of the themes and key
concepts within and across the SRs in our corpus.
Figure 1
PRISMA diagram
11
Note. Slightly modified after Brunton and Thomas (2012, p. 86) and Moher et al. (2009, p. 8)
Limitations. Whilst this umbrella review was undertaken as rigorously as possible, each review is limited by
its search strategy. The three educational research databases chosen are large and international in scope;
however, by applying the criteria of SRs published in peer-reviewed journals only in English, reviews
published on ODDE in other languages were not included in this meta-review. This also applies to
reviews in book chapters or grey literature or those articles not published in journals. We acknowledge
that it is becoming a gold standard to have at least four literature databases included in the search for a
systematic review. Future research should therefore consider using a larger number of databases,
Records identified:
Scopus: (n = 1,480)
WoS: (n = 1,128)
EduSource: (n = 1,841)
Total: (N = 4,449)
Records screened
(n = 2,616)
Records excluded based on titles and abstracts
(n = 1,830); reasons (multiple codes possible):
not ODDE (n = 1,381)
not SR (n = 229)
before 2017 (n = 153)
non-formal (n = 52)
duplicate (n = 30)
wrong publication type (n = 17)
not in English (n = 2)
Records assessed for eligibility
(n = 786)
Studies finally included
(n = 576)
Identification of studies via databases
Identification
Screening
Included
12
publication types, and publication languages to widen the scope of the review. However, serious
consideration would then need to be given to project resources and the manageability of the review.
For the analysis of the country-wise distribution of articles, only the country of the principal investigator
(i.e., the first author) was taken into consideration. As a result, the geographical distribution of the entire
author network is not covered.
We furthermore note that the computer-assisted content analysis with the text-mining tool LeximancerTM
does not go into too much detail and depth. However, the primary aim of this umbrella mapping review
was not to synthesize the evidence and results of the various applications and interventions in ODDE;
rather, the content analysis maps out the broader SR landscape of ODDE. Research topics identified here
can be the starting point for subsequent umbrella reviews, which focus on synthesizing and reporting key
findings within a certain research area.
In addition, the application of co-word analysis for the mapping of a research discipline has been subject
of debate. Leydesdorff (1997) concluded that: "The fluidity of epistemic networks in which nodes and
links change positions may destabilize any knowledge representation on the basis of co-occurrences of
words" (p. 426). The prevailing opinion in the literature is that the co-occurrence of words provides
"useful information for a narrative inquiry on a subject" (Liesch et al., 2011, p. 24) and bibliometric maps
such as the concept maps produced by LeximancerTM based on co-word analysis help to visualize
complex masses of data in less time and " also accomplish data reduction while retaining essential
information" (van Raan & Tijssen, 1993, p. 175). However, in light of Leydesdorff's concerns, we also
provide qualitative examples for the main emerging research topics or "concept paths" to illustrate the
connections between key terms in the concept map.
13
3 Results and discussion
3.1 RQ 1: Publication and authorship patterns
Published SRs per year. Figure 2 shows a noticeable increase in the papers published from 2018 onwards.
The annual count of included SRs in ODDE grew steadily, increasing from 35 in 2018 to 129 in 2021,
before nearly doubling to 251 in 2022.
Figure 2
Number of included articles per year (N = 576)
Journals. The 576 SRs included in the corpus were published across 260 different journals. The greatest
number of articles was published in Computers & Education, and Education & Information Technologies (n =
27), followed by Sustainability (n = 19), Interactive Learning Environments, and the Journal of Medical Internet
Research (n = 14). Table 3 lists the 24 journals that published at least five SRs in ODDE from 2018 to
2022.
35
74
87
129
251
0
50
100
150
200
250
2018 2019 2020 2021 2022
Years
Frequency
14
Table 3
Number of included articles by journal (N = 576)
Rank
Journal
n
1
Computers & Education
27
Education & Information Technologies
27
2
Sustainability (Switzerland)
19
3
Interactive Learning Environments
14
Journal of Medical Internet Research
14
4
Education Sciences
12
5
Nurse Education Today
11
6
British Journal of Educational Technology
10
Journal of Computer Assisted Learning
10
7
Australasian Journal of Educational Technology
9
International Journal of Emerging Technologies in Learning
9
8
BMC Medical Education
8
Int. Journal of Educational Technology in Higher Education
8
9
IEEE Access
7
Journal of Research on Technology in Education
7
10
Applied Sciences (Switzerland)
6
Computer Assisted Language Learning
6
Turkish Online Journal of Distance Education
6
11
Educational Research Review
5
Frontiers in Psychology
5
International Review of Research in Open & Distributed Learning
5
Medical Education
5
Nurse Education in Practice
5
Technology, Knowledge and Learning
5
15
236 other journals
336
Total
576
Countries. For the analysis of the country-wise distribution of articles, the first author’s country was
considered (n = 70 countries). Table 4 shows 33 countries that contributed at least four SRs, and it reveals
that 25% of all articles come from only three countries: China, USA, and Spain.
Table 4
Distribution of SR articles by country and cumulative percentages (N = 576)
Rank
Country
n
Cum %
Rank
Country
n
Cum %
1
China
58
10.1
14
Netherlands
9
75.3
2
USA
49
18.6
New Zealand
9
76.9
3
Spain
37
25.0
15
Finland
7
78.1
4
Malaysia
35
31.1
Norway
7
79.3
5
UK
32
36.6
Pakistan
7
80.6
6
Australia
31
42.0
16
Belgium
6
81.6
7
Iran
19
45.3
South Korea
6
82.6
Turkey
19
48.6
South Africa
6
83.7
8
Canada
18
51.7
UAE
6
84.7
Germany
18
54.9
17
France
5
85.6
Singapore
18
58.0
Oman
5
86.5
9
India
17
60.9
18
Mexico
4
87.2
Taiwan
17
63.9
Cyprus
4
87.8
10
Saudi Arabia
14
66.3
Greece
4
88.5
11
Indonesia
12
68.4
Thailand
4
89.2
12
Brazil
11
70.3
16
13
Colombia
10
72.0
other
62
100.00
Portugal
10
73.8
Collaboration. SRs are time and labor-intensive undertakings. Only 10 of the 576 (1.7%) reviews were
carried out by a single author. A SR research team of the included reviews consisted of three to four
collaborators on average (M = 3.6, SD = 2.2). This finding is in line with examples of systematic reviews
from the field of education described in Zawacki-Richter et al. (2020), in which the number of team
members for SRs usually ranged between two and five.
3.2 RQ2: Quality of SRs in ODDE
Our second research question deals with the quality of SRs in ODDE, especially with regard to whether
the elements are in place that make a SR reproducible. In the first step, we analyzed how many databases
were used for in the literature search, and which document types were included in the SRs.
Databases, and document types. The number of databases used to retrieve studies in various document types
ranged between one to 35, with Tavares (2022) using 35 databases in a SR on the impact of game-based
learning on the learning experience of nursing undergraduate students. The median number of databases
queried was four.
Of the 576 included articles, the vast majority of SRs included journal articles (477; 82.8%), followed by
conference proceedings (108; 18.8%), book chapters (32; 5.6%), books (18; 3.1%), reports (16; 2.8%),
dissertations (14, 2.4%), theses (11; 1.9%), and other (12; 2.1%). Twenty-nine (5.0%) SRs did not specify
the document type of included studies, and 352 (61.1%) SRs included only journal articles. Twenty-three
(4.0%) covered four document types, and 55 (9.5%) included three or more document types.
Filtering and number of included studies. There was a wide range of included studies in the review at each stage
(see Figure 3). The total N found after removing duplicates ranged from 23 to 43,315 (median = 546).
The filtration process in each step of the SR can greatly reduce the number of studies. For example,
Borah et al. (2017) reported a final average yield rate below 3% for 195 SRs on medical interventions. In
17
the present umbrella review, the median percentage of finally included studies in the 576 included SRs
was 6.5%, with a total N included ranging from zero (Stracke, 2019) to 1,986 (Bozkurt, 2022).
Reporting items for SRs. According to the Institute of Medicine (US) Committee on Standards for Systematic
Reviews of Comparative Effectiveness Research (Eden et al., 2011), the major reporting guideline for SRs
and meta-analysis is PRISMA (Preferred Reporting Items for SRs and Meta-Analysis; Moher et al., 2009).
The following analysis is therefore also based on the reporting items of this standard. According to
PRISMA, the following information must be reported so that the reviews are comprehensible and
reproducible: the search strategy documented in a protocol with information about data sources/data
basis, the full Boolean search string, eligibility criteria for study selection, documentation of the filtering
process in a PRISMA chart, coding and interrater reliability, quality appraisal and risk of bias within
studies, and a discussion of limitations at outcome and review level. Table 5 shows the extent to which
these items were reported in the 576 SRs included here.
Figure 3
The number of papers at each stage of processing
18
Note. Grey lines show individual SRs, and overlaid boxplots show the median, upper- and lower-quartiles,
and whiskers extending 1.5 times the interquartile range. The y-axis is presented in a log scale to display
the positively skewed distributions. Following Borah et al.’s (2017, Figure 1) example, this figure omits
studies with counts that deviated more than 2.5 SDs of the group average; in addition, Stracke’s (2019)
study was omitted as it had zero final includes.
It must be noted that many published SRs did not adhere to PRISMA or any other reporting guidelines.
Specifically, approximately one-third of these reviews did not include information on the search string,
omitted a discussion of their limitations, or lacked the inclusion of PRISMA flow chart. Furthermore,
over 80% of the reviews did not address the issue of interrater reliability. Almost three-quarters of the
reviews made no statements regarding the quality of the included studies. Less than 10% published a
review protocol that documented the SR process in detail. This was not surprising as protocol registration
is uncommon and relatively new in the field of education, although it is important for the reproducibility
of a SR (Pieper & Rombey, 2022).
Table 5
Items reported according to PRISMA in the SRs (N = 576)
Reporting item
Yes
%
Search string
376
65.3
Eligibility criteria
524
91.0
PRISMA chart
380
66.0
Interrater reliability
111
19.3
Quality appraisal
149
25.9
Discussion of limitations
358
62.2
Review protocol
44
7.6
19
The results presented in Table 5 are generally consistent with Buntins et al.’s (2023) study looking at 446
evidence synthesis in educational technology research. The number of reviews that reported the complete
search string (44%) and the inclusion/exclusion criteria (62%) was even lower here. About the same
proportion (26%) carried out a quality appraisal.
SR Quality Index Score (QIS). To quantify the quality of the 576 SRs in ODDE, we developed a simple
index that combines the reporting items into a Quality Index Score (QIS). The QIS ranges between 0 and
100, reporting items are treated as dichotomous variables (1 = yes, 0 = no), and items that are essential to
ensure the reproducibility of a SR are weighted more; that is, QIS = 10*(2*string + 2*criteria + 2*prisma
chart + 1.5*limitations + quality appraisal + interrater reliability + 0.5 protocol). The application of this
formula reveals the following distribution of the QIS (Figure 4 and Table 6).
Figure 4
Bar chart of QIS distribution (N = 576)
Table 6
0
25
50
75
0 25 50 75 100
Quality Index Score (QIS)
Frequency
20
Distribution of the QIS (N = 576)
QIS
n
%
Cum. %
QIS
n
%
Cum. %
0
33
5.7
5.7
50
14
2.4
32.1
5
0
0.0
5.7
55
62
10.8
42.9
10
0
0.0
5.7
60
48
8.3
51.2
15
4
0.7
6.4
65
29
5.0
56.3
20
41
7.1
13.5
70
23
4.0
60.2
25
0
0.0
13.7
75
95
16.5
76.7
30
6
1.0
14.6
80
6
1.0
77.8
35
29
5.0
19.6
85
81
14.1
91.8
40
52
9.0
28.6
90
18
3.1
95.0
45
6
1.0
29.7
95
18
3.1
98.1
50
14
2.4
32.1
100
11
1.9
100.0
Surprisingly, 33 reviews published under the label of a SR did not follow the SR process at all. Seventy-
eight studies (13.5 %) had a very low QIS of 20, the median QIS was 60, the first and third quartile
ranged between 40 and 75 (see Figure 5), and only 47 studies (8.2 %) reached a QIS of 90.
These results are in line with Bond et al.’s (2024) study, which assessed 66 systematic reviews in the field
of artificial intelligence in higher education against the AMSTAR quality criteria. The reviews included in
this meta-analysis reached an average quality assessment score of 6.57 (out of 10).
Figure 5
Boxplot of QIS distribution (N = 576)
21
In light of these findings, it is pertinent to ask how so many reviews are published as systematicdespite
obviously not following the steps in a SR process. Therefore, in the next section, we take a closer look at
how the quality of SRs is related to the quality or ranking of the journals in which they are published.
Only 11 (1.9%) SRs received a maximum QIS score of 100 (see Appendix B). Nine of them were
published in medical or health science journals, one in the top-ranked review journal Review of Educational
Research, and only one in a specialized educational technology journal, the Australasian Journal of Educational
Technology.
SR and Journal Quality. For convenience, we use the SCImago Journal Rank (SJR)1 as an indicator of a
journal’s impact. It is a measure based on the Scopus database that accounts for the number of citations
received by a journal and the prestige of the citing journals during the previous three years.
Journals with a lower SCImago ranking tend to publish SRs with lower quality. Table 7 shows a cross-
tabulation of the four SCImago quartiles with the mean QIS and impact scores for each quartile of
journal impact.
1 https://www.scimagojr.com/journalrank.php (accessed June 8, 2023)
0.4
0.2
0.0
0.2
0.4
0 25 50 75 100
SR Quality Index Score (QIS)
22
Table 7
Mean QIS and journal impact measures (n = 49 journals)
SJR Quartile
MQIS
MSJR score
MH-Index
n
Q1
65.9
1.43
27
28
Q2
58.3
0.59
36
12
Q3
47.0
0.29
22
7
Q4
59.0
0.17
20
2
Eliminating the fourth quartile with only two journals reveals the distribution as shown in Figure 6. The
association between QIS scores and SRJ between the first and third quartile is statistically significant, χ2 =
65.7, df = 66, p < .05.
23
Figure 6
Distribution of QIS scores by SRJ quartiles 1 to 3 (n = 47 journals)
3.3 RQ 3: Methods of quality appraisal in SRs
As shown in Table 5, only around a quarter (n = 149; 25.9 %) of the SRs in ODDE carried out a
comprehensive critical quality appraisal (Petticrew & Roberts, 2005) of the included studies. Most of the
576 SRs (n = 352; 61.1 %) include journal articles as the sole document type and thus relied on quality
assurance by the editors and the peer review process. However, Newman and Gough (2020) remind us
that it is evident that using simple criteria, such as ‘published in a peer-reviewed journal’ as a sole
indicator of quality, is not likely to be an adequate basis for considering the quality and relevance of a
study for a particular systematic review" (p. 13). Some SRs have also used simple bibliometric measures to
ensure the quality of the included studies. For example, Talib et al. (2021) selected only papers that
belonged to the first or second quartile in the SCImago Journal Rank (SJR), and Law and Heintz (2021)
used Google Scholar Metrics and the SJR h-index to select journals in educational technology.
20
40
60
80
1 2 3
SJR Quartile
Quality Index Score (100)
24
The 149 studies that did conduct a quality appraisal used a range of frameworks, tools, and checklists
(Table 8). Seven reviews applied two quality-appraisal tools; for example, He et al. (2022) combined the
Cochrane RoB with MERSQI in a review comparing synchronous distance education versus traditional
education for health science students. Forty-eight tools were used only once, of which 31 were developed
or customized by the authors; these are summarized here under "other". The following section briefly
describes the tools that were used more than five times.
Table 8
Frequency of study quality-appraisal tools applied
Study quality appraisal tool
n
Cochrane's Risk of Bias Tool (Cochrane RoB)
30
Medical Education Research Study Quality Instrument (MERSQI)
18
Mixed Methods Appraisal Tool (MMAT)
12
Joanna Briggs Institute Critical Appraisal Checklist (JBI)
9
Guidelines for Systematic Lit. Reviews in Software Engineering (Kitchenham & Charters, 2007)
9
Grading of Recommendations Assessment, Development and Evaluation (GRADE)
7
Critical Appraisal Skills Program (CASP) Checklists
6
QualSyst Checklists (Alberta Heritage Foundation for Medical Research)
5
A MeaSurement Tool to Assess Systematic Reviews (AMSTAR)
2
Appraisal Tool for Cross-Sectional Studies (AXIS)
2
Methodological Index for Non-Randomized Studies (MINORS)
2
Newcastle Ottawa Scale (NOS)
2
Oxford Centre for Evidence-Based Medicine (OCEBM) Levels of Evidence
2
Physiotherapy Evidence Database (PEDro) Scale
2
other
48
Total
156
25
Cochrane's Risk of Bias Tool (RoB; n = 30). Version 2 of the risk of bias tool for randomized trials was
published in the Cochrane Handbook for Systematic Reviews of Interventions 2008 and updated in 2011
(Higgins et al., 2011). This tool addresses a set of biases focusing on different aspects of design, conduct,
and reporting. Each domain is judged based on a series of questions that signal ‘low’ or ‘high risk’ of bias.
For example, Min, et al. (2022) applied the tool in a SR of the effectiveness of serious games in nurse
education to assess the methodological quality of three randomized controlled trials (RCTs).
Medical Education Research Study Quality Instrument (MERSQI; n = 18). This instrument was introduced in
2007 to appraise the quality of studies in the field of medical education (Reed et al., 2007). For example,
MERSQI was used in reviews to appraise the quality of studies about virtual reality in nursing education
(Choi et al., 2022) and social media in undergraduate medical education (Guckian et al., 2021). Al Asmri
and colleagues (2023) presented a modified version of the weighted MERSQI scoring system with items
related to study design, sampling, study setting, type of data, validity, data analysis, and outcomes.
Mixed Methods Appraisal Tool (MMAT; n = 12). This tool assesses the methodological quality of qualitative
research, randomized controlled trials, non-randomized studies, quantitative descriptive studies, and
mixed methods studies, with detailed explanations provided in the user guide (Hong et al., 2018) to
evaluate the methodological quality criteria. An example using the MMAT tool is Butler-Henderson and
Crawford’s (2020) study, which synthesized 36 articles about online examinations and used MMAT to
distinguish between low-, medium-, and high-quality studies.
Joanna Briggs Institute Critical Appraisal Checklist (JBI; n = 9). JBI (2017) is a checklist for SRs and research
synthesis with 11 items that include criteria that address the possibility of bias in the design, conduct, and
analysis of included studies. For example, the JBI checklist was used in SRs of digital storytelling (Moreau
et al., 2018) and serious games (Thangavelu et al., 2022) in health education.
Guidelines for Systematic Literature Reviews in Software Engineering (n = 9). The guidelines developed by
Kitchenham and Charters (2007) provide an overview of how to conduct a SR, including a study quality
assessment instrument (pp. 20-29). Quality checklists regarding the design, conduct, analysis, and
conclusions are provided for quantitative and qualitative studies. For example, Garcia et al. (2020) used
26
these checklists to review the effects of game-based learning in undergraduate software engineering
courses.
Grading of Recommendations Assessment, Development and Evaluation (GRADE; n = 7). The Cochrane GRADE
Handbook (Schünemann et al., 2013) describes the process of rating the quality of studies to develop
evidence-based healthcare recommendations. The quality of evidence is rated for each outcome across
studies resulting in an assessment in one of four quality of evidence grades (high, moderate, low, or very
low). For example, Kyaw et al. (2019) assessed the quality of evidence of 12 RCTs in a SR of the
effectiveness of digital education on communication skills among medical students.
Critical Appraisal Skills Program (CASP) Checklists (n = 6). The development of the CASP checklists began
in the 1990s to help healthcare decision-makers understand scientific evidence (Singh, 2013). Checklists
are available for SRs in general and, among others, for RCTs, cohort studies, and qualitative studies. For
example, Vanzella and colleagues (2022) used CASP for a SR of qualitative studies evaluating virtual
education in cardiac rehabilitation.
The seven most frequently used tools described above account for over half (n = 91; 58.3%) of the
quality appraisals in the studies included here, of which 76 (83.5%) were conducted in the context of
medical and health sciences in higher education settings. Only 15 (16.5%) SRs that applied one or more
of the most frequently used quality-appraisal tools came from dedicated ODDE fields (see Zawacki-
Richter & Bozkurt, 2023), such as distance education, educational technology, online learning, and
computer science from K-12 to higher education; of these 15, five were published in the leading journal
Computers & Education.
3.4 Content Analysis
Following the 3M-Framwork (Zawacki-Richter, 2009; Zawacki-Richter & Bozkurt, 2023), the reviews in
this umbrella review were coded along the lines of macro-level research (ODDE systems, theory,
methods, and global perspectives), meso-level research (management and organization of ODDE
institutions), and micro-level research (teaching and learning in ODDE settings).
27
As shown in Table 9, most SRs dealt with issues related to teaching and learning (77.6%) and educational
management on the institutional level (15.4%). Only 17 reviews (2.7%) focused on topics on the system
level; for example, Gama and colleagues (2022) reviewed the benefits and challenges of online learning in
Malawi’s higher education system; King et al. (2018) looked at the potentials of Massive Open Online
Courses (MOOCs) and Open Educational Resources (OER) in the Global South; and Ramírez-Montoya
(2022) reviewed dimensions of open education in Latin America in the light of UNESCO’s (2019)
recommendations on OER. Over half (56.9%) of the included reviews focused on ODDE in higher
education settings, followed by 23.7% which focused on K-12. Adult and continuing education and
Technical and Vocational Education and Training (TVET) are areas of education that received little
attention in the ODDE literature (see Table 10).
Table 9
Research and education level (N = 576)
Research level
n
%
System level (macro)
17
2.7
Institution level (meso)
98
15.4
Teaching and learning level (micro)
495
77.6
Not specified
28
4.4
Total
638
100.0
Note. Each SR could be coded for more than one research level.
28
Table 10
Education level (N = 576)
Education level
n
%
Higher education
405
56.9
K-12
169
23.7
Adult and continuing education
52
7.3
TVET
8
1.1
Not specified
78
11.0
Total
712
100.0
Note. Each SR could be coded for more than one education level.
An overall analysis with the text-mining tool LeximancerTM was run with titles and abstracts of all 576
articles. The concept map depicted in Figure 7 shows five main thematic areas: education and the
COVID experience, learning design, effectiveness, artificial intelligence in education (AIEd), and flipped
learning.
The content areas covered in the publications are described below along the lines of these major thematic
regions. The articles were also coded according to their thematic research focus (see Table 11).
29
Figure 7
Concept map based on 576 titles and abstracts
Education and the COVID Experience. Given the chosen time period of 2018 to 2022, it is no surprise that
many reviews (n = 53) dealt with the experiences of online learning and teaching during the COVID-19
pandemic (see concept path e-learning-distance-covid-pandemic-classroom) and the challenges faced by
educational institutions and instructors (see concept path challenges-institutions-resources-instructors).
Several SRs investigated the students’ experiences during emergency remote teaching (del Socorro Torres-
Caceres et al., 2022; Nasution et al., 2022; Ozdamli & Karagozlu, 2022), their satisfaction and motivation
(Aznam et al., 2022; Ranadewa et al., 2021) in different subject areas; for example, STEM (Alangari, 2022)
or health science (Mutalib et al., 2022; Pires, 2022). Aljedaani et al. (2022) contributed by reviewing 34
articles on the challenges of online learning for deaf and hearing-impaired students. Wen et al. (2021)
reviewed 19 studies on the design and implementation of home-based learning in K-12, and emphasized
30
pedagogical strategies that adopted parental involvement, clear communication between teachers and
parents, peer learning, and synchronous video conferencing.
Regarding teachers’ and faculty members’ experiences, Na and Jung’s (2021) SR synthesised eight studies
on the challenges in online teaching during the pandemic of instructors from universities. They identified
challenges including managing online classes, using the learning platform, and interacting with and
engaging students. They also matched the challenges with identified causes (e.g., lack of skills for online
teaching, lack of support, infrastructure issues) to make recommendations for professional development
and the design of online courses. Li and Yu (2022) synthesized 21 studies dealing with instructors
changing roles in online teaching, their satisfaction, and their digital literacy. Numerous studies found that
teacher’s satisfaction levels dropped with the advent of the pandemic and that they were quickly in
emotional exhaustion. Similarly, Nang et al. (2022) reviewed 52 studies about school teacher’s stress
factors and coping mechanisms during online teaching.
Table 11
Research topics (N = 576 SRs) by count, percentage, and cumulative percentage.
Topic
n
%
Cum %
Online Learning during COVID-19 Pandemic
53
9.0
9.0
AI in education
46
7.8
16.8
Game-based learning, gamification
38
6.4
23.2
Virtual Reality/Augmented Reality
31
5.3
28.5
Blended Learning
27
4.6
33.1
Computer-assisted language learning (CALL)
22
3.7
36.8
Flipped Learning
22
3.7
40.5
(Digital) competencies of learners and teachers
21
3.6
44.1
Open Education, OER, MOOCs,
21
3.6
47.6
Assessment
15
2.5
50.2
31
Learning analytics
15
2.5
52.7
EdTech systems and new technologies
13
2.2
54.9
Technology acceptance
12
2.0
56.9
Inclusion, diversity and special needs
10
1.7
58.6
Mobile learning
9
1.5
60.2
Digital transformation of educational institutions
7
1.2
61.4
Social media
7
1.2
62.5
Student engagement
7
1.2
63.7
Communication and collaboration in online environments
6
1.0
64.7
Professional development
5
0.8
65.6
Digital storytelling
4
0.7
66.3
Online doctoral education
2
0.3
66.6
other
198
33.6
100.0
Total
590
From an institutional perspective, Paposa and Paposa (2022) reviewed factors influencing service quality
and learners’ satisfaction in online classrooms. Crompton, et al. (2022) carried out a SR (N = 57) of
support provided for K-12 teachers teaching at a distance due to COVID-19 but also during emergencies
in general, including biological, human-caused disasters, and natural disasters. Support for the teacher’s
mental health and wellness in difficult times emerged as an important topic linked with digital pedagogical
practice and reaching isolated teachers with remote training.
Effectiveness of online learning interventions. In light of the experiences with online learning during the
pandemic, several SRs, particularly in the field of medicine and health sciences, focused on the
effectiveness of online learning interventions. In Figure 7, effectiveness forms a thematic region of its own in
the concept map (see concept path effects-pubmed-risk-satisfaction-interventions-medical/health). For example,
Wilcha (2020) synthesized 31 studies on the advantages and disadvantages of virtual medical teaching for
medical students during the COVID-19 pandemic. Student engagement and well-being were found to be
32
negatively affected while peer mentoring was helpful for learning and providing psychological support.
Rahayuwati and colleagues (2021) conducted a SR of 22 studies on the effectiveness of “tele-education”
(e-learning, virtual, and digital learning) published between 2002 and 2020. They found a positive effect of
various types of tele-education on academic performance and concluded that the application of online
learning benefitted students, particularly during the COVID-19 pandemic.
Flipped learning. The flipped classroom concept stands out and is connected with the thematic area of
effectiveness via health-intervention-medical. The flipped (or inverted) classroom approach (Bergman & Sams,
2012) allows students to learn and review course materials (often video lectures) before attending a class,
where they can apply and strengthen their knowledge during interactive sessions. Twenty-two SRs in the
corpus dealt with the topic of flipped learning (see Table 12), of which seven were situated in the field of
medical and health science education (e.g., Evans et al., 2019; Park et al., 2021), followed by four reviews
of flipped learning in mathematics and statistics (e.g., Farmus et al., 2020; Fung et al., 2021) and three in
language learning (e.g., Baltaci, 2022). Three other SRs (Gianoni-Capenakas et al., 2019; Özbay & Çınar,
2021; Xu et al., 2019) investigated the effectiveness of flipped learning interventions in nursing and dental
education.
Learning design. Apart from flipped learning, many more SRs were conducted on different learning design
approaches and methods using digital media and tools; for example, game-based learning (n = 38; e.g.,
Udeozor et al., 2022, on digital games in engineering education), learning with virtual and augmented
reality (n = 31; e.g., Moro et al., 2021, on virtual and augmented reality enhancements for physiology and
anatomy learning), learning analytics (n = 15; e.g., Mavroudi, 2018, on using learning analytics for the
design of personalized learning opportunities), assessment (n = 15; e.g., Muzaffar, 2021, on online exam
solutions), digital storytelling (n = 4; e.g., Wu & Chen, 2021, on digital storytelling in K-12 education), and
different delivery modes, such as blended learning (n = 27; e.g., Atmacasoy & Aksu, 2018, on blended
learning in teacher education), OER and MOOCs (n = 21; e.g., Axe et al., 2020, on student experiences of
open educational practices), or mobile learning (n = 9; e.g., Calderón-Garrido et al., 2022, on the use of
mobile phones in the classroom).
33
In the thematic region of learning design, teachers played a prominent role in facilitating online learning
for cognitive skills and competence development; see concept path learning-teachers-role-cognitive-competence,
which is linked via communication with effectiveness. An important function of teachers was also to support
collaborative online learning processes and provide feedback (see teachers-role-collaborative-feedback). For
example, Fehrman and Watson (2020) reviewed 35 studies on tools and strategies to facilitate
asynchronous online discussions in online higher education. Penn and Brown (2022) investigated
feedback for student learning in higher education via screencast; the majority of the 15 included studies
revealed positive effects of screencast feedback being more personal, supportive, and easier to understand
than plain-text feedback.
Twenty-two SRs dealt with issues related to computer-assisted language learning (CALL, see language-
learning). For example, Vorobel’s (2022) review focused on the methodology of 197 empirical studies on
distance language teaching published between 2011 and 2020. Bahari (2020) reviewed 97 articles
published from 2012 to 2020 about the theoretical and pedagogical affordances and challenges of
correcting second language learners via computer-mediated feedback in blended and distance learning.
Huang and colleagues (2022) synthesized reviews on the use of chatbots for language learning.
Artificial intelligence in education (AIEd). The introduction of AI methods and tools in ODDE is a dynamic
and growing field of educational technology application and research (not limited to the release of
ChatGPT in November 2022 and other generative AI applications; see Zawacki-Richter et al., 2024).
Forty-six SRs of AIEd were included in the corpus, forming a thematic region at the intersection of
learning design. This number is comparable with the 66 SRs from 2018 up to 2023 discussed in a recent
meta-review of AIEd by Bond et al. (2024). The reviews in this area explore the challenges and
opportunities that AI applications might afford for teaching and learning (see concept path challenges-
systems-opportunities-media-adoption-AI). For example, Wollny et al. (2021) conducted a SR with 74 studies
exploring the pedagogical roles of chatbots, the use of chatbots for tutoring, and their potential to
personalize education.
The majority of AIEd research was conducted in higher education settings (see also Bond et al., 2024;
Zawacki-Richter et al., 2019), but there were also more and more applications on the K-12 level (e.g., Yue
34
et al., 2022; Zafari et al., 2022). While many studies on AIEd emphasize the very positive potential of AI,
Crompton et al. (2022) also take a critical look at the challenges associated with the use of AI in K-12; in
the review of 169 studies, they identified challenges that included AI tools and devices being a distraction,
students’ and teachers’ lack of understanding of AI technology and methods, privacy concerns, and
biases.
4 Conclusion and further research perspectives
Our umbrella mapping review confirms that more and more SRs in the field of ODDE are being
published in a variety of journalsalmost twice as many in 2022 as in 2021. The classic journals in ODDE
and educational technology are represented with Computers & Education in the first place, and the British
Journal of Educational Technology or the International Review of Open & Distributed Learning. As the methodology
of SR comes from the health sciences and medicine, it is not surprising that many studies are included in
journals from this field (e.g., Nurse Education Today, Medical Education, or Nurse Education in Practice).
Regarding the methodological quality of the SR, it was not surprising that the most rigorous reviews also
come from authors with a background in health sciences and medicine, where the methodology has been
used for many years. In contrast, many original reviews from the field of ODDE are gaining their first
experience. However, there is clear room for improvement here, as was already shown for the field of
educational technology (Buntins, et al., 2023) and the research area of AIEd (Bond et al, 2024). Our
evaluation shows that some key steps and information about the SR process were not reported or
documented (e.g., the search string or explicit inclusion and exclusion criteria, information about database
settings, etc.) in the ODDE SRs; these omissions make it impossible to replicate or update the results.
Reviews in which such essential information is missing should not be published. It was also notable that
the higher the reputation and impact of the journals, the more selective they are, and the higher the
quality of the SRs published there.
Only 25.9 % of the SRs carried out a critical quality appraisal of the included studies . The quality
appraisal tools most frequently used in this corpus come again from the health sciences and medicine.
35
The development of tools for assessing the methodological quality of dedicated educational research
would be desirable.
The thematic content analysis shows that most of the SRs included (over three-quarters) deal with
interventions on the micro-level of teaching and learning with digital media and tools. Particularly with
regard to the digital transformation of educational institutions, more SRs at the macro- and meso-levels
would be desirable to inform evidence-based organizational development.
The 576 included SRs in this umbrella mapping review cover a wide range of different topics in ODDE.
In the time period analyzed, it was to be expected that many reviews dealt with the experiences of the
COVID-19 pandemic. Other leading topics were AIEd, game-based learning, virtual and augmented
reality, blended learning, computer assisted language learning, flipped learning, digital competencies of
learners and teachers, OER, and MOOCs; together, these account for almost half of the evidence
synthesis included. These subject areas offer a critical mass of studies which, based on this mapping,
would warrant in-depth synthesis of the findings in follow-up umbrella reviews of each specific area.
, The primary studies that are synthesized in a SR can help to inform evidence-based practice and
interventions in the digital transformation of education. However, we need to ensure that the SRs are
reliable and reproducible to be useful to inform evidence-based practice and policy. Although the
application of the SR method is still relatively new in the field of ODDE, our results suggests that work
still needs to be done to improve the quality and rigor of published SRs. On the one hand, researchers
need appropriate training in the methodology. On the other hand, editors who serve as gatekeepers for
journal quality must guarantee that only methodologically sound systematic reviews are published.
Appointing experts for evidence synthesis to the editorial team to handle the peer-review process of
submissions should help to achieve this aim. We hope this umbrella mapping review draws attention to
these issues and serves as a basis for further methodological improvement of SRs in education,
particularly to guide the dynamic developments in the field of ODDE as it undergoes profound changes
in the digital transformation.
36
References
Abdull Mutalib, A. A., Md. Akim, A., & Jaafar, M. H. (2022). A systematic review of health sciences
students’ online learning during the covid-19 pandemic. BMC Medical Education, 22(1).
https://doi.org/10.1186/s12909-022-03579-1
Abu Talib, M., Bettayeb, A. M., & Omer, R. I. (2021). Analytical study on the impact of technology in
higher education during the age of covid-19: Systematic literature review. Education and Information
Technologies, 26(6), 67196746. https://doi.org/10.1007/s10639-021-10507-1
Al Asmri, M., Haque, M. S., & Parle, J. (2023). A Modified Medical Education Research Study Quality
Instrument (MMERSQI) developed by Delphi consensus. BMC Medical Education, 23(1), 63.
https://doi.org/10.1186/s12909-023-04033-6
Alangari, T. S. (2022). Online stem education during covid-19 period: A systematic review of perceptions
in higher education. Eurasia Journal of Mathematics, Science and Technology Education, 18(5).
https://doi.org/10.29333/ejmste/11986
Aljedaani, W., Krasniqi, R., Aljedaani, S., Mkaouer, M. W., Ludi, S., & Al-Raddah, K. (2022). If online
learning works for you, what about deaf students? Emerging challenges of online learning for deaf
and hearing-impaired students during covid-19: A literature review. Universal Access in the Information
Society. https://doi.org/10.1007/s10209-022-00897-5
Aromataris, E., & Pearson, A. (2014). The Systematic Review: An Overview. AJN, American Journal of
Nursing, 114(3), 5358. https://doi.org/10.1097/01.NAJ.0000444496.24228.2c
Atmacasoy, A., & Aksu, M. (2018). Blended learning at pre-service teacher education in turkey: A
systematic review. Education and Information Technologies, 23(6), 23992422.
https://doi.org/10.1007/s10639-018-9723-5
Axe, J., Childs, E., DeVries, I., & Webster, K. (2020). Student experiences of open educational practices:
A systematic literature review. Journal of E-Learning and Knowledge Society, 16(4), 6775.
https://doi.org/10.20368/1971-8829/1135340
Aznam, N., Perdana, R., Jumadi, J., Nurcahyo, H., & Wiyatmo, Y. (2022). Motivation and satisfaction in
online learning during covid-19 pandemic: A systematic review. International Journal of Evaluation and
Research in Education, 11(2), 753762. https://doi.org/10.11591/ijere.v11i2.21961
Bahari, A. (2021). Computer-mediated feedback for l2 learners: Challenges versus affordances. Journal of
Computer Assisted Learning, 37(1), 2438. https://doi.org/10.1111/jcal.12481
Baltaci, H. S. (2022). A snapshot of flipped instruction in english language teaching in turkiye: A
systematic review. Turkish Online Journal of Distance Education (TOJDE), 23(4), 257270.
Bennett;, J., Lubben, F., Hogarth, S., & Campbell, B. (2005). Systematic reviews of research in science
education: rigour or rigidity? International Journal of Science Education, 27(4), 387406.
https://doi.org/10.1080/0950069042000323719
Bergmann, J., & Sams, A. (2012). Flip your classroom: Reach every student in every class every day. International
Society for Technology in Education.
Biondi-Zoccai, G. (Ed.). (2016). Umbrella Reviews: Evidence Synthesis with Overviews of Reviews and Meta-
Epidemiologic Studies. Springer International Publishing. https://doi.org/10.1007/978-3-319-25655-9
Bond, M., Khosravi, H., De Laat, M., Bergdahl, N., Negrea, V., Oxley, E., Pham, P., Chong, S. W., &
Siemens, G. (2024). A meta systematic review of artificial intelligence in higher education: A call
for increased ethics, collaboration, and rigour. International Journal of Educational Technology in Higher
Education, 21(1), 4. https://doi.org/10.1186/s41239-023-00436-z
Borah, R., Brown, A. W., Capers, P. L., & Kaiser, K. A. (2017). Analysis of the time and workers needed
to conduct systematic reviews of medical interventions using data from the PROSPERO registry.
BMJ Open, 7(2), e012545. https://doi.org/10.1136/bmjopen-2016-012545
Booth, A. (2016). Searching for qualitative research for inclusion in systematic reviews: A structured
methodological review. Systematic Reviews, 5(1), 74. https://doi.org/10.1186/s13643-016-0249-x
Bozkurt, A. (2022). A retro perspective on blended/hybrid learning: Systematic review, mapping and
visualization of the scholarly landscape. Journal of Interactive Media in Education, 2022(1).
https://doi.org/10.5334/jime.751
Buntins, K., Bedenlier, S., Marín, V., Händel, M., & Bond, M. (2023). Methodische Ansätze zu
Evidenzsynthesen in der Bildungstechnologie: Eine tertiäre Übersichtsarbeit. MedienPädagogik:
37
Zeitschrift für Theorie und Praxis der Medienbildung, 54, 167191.
https://doi.org/10.21240/mpaed/54/2023.12.20.X
Butler-Henderson, K., & Crawford, J. (2020). A systematic review of online examinations: A pedagogical
innovation for scalable authentication and integrity. Computers and Education, 159.
https://doi.org/10.1016/j.compedu.2020.104024
Calderón-Garrido, D., Javier Ramos-Pardo, F., & Suárez-Guerrero, C. (2022). The use of mobile phones
in classrooms: A systematic review. International Journal of Emerging Technologies in Learning, 17(6),
194210.
Choi, J., Thompson, C. E., Choi, J., Waddill, C. B., & Choi, S. (2022). Effectiveness of immersive virtual
reality in nursing education: Systematic review. Nurse Educator, 47(3), E57E61.
Cooper, C., Booth, A., Varley-Campbell, J., Britten, N., & Garside, R. (2018). Defining the process to
literature searching in systematic reviews: A literature review of guidance and supporting studies.
BMC Medical Research Methodology, 18(1), 85. https://doi.org/10.1186/s12874-018-0545-3
Cretchley, J., Rooney, D., & Gallois, C. (2010). Mapping a 40-Year history with Leximancer: Themes and
concepts in the Journal of Cross-Cultural Psychology. Journal of Cross-Cultural Psychology, 41(3), 318
328. https://doi.org/10.1177/0022022110366105
Crompton, H., Burke, D., Jordan, K., & Wilson, S. (2022). Support provided for k-12 teachers teaching
remotely with technology during emergencies: A systematic review. Journal of Research on Technology in
Education, 54(3), 473489.
Crompton, H., Jones, M. V., & Burke, D. (2022). Affordances and challenges of artificial intelligence in k-
12 education: A systematic review. Journal of Research on Technology in Education.
https://doi.org/10.1080/15391523.2022.2121344
del Socorro Torres-Caceres, F., Méndez-Vergaray, J., Rivera-Arellano, E. G., Ledesma-Cuadros, M. J.,
Huayta-Franco, Y. J., & Flores, E. (2022). Virtual education during covid-19 in higher education: A
systematic review. Tuning Journal for Higher Education, 9(2), 189215.
https://doi.org/10.18543/tjhe.2217
Eden, J., Levit, L. A., Berg, A. O., & Morton, S. C. (Eds.). (2011). Finding what works in health care: Standards
for systematic reviews. National Academies Press.
Evans, L., Vanden Bosch, M. L., Harrington, S., Schoofs, N., & Coviak, C. (2019). Flipping the classroom
in health care higher education: A systematic review. Nurse Educator, 44(2), 7478.
Farmus, L., Cribbie, R. A., & Rotondi, M. A. (2020). The flipped classroom in introductory statistics:
Early evidence from a systematic review and meta-analysis. Journal of Statistics Education, 28(3), 316
325.
Fehrman, S., & Watson, S. L. (2021). A systematic review of asynchronous online discussions in online
higher education. American Journal of Distance Education, 35(3), 200213.
https://doi.org/10.1080/08923647.2020.1858705
Fung, C.-H., Besser, M., & Poon, K.-K. (2021). Systematic literature review of flipped classroom in
mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 17(6), Jan-17.
https://doi.org/10.29333/ejmste/10900
Gama, L. C., Chipeta, G. T., & Chawinga, W. D. (2022). Electronic learning benefits and challenges in
malawi’s higher education: A literature review. Education and Information Technologies, 27(8), 11201
11218. https://doi.org/10.1007/s10639-022-11060-1
Garcia, I., Pacheco, C., Méndez, F., & Calvo-Manzano, J. A. (2020). The effects of game-based learning in
the acquisition of “soft skills” on undergraduate software engineering courses: A systematic
literature review. Computer Applications in Engineering Education, 28(5), 13271354.
https://doi.org/10.1002/cae.22304
Gough, D., Oliver, S., & Thomas, J. (2017). An introduction to systematic reviews (2nd edition). SAGE.
Gianoni-Capenakas, S., Lagravere, M., Pacheco-Pereira, C., & Yacyshyn, J. (2019). Effectiveness and
perceptions of flipped learning model in dental education: A systematic review. Journal of Dental
Education, 83(8), 935945. https://doi.org/10.21815/JDE.019.109
Grant, M. J., & Booth, A. (2009). A typology of reviews: An analysis of 14 review types and associated
methodologies: A typology of reviews, Maria J. Grant & Andrew Booth. Health Information & Libraries
Journal, 26(2), 91108. https://doi.org/10.1111/j.1471-1842.2009.00848.x
38
Guckian, J., Utukuri, M., Asif, A., Burton, O., Adeyoju, J., Oumeziane, A., Chu, T., & Rees, E. L. (2021).
Social media in undergraduate medical education: A systematic review. Medical Education, 55(11),
12271241. https://doi.org/10.1111/medu.14567
He, L., Yang, N., Xu, L., Ping, F., Li, W., Sun, Q., Li, Y., Zhu, H., & Zhang, H. (2021). Synchronous
distance education vs traditional education for health science students: A systematic review and
meta-analysis. Medical Education, 55(3), 293308.
Higgins, J. P. T., J. Thomas, J. Chandler, M. Cumpston, T. Li, Matthew J. Page, and Vivian A.
Welch. (2022). Cochrane Handbook for Systematic Reviews of Interventions. Version 6.3: Cochrane.
Higgins, J. P. T., Altman, D. G., Gotzsche, P. C., Juni, P., Moher, D., Oxman, A. D., Savovic, J., Schulz,
K. F., Weeks, L., Sterne, J. A. C., Cochrane Bias Methods Group, & Cochrane Statistical Methods
Group. (2011). The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials.
BMJ, 343(oct18 2), d5928d5928. https://doi.org/10.1136/bmj.d5928
Hodges, C., Moore, S., Lockee, B., Trust, T., & Bond, A. (2020). The difference between Emergency
Remote Teaching and Online Learning. EDUCAUSE Review, March 27, 2020.
https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-
online-learning
Höfler, M., & Vasylyeva, T. (2023). Studienbewertung in systematischen Reviews der Bildungsforschung
Planungsschritte und Kriterien zur Prüfung der internen Validität von Interventionsstudien.
Zeitschrift für Erziehungswissenschaft, 26(4), 10291051. https://doi.org/10.1007/s11618-023-01160-0
Hong, Q. N., Pluye, P., Fabregues, S., Bartelett, G., Boardman, F., Cargo, M., Dagenais, P., Gagnon, P.-
M., Griffith, F., Nicolau, B., O’Cathain, A., Rousseau, M.-C., & Vedel, I. (2018). Mixed Methods
Appraisal Tool (MMAT). McGill Department of Medicine.
http://mixedmethodsappraisaltoolpublic.pbworks.com/w/file/fetch/127916259/MMAT_2018_c
riteria-manual_2018-08-01_ENG.pdf
Huang, W., Hew, K. F., & Fryer, L. K. (2022). Chatbots for language learningAre they really useful? A
systematic review of chatbot-supported language learning. Journal of Computer Assisted Learning,
38(1), 237257.
JBI. (2017). Checklist for Systematic Reviews and Research Syntheses (p. 7). Joanna Briggs Institute.
https://jbi.global/sites/default/files/2019-05/JBI_Critical_Appraisal-
Checklist_for_Systematic_Reviews2017_0.pdf
King, M., Pegrum, M., & Forsey, M. (2018). Moocs and oer in the global south: Problems and potential.
International Review of Research in Open and Distance Learning, 19(5), Feb-20.
https://doi.org/10.19173/irrodl.v19i5.3742
Kitchenham, B., & Charters, S. (2007). Guidelines for Performing Systematic Literature Reviews in Software
Engineering (p. 65). University of Keele, University of Durham.
https://citeseerx.ist.psu.edu/doc/10.1.1.117.471
Kyaw, B. M., Posadzki, P., Paddock, S., Car, J., Campbell, J., & Tudor Car, L. (2019). Effectiveness of
digital education on communication skills among medical students: Systematic review and meta-
analysis by the digital health education collaboration. Journal of Medical Internet Research, 21(8).
https://doi.org/10.2196/12967
Landis, J. R., & Koch, G. G. (1977). The Measurement of Observer Agreement for categorical data.
Biometrics, 33(1), 159. https://doi.org/10.2307/2529310
Law, E. L.-C., & Heintz, M. (2021). Augmented reality applications for k-12 education: A systematic
review from the usability and user experience perspective. International Journal of Child-Computer
Interaction, 30. https://doi.org/10.1016/j.ijcci.2021.100321
Leydesdorff, L. (1997). Why words and co-words cannot map the development of the sciences. Journal of
the American Society for Information Science, 48(5), 418427.
Li, M., & Yu, Z. (2022). Teachers’ satisfaction, role, and digital literacy during the covid-19 pandemic.
Sustainability (Switzerland), 14(3). https://doi.org/10.3390/su14031121
Liesch, P. W., Håkanson, L., McGaughey, S. L., Middleton, S., & Cretchley, J. (2011). The evolution of
the international business field: A scientometric investigation of articles published in its premier
journal. Scientometrics, 88(1), 1742. https://doi.org/10.1007/s11192-011-0372-3
Makel, M. C., & Plucker, J. A. (2014). Facts Are More Important Than Novelty: Replication in the
Education Sciences. Educational Researcher, 43(6), 304316.
https://doi.org/10.3102/0013189X14545513
39
Mavroudi, A., Giannakos, M., & Krogstie, J. (2018). Supporting adaptive learning pathways through the
use of learning analytics: Developments, challenges and future opportunities. Interactive Learning
Environments, 26(2), 206220. https://doi.org/10.1080/10494820.2017.1292531
Min, A., Min, H., & Kim, S. (2022). Effectiveness of serious games in nurse education: A systematic
review. Nurse Education Today, 108, N.PAG-N.PAG.
Moro, C., Birt, J., Stromberga, Z., Phelps, C., Clark, J., Glasziou, P., & Scott, A. M. (2021). Virtual and
augmented reality enhancements to medical and science student physiology and anatomy test
performance: A systematic review and meta-analysis. Anatomical Sciences Education, 14(3), 368376.
https://doi.org/10.1002/ase.2049
Muzaffar, A. W., Tahir, M., Anwar, M. W., Chaudry, Q., Mir, S. R., & Rasheed, Y. (2021). A systematic
review of online exams solutions in e-learning: Techniques, tools, and global adoption. IEEE
Access, 9, 3268932712. https://doi.org/10.1109/ACCESS.2021.3060192
Na, S., & Jung, H. (2021). Exploring university instructors’ challenges in online teaching and design
opportunities during the covid-19 pandemic: A systematic review. International Journal of Learning,
Teaching and Educational Research, 20(9), 308327. https://doi.org/10.26803/ijlter.20.9.18
Nang, A. F. M., Maat, S. M., & Mahmud, M. S. (2022). Teacher technostress and coping mechanisms
during covid-19 pandemic: A systematic review. Pegem Egitim ve Ogretim Dergisi, 12(2), 200212.
https://doi.org/10.47750/pegegog.12.02.20
Nasution, A. K. P., Nasution, M. K., Batubara, M. H., & Munandar, I. (2022). Learning during covid-19
pandemic: A systematic literature review. International Journal of Evaluation and Research in Education,
11(2), 639648. https://doi.org/10.11591/ijere.v11i2.21917
Nicholas, D., Watkinson, A., Jamali, H. R., Herman, E., Tenopir, C., Volentine, R., Allard, S., & Levine,
K. (2015). Peer review: Still king in the digital age. Learned Publishing, 28(1), 1521.
https://doi.org/10.1087/20150104
Nichols, M. (2023). Commentary: What, exactly, is ‘online’ education? Journal of Learning for Development,
10(2), 142148. https://doi.org/10.56059/jl4d.v10i2.1054
Nichols, M. (2024). What’s in a name? Wrestling with ‘ODDE’. Journal of Open, Distance, and Digital
Education, 1(1), 116. https://doi.org/10.25619/FD6DCH73
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science,
349(6251), Article aac4716. https://doi.org/10.1126/science.aac4716
Özbay, Ö., & Çınar, S. (2021). Effectiveness of flipped classroom teaching models in nursing education:
A systematic review. Nurse Education Today, 102, N.PAG-N.PAG.
Ozdamli, F., & Karagozlu, D. (2022). Online education during the pandemic: A systematic literature
review. International Journal of Emerging Technologies in Learning, 17(16), 167193.
https://doi.org/10.3991/ijet.v17i16.32287
Park, J. H., Han, W. S., Kim, J., & Lee, H. (2021). Strategies for flipped learning in the health professions
education in south korea and their effects: A systematic review. Education Sciences, 11(1), 09-Sep.
Penn, S., & Brown, N. (2022). Is Screencast Feedback Better Than Text Feedback for Student Learning
in Higher Education? A Systematic Review. Ubiquitous Learning: An International Journal, 15(2), 118.
Petticrew, M., & Roberts, H. (2006). Systematic reviews in the social sciences: A practical guide. Blackwell Pub.
Pieper, D., & Rombey, T. (2022). Where to prospectively register a systematic review. Systematic Reviews,
11(1), 8. https://doi.org/10.1186/s13643-021-01877-1
Pires, C. (2022). Perceptions of pharmacy students on the e-learning strategies adopted during the covid-
19 pandemic: A systematic review. Pharmacy, 10(1). https://doi.org/10.3390/pharmacy10010031
Rahayuwati, L., Pramukti, I., & Susanti, R. D. (2021). The effectiveness of tele-education for health field
university students as a learning method during a covid-19 pandemic: A systematic review. Open
Access Macedonian Journal of Medical Sciences, 9, 159163. https://doi.org/10.3889/oamjms.2021.7350
Ramírez-Montoya, M. S. (2022). Analysis of open education in latin america in the framework of unesco’s
new recommendations. Revista Interuniversitaria de Formacion Del Profesorado, 98(36), 93112.
https://doi.org/10.47553/rifop.v98i36.2.94059
Ranadewa, D. U. N., Gregory, T. Y., Boralugoda, D. N., Silva, J. A. H. T., & Jayasuriya, N. A. (2021).
Learners’ satisfaction and commitment towards online learning during covid-19: A concept paper.
Vision. https://doi.org/10.1177/09722629211056705
Reed, D. A., Cook, D. A., Beckman, T. J., Levine, R. B., Kern, D. E., & Wright, S. M. (2007). Association
between funding and quality of published medical education research. JAMA, 298(9), 10021009.
40
Sayre, F., & Riegelman, A. (2018). The Reproducibility Crisis and Academic Libraries. College & Research
Libraries, 79(1), 29. https://doi.org/10.5860/crl.79.1.2
Sackett, D. L., Rosenberg, W. M. C., Gray, J. A. M., Haynes, R. B., & Richardson, W. S. (1996). Evidence
based medicine: what it is and what it isn’t. BMJ, 312(71).
https://doi.org/10.1136/bmj.312.7023.71
Schünemann, H., Brozek, J., Guyatt, G., & Oxman, An. (Eds.). (2013). Grading of Recommendations
Assessment, Development and Evaluation (GRADE) Handbook. Cochrane.
https://training.cochrane.org/resource/grade-handbook
Shea, B. J., Grimshaw, J. M., Wells, G. A., Boers, M., Andersson, N., Hamel, C., Porter, A. C., Tugwell,
P., Moher, D., & Bouter, L. M. (2007). Development of AMSTAR: A measurement tool to assess
the methodological quality of systematic reviews. BMC Medical Research Methodology, 7(1), 10.
https://doi.org/10.1186/1471-2288-7-10
Simonson, M., Schlosser, C., & Orellana, A. (2011). Distance education research: A review of the
literature. Journal of Computing in Higher Education, 23, Article 23. https://doi.org/10.1007/s12528-
011-9045-8
Singh, J. (2013). Critical appraisal skills programme. Journal of Pharmacology and Pharmacotherapeutics, 4(1), 76
77. https://doi.org/10.4103/0976-500X.107697
Smith, R. (2006). Peer review: A flawed process at the heart of science and journals. Journal of Royal Society
Medicine, 99(4), 178-182. https://doi.org/10.1258/jrsm.99.4.178
Stracke, C. M. (2019). Quality frameworks and learning design for open education. International Review of
Research in Open and Distance Learning, 20(2), 180203. https://doi.org/10.19173/irrodl.v20i2.4213
Sutton, A., Clowes, M., Preston, L., & Booth, A. (2019). Meeting the review family: Exploring review
types and associated information retrieval requirements. Health Information & Libraries Journal, 36(3),
202222. https://doi.org/10.1111/hir.12276
Tavares, N. (2022). The use and impact of game-based learning on the learning experience and knowledge
retention of nursing undergraduate students: A systematic literature review. Nurse Education Today,
117, 105484. https://doi.org/10.1016/j.nedt.2022.105484
Udeozor, C., Toyoda, R., Russo Abegão, F., & Glassey, J. (2022). Digital games in engineering education:
Systematic review and future trends. European Journal of Engineering Education.
https://doi.org/10.1080/03043797.2022.2093168
UNESCO. (2019). Recommendation on Open Educational Resources (OER).
https://unesdoc.unesco.org/ark:/48223/pf0000373755/PDF/373755eng.pdf.multi.page=3
Van Raan, A. F. J., & Tijssen, R. J. W. (1993). The neural net of neural network research: An exercise in
bibliometric mapping. Scientometrics, 26(1), Article 1. https://doi.org/10.1007/BF02016799
Vanzella, L. M., Oh, P., Pakosh, M., & Ghisi, G. L. D. M. (2022). Barriers and facilitators to virtual
education in cardiac rehabilitation: A systematic review of qualitative studies. European Journal of
Cardiovascular Nursing, 21(5), 414429. https://doi.org/10.1093/eurjcn/zvab114
Vorobel, O. (2022). A systematic review of research on distance language teaching (20112020): Focus on
methodology. System, 105. https://doi.org/10.1016/j.system.2022.102753
Wen, Y., Gwendoline, C. L. Q., & Lau, S. Y. (2021). ICT-supported home-based learning in k-12: A
systematic review of research and implementation. TechTrends, 65(3), 371378.
https://doi.org/10.1007/s11528-020-00570-9
Wilcha, R.-J. (2020). Effectiveness of virtual medical teaching during the covid-19 crisis: Systematic
review. JMIR Medical Education, 6(2). https://doi.org/10.2196/20963
Wollny, S., Schneider, J., Di Mitri, D., Weidlich, J., Rittberger, M., & Drachsler, H. (2021). Are we there
yet? - A systematic literature review on chatbots in education. Frontiers in Artificial Intelligence, 4.
https://doi.org/10.3389/frai.2021.654924
Wu, J., & Chen, D. (2020). A systematic review of educational digital storytelling. Computers & Education,
147. https://doi.org/10.1016/j.compedu.2019.103786
Yue, M., Jong, M. S.-Y., & Dai, Y. (2022). Pedagogical Design of K-12 Artificial Intelligence Education: A
Systematic Review. Sustainability (Switzerland), 14(23). https://doi.org/10.3390/su142315620
Zafari, M., Bazargani, J. S., Sadeghi-Niaraki, A., & Choi, S.-M. (2022). Artificial intelligence applications in
k-12 education: A systematic literature review. IEEE Access, 10, 6190561921.
https://doi.org/10.1109/ACCESS.2022.3179356
41
Zawacki-Richter, O. (2009). Research Areas in Distance Education: A Delphi Study. The International
Review of Research in Open and Distributed Learning, 10(3). https://doi.org/10.19173/irrodl.v10i3.674
Zawacki-Richter, O., Bai, J. Y. H., Lee, K., Slagter Van Tryon, P. J., & Prinsloo, P. (2024). New advances
in artificial intelligence applications in higher education? International Journal of Educational Technology
in Higher Education, 21(1). https://doi.org/10.1186/s41239-024-00464-3
Zawacki-Richter, O., & Bozkurt, A. (2023). Research Trends in Open, Distance, and Digital Education.
In O. Zawacki-Richter & I. Jung (Eds.), Handbook of Open, Distance and Digital Education (pp. 199
220). Springer Nature Singapore. https://doi.org/10.1007/978-981-19-2080-6_12
Zawacki-Richter, O., Kerres, M., Bedenlier, S., Bond, M., & Buntins, K. (Eds.). (2020). Systematic reviews in
educational research: Methodology, perspectives and application. Springer.
http://link.springer.com/10.1007/978-3-658-27602-7
Zawacki-Richter, O., & Jung, I. (Eds.). (2023). Handbook of Open, Distance, and Digital Education. Springer.
https://link.springer.com/referencework/10.1007/978-981-19-0351-9
Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of research on
artificial intelligence applications in higher educationwhere are the educators? International Journal of
Educational Technology in Higher Education, 16, 1-27. https://doi.org/10.1186/s41239-019-0171-0
Zawacki-Richter, O., Xiao, J., Slagter Van Tryon, P. J., Lim, D., Conrad, D., Kerres, M., Lee, K., &
Prinsloo, P. (2024). Editorial Inaugurating the Journal of Open, Distance, and Digital Education
(JODDE). Journal of Open, Distance, and Digital Education, 1(1).
https://doi.org/10.25619/OZR2CE72
42
Appendix A
Codes for data extraction based on full-texts
Code
Scope
inclusion
yes/no, if no, reason
year
year of the publication
number of authors
integer
country of the first author
country name
Journal
journal name
time span of the review
range and total number of years
language included in the SR
language name(s) only
number of databases used
integer
review Type
configurative, aggregative, both, mapping
document types included in the SR
articles, conference papers, books, book chapters,
dissertations, thesis, other
yield
filter rate from titles and abstracts screening after removal of
duplicates, full text screening to final inclusion
report of string
yes/no
report of PRISMA
yes/no
report of criteria
yes/no
report of Interrater IRR
yes/no
report of quality appraisal
yes/no
report of quality standard
yes/no, quality appraisal tool noted
report of limitations
yes/no
report of education level covered
K-12, higher education, adult/continuing education, or TVET
report of level of study (3M)*
macro, meso, micro
topic of the review
coder comment
* Following the 3M-Framwork (Zawacki-Richter, 2009; Zawacki-Richter & Bozkurt, 2022) also used in the
Handbook of Open, Distance, and Digital Education (Zawacki-Richter & Jung, 2022) to organize the body of
43
knowledge in ODDE, the reviews included in this umbrella review are analyzed and synthesized along the lines of
macro-level research (ODDE systems, theory, methods, and global perspectives), meso-level research (management
and organization of ODDE institutions), and micro level research (teaching and learning in ODDE settings).
44
Appendix B
Eleven SRs with a QIS of 100
Authors/Year
Topic
Journal
Arqub, et al.
(2022)
Technology-enhanced learning in
orthodontics' education
European Journal of Dental Education
Du et al.
(2022)
Blended vs. traditional learning in nursing
education
Nurse Education in Practice
Gao et al.
(2022)
Acceptance of online learning in medical
education
Journal of Xiangya Medicine
Grafton-C.
et al. (2022)
Online in clinical work-based learning
Medical Teacher
Nowell et al.
(2022)
Online education to develop students
Remote caring skills and practices
Medical Education Online
Law & Heintz
(2021)
Augmented reality applications for k-12
education
International Journal of Child-
Computer Interaction
Noetel et al.
(2021)
Video-based learning in higher education
Review of Educational Research
Xu et al.
(2021)
Psychological interventions of virtual
gamification
Journal of Affective Disorders
Youhasan
et al. (2021)
Flipped classroom in undergraduate
nursing education
BMC Nursing
Adams et al.
(2019)
Online learning for university students on
the autism spectrum
Australasian Journal of Educational
Technology
Liaw et al.
(2018)
Virtual worlds in healthcare education
Nurse Education Today
45
Appendix C
Python Code
# author: John Bai
# date: 21.11.22; cleaned, commented, and checked on 12.03.24
import csv
import pandas as pd
# Input file
file_path = "C:\\John\Work\\Side projects\\Umbrella review\\"
file_name = "Umbrella Review UTF-8.csv"
# Columns to check
title = 5
abstract = 11
search_term = "systematic"
# Output lists
includes = []
excludes = []
# Open the csv file, search the title and abstract of each row for the search term
# Copy row into correct output list
with open(file_path + file_name, encoding = 'utf-8', mode='r') as infile:
reader = csv.reader(infile)
next(reader)
for row in reader:
search_field = (row[title]+row[abstract]).lower()
if search_term in search_field:
includes.append(row)
else:
excludes.append(row)
# Save includes list to csv
df = pd.DataFrame(includes)
df.to_csv(file_path + file_name[:len(file_name)-4] + " includes.csv", encoding = 'utf-8')
# Save excludes list to csv
df = pd.DataFrame(excludes)
df.to_csv(file_path + file_name[:len(file_name)-4] + " excludes.csv", encoding = 'utf-8')
... Evidence syntheses should be conducted using rigorous methods that are transparently reported, to enable replicability and trustworthiness (Gough et al., 2017;Zawacki-Richter et al., 2020), however in an analysis of 73 EdTech reviews, Lai and Bower (2020) found an average total quality score of 2.7 out of 4, with only six reviews (8.2%) explicitly defining quality assessment criteria. In larger and more recent analyses of EdTech reviews, Buntins et al. (2023) found that only 16 out of 361 reviews were fully replicable, and Zawacki-Richter et al. (2024) found that 8.1% of 576 reviews achieved a quality score above 90/100. Explorations of single topic reviews have also raised similar concerns; an analysis of 66 AI in higher education reviews and a meta scoping review of programming and robotics in primary and secondary schooling (Forsström et al., 2024) both revealed an average medium quality across included studies, but with many crucial elements lacking, such as information about inclusion/exclusion criteria, the full search string used to locate studies, inter-rater reliability between reviewers, and how the quality of primary research was assessed. ...
Article
Full-text available
In celebrating the 20th anniversary of the International Journal of Educational Technology in Higher Education (IJETHE), previously known as the Revista de Universidad y Sociedad del Conocimiento (RUSC), it is timely to reflect upon the shape and depth of educational technology research as it has appeared within the journal, in order to understand how IJETHE has contributed to furthering scholarship, and to provide future directions to the field. It is particularly important to understand authorship patterns in terms of equity and diversity, especially in regard to ensuring wide-ranging geographical and gender representation in academic publishing. To this end, a content and authorship analysis was conducted of 631 articles, published in RUSC and IJETHE from 2010 to June 2024. Furthermore, in order to contribute to ongoing efforts to raise methodological standards of secondary research being conducted within the field, an analysis of the quality of evidence syntheses published in IJETHE from 2018 to June 2024 was conducted. Common themes in IJETHE have been students’ experience and engagement in online learning, the role of assessment and feedback, teachers’ digital competencies, and the development and quality of open educational practices and resources. The authorship analysis revealed gender parity and an increasingly international identity, although contributions from the Middle East, South America and Africa remain underrepresented. The findings revealed a critical need for enhanced efforts to raise the methodological rigour of EdTech evidence syntheses, and suggestions are provided for how IJETHE can help move the field forwards. Key future research areas include educator professional development, the impact of digital tools on learning outcomes and engagement, the influence of social and contextual factors, the application of AI tools to support learning, and the use of multimodal data to analyse student learning across diverse contexts.
Article
Full-text available
Although the field of Artificial Intelligence in Education (AIEd) has a substantial history as a research domain, never before has the rapid evolution of AI applications in education sparked such prominent public discourse. Given the already rapidly growing AIEd literature base in higher education, now is the time to ensure that the field has a solid research and conceptual grounding. This review of reviews is the first comprehensive meta review to explore the scope and nature of AIEd in higher education (AIHEd) research, by synthesising secondary research (e.g., systematic reviews), indexed in the Web of Science, Scopus, ERIC, EBSCOHost, IEEE Xplore, ScienceDirect and ACM Digital Library, or captured through snowballing in OpenAlex, ResearchGate and Google Scholar. Reviews were included if they synthesised applications of AI solely in formal higher or continuing education, were published in English between 2018 and July 2023, were journal articles or full conference papers, and if they had a method section 66 publications were included for data extraction and synthesis in EPPI Reviewer, which were predominantly systematic reviews (66.7%), published by authors from North America (27.3%), conducted in teams (89.4%) in mostly domestic-only collaborations (71.2%). Findings show that these reviews mostly focused on AIHEd generally (47.0%) or Profiling and Prediction (28.8%) as thematic foci, however key findings indicated a predominance of the use of Adaptive Systems and Personalisation in higher education. Research gaps identified suggest a need for greater ethical, methodological, and contextual considerations within future research, alongside interdisciplinary approaches to AIHEd application. Suggestions are provided to guide future primary and secondary research.
Article
Full-text available
Evidence synthesis methods are becoming increasingly popular in the social sciences, particularly in the field of educational technology, where secondary research has grown exponentially in recent years. Although review studies provide insight into these methods, questions have been raised about their methodological rigor and transparency. This tertiary review analyzed transparency and reproducibility in the reporting of evidence synthesis methods in the field of educational technology across different types of reviews indexed in the Web of Science, ERIC, Scopus, Google Scholar, Dialnet, and FIS. Reviews were included if they were published in English, German, or Spanish; if they synthesized the use of educational technology within formal teaching and learning settings; and if they contained a methods section. A sample of 446 evidence syntheses were included for data extraction and synthesis in EPPI Reviewer, with systematic reviews, meta-analyses, and literature reviews selected for deeper analysis as the most widely used review types in the corpus. Indicators of replicability at critical stages of the review were identified and analyzed in the sample by review type (research question, search strategy, data extraction, and synthesis). The results show significant room for improvement of methodological transparency in data extraction and synthesis, with certain types of reviews showing lower scores than others on some indicators. The article concludes with recommendations for improving the methodological transparency and rigor of evidence synthesis in the field of educational technology.
Article
Full-text available
As responsible educators, it is time we admitted that we do not know what 'online' education is. We also need to confront the discomforting realisation that no one else does, either. The term 'online' has reached the stage where it is now so inclusive as to be meaningless. In embracing too much, it describes nothing. What was once a useful term to describe using the internet as part of asynchronous distance education is now used universally, to describe almost anything. Lectured, synchronous classes are now 'online' (Johnson et al., 2022). Emergency remote teaching and learning during the Covid-19 pandemic was 'online'. Including additional resources on an LMS for students to refer to after class is considered ‘online’. Across much of the educational spectrum, to be 'online' now is far from unusual. In this commentary I make the case that the term 'online' needs a forced retirement, or, at the least, additional context when it is applied.
Article
Full-text available
Systematische Reviews, die in der evidenzbasierten Medizin heute fest etabliert sind, werden auch in der Bildungsforschung zunehmend umgesetzt. Mithilfe solcher Übersichtsstudien lassen sich Wirkungsbefunde zu pädagogischen Maßnahmen umfänglich zusammentragen und evaluieren. Da hierfür eine Einschätzung der Qualität der empirischen Datenlage notwendig ist, stellt die Qualitätsbewertung von Studien eine zentrale Aufgabe im Review-Prozess dar. Sofern die Durchführung eines Reviews nicht über externe Review-Organisationen erfolgt, muss die Studienbewertung selbstständig konzipiert werden. Dies birgt jedoch Herausforderungen angesichts der vielfältigen Forschungsdesigns von pädagogischen Interventionsstudien und möglichen Bewertungskriterien, die passend zum Evidenzkonzept in der Bildung und zum jeweiligen Untersuchungsgegenstand definiert werden müssen. Der Beitrag beleuchtet die selbstständige Konzeptionierung der Studienbewertung in systematischen Reviews der Bildungsforschung und fokussiert dabei die Beurteilung der internen Validität von Interventionsstudien. Es werden relevante Planungsschritte zusammengefasst und reflektiert, die aus einer Recherche in theoretisch-konzeptionellen Arbeiten und Kriterien-Checklisten zur Qualitätsevaluation hervorgehen. Die Umsetzung dieser Schritte wird anhand der Studienbewertung eines systematischen Reviews zur Wirkung sprachsensibler Unterrichtsansätze exemplarisch illustriert. Dabei wird auch das Kriterienraster zur Beurteilung der internen Validität von pädagogischen Interventionsstudien vorgestellt, das in diesem Review als Ergebnis des Konzeptionsprozesses entstanden ist. Der Beitrag macht deutlich, dass die Studienbewertung gründliche Planungen und Richtungsentscheide voraussetzt. Das vorgestellte Kriterienraster kann zur Orientierung für die Qualitätsbewertung in anderen systematischen Reviews der Bildungsforschung dienen.
Article
Full-text available
Background The Medical Education Research Study Quality Instrument (MERSQI) is widely used to appraise the methodological quality of medical education studies. However, the MERSQI lacks some criteria which could facilitate better quality assessment. The objective of this study is to achieve consensus among experts on: (1) the MERSQI scoring system and the relative importance of each domain (2) modifications of the MERSQI. Method A modified Delphi technique was used to achieve consensus among experts in the field of medical education. The initial item pool contained all items from MERSQI and items added in our previous published work. Each Delphi round comprised a questionnaire and, after the first iteration, an analysis and feedback report. We modified the quality instruments’ domains, items and sub-items and re-scored items/domains based on the Delphi panel feedback. Results A total of 12 experts agreed to participate and were sent the first and second-round questionnaires. First round: 12 returned of which 11 contained analysable responses; second-round: 10 returned analysable responses. We started with seven domains with an initial item pool of 12 items and 38 sub-items. No change in the number of domains or items resulted from the Delphi process; however, the number of sub-items increased from 38 to 43 across the two Delphi rounds. In Delphi-2: eight respondents gave ‘study design’ the highest weighting while ‘setting’ was given the lowest weighting by all respondents. There was no change in the domains’ average weighting score and ranks between rounds. Conclusions The final criteria list and the new domain weighting score of the Modified MERSQI (MMERSQI) was satisfactory to all respondents. We suggest that the MMERSQI, in building on the success of the MERSQI, may help further establish a reference standard of quality measures for many medical education studies.
Chapter
Full-text available
This chapter sets out to explore the research field of open, distance, and digital education (ODDE) building upon the 3 M-Framework developed in the context of distance education along three broad lines of research: ODDE systems and theories (global macro-level); management, organization, and technology (institutional meso-level); and teaching and learning in ODDE (individual micro-level). Based on various bibliographic analyses, the flow of research areas and trends is described. The COVID-19 pandemic is discussed as a turning point that already has a huge impact on research and practice of the entire field of ODDE. According to thematic similarities and dissimilarities in the academic fields of educational technology (EdTech), distance education (DE), and instructional design (ID), four clusters of academic journals are identified with different thematic foci in various educational contexts. This information can be used to guide researchers to choose an appropriate journal in which to submit their work.
Article
Full-text available
In response to the growing popularity of artificial intelligence (AI) usage in daily life, AI education is increasingly being provided at the K-12 level, with relevant initiatives being launched worldwide. Examining how these programs have been implemented and summarizing useful experiences is thus imperative. Although prior reviews have described the characteristics of AI education programs in publications, the papers reviewed were mostly nonempirical reports, and the analysis typically only involved a descriptive summary. The current review focuses on the most recent empirical studies on AI teaching programs in K-12 contexts through a systematic search of the Web of Science database from 2010 to 2022. To provide a comprehensive overview of the status of AI teaching and learning (T&L), 32 empirical studies were analyzed both descriptively and thematically. We analyzed (1) the research status, (2) the pedagogical design, and (3) the assessments and outcomes of the AI teaching programs. An increasing number of studies have focused on AI education at the K-12 stage, but most of them have a small sample size. Moreover, the data were mostly collected through interviews and self-reports. We reviewed the pedagogical design of AI teaching programs by using Gerlach and Ely’s pedagogical design model. The results comprehensively delineated current AI teaching programs through nine dimensions: learning theory, pedagogical approach, T&L activities, learning content, scale, teaching resources, prior knowledge prerequisite, aims and objectives, assessment, and learning outcome. The results highlighted the positive impact of current AI teaching programs on students’ motivation, engagement, and attitude. However, we observed a lack of sufficient research objectively measuring students’ knowledge acquisition as learning outcomes. Overall, in this paper, we discussed relevant findings in terms of research trends, learning content, teaching units, characteristics of the pedagogical design, and assessment and evaluation by providing illustrations of exemplary designs; we also discussed future directions for research and practice in AI education in the K-12 context.
Article
This study aims to explore the trends and the perceived benefits and challenges of flipped language instruction regarding student achievement and attitudes in Turkiye. To that end, the databases, including Web of Science Core Collection, Scopus, Eric and DergiPark were reviewed, and a total of 20 articles were analyzed. Systematic review was utilized as the research methodology. The findings revealed that flipped instruction in ELT has gained importance since 2015 in Turkiye and has been gradually receiving more attention in research and practice. In the reviewed studies, the most employed research method turned out to be the mixed method, whilst the purely quantitative and qualitative studies were not abundant. It is seen that writing has been the most frequently researched language skill with respect to flipped instruction, whereas the other skills were not subject to investigation considerably. Furthermore, apart from language skills as the primary focus, the studies also concentrated on students’ perceptions, achievement, self-directed learning, attitudes, and classroom engagement. Finally, the reviewed studies illustrated the challenges and benefits of the flipped classroom in relation to students’ achievement and attitudes towards learning. In the light of the findings, implications for practice and recommendations for future research are provided.