Technical ReportPDF Available

How to do a Structured Literature Review in computer science

Authors:

Figures

Content may be subject to copyright.
How to do a Structured Literature Review in computer science
Anders Kofod-Petersen
Version 0.2
October 8, 2014
Contents
1 Introduction 1
2 Structure of a systematic literature review 2
3 Performing a structured literature review 2
3.1 Planningthereview ................................ 2
3.2 Conductingthereview............................... 3
References 7
1 Introduction
How to write a reference list
Doing a systematic literature review is a formal way of synthesising the information avail-
able from available primary studies relevant to a set of research questions. The use of sys-
tematic literature reviews have traditionally been widespread primarily in medicine (e.g. the
well known Cochrane reviews [1]). Unfortunately it has been used to a much lesser extend
in computer science (for an example of how to do reviews in software engineering see: [2]).
Systematic literature reviews stand apart from, in computer science the more traditional un-
systematic surveys by using a strict methodological framework with a set of well defined steps
carried out in accordance with a predefined protocol.
Using a systematic literature review is in no way a guarantee of finding all relevant lit-
erature in a given area. However, there are several advantages in using it: A systematic
literature review can map out existing solutions before a researcher attempts to tackle an
area; it helps researchers in avoiding bias in their work; publishing these reviews also benefits
the community by allowing others to avoid duplicating the effort; it allows researchers to
identify gaps of knowledge; and it highlights the areas where additional research is required.
If a systematic literature review is conducted thoroughly it fulfils the advantages described
above and thereby gains scientific value.
This documents attempts to give a short introduction to how to conduct a structured
literature review within computer science. The examples used are taken from [3].
1
2 Structure of a systematic literature review
A systematic review has three main phases: i) planning, ii ) conducting and iii) reporting.
Each of these phases are divided into several steps.
The first phase involves planning the review and can be broken down into these five steps:
1. Identification of the need for a review
2. Commissioning a review
3. Specifying the research question(s)
4. Developing a review protocol
5. Evaluating the review protocol
This second phase is the actually review of the literature. It consists of five steps:
1. Identification of research
2. Selection of primary studies
3. Study quality assessment
4. Data extraction and monitoring
5. Data synthesis
The last phase deals with how to disseminate the newly acquired knowledge. It consists
of three steps:
1. Specifying dissemination strategy
2. Formatting the main report
3. Evaluating the report
3 Performing a structured literature review
3.1 Planning the review
For the purpose of this document we can assume that a need has already been identified (step
1) and that a review has been commissioned (step 2). This description will cover steps 3 and
4 in the planning phase; as step 5 has been included in step 4.
2
Step 3: Specifying the research question(s)
Attempting a literature review is obviously closely coupled with some specific area and/or
problem. Thus, it is assumed that a specific problem (P) is tackled using some specific
constraints, methods and/or approaches (C) to develop a system, application or algorithm
(S). In computer science we would typically like to know what existing solutions are available,
how they compare, what the strength of the evidence is and what implications these solutions
have. Writing these points down gives us the following research questions:
RQ1 What are the existing solutions to P?
RQ2 How does the different solutions found by addressing RQ1 compare to each other with
respect to C?
RQ3 What is the strength of the evidence in support of the different solutions?
RQ4 What implications will these findings have when creating S?
Step 4: Developing a review protocol
The review protocol is very important as it defines exactly how each step is to be carried out;
thus, the work is reproducible. It can be beneficial to create an initial protocol and review the
upcoming step whenever a step is concluded. Doing this iteratively then covers the fifth step
evaluating the review protocol. Appendix ?? gives an example of a complete review protocol.
3.2 Conducting the review
With the developed protocol in hand it is now possible to conduct the review. This phase
contains five steps: Identification of research in the literature, selecting the primary studies
deemed relevant, evaluate the corpus with respect to the chosen quality parameters, extract
the relevant data, and synthesise the data.
Step 1: Identification of research
The goal of this step is to retrieve all the literature relevant to the defined research questions.
To do this a search strategy must be defined. This strategy should specify which sources to be
searched and how to search them. The list of sources to be searched will traditionally contain
the relevant on-line digital libraries as well as a set of journals and conferences relevant to the
area1The following list contains the most obvious general computer science archives: ACM
digital library, IEEE Xplore, ISI web of knowledge, ScienceDirect, CiteSeer, SpringerLink and
Wiley Inter Science. For domain specific sources it is worth contacting experts in the domain
who will normally know which are the best.
Once the list is complete the specific search terms can be defined as well as the procedure
for searching the sources.
The search strings are formed by grouping key termsinto groups. Each group contains
terms that are either synonyms, different forms of the same word, or terms that have similar
or related semantic meaning within the domain. Table 1 exemplifies this approach. The terms
1Obviously off-line sources should be searched as well. However, at least in computer science most relevant
sources are on-line.
3
chosen should by closely related to the first research question (what are the existing solutions
to P?).
Table 1: Search terms
Group 1 Group 2 Group 3 Group 4
Term 1 Synonym1S ynonym2Synonym3Synonym4
Term 2 Synonym1S ynonym2Synonym3Synonym4
Term 3 Synonym2S ynonym3
Term 4 Synonym3
Term 5 Synonym3
Each of the, in this case four groups can be designed to retrieve different sets of the
relevant literature. The primary goal is to find the literature that is the intersection of the
sets (see Figure 1).
Group 1
Group 2
Group 3
Group 4
Target studies
Figure 1: Relevant studies
Implementing this search strategy can be achieved by applying the AND () and OR (),
where the OR operator can used within the groups and the AND operator between the groups.
Following the example in Table 1 the following search string will capture the structure:
([G1, T 1] [G1, T 2]) ([G2, T 1] [G2, T 2] [G2, T 3]) ([G3, T 1] [G3, T 2]
[G3, T 3] [G3, T 4] [G3, T 5]) ([G4, T 1] [G4, T 2])
The set of papers constructed by applying this search strategy is now ready to go though
the selection process.
4
Step 2: Selection of primary studies
Applying the search described above will most likely return a number of articles far larger
than manageable. The relevant articles should now be selected. The protocol should described
exactly which criteria should be applied in this selection process. However, some points can
be regarded as general and used as removal criteria:
1. Duplicates (keep the highest ranking source),
2. The same study published in different sources (keep the highest ranking source),
3. Studies published before a certain date (or even after).
Applying this selection now leaves us with a set of relevant studies that can now be filtered
with respect to quality.
Step 3: Study quality assessment
The purpose of this step is to filter away studies that are not thematically relevant to the area
chosen. The protocol should define exactly which inclusion (IC) and quality criteria (QC) are
employed (Table 2 gives some examples of criteria).
Table 2: Inclusion and quality criteria
Criteria identification Criteria
IC 1 The study’s main concern is P
IC 2 The study is a primary study presenting empirical results
IC 4 The study focuses on C
IC 5 The study describes an S
QC 1 There is a clear statement of the aim of the research
QC 2 The study is put into context of other studies and research
The criteria can be divided into: primary, secondary and quality screening criteria. In the
example described in Table 2 IC 1 and 2 would be the primary; IC 3 and 4 the secondary;
and QC 1 and 2 the quality criteria. The criteria can now be applied in a three stage process:
1. Abstract inclusion criteria screening,
2. Full text inclusion criteria screening,
3. Full text quality screening.
Each step should be thoroughly documented as part of the final protocol. Once the set
of studies have gone through this process it is (most likely) further reduced and can now go
though the next step of detailed quality assessment.
The final quality assessment is done to answer the third research question (What is the
strength of the evidence in support of the different solutions?). To do this further quality
criteria, supplementing QC 1 and 2 in Table 2 should be developed. Examples of this could
be (QC 1 and 2 is duplicated as questions for completeness):
5
QC 1 Is there is a clear statement of the aim of the research?
QC 2 Is the study is put into context of other studies and research?
QC 3 Are system or algorithmic design decisions justified?
QC 4 Is the test data set reproducible?
QC 5 Is the study algorithm reproducible?
QC 6 Is the experimental procedure throughly explained and reproducible?
QC 7 Is it clearly stated in the study which other algorithms the study’s algorithm(s) have
been compared with?
QC 8 Are the performance metrics used in the study explained and justified?
QC 9 Are the test results thoroughly analysed?
QC 10 Does the test evidence support the findings presented?
Each of the studies under considerations should be classified according to these 10 quality
criteria. The protocol should clearly specify the granularity of the score, e.g. yes (1 point),
partly (1
2point) or no (0 point). The protocol should further specify the threshold for studies
to be accepted and if it is acceptable to have, e.g. zero points in certain criteria.
All the studies have been classified and a suitable set of worthy studies has been selected.
Now the data from each study can now be extracted.
Step 4: Data extraction and monitoring
Step 5: Data synthesis
6
References
[1] J. P. T. Higgins and S. Green, editors. Cochrane Handbook for Systematic Reviews of
Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011.
Available from www.cochrane-handbook.org.
[2] B. A. Kitchenham. Guidelines for performing systematic literature reviews in software
engineering version 2.3. Technical Report EBSE-2007-01, Keele University and University
of Durham, 2007.
[3] Terje Nesbakken Lillegraven and Arnt Christian Wolden. Design of a bayesian recom-
mender system for tourists presenting a solution to the cold-start user problem. Master’s
thesis, Department of Computer and Information Science, NTNU, 2010.
7
... A systematic review can help a researcher to elicit the existing solutions and find a new way to tackle a given problem before addressing the interesting problem. Systematic review minimizes biases in the work and helps the researchers to find knowledge gap, and identify the areas which require further research [33]. Formerly, this methodology was designed for the healthcare reviews and Meta-analysis, but later it was found to be beneficial for various natural and social science fields, including information systems [34] and software engineering [35]. ...
... It immediately starts after the planning phase. It includes research identification, primary studies selection, quality assessment, data extraction, data synthesis (summarization) [33,[35][36][37][38][39]. Research identification ensures the retrieval of all the research studies related to the research questions. ...
... Quality assessment (QA) of the selected studies was performed at Step S5 to obtain the most relevant research studies for literature review. The quality evaluation process of our study is based on the protocol defined by A K-Petersen [33]. Quality evaluation questions have been formulated to assess the overall quality of the selected studies and these questions are defined in Table 2. First, a full text reading of each selected study was carried out then a quality score of each study was independently calculated based on formulated quality questions. ...
Article
Full-text available
The adoption of sustainable electronic healthcare infrastructure has revolutionized healthcare services and ensured that E-health technology caters efficiently and promptly to the needs of the stakeholders associated with healthcare. Despite the phenomenal advancement in the present healthcare services, the major obstacle that mars the success of E-health is the issue of ensuring the confidentiality and privacy of the patients’ data. A thorough scan of several research studies reveals that healthcare data continues to be the most sought after entity by cyber invaders. Various approaches and methods have been practiced by researchers to secure healthcare digital services. However, there are very few from the Machine learning (ML) domain even though the technique has the proactive ability to detect suspicious accesses against Electronic Health Records (EHRs). The main aim of this work is to conduct a systematic analysis of the existing research studies that address healthcare data confidentiality issues through ML approaches. B.A. Kitchenham guidelines have been practiced as a manual to conduct this work. Seven well-known digital libraries namely IEEE Xplore, Science Direct, Springer Link, ACM Digital Library, Willey Online Library, PubMed (Medical and Bio-Science), and MDPI have been included to perform an exhaustive search for the existing pertinent studies. Results of this study depict that machine learning provides a more robust security mechanism for sustainable management of the EHR systems in a proactive fashion, yet the specified area has not been fully explored by the researchers. K-nearest neighbor algorithm and KNIEM implementation tools are mostly used to conduct experiments on EHR systems’ log data. Accuracy and performance measure of practiced techniques are not sufficiently outlined in the primary studies. This research endeavour depicts that there is a need to analyze the dynamic digital healthcare environment more comprehensively. Greater accuracy and effective implementation of ML-based models are the need of the day for ensuring the confidentiality of EHRs in a proactive fashion.
... Lecturers, as one of the stakeholders in higher education institutions, have a role in determining the quality of universities' graduates [22]. The development of technology in education requires lecturers to adapt to the latest way of delivering learning materials [23]. Lecturers' involvement in the planning phase of e-learning platforms and learning materials is required to ensure the adaptation of online learning. ...
... Challenges that related to lecturers' personality can be categorized into self-efficacy [22]- [31], time management [20], [23]- [25], [27], [29]- [33], motivation [31], [40], health [4], [41] and safety [42]. In the online learning context, selfefficacy is the lecturer's level of confidence in facilitating online learning [32], [33], [35]. ...
... The research method used to conduct this SLR is based on the guidelines proposed by [12] in the PRISMA statement and by [13] for conducting structured literature reviews in computer science fields. The authors of [13] indicated three main phases that should compose a review: (i) planning, (ii) conducting, and (iii) composing. ...
... The research method used to conduct this SLR is based on the guidelines proposed by [12] in the PRISMA statement and by [13] for conducting structured literature reviews in computer science fields. The authors of [13] indicated three main phases that should compose a review: (i) planning, (ii) conducting, and (iii) composing. ...
Article
Full-text available
As digital instrumentation in Nuclear Power Plants (NPPs) is becoming increasingly complex, both attack vectors and defensive strategies are evolving based on new technologies and vulnerabilities. Continued efforts have been made to develop a variety of measures for the cyber defense of these infrastructures, which often consist in adapting security measures previously developed for other critical infrastructure sectors according to the requirements of NPPs. That being said, due to the very recent development of these solutions, there is a lack of agreement or standardization when it comes to their adoption at an industrial level. To better understand the state of the art in NPP Cyber-Security (CS) measures, in this work, we conduct a Systematic Literature Review (SLR) to identify scientific papers discussing CS frameworks, standards, guidelines, best practices, and any additional CS protection measures for NPPs. From our literature analysis, it was evidenced that protecting the digital space in NPPs involves three main steps: (i) identification of critical digital assets; (ii) risk assessment and threat analysis; (iii) establishment of measures for NPP protection based on the defense-in-depth model. To ensure the CS protection of these infrastructures, a holistic defense-in-depth approach is suggested in order to avoid excessive granularity and lack of compatibility between different layers of protection. Additional research is needed to ensure that such a model is developed effectively and that it is based on the interdependencies of all security requirements of NPPs.
... Furthermore, a second methodology developed by Kofod-Petersen [39] was considered in this study. The Kofod-Petersen method helps researchers conduct a structured literature review within Computer Science. ...
Article
Full-text available
The COVID-19 pandemic has changed our common habits and lifestyle. Occupancy information is valued more now due to the restrictions put in place to reduce the spread of the virus. Over the years, several authors have developed methods and algorithms to detect/estimate occupancy in enclosed spaces. Similarly, different types of sensors have been installed in the places to allow this measurement. However, new researchers and practitioners often find it difficult to estimate the number of sensors to collect the data, the time needed to sense, and technical information related to sensor deployment. Therefore, this systematic review provides an overview of the type of environmental sensors used to detect/estimate occupancy, the places that have been selected to carry out experiments, details about the placement of the sensors, characteristics of datasets, and models/algorithms developed. Furthermore, with the information extracted from three selected studies, a technique to calculate the number of environmental sensors to be deployed is proposed.
Article
Background: Digital transformation is by itself a fragmented area, due to different perspectives encountered in the literature. The research problem addressed in this paper is a general lack of consent on the content of digital transformation and the lack of a comprehensive framework for implementing digital transformation initiatives. Purpose: The aim of this paper is to identify distinct key activities of digital transformation through a systematic literature review, and in doing so contribute to defining the scope of digital transformation and the structure of digital transformation as a process. Study design/methodology/approach: This research was conducted by means of a systematic literature review, with the aim to ascertain the general structure of the digital transformation process through identification of its key activities. Finding/conclusions: A total of 19 items were identified as activities of digital transformation, which were subsequently distributed among the 6 distinct stages of the digital transformation process, in an effort to advance the understanding of the notion and the scope of digital transformation through clarification of its content. Limitations/future research: The results of this research should be instrumental for the future research aimed towards developing generic, universal guidelines for companies seeking to embark on digital business transformation journeys.
Chapter
Although Breast Cancer (BC) deaths have decreased over time, it is still the second largest cause of cancer death among women. With the technical revolution of Artificial Intelligence (AI), and the big healthcare data that is becoming more of a reality, many researchers have attempted to employ Machine Learning (ML) techniques to gain a better understanding of this disease. The present paper is a systematic mapping study of the application of ML techniques in Breast Cancer Screening (BCS) between the years 2011 and early 2021. Out of 129 candidate papers we retrieved from six digital libraries, a total of 66 papers were selected according to 5 criteria: year and publication venue, paper type, BCS modality, and empirical type. The results show that classification was the most used ML objective, and that mammography was the most frequent BCS modality used.
Chapter
Ensuring trust between Internet of Things (IoT) devices is crucial to ensure the quality and the functionality of the system. However, with the dynamism and distributed nature of IoT systems, finding a solution that not only provides trust among IoT systems but is also suitable to their nature of operation is considered a challenge. In recent years, Blockchain technology has attracted significant scientific interest in research areas such as IoT. A Blockchain is a distributed ledger capable of maintaining an immutable log of transactions happening in a network. Blockchain is seen as the missing link towards building a truly decentralized and secure environment for the IoT. This paper gives a taxonomy and a side by side comparison of the state of the art methods securing IoT systems with Blockchain technology. The taxonomy aims to evaluate the methods with respect to security functions, suitability to IoT, viability, main features, and limitations.
Article
Full-text available
Evidence is increasing of human responses to the impacts of climate change in Africa. However, understanding of the effectiveness of these responses for adaptation to climate change across the diversity of African contexts is still limited. Despite high reliance on indigenous knowledge (IK) and local knowledge (LK) for climate adaptation by African communities, potential of IK and LK to contribute to adaptation through reducing climate risk or supporting transformative adaptation responses is yet to be established. Here, we assess the influence of IK and LK for the implementation of water sector adaptation responses in Africa to better understand the relationship between responses to climate change and indigenous and local knowledge systems. Eighteen (18) water adaptation response types were identified from the academic literature through the Global Adaptation Mapping Initiative (GAMI) and intended nationally determined contributions (iNDCs) for selected African countries. Southern, West, and East Africa show relatively high evidence of the influence of IK and LK on the implementation of water adaptation responses, while North and Central Africa show lower evidence. At country level, Zimbabwe displays the highest evidence (77.8%) followed by Ghana (53.6%), Kenya (46.2%), and South Africa (31.3%). Irrigation, rainwater harvesting, water conservation, and ecosystem-based measures, mainly agroforestry, were the most implemented measures across Africa. These were mainly household and individual measures influenced by local and indigenous knowledge. Adaptation responses with IK and LK influence recorded higher evidence of risk reduction compared to responses without IK and LK. Analysis of iNDCs shows the most implemented water adaptation actions in academic literature are consistent with water sector adaptation targets set by most African governments. Yet only 10.4% of the African governments included IK and LK in adaptation planning in the iNDCs. This study recommends a coordinated approach to adaptation that integrates multiple knowledge sources, including IK and LK, to ensure sustainability of both current and potential water adaptation measures in Africa.
Article
Full-text available
The objective of this report is to propose comprehensive guidelines for systematic literature reviews appropriate for software engineering researchers, including PhD students. A systematic literature review is a means of evaluating and interpreting all available research relevant to a particular research question, topic area, or phenomenon of interest. Systematic reviews aim to present a fair evaluation of a research topic by using a trustworthy, rigorous, and auditable methodology. The guidelines presented in this report were derived from three existing guidelines used by medical researchers, two books produced by researchers with social science backgrounds and discussions with researchers from other disciplines who are involved in evidence-based practice. The guidelines have been adapted to reflect the specific problems of software engineering research. The guidelines cover three phases of a systematic literature review: planning the review, conducting the review and reporting the review. They provide a relatively high level description. They do not consider the impact of the research questions on the review procedures, nor do they specify in detail the mechanisms needed to perform meta-analysis.
Book
The Cochrane Handbook for Systematic Reviews of Interventions (the Handbook) has undergone a substantial update, and Version 5 of the Handbook is now available online at www.cochrane-handbook.org and in RevMan 5. In addition, for the first time, the Handbook will soon be available as a printed volume, published by Wiley-Blackwell. We are anticipating release of this at the Colloquium in Freiburg. Version 5 of the Handbook describes the new methods available in RevMan 5, as well as containing extensive guidance on all aspects of Cochrane review methodology. It has a new structure, with 22 chapters divided into three parts. Part 1, relevant to all reviews, introduces Cochrane reviews, covering their planning and preparation, and their maintenance and updating, and ends with a guide to the contents of a Cochrane protocol and review. Part 2, relevant to all reviews, provides general methodological guidance on preparing reviews, covering question development, eligibility criteria, searching, collecting data, within-study bias (including completion of the Risk of Bias table), analysing data, reporting bias, presenting and interpreting results (including Summary of Findings tables). Part 3 addresses special topics that will be relevant to some, but not all, reviews, including particular considerations in addressing adverse effects, meta-analysis with non-standard study designs and using individual participant data. This part has new chapters on incorporating economic evaluations, non-randomized studies, qualitative research, patient-reported outcomes in reviews, prospective meta-analysis, reviews in health promotion and public health, and the new review type of overviews of reviews.
Design of a bayesian recommender system for tourists presenting a solution to the cold-start user problem
  • Arnt Terje Nesbakken Lillegraven
  • Christian Wolden
Terje Nesbakken Lillegraven and Arnt Christian Wolden. Design of a bayesian recommender system for tourists presenting a solution to the cold-start user problem. Master's thesis, Department of Computer and Information Science, NTNU, 2010.
Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated
  • J P T Higgins
  • S Green
J. P. T. Higgins and S. Green, editors. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org.