The exponential growth of research and enormity of the body of knowledge that has been accumulated in applied linguistics make the need for quality and reliable synthesis of the available research more pressing than ever. Traditional reviews seek to critique existing research, provide an overview of the research, and/or contextualize a new study. Research syntheses aim at reaching conclusions by means of aggregating the totality of the empirical research that has been carried out on a certain topic. In this chapter, we discuss the procedures and best practices of each of the two approaches and conclude by making a comparison between the two approaches and proposing ways to integrate them.
Traditional Literature Review andResearch
Shaofeng Li andHongWang
A literature review is a retrospective account of previous research on a certain
topic, and it may achieve various purposes. For researchers, a literature review may
serve to contextualize and inform further research. Specically, prior to carrying
out a new study, the researcher needs to nd a niche by identifying what has been
done on the topic under investigation and to mine through existing methodology
with a view to developing instruments and materials that can best answer his/her
own research questions. Also, through a literature review, a researcher may draw
on existing evidence to verify a theory or build a new theory. For practitioners and
policy makers, the conclusions reached in a literature review based on aggregated
research ndings may serve as a basis for decision-making in terms of how to meet
the needs of dierent stakeholders. Such integration of research and practice is
called evidence-based practice (Bronson & Davis, 2012), which is of particular
importance in a heavily practice- oriented discipline such as applied linguistics.
Because of the importance of literature review, there has been a call to treat
it as a research method in its own right (Cooper, 2016). In fact, in the litera-
ture on literature review, a distinction has been made between traditional lit-
erature reviews, such as the type that appears in the literature review section
of a journal article, and more systematic approaches of previous research, such
as meta-analysis, which is conducted following a set of well-dened procedures
and protocols. Although there has been much discussion about the dierences
between traditional reviews and systematic reviews, systematic compari-
sonshave been rare, and the terminology relating to the two approaches and
to literature review as a genre has been ambiguous and confusing.
To clarify any potential ambiguities, and for the purposes of this chapter,
the term literature review is used to refer to the genre as a whole as well as to
traditional literature reviews. Research synthesis is reserved for systematic
reviews such as meta-analysis. Proponents of research synthesis have generally
been negative about traditional reviews, criticizing its unscientic nature.
However, we argue thatboth traditional reviews and research syntheses have
merits, that they serve dierent purposes, and that they do not have to be
mutually exclusive. e following sections discuss the procedures and best
practices of each of the two approaches and conclude by making a compari-
son between the two approaches and proposing ways to integrate them.
Traditional Literature Review
Although anyone getting an academic degree or pursuing an academic career
may have to write a literature review at some point, there has been surprisingly
little information on how to do it. A quick look at research methods textbooks
in applied linguistics shows that none of them includes detailed information on
how to conduct a literature review. In a study by Zaporozhetz (1987), Ph.D.
advisors (N=33) ranked the literature review section the lowest in terms of the
amount of help they provided to their students—they reported spending the
most time on the supervision of the methods chapter. e lack of interest in
guiding students on how to do a literature review is probably because (1) it is
considered an easy and transparent process, not a skill that needs to be trained,
and (2) there are a myriad of ways of writing a literature review, which makes it
challenging to provide a general guidance. However, it can be argued that (1)
doing a literature review is not a naturally acquired skill, and (2) despite the
variety of styles and approaches, there are some common principles and proce-
dures one could follow in order to write a successful review.
Jesson, Matheson, and Lacey (2011) dened a traditional literature review
as “a written appraisal of what is already known … with no prescribed meth-
odology” (p.10). is denition suggests that a literature review is not a mere
description of previous research; rather, it provides an evaluation of the
research. e denition also distinguishes a traditional review from a research
synthesis, which is carried out by following a set of well-dened procedures
(see Plonsky & Oswald, 2015). Traditional reviews include the introductory
sections of the study reports of empirical studies as well as freestanding reviews
such as those published in Language Teaching. e main purpose of a litera-
ture review for an empirical study is to set the stage for a new study. e pur-
poses of freestanding reviews, by contrast, are more diverse, such as providing
a state-of-the-art review of the research on a certain instructional treatment,
clarifying the myths and central issues of a substantive domain, proposing a
new research agenda, and summarizing the methods previous researchers have
used to measure a certain construct. Because of the diversity of topics and
purposes of freestanding reviews, there are no xed formats to follow. e
focus of this section, consequently, will be on the literature reviews for empiri-
cal studies, which, of course, overlaps with freestanding reviews in many ways.
Before going into further detail, it is necessary to emphasize that what we are
discussing here is how to do, not just how to write, a literature review; writing a
literature review is the nal step of the entire process of doing a literature review.
e purpose of doing a literature review is trifold: (1) to contextualize the study to
be conducted, (2) to inform the study design, and (3) to help the researcher inter-
pret the resultsin the discussion section. Specically, when contextualizing the
current study, the researcher needs to identify what is known about the topic by
discussing the related theories, research, and practices. e researcher also needs to
identify what is unknown about the topic, explain how it is informed by, and devi-
ates from, previous studies, and convince the reader of the signicance of the cur-
rent study. An equally important purpose of doing a literature review is to draw on
the methodology of existing research to answer the research questions of the cur-
rent study. Finally, doing a literature review enables the researcher to refer back to
the theories and research expounded in the review section when discussing how
the ndings of this study may contribute to the existing body of knowledge.
e content of a literature review, or the scholarship to be evaluated, is of
three types: conceptual, empirical, and practical. Conceptual knowledge con-
cerns theories, including arguments, statements, claims, and terminology.
Empirical knowledge refers to the ndings of empirical studies as well as the
methodological aspects of the studies. Practical knowledge can be divided into
two types. One refers to the knowledge contributed by practitioners including
(1) the ndings of action research, such as those reported in articles published
in the ELT Journal or in the practitioners’ research section of Language Teaching
Research; (2) guidelines and principles for eective practice, such as the infor-
mation from teacher guides; and (3) opinions, debates, and discussions
on public forums such as the Internet. e other type of practical knowledge
pertains to policies and instructions formulated by government agencies to
guide the practice of the domain in which the research is situated. ese three
types of knowledge correspond to three aspects of a research topic: theory
(conceptual), research (empirical), and practice (practical). Although it is not
a must to be all-inclusive, a literature review should minimally include the
theories and research relating to the research topic. However, given the applied
nature of our eld, the review of the literature would appear incomplete if the
practical dimension is left out.
e process of doing a literature review can be divided into six stages, which
are elaborated in the following sections.
Stage 1: Dening theProblem
When doing a literature review, the rst step is to dene the research problem
or formulate research questions for the study. Although research questions usu-
ally appear at the end of a literature review, in practice a researcher must have
them at the beginning of the process. e research questions constitute a ag-
ship guiding the literature review as well as other parts of an empirical study
such as study design. erefore, if the researcher isuncertain about where to
start during a literature search, or even how to organize the literature review,
the best way is to consider what questions the study seeks to answer and then
nd information on what theorists, researchers, and practitioners said about
the questions. Although the research questions may be ne-tuned as the review
process unfolds, they serve as starting points leading you towards the destina-
tion and guiding the literature search as well as later stages of the process.
Stage 2: Searching fortheLiterature
After formulating the research questions, the next step is to search for the lit-
erature to be included in the review. e most common search strategy is to
use electronic databases, including (1) domain-general databases (e.g., Google
Scholar), (2) domain-specic databases in applied linguistics (e.g., LLBA), (3)
databases from neighbouring disciplines (e.g., PsycINFO in psychology), and
(4) databases for Ph.D./M.A. dissertations and theses (e.g., ProQuest
Dissertations & eses). One emerging powerful source of information is
public academic forums such as Academia (, which are
not only venues for academic communication between researchers but also
large repositories of information. Another commonly used strategy is ancestry
chasing, that is, mining the reference sections of primary studies (the term
primary is used to distinguish studies synthesized in a literature review and the
review itself) and review articles to nd relevant items.
Stage 3: Selecting Studies
Although a traditional literature review usually does not report how the stud-
ies included in the review are selected, the researcher must make decisions,
albeit “behind the scene,” on which retrieved studies actually go into the
review. Unlike a research synthesis, which must be inclusive, a traditional
review is selective. Although the selection criteria are idiosyncratic, some gen-
eral principles should be adhered to. e rst principle is that the selected
studies must be representative. Representativeness has two dimensions: inu-
ential and diverse. Inuential studies refer to milestone or seminal studies on
the topic under investigation, which are frequently cited and/or are published
in prestigious venues such as journals with high impact factors. By diverse, it
is meant the included studies must represent dierent perspectives, disparate
ndings, and varied contexts or populations. Critics of traditional reviews
argue that the authors may include only studies that support a certain theory,
show certain results, or are carried out with a certain methodology. erefore,
it is important to include studies that represent dierent theories and trends
to reduce the likelihood of bias and arbitrariness. e second principle con-
cerns relevance. Retrieved studies may be relevant to theresearch questions to
varying degrees. Giventhe usually limited space for a literature review, it is
important to include the most relevant research. e third consideration is
study quality, that is, the reviewed studies should have high internal and exter-
nal validity.
Stage 4: Reading theLiterature
Two principles should be kept in mind when reading the literature. e rst
is to read carefully and understand thoroughly. A common problem in doing
a literature review is piecemeal reading (Booth, Colomb, & Williams, 1995),
which may cause incomplete or inaccurate understanding. ere is no short-
cut to a successful review, and familiarity with the literature is the only key. It
is advisable to read key articles several times, read alternative explanations in
the case of a dicult or complicated theory, and read all hallmark studies. e
second principle is to read actively and critically instead of passively and
mechanically. While reading, one should not assume that published research
is perfect. Instead, consider whether the study in question was conducted
using valid methods, whether the results are due to the idiosyncratic versus
principled methods, how the study is similar to, and dierent from, other
studies, whether the interpretations are warranted, and how the study
informsone’s own study in terms of whatone can draw on, what improve-
ments one wants to make, or whether one will reorient the focus of one’s
ownstudy based on whathas been learned from this study.
Stage 5: Organizing theData
Information derived from the retrieved studies constitutes data for the litera-
ture review and should be organized in two ways: discretely and syntheti-
cally. In a discrete organization, the details about each individual study are
recorded in a table or spreadsheet, and this type of information can be
labelled study notes. Study notes can be arranged alphabetically according to
the authors’ family names, and the notes about a primary study should
include a brief summary of the study, followed by detailed information about
the methods, results, and interpretations of the results. As to the literature
review of a retrieved study, it would be useful to observe what theories or
models the author refers to, how he/she denes the constructs, and how he/
she summarizes and critiques previous research. However, the author’s com-
ments and interpretations of other studies might be biased and inaccurate,
and therefore it is always advisable to read the original articles in case of any
uncertainty. Finally, the study notes table should have a section for any com-
ments the reviewermay have on any aspect of the study that merits further
A synthetic organization involves extracting themes, patterns, or trends
that have emerged from individual studies. Synthetic notes can be organized
after the study notes about each individual study are in place, but it is easier
and more ecient to work on both at the same time. Organizing synthetic
notes entails categorizing the information from the primary studies and iden-
tifying and reecting on the commonalities and disparities between them. In
which way should the information be categorized depends on what has
emerged from the studies and whether categorizing the studies in a certain
way leads to an interesting point or a convincing argument. Categorization
can be based on study ndings. For example, some studies may have found a
certain instructional treatment to be eective, while others may not. It would
then be necessary to divide them into two categories and ascertain what char-
acteristics each group of studies share that lead to their respective and conict-
ing ndings. In a similar vein, categorization can be done methodologically
based on learner population, study context, measures of independent and
dependent variables, and so on. Furthermore, methodological information
collated from dierent studies can be the target of synthesis if one purpose of
the review is to summarize the methodology of the primary studies. Finally, in
addition to empirical knowledge, the synthetic notes should include sections
for theoretical and practical knowledge so that all information needed for the
review converges in one venue.
Stage 6: Writing UptheReview
Components of a literature review. A literature review typically consists of three
components: an introduction, the body of the review, and research questions.
e length of the introduction ranges from one or two paragraphs (for a jour-
nal article) to a chapter (for a Ph.D. thesis), but the purpose is the same: to
give “the reader a sense of what was done and why” (APA, 2001, p.16). is
section of the literature review should provide a succinct overview of the topic,
spell out the key issues surrounding the topic, state the signicance of the
study, identify the gaps in knowledge, explain the aims of the study, and
inform the reader of the structure of the literature review. e body of the
literature review should contextualize the current study by summarizing the
theory, research, and practice of the research topic and justifying the signi-
cance of the study. e body of the review leads towards the research ques-
tions, and therefore before the research questions are introduced, it is necessary
to summarize previous ndings and controversies and show the links between
previous research and the currentstudy.
Structuring a literature review. ere is no xed format as far as how a litera-
ture review should be organized, but given the separation between the three
types of knowledge—conceptual, empirical, and practical—the macro struc-
ture may consist of three major parts dealing with the theories, research, and
practices relating to the focus of the current study. For an empirical study, the
bulk of the review should be empirical knowledge, namely, the ndings and
methods of empirical studies.
ere are two ways to present the information contributed by empirical
studies: thematic and anthological. ematic presentation is based on the
themes emerging from the primary studies (from the synthetic notes), and the
ow of information proceeds through arguments. An argument is “the logical
presentation of evidence that leads to and justies a conclusion” (Machi &
McEvoy, 2012, p.65). An argument has three components: claim, evidence,
and warrant (Booth etal., 1995). A claim should be substantive and contest-
able in order to arouse readers’ interest and contribute to the existing body of
knowledge. e evidence cited to support the claim needs to be:
1. accurate, that is, incorrect evidence should not be used
2. precise, namely, be specic, avoid being vague, and hedge or use qualiers
if absolute preciseness is impossible
3. sucient, meaning there should be enough evidence for the validity of the
4. authoritative, that is, evidence should be robust and inuential
5. perspicuous, which means evidence should be clear and easy to
e warrant of an argument is the reasoning linking the evidence and the
claim; it is about the logical connections between the evidence and the claim,
not the evidence or claim per se. erefore, if there are no logical links between
the claim and the evidence, then the argument is not warranted, even though
the evidence is sound. In a nutshell, if oneelects to present the information
about empirical studies thematically, the literature review would be built
around dierent arguments in which claims or themes are reported with sup-
porting evidence from multiple sources or studies.
Anthological presentation means that the review is organized as a collection
of individual studies, reporting details on the methods and results of each
study—similar to study notes. Although this practice is prevalent in the eld
of applied linguistics, it is less eective than thematic presentation because the
primary objective of a literature review is to critique, synthesize, and show
readers what to make of previous ndings rather than create an annotated
bibliography. is is not to say that we should root out detailed descriptions
of individual studies; rather, study details are necessary when they are impor-
tant for making an argument or when hallmark or seminal studies are reported
due to the special place they occupy in a literature review. However, because
of the critical nature of traditional reviews, study details, when reported,
should be accompanied with comments and critique.
e critical nature of a literature review. As Imel (2011, p.146) pointed out, a
literature review “provides a new perspective on the topic …[and is] more than
the sum of its parts.” Key to developing a new perspective is a critical assessment
of the ndings and methods of the relevant body of research. e following
strategies can beutilized to make a literature review a critical piece of writing:
Identify the dierences and similarities between primary studies in terms of
their ndings and methods.
Include all representative ndings, not only those that are “cherry-picked”
to support your own position. If there are disparities, it is better to explain
rather than ignore. For opposing evidence that is uninterpretable, it is bet-
ter to state that “certain studies support one conclusion and others support
another” (APA, 2001, pp.16–17), rather than provide an unconvincing
Challenge existing theories and research. If you disagree with an argument
or claim on reasonable grounds, do not hide your position; if there is clear
evidence showing the limitations of a previous study, point them out; pro-
pose new or alternative interpretations for previous ndings. However, do
not stigmatize previous research even though you have found loopholes
and limitations.
Discuss the signicance of previous studies. If you have discovered merits
of a view, nding, or method, it is important to make them known to the
reader. us, being critical entails demonstrating not only the weaknesses
of previous research but also their strengths and contributions.
Evaluate and clarify controversies or opposing theoretical positions and dis-
cuss how they have inuenced the research and practice of the substantive
Propose new directions, methods, or theories, which may complement, but
not necessarily contradict or supersede, existing research.
Use linguistic devices to show the relationships between ideas and describe
the authors’ stances. For example, cohesive expressions, such as “similarly,
“in contrast,” “however,” “therefore,” and so on, are eective in making
evident how information is related. Appropriate reporting verbs should be
used to accurately capture the authors’ positions and stances, such as “con-
tend,” “argue,” “state,” “observe,” “assert,” “report,” and so on.
Research Synthesis
Research synthesis grew out of the dissatisfaction with traditional reviews,
which are deemed unscientic and subjective. Cooper (2016) argued that like
an empirical study, a research synthesis must meet rigorous methodological
standards in order for the conclusions to be trustworthy. Despite the dierent
labels for research synthesis, experts seem to converge on the point that a
research synthesis is comprehensive in coverage and transparent in reporting,
and its purpose is to reach conclusions based on study ndings, which may be
used to guide practice and policy-making.
In applied linguistics, two types of research synthesis have emerged: meth-
odological and substantive. Methodological synthesis provides a survey of one
or more methodological aspects of the primary research with a view to evalu-
ating whether current practices meet certain criteria and what improvements
can be made. Plonsky and associates (e.g., Plonsky & Gass, 2011) have pub-
lished a series of methodological syntheses assessing the study quality of the
primary research in second language acquisition. For example, Plonsky (2013)
synthesized 606 empirical studies published between 1990 and 2010in two
major journals: Language Learning and Studies in Second Language Acquisition,
coding the studies for designs, statistical analyses, and reporting practices. e
results showed a number of strengths and aws, and the author proposed
strategies to resolve the identied issues.
Substantive syntheses seek to aggregate the results of primary studies and
reach conclusions about whether an instructional treatment is eective or a
certain relationship exists or how frequently a certain phenomenon occurs.
Depending on the way the data is analysed, substantive syntheses can be fur-
ther divided into three types: thematic synthesis, vote-counting, and meta-
analysis. In a thematic synthesis, study ndings are reported as themes and
categories. A thematic synthesis may appear similar to a traditional review,
but it is not, because it has all the characteristics of a research synthesis. In
particular, it seeks to reach conclusions based on the totality of research rather
than critique some selected studies. However, freestanding state-of-the-art
reviews, which fall into the traditional review paradigm, have the potential of
being converted to thematic syntheses if they follow more rigorous proce-
dures, transparently report the review methods, and narrow their foci. An
example thematic synthesis is Dixon etal.’s (2012) synthesis of 72 empirical
studies on the optimal conditions of second language acquisition. e synthe-
sis sought to answer ve research questions from four perspectives based on
transparent study selection criteria. e article reported the information about
each included study in a 20-page table, and the study ndings were described
in a narrative style.
e second type of substantivesynthesis is called vote-counting, in which
study ndings are analysed by tallying the number of signicant and nonsig-
nicant p-values reported in primary studies. An alternative, which is based
on similar principles, is to average the p-values generated by the primary
S. Li and H. Wang
studies. An example vote-counting synthesis is Ellis (2002), who aggregated
the results of 11 studies examining the eects of form-focused instruction on
the acquisition of implicit knowledge. Ellis conducted the synthesis by count-
ing the number of studies that reported signicant or nonsignicant eects,
followed by a discussion of the results. Strictly speaking, thematic synthesis,
which is discussed above, is one type of vote-counting because the conclusions
are based on whether primary studies reported signicant ndings, even
though a thematic synthesis does not overtly count the number of signicant
e third type of substantivesynthesis is meta-analysis, where eect sizes
extracted from each primary study are aggregated to obtain a mean eect size
as a proxy of the population eect(e.g., Li, 2010). Meta-analysis is a common
type of research synthesis, and in fact, research synthesis has, either implicitly
or explicitly, been equated with meta-analysis by many experts in this eld
(e.g., Cooper, 2016). Since Glass (1976) coined the term “meta-analysis,” it
has become the most preferred method of research synthesis in various elds
such as psychology and medicine. Unlike the dearth of guidelines on how to
conduct a traditional literature review, there has been an abundance of publi-
cations including journal articles, book chapters, and books, which provide
systematic instructions on how to carry out a meta-analysis (see Li, Shintani,
& Ellis, 2012). Given that it is the most favoured and common method of
research synthesis, the focus of this section is on meta-analysis.
Meta-analysis is a statistical procedure aiming to (1) aggregate the quantitative
results of a set of primary studies conducted to answer the same research
question(s) and (2) identify factors that moderate the eects across studies.
us, a meta-analysis seeks to obtain a numeric index of the eects of a certain
treatment or the strength of a relationship existing in the population; it also
investigates whether the variation of the eects or relationship can be explained
by systematic substantive and/or methodological features of the included
studies. In a meta-analysis, the “participants” are the included studies, and the
unit of analysis is the eect size contributed by each primary study or calcu-
lated based on available information. e variables for analysis are those
investigated by the primary researchers or created by the meta-analyst on the
basis of the features of the studies (e.g., participant demographics, research
context, treatment length, etc.). e eect size, which takes dierent forms
depending on the nature of the construct or the study design, is the building
Traditional Literature Review andResearch Synthesis
block of a meta-analysis. Importantly, the eect size is a standardized index
that makes it possible to compare the results of dierent studies.
Basing the analyses on eect sizes overcomes the limitations of the dichoto-
mizing p-value in null hypothesis signicance testing (NHST), which repre-
sents the presence or absence of a signicant eect but tells nothing about the
size of the eect. NHST is sensitive to sample size. For example, a signicant
p-value may result from a large sample even though the eect is small, and a
nonsignicant p-value may be associated with a large eect but a small sam-
ple. Sun, Pan, and Wang (2010) reported that in 11% of the articles pub-
lished in selected journals in educational psychology, there was a discrepancy
between eect sizes and the results of NHST: medium to large eect sizes
were associated with nonsignicant p-values, while small eect sizes were
accompanied with signicant p-values. Because the eect size contains infor-
mation about the magnitude of an eect and it is exempt from the inuence
of sample size, its utility has gone beyond meta-analysis. For example, many
applied linguistics journals have made it a requirement to include eect sizes
in manuscripts reporting empirical studies.
Meta-analysis is considered to be superior to other methods, such as vote-
counting, and to a large extent the superiority lies in the usefulness of eect
size (Li etal., 2012). First, by aggregating eect sizes across primary studies,
meta-analysis provides information about the size of the overall eect, whereas
vote-counting can only tell us whether an eect is present. Second, in meta-
analysis, eect sizes can be weighted in proportion to sample sizes, while in
vote-counting all data points carry the same weight. ird, meta-analysis can
provide a precise estimate of an eect or relationship, vote-counting methods
can only demonstrate what the majority of the studies show about the con-
struct. Fourth, meta-analysis follows rigorous statistical procedures. eresults
are likely more robust and credible and can guide practice and
How toDo Meta-Analysis?
An easy way to understand meta-analysis is to construe it as an empirical
study, which involves problem specication, data collection, data analysis,
and research reportwriting. In the following, we briey outline ve major
stages involved in conducting a meta-analysis along with some of the issues
and choices encountered at each stage. For a detailed tutorial on conducting
a meta-analysis in applied linguistics, see Li etal. (2012), Ortega (2015), and
Plonsky and Oswald (2015).
Stage 1: Identifying aTopic
As with a primary study, a successful meta-analysis starts with clear, well-
dened research questions. However, unlike a primary study where the
research questions are based on hypotheses and theories or gaps in previous
research, the research questions for a meta-analysis are primarily those exam-
ined by the primary studies. Nevertheless, the meta-analyst still has to delin-
eate the research domain based on a thorough understanding of the theories
and the research that has been carried out. For instance, a meta-analysis on
language aptitude (Li, 2015) requires an unequivocal theoretical and opera-
tional denition of aptitude. In educational psychology, aptitude refers to any
person trait that aects learning outcomes, including cognitive (e.g., analytic
ability) as well as aective variables (motivation, anxiety, etc.). However, in
most aptitude research in applied linguistics, aptitude has been investigated as
a cognitive construct. erefore, dening the research topic is no easy matter,
and the decisions at this preliminary stage have a direct impact on the scope
of the synthesis and how it is implemented at later stages.
Stage 2: Collecting Data
Data collection for a meta-analysis involves searching for and selecting studies
for inclusion in the synthesis. Dierent from a traditional review, which usually
does not report how the reviewed studies are identied and sieved, a meta-
analysis must report the details on the strategies, databases, and keywords used
during the search, as well as the criteria applied to include/exclude studies. One
judgement call at this stage concerns whether to include unpublished studies.
Evidence shows that studies that report statistically signicant results are more
likely to be published or submitted for publication (Lipsey & Wilson, 1993).
erefore excluding unpublished studies may lead to biased results, and this
phenomenon is called publication bias. One solution, of course, is to include
unpublished studies. However, there have been objections to this recommenda-
tion on the grounds that unpublished studies are dicult to secure, that they
lack internal and external validity, and so on. Another solution is to explore the
extent to which publication/availability bias is present in the current dataset,
such as by plotting the calculated eect sizes to see whether studies with small
eects are missing, followed by a trim-and- ll analysis to see how the mean
eect size would change if the missing values were added, or by calculating a
fail-safe N statistic to probe whether the obtained results would be easily nulli-
ed with the addition of a small number of studies.
Traditional Literature Review andResearch Synthesis
Stage 3: Coding theData
Data coding involves reading the selected studies, extracting eect sizes from
the studies, and coding independent and moderating variables. As a start,
wewould like to point out that most introductory texts ignore the reading
stage and emphasize the technical aspects of meta-analysis. However, it is
important to recognize that meta-analysis is fundamentally a statistical tool
that helps us solve problems, not an end in itself. Eventually the contribution
of a meta-analysis lies in the ndings and insights it generates, not the sophis-
ticated nature of the statistical procedure. erefore, while every eort should
be made to ensure statistical rigour, careful reading of the study reports is
critical to a thorough understanding of the substantive domain, meaningful
coding of the data, and accurate interpretation of the results. While reading,
the meta-analyst should keep notes about each individual study recording the
main ndings and methodological details, which can be checked at any stage
of the meta-analysis.
While the meta-analyst should read the whole article for each study, eect
size extraction relates mainly to the results section of the study report. ere
are three main types of eect sizes: d, r, and OR (odds ratio), which represents
mean dierence between two groups, correlation, and probability of the
occurrence of one event in a certain condition compared with another, respec-
tively. In a meta-analysis, the eect sizes from the primary studies serve as the
dependent variable, and the independent variables are of two types: study-
generated and synthesis-generated (Cooper, 2016). Study-generated indepen-
dent variables are those investigated in primary studies, and these variables
should be recorded intact. Synthesis-generated independent variables are not
directly investigated in primary studies, that is, they are not manipulated as
variables by primary researchers. Rather, they are created by the meta-analyst
a posterior based on the characteristics of the primary research. Synthesis-
generated variables also include what Lipsey and Wilson (2001) call study
descriptors, which refer to the methodological aspects of the primary research
such as participants’ age, research setting, and so on.
Stage 4: Analysing theData
In most meta-analyses, data analysis follows a two-step procedure: aggrega-
tion of all eect sizes that produces an estimate of the population eect, fol-
lowed by moderator analysis exploring whether the variation of eect sizes is
S. Li and H. Wang
due to systematic dierences between subgroups of studies formed by treat-
ment type, research setting, and so on. Eect size aggregation must be theo-
retically meaningful. For example, in Li’s (2015) meta-analysis on language
aptitude, there were two types of studies based on dierent theoretical frame-
works and conducted via two distinguishable methodologies. e eect sizes
were aggregated separately because squashing the two study types was theo-
retically unsound, albeit statistically feasible.
In a meta-analysis, several issues need to be attended to for each analysis.
One is to assign dierent weights to the included studies in proportion to
their sample sizes such that large-sample studies carry more weight because
they provide more accurate estimates of the population eect. e second is
to make sure that one study contributes only one eect size for each aggrega-
tion, to prevent eect size ination, Type I errors, and the violation of the
“independence of data points” assumption of inferential statistics. In the event
that a study contributes several eect sizes, it is advisable to either pick one or
average them, depending on which choice suits the research question for that
analysis. e third is to calculate a condence interval for each mean eect
size, which is the range the population eect falls into. A condence interval
that does not include zero means the eect size is signicant, and a narrow
interval represents a robust eect. Fourth, for each mean eect size, it is neces-
sary to report a measure of variability—either standard error or standard devi-
ation—that shows the distribution of the aggregated eect sizes. Finally, a
homogeneity test, which is called Qw (“w” in the subscript is the abbreviation
for “within-group”) test, should be performed for each group of eect sizes to
assess the distribution of the eect sizes.
Moderator analysis, which is alternatively called subgroup analysis, is
often conducted through the Qb (b in the subscript stands for “between-
group”) test, which is similar to one-way ANOVA, with the only dierence
being that the Qb test incorporates study weights in the calculation of coef-
cients and standard errors. A signicant Q value indicates signicant dier-
ences between the subgroups of eect sizes, and post hoc pairwise Qb tests are
in order to locate the source of signicance. In applied linguistics, a common
practice is to use the condence interval as a test of signicance: lack of an
overlap between two condence intervals indicates that the mean eect sizes
are signicantly dierent. However, it must be pointed out that the opposite
is not true, namely, an overlap does not mean absence of signicant dier-
ences (see Cumming, 2012, for further details). erefore, the condence
interval is at best a conservative, if not unreliable, test of statistical
Stage 5: Writing UptheResearch Report
Similar to the report for an empirical study, a meta-analytic report includes
the following sections: introduction, methods, results, discussion, and con-
clusion. e introduction should contextualize the meta-analysis by:
1. elaborating the relevant theories
2. dening the research problem and scope
3. identifying the central issues and controversies
4. explaining the rationale for examining the variables
e methods section reports information about search strategies, selection
criteria, the coding protocol, and statistical procedures. Transparent reporting
is a dening feature that distinguishes meta-analysis from traditional literature
reviews; transparent reporting of methods and analytic procedures makes it
possible for other researchers to replicate a meta-analysis to verify the ndings.
e results section should include a summary of the methodological aspects of
the synthesized studies to provide an overall picture about how studies in this
domain have been conducted, followed by the results of the meta- analysis.
Valentine, Pigott, and Rothstein (2010) recommended creating a table providing
a description of the characteristics and results of each study including participant
information, the treatment, and the calculated eect sizes. For the meta-analytic
results, each mean eect size should be accompanied with the number of eect
sizes (k), the condence interval, a measure of dispersion (standard error or stan-
dard deviation), the results of a homogeneity test, and the p-value for the eect
size. Results of a moderator analysis should include the number of eect sizes for
each group/condition, the Qb value, and the related p-value.
In the discussion section, the meta-analyst may interpret the results “inter-
nally” with reference to the methods of the primary studies and “externally
to theories and other reviews such as state-of-the-art traditional reviews and
meta-analyses on this and similar topics. e results can also be discussed in
terms of how they can be used to guide practice and policy-making as well as
future research.
Traditional Review andResearch Synthesis:
Comparison andIntegration
Given that traditional reviews and research syntheses have been discussed as
two separate approaches in the literature, there seems to be a need to make a
direct comparison between them. e purpose of the comparison is not only
to distinguish them but also to stimulate thoughts on how to accurately
understand what the two approaches can achieve and how to overcome their
limitations and maximize their strengths. Before comparing the two methods
of review, it is helpful to provide a taxonomy of literature review as a genre,
which, according to Cooper (2016), can be classied on six dimensions: focus,
goal, perspective, coverage, organization, and audience. e focus of a review
refers to the type of information to be included, which may take the form of
theories, empirical ndings, research methods, and policies/practices. e
goals to accomplish may include summarizing existing knowledge, evaluating
the validity of certain aspects of the primary research, or identifying issues
central to the eld. Perspective refers to whether a reviewer adopts a neutral
position or is predisposed to a certain standpoint. Coverage concerns whether
a review is based on all available studies or a selected set of studies. In terms of
organization, a review can be structured based on the historical development
of the research, the themes that emerged in the literature, or the methods
utilized by subgroups of studies. e audience of a review may vary between
researchers, practitioners, general public, and so on. It is noteworthy that the
options for each dimension can be applied jointly when identifying the char-
acteristics of a review or making decisions on what type of review one plans to
carry out. For example, the focus of a review can be on both theory and
research ndings. Similarly, a review can be structured historically, but the
research within a certain period can be synthesized thematically.
While Cooper’s scheme does not directly distinguish traditional reviews
and research syntheses, it helps us understand the dierences between the two
approaches, which are distinguishable along the following lines (see Table6.1).
First, traditional reviews are based on selected studies while research syntheses
are based on all available studies. Certainly, research syntheses may also be
selective, following certain screening criteria, but in general they tend to be
more inclusive than traditional reviews. Second, the studies included in a
traditional review are vetted via the reviewer’s expertise or authority; in a
research synthesis, however, the primary studies are often assessed based on
criteria for study quality. ird, traditional reviews do not follow protocols or
procedures, whereas research synthesis follows rigorous methodology. Fourth,
one important feature of research synthesis is its transparency in reporting the
review procedure or process, which is absent in traditional reviews. A corol-
lary is that a research synthesis is replicable, but a traditional review is not.
Fifth, traditional reviews do not often have research questions and often seek
to provide overarching descriptions of previous research; research syntheses,
in contrast, aim to answer clearly dened research questions. e above argu-
ments and observations seem to suggest that traditional reviews are subjective
and may lead to false claims and that research syntheses are more objective
and generate robust results that can guide practice and policy-making.
However, it is arbitrary and inaccurate to discount traditional reviews as
valueless, and their utility is justiable on the following grounds. First, a tra-
ditional review is often part of a journal article, a master’s thesis, or a Ph.D.
dissertation, and the primary purpose is to contextualize a new study and
draw on existing research methods to answer the research questions of the
current study. In this context, it seems less important to include all available
studies and reach conclusions based on the totality of the research. Second,
traditional reviews are exible and can be used to synthesize knowledge other
than research ndings such as theories, practices, and policies. ird, tradi-
tional reviews are often critical. us, if the purpose of a review is to critique
certain aspects of previous research such as the instruments used to measure a
certain construct rather than collate information to show the eectiveness of
a certain treatment or the presence of a certain relationship, then the tradi-
tional approach is more appropriate. Finally, traditional reviews are appropri-
ate when (1) it is premature to conduct a research synthesis because of the lack
of research, and (2) aggregation of evidence is not meaningful because the
primary studies are carried out using heterogeneous methods.
Traditional review Research synthesis
Focus Theory, research, and practice Research
Purpose To justify further research; identify
central issues and research gaps;
discuss state of the art; critique
previous research
To answer one or more research
questions by collating evidence
from previous research; resolve
controversies; guide practice
Coverage Selective: most relevant and
representative; no selection
Inclusive: all relevant studies;
based on systematic search and
justified selection criteria
Methodology No prescribed methodology With transparent methodology
Structure Organized by themes and patterns Analogous to the template of a
research report, including an
introduction, methods, results,
discussion, and conclusion
Style Narrative, critical, and interpretive Descriptive, inductive, and
Pros Flexible; appropriate when the
purpose is to critique rather than
aggregate research findings
Objective, transparent, and
replicable; results may inform
practice and policy-making
Cons Subjective; based on idiosyncratic
methods; not replicable
Subject to the quality of
available studies; comparing
oranges and apples;
So what do we make of the status quo of literature review as a eld of
research in relation to the two dierent review methods? First of all, the dif-
ferences between the two methods are suggestive rather than conclusive, and
they stand in a continuum rather than a dichotomy. For example, although
traditional reviews are more likely to be critical, those adopting a more sys-
tematic approach may also critique certain aspects of the research based on
their aggregated results. erefore, it is better to consider the dierences in
terms of overall emphasis and orientation rather than treat them as polarized
disparities. Second, the dierences are based on what has been observed about
two broad types of review. ey are a posteriori and descriptive, not stipulated
or prescriptive; in other words, they do not have to dier the way they are
assumed to be dierent. For example, traditional reviews have been criticized
for being subjective, but there is no reason why they cannot become more
objective by following more rigorous procedures, such as conducting more
thorough literature searches, including more studies that represent dierent
perspectives, and so on.
Finally, and importantly, we should nd ways to integrate traditional litera-
ture reviews and more systematic approaches such as meta-analysis, and such
initiatives have already taken place in our eld. For example, Carpenter (2008)
included a meta-analysis of the predicative validity of the Modern Language
Aptitude Test in her Ph.D. dissertation when reviewing the literature on lan-
guage aptitude. In this case, a small-scale meta-analysis is embedded in a tra-
ditional literature to explore an important issue, which constitutes an
assimilative approach where a meta-analysis plays a supplementary role.
Another way is to adopt a more balanced approach where the two review
methods are utilized to synthesize (1) dierent types of knowledge or (2) stud-
ies conducted with dierent methods. In the case of (1), a good example is
Xu, Maeda, Lv, and Jinther (2015), who used more traditional methods to
synthesize the theoretical and methodological aspects and meta-analysis to
aggregate the ndings of the primary research. In the case of (2), an example
is Li (2017), who conducted a comprehensive review of the research on teach-
ers’ and learners’ beliefs on corrective feedback. e author meta-analysed the
results of the studies conducted using similar methods (e.g., studies using a
ve-point Likert scale) but described the themes and patterns demonstrated
by studies that deviated from the majority, such as those using a four- or
three-point scale and those that used qualitative methods (e.g., observations,
interviews, or diaries). erefore, rather than make an a priori decision to
carry out a certain type of review, we may customize our methodology based
on the available data, using or integrating dierent approaches so as to pro-
vide a comprehensive, impartial view of the research domain.
