Conference PaperPDF Available

From Replication to Substantiation in Applied Linguistics: A Complexity Theory Perspective



In contemporary methodological thinking, replication is increasingly holding an important place in various disciplines, including applied linguistics. At the same time, relatively little attention has been paid to replication in the context of complex dynamic systems theory (CDST), perhaps due to uncertainty regarding the epistemology–methodology match between these domains. It is important for CDST scholars to engage with the dialog on replication and the various questions associated with it. In this paper, we explore the place of replication in CDST research and argue that three conditions must be in place for replication research to be effective: results interpretability, theoretical maturity, and terminological precision. We consider whether these conditions are part of the applied linguistics body of work, and then propose a more comprehensive framework centering on what we call substantiation research, only one aspect of which is replication research. This framework presents an alternative classification of what used to labeled “replication” research, depending on theoretical maturity, function and methodology in order to better contextualize successful and failed replication attempts. Using this framework, we then discuss three approaches to dealing with replication from a CDST perspective theory. These approaches are moving from a representing to an intervening mindset, from a comprehensive theory to a mini-theory mindset, and from individual findings to a cumulative mindset.
Ali H. Al-Hoorie, Royal Commission for Jubail and Yanbu, Saudi Arabia
Phil Hiver, Florida State University, USA
Diane Larsen-Freeman, University of Michigan, USA
Wander Lowie, University of Groningen, Netherlands
Presented at AAAL, Pittsburgh, Pennsylvania
Why replication?
Classifications of replication research
Limitations of these classifications
Substantiation framework
Complexity perspective
Why should journals accept (let alone encourage) replication?
Isn’t science about finding new things, not repeating what was previously done?
The problem with this is
Creating an incentive structure that favors innovation over verification
Unreplicable findings will accumulate
Replication crisis, confidence crisis, devaluing science
The replication dilemma
If you find the same results, the journal & reviewers will say: we already know that
If you don’t find the same results, the journal, reviewers & initial authors may blame you
Your expertise
Design deviation
(Clemens, 2017)
We do not want to add to the confusion here with
more definitions! Rather, we intend to present you
with three existing definitions and define a
practical use for each within a systematic,
cumulative approach to replication work. The
intention is to set out firm replication research
series which are interdependent, and in which you
can participate first through close, followed by
approximate, and perhaps then supplemented by
conceptual replications. (p. 72)
Close: Change in one variable
Approximate: Change in two variables
Conceptual: More deviation
A lot of confusion over terminology
Different fields have different conventions
In the field of language learning, some have similarly proposed certain terms
Thus, replication terms not set in stone
How do we interpret failed replications?
If failed replications do not question the initial results in some way, whats the point?
Does all this apply to open systems?
Unlike lab-controlled settings, many phenomena are too complex to control
Even if they follow deterministic structures (Lorenz, 1963)
Does it make sense to attempt replication, if we already know it will fail (sensitivity to initial conditions)?
Do failed replications mean question initial results, or our own lack of knowledge?
Should we call all these replications?
Or are some extensions? (e.g., partial replication)
Is it a conceptual replication, or a completely different study?
Conceptual replicationsare in a weaker position for ascribing different findings to the
adaptations made to the initial study(Marsden et al., 2018, p. 366)
[Conceptual replications] can be a more high-risk undertaking than close or approximate
replications in the sense that failure to replicate will leave us with little or anything to say about
the original(Porte & McNanus, 2019, p. 94)
if our study did not come to the same conclusion, can we say we have conceptually not
replicatedour original? Obviously not(Porte & McNanus, 2019, pp. 9394)
“when a conceptual replication fails to support a theory, rather than reduce our belief in the
theory, we are tempted to explain the failure in terms of methodological problems with the
operationalization of the key variables. As such, conceptual replication has been described by
critics as solely a mechanism for confirmation bias (Crandall & Sherman, 2016, p. 97)
“the likelihood of being able to publish a conceptual replication failure in a journal is very low.
But here, the failure will likely generate no gossipthere is nothing interesting enough to talk
about here” (Pashler & Harris, 2012, p. 533)
Three conditions need to exist:
Result interpretability
Theoretical maturity
Terminological precision
1) Result interpretability
Replication should not be defined in relation to its operational characteristics (how similar to
the initial study: direct, partial, conceptual)
Instead, a study for which any outcome would be considered diagnostic evidence about a
claim from prior research(Nosek & Errington, 2020, p. 1)
positive results must support initial study findings
Negative results must question initial study findings
2) Theoretical maturity
Theory explains necessary and sufficient conditions to obtain a finding
Therefore, even with some design deviations a replication can still be direct
Constraints-on-Generality statements “The current publishing model incentivizes authors to make the
strongest possible claims of generality; broadly generalizable findings are more likely to be published
and more likely to be influential (Simons et al., 2017, p. 1124)
Failed conceptual replications can still hold evidentiary value
Verification of postulated processes
Testing the underlying theory
3) Terminological precision
Are all types really “replication”?
E.g., partial and conceptual replications are actually:
an extension,
a follow-up, or
a generalizability test
A conceptual replication is a practical oxymoron (Freese & Peterson, 2018, p. 302)
Repeat what?
Is it appropriate to use “replication” as the umbrella term?
Al-Hoorie et al. (in press)
For open complex system, also:
Interactions: the way that components of any treatment/process influence each other
Iterativity: what happens in the next step in the process depends on the preceding step
Interdependence: various nested processes that unfold over many time interdependent timescales
Self-organization: attractors emerge spontaneously when components and their interactions become
When CDST replicatesan effect, it does not refer to a simple relationship, but to something
that takes place in the form of complex, iterative, time-scaled, situated processes (Hiver & Al-
Hoorie, 2020; Hiver et al., in press)
Can this be a direct replication? Or only conceptual extension?
“This paper provides evidence that contextual factors are associated with reproducibility,
even after adjusting for other methodological variables reported or hypothesized to impact
replication success. Attempting a replication in a different time or place or with a different
sample can alter the results of what are otherwise considered ‘direct replications. The results
suggest that many variables in psychology and other social sciences cannot be fully
understood apart from the cultural and historical contexts that define their meanings”
Van Bavel et al. (2016, p. 6457)
Theoretical maturity
Constraints of generality
Encourage exploratory (descriptive) research, not just confirmatory
“Confirmatory research bias”?
Open science and transparency
Expect effect sizes from individual studies to be overestimated
Adopt a meta-analytic mindset to disconfirm results
Not only replication but broader substantiation efforts
“Many solutions suggested for the [reproducibility] concerns… are decades old
(Meehl 1967). The reproducibility crisis presents a sober occasion to revisit them,
given our accumulating research experience.
“Inevitably, psychology will still face a reproducibility problem 20 years from now,
even when recommendations such as preregistration, open materials, data, and code
are standard (cf. Meehl 1990a).
Alexander and Moors (2018, pp. 1314)
Replication is HARD
Especially with open systems
Just like we realized that preregistration is HARD (Nosek et al., 2019)
It would be naïve to think of replication as just doing the same study again
If there is one thing that we have learned from the history of science:
Anything that is complex, upon closer examination becomes more complex(Hansen, 2011, p. 119).
Phil Hiver Diane Larsen-Freeman Wander Lowie
Alexander, D. M., & Moors, P. (2018). If we accept that poor replication rates are mainstream. Behavioral and Brain Sciences, 41, e121.
Al-Hoorie, A. H., Hiver, P., Larsen-Freeman, D., & Lowie, W. (in press). From replication to substantiation: A complexity theory perspective. Language Teaching.
Clemens, M. A. (2017). The meaning of failed replications: A review and proposal. Journal of Economic Surveys, 31(1), 326342.
Crandall, C. S., & Sherman, J. W. (2016). On the scientific superiority of conceptual replications for scientific progress. Journal of Experimental Social Psychology,
66, 9399.
Freese, J., & Peterson, D. (2018). The emergence of statistical objectivity: Changing ideas of epistemic vice and virtue in science. Sociological Theory, 36(3), 289
Hansen, W. B. (2011). Was Herodotus correct? Prevention Science, 12(2), 118120.
Hiver, P., & Al-Hoorie, A. H. (2020). Research methods for complexity theory in applied linguistics. Multilingual Matters.
Hiver, P., Al-Hoorie, A. H., & Reid, E. (in press). Complex dynamic systems theory in language learning: A scoping review of 25 years of research. Studies in
Second Language Acquisition.
Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of Atmospheric Sciences, 20(2), 130141.
Marsden, E., Morgan-Short, K., Thompson, S., & Abugaber, D. (2018). Replication in second language research: Narrative and systematic reviews and
recommendations for the field. Language Learning, 68(2), 321391.
Nosek, B. A., & Errington, T. M. (2020). What is replication? PLOS Biology, 18(3), e3000691.
Nosek, B. A., Beck, E. D., Campbell, L., Flake, J. K., Hardwicke, T. E., Mellor, D. T., van t Veer, A. E., & Vazire, S. (2019). Preregistration is hard, and worthwhile.
Trends in cognitive sciences, 23(10), 815818.
Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7(6), 531536.
Plesser, H. E. (2018). Reproducibility vs. replicability: A brief history of a confused terminology. Frontiers in Neuroinformatics, 11, Article 76.
Porte, G. K., & McManus, K. (2019). Doing replication research in applied linguistics. Routledge.
Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on Generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological
Science, 12(6), 11231128.
Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016). Contextual sensitivity in scientific reproducibility. Proceedings of the National
Academy of Sciences, 113(23), 6454-6459.
Thank You
... This leads to the issue of replicability. Considering that the sorting activity is grounded in the participants' life experiences, beliefs, and sorting context, we invite the readers to understand replicability again not in a positivistic way but, as suggested by Al-Hoorie et al. (2021), as interpretability of the results, therefore putting the accent of a replication attempt not on the methodological and mechanical aspects of the procedure, but on the interpretation of the research outcomes. Finally, the ethical procedural need to safeguard the anonymity of the non-Q expert participating in this study meant that we were unable to conduct interviews with participants associated with factors 3a and 3b, which would have probably opened up a more detailed discussion. ...
Full-text available
The concept of subjectivity has long been controversially discussed in academic contexts without ever reaching consensus. As the main approach for a science of subjectivity, we applied Q methodology to investigate subjective perspectives about ‘subjectivity’. The purpose of this work was therefore to contribute with clarity about what is meant with this central concept and in what way the understanding might differ among Q researchers and beyond. Forty-six participants from different disciplinary backgrounds and geographical locations sorted 39 statements related to subjectivity. Factor analysis yielded five different perspectives. Employing a team approach, the factors were carefully and holistically interpreted in an iterative manner. Preliminary factor interpretations were then discussed with prominent experts in the field of Q methodology. These interviewees were selected due to their clear representation by a specific factor and led to a further enrichment of the narratives presented. Despite some underlying consensus concerning subjectivity’s dynamic and complex structure and being used as individuals’ internal point of view, perspectives differ with regard to the measurability of subjectivity and the role context plays for their construction. In light of the wide range of characterisations, we suggest the presented perspectives to be used as a springboard for future Q studies and urge researchers, within and beyond the Q community, to be more specific regarding their application of the concept. Furthermore, we discuss the importance of attempting to deeply understand research participants in order to truly contribute to a science of subjectivity.
Full-text available
A quarter of a century has passed since complex dynamic systems theory was proposed as an alternative paradigm to rethink and reexamine some of the main questions and phenomena in applied linguistics and language learning. In this article, we report a scoping review of the heterogenous body of research adopting this framework. We analyzed 158 reports satisfying our inclusion criteria (89 journal articles and 69 dissertations) for methodological characteristics and substantive contributions. We first highlight methodological trends in the report pool using a framework for dynamic method integration at the levels of study aim, unit of analysis, and choice of method. We then survey the main substantive contribution this body of research has made to the field. Finally, examination of study quality in these reports revealed a number of potential areas of improvement. We synthesize these insights in what we call the "nine tenets" of complex dynamic systems theory research, which we hope will help enhance the methodological rigor and the substantive contribution of future research.
Full-text available
This book provides practical guidance on research methods and designs that can be applied to Complex Dynamic Systems Theory (CDST) research. It discusses the contribution of CDST to the field of applied linguistics, examines what this perspective entails for research and introduces practical methods and templates, both qualitative and quantitative, for how applied linguistics researchers can design and conduct research using the CDST framework. Introduced in the book are methods ranging from those in widespread use in social complexity, to more familiar methods in use throughout applied linguistics. All are inherently suited to studying both dynamic change in context and interconnectedness. This accessible introduction to CDST research will equip readers with the knowledge to ensure compatibility between empirical research designs and the theoretical tenets of complexity. It will be of value to researchers working in the areas of applied linguistics, language pedagogy and educational linguistics and to scholars and professionals with an interest in second/foreign language acquisition and complexity theory.
Full-text available
We agree with the authors' arguments to make replication mainstream but contend that the poor replication record is symptomatic of a pre-paradigmatic science. Reliable replication in psychology requires abandoning group-level p- value testing in favor of real-time predictions of behaviors, mental and brain events. We argue for an approach based on analysis of boundary conditions where measurement is closely motivated by theory.
Full-text available
Despite its critical role for the development of the field, little is known about replication in second language (L2) research. To better understand replication practice, we first provide a narrative review of challenges related to replication, drawing on recent developments in psychology. This discussion frames and motivates a systematic review, building on syntheses of replication in psychology, education, and L2 research. We coded 67 self‐labeled L2 replication studies found across 26 journals for 136 characteristics. We estimated a mean rate of 1 published replication study for every 400 articles, with a mean of 6.64 years between initial and replication studies and a mean of 117 citations of the initial study before a replication was published. Replication studies had an annual mean of 7.3 citations, much higher than averages in linguistics and education. Overlap in authorship between initial and replication studies and the availability of the initial materials both increased the likelihood of a replication supporting the initial findings. Our sample contained no direct (exact) replication attempts, and changes made to initial studies were numerous and wide ranging, which likely obscured, if not undermined, the interpretability of replication studies. To improve the amount and quality of L2 replication research, we propose 16 recommendations relating to rationale, nomenclature, design, infrastructure, and incentivization for collaboration and publication. Open Practices This article has been awarded Open Materials and Open Data badges. All materials and data are publicly accessible via the IRIS Repository at . Learn more about the Open Practices badges from the Center for Open Science: .
Full-text available
Significance Scientific progress requires that findings can be reproduced by other scientists. However, there is widespread debate in psychology (and other fields) about how to interpret failed replications. Many have argued that contextual factors might account for several of these failed replications. We analyzed 100 replication attempts in psychology and found that the extent to which the research topic was likely to be contextually sensitive (varying in time, culture, or location) was associated with replication success. This relationship remained a significant predictor of replication success even after adjusting for characteristics of the original and replication studies that previously had been associated with replication success (e.g., effect size, statistical power). We offer recommendations for psychologists and other scientists interested in reproducibility.
Full-text available
There is considerable current debate about the need for replication in the science of social psychology. Most of the current discussion and approbation is centered on direct or exact replications, the attempt to conduct a study in a manner as close to the original as possible. We focus on the value of conceptual replications, the attempt to test the same theoretical process as an existing study, but that uses methods that vary in some way from the previous study. The tension between the two kinds of replication is a tension of values—exact replications value confidence in operationalizations; their requirement tends to favor the status quo. Conceptual replications value confidence in theory; their use tends to favor rapid progress over ferreting out error. We describe the many ways in which conceptual replications can be superior to direct replications. We further argue that the social system of science is quite robust to these threats and is self-correcting.
The meaning of objectivity in any specific setting reflects historically situated understandings of both science and self. Recently, various scientific fields have confronted growing mistrust about the replicability of findings, and statistical techniques have been deployed to articulate a “crisis of false positives.” In response, epistemic activists have invoked a decidedly economic understanding of scientists’ selves. This has prompted a scientific social movement of proposed reforms, including regulating disclosure of “backstage” research details and enhancing incentives for replication. We theorize that together, these events represent the emergence of a new formulation of objectivity. Statistical objectivity assesses the integrity of research literatures in the results observed in collections of studies rather than the methodological details of individual studies and thus positions meta-analysis as the ultimate arbiter of scientific objectivity. Statistical objectivity presents a challenge to scientific communities and raises new questions for sociological theory about tensions between quantification and expertise.
Psychological scientists draw inferences about populations based on samples—of people, situations, and stimuli—from those populations. Yet, few papers identify their target populations, and even fewer justify how or why the tested samples are representative of broader populations. A cumulative science depends on accurately characterizing the generality of findings, but current publishing standards do not require authors to constrain their inferences, leaving readers to assume the broadest possible generalizations. We propose that the discussion section of all primary research articles specify Constraints on Generality (i.e., a “COG” statement) that identify and justify target populations for the reported findings. Explicitly defining the target populations will help other researchers to sample from the same populations when conducting a direct replication, and it could encourage follow-up studies that test the boundary conditions of the original finding. Universal adoption of COG statements would change publishing incentives to favor a more cumulative science.