ArticlePDF Available

What is replication?



Credibility of scientific claims is established with evidence for their replicability using new data. According to common understanding, replication is repeating a study’s procedure and observing whether the prior finding recurs. This definition is intuitive, easy to apply, and incorrect. We propose that replication is a study for which any outcome would be considered diagnostic evidence about a claim from prior research. This definition reduces emphasis on operational characteristics of the study and increases emphasis on the interpretation of possible outcomes. The purpose of replication is to advance theory by confronting existing understanding with new evidence. Ironically, the value of replication may be strongest when existing understanding is weakest. Successful replication provides evidence of generalizability across the conditions that inevitably differ from the original study; Unsuccessful replication indicates that the reliability of the finding may be more constrained than recognized previously. Defining replication as a confrontation of current theoretical expectations clarifies its important, exciting, and generative role in scientific progress.
What is replication?
Brian A. NosekID
*, Timothy M. ErringtonID
1Center for Open Science, Charlottesville, Virginia, United States of America, 2University of Virginia,
Charlottesville, Virginia, United States of America
Credibility of scientific claims is established with evidence for their replicability using new data.
According to common understanding, replication is repeating a study’s procedure and observ-
ing whether the prior finding recurs. This definition is intuitive, easy to apply, and incorrect. We
propose that replication is a study for which any outcome would be considered diagnostic evi-
dence about a claim from prior research. This definition reduces emphasis on operational
characteristics of the study and increases emphasis on the interpretation of possible out-
comes. The purpose of replication is to advance theory by confronting existing understanding
with new evidence. Ironically, the value of replication may be strongest when existing under-
standing is weakest. Successful replication provides evidence of generalizability across the
conditions that inevitably differ from the original study; Unsuccessful replication indicates that
the reliability of the finding may be more constrained than recognized previously. Defining rep-
lication as a confrontation of current theoretical expectations clarifies its important, exciting,
and generative role in scientific progress.
Credibility of scientific claims is established with evidence for their replicability using new data
[1]. This is distinct from retesting a claim using the same analyses and same data (usually referred
to as reproducibility or computational reproducibility) and using the same data with different
analyses (usually referred to as robustness). Recent attempts to systematically replicate published
claims indicate surprisingly low success rates. For example, across 6 recent replication efforts of
190 claims in the social and behavioral sciences, 90 (47%) replicated successfully according to
each study’s primary success criterion [2]. Likewise, a large-sample review of 18 candidate gene
or candidate gene-by-interaction hypotheses for depression found no support for any of them
[3], a particularly stunning result considering that more than 1,000 articles have investigated
their effects. Replication challenges have spawned initiatives to improve research rigor and trans-
parency such as preregistration and open data, materials, and code [46]. Simultaneously, fail-
ures-to-replicate have spurred debate about the meaning of replication and its implications for
research credibility. Replications are inevitably different from the original studies. How do we
decide whether something is a replication? The answer shifts the conception of replication from
a boring, uncreative, housekeeping activity to an exciting, generative, vital contributor to
research progress.
PLOS Biology | March 27, 2020 1 / 8
Citation: Nosek BA, Errington TM (2020) What is
replication? PLoS Biol 18(3): e3000691. https://
Published: March 27, 2020
Copyright: ©2020 Nosek, Errington. This is an
open access article distributed under the terms of
the Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Funding: This work was supported by grants from
Arnold Ventures, John Templeton Foundation,
Templeton World Charity Foundation, and
Templeton Religion Trust. The funders had no role
in the preparation of the manuscript or the decision
to publish.
Competing interests: We have read the journal’s
policy and the authors of this manuscript have the
following competing interests: BAN and TME are
employees of the Center for Open Science, a
nonprofit technology and culture change
organization with a mission to increase openness,
integrity, and reproducibility of research.
Provenance: Commissioned; not externally peer
Replication reconsidered
According to common understanding, replication is repeating a study’s procedure and observ-
ing whether the prior finding recurs [7]. This definition of replication is intuitive, easy to
apply, and incorrect.
The problem is this definition’s emphasis on repetition of the technical methods—the pro-
cedure, protocol, or manipulated and measured events. Why is that a problem? Imagine an
original behavioral study was conducted in the United States in English. What if the replication
is to be done in the Philippines with a Tagalog-speaking sample? To be a replication, must the
materials be administered in English? With no revisions for the cultural context? If minor
changes are allowed, then what counts as minor to still qualify as repeating the procedure?
More broadly, it is not possible to recreate an earthquake, a supernova, the Pleistocene, or an
election. If replication requires repeating the manipulated or measured events of the study,
then it is not possible to conduct replications in observational research or research on past
The repetition of the study procedures is an appealing definition of replication because it
often corresponds to what researchers do when conducting a replication—i.e., faithfully follow
the original methods and procedures as closely as possible. But the reason for doing so is not
because repeating procedures defines replication. Replications often repeat procedures because
theories are too vague and methods too poorly understood to productively conduct replica-
tions and advance theoretical understanding otherwise [8].
Prior commentators have drawn distinctions between types of replication such as “direct”
versus “conceptual” replication and argue in favor of valuing one over the other (e.g., [9,10]).
By contrast, we argue that distinctions between “direct” and “conceptual” are at least irrelevant
and possibly counterproductive for understanding replication and its role in advancing knowl-
edge. Procedural definitions of replication are masks for underdeveloped theoretical expecta-
tions, and “conceptual replications” as they are identified in practice often fail to meet the
criteria we develop here and deem essential for a test to qualify as a replication.
Replication redux
We propose an alternative definition for replication that is more inclusive of all research and
more relevant for the role of replication in advancing knowledge. Replication is a study for
which any outcome would be considered diagnostic evidence about a claim from prior
research. This definition reduces emphasis on operational characteristics of the study and
increases emphasis on the interpretation of possible outcomes.
To be a replication, 2 things must be true: outcomes consistent with a prior claim would
increase confidence in the claim, and outcomes inconsistent with a prior claim would decrease
confidence in the claim. The symmetry promotes replication as a mechanism for confronting
prior claims with new evidence. Therefore, declaring that a study is a replication is a theoretical
commitment. Replication provides the opportunity to test whether existing theories, hypothe-
ses, or models are able to predict outcomes that have not yet been observed. Successful replica-
tions increase confidence in those models; unsuccessful replications decrease confidence and
spur theoretical innovation to improve or discard the model. This does not imply that the mag-
nitude of belief change is symmetrical for “successes” and “failures.” Prior and existing evidence
inform the extent to which replication outcomes alter beliefs. However, as a theoretical commit-
ment, replication does imply precommitment to taking all outcomes seriously.
Because replication is defined based on theoretical expectations, not everyone will agree
that one study is a replication of another. Moreover, it is not always possible to make precom-
mitments to the diagnosticity of a study as a replication, often for the simple reason that study
PLOS Biology | March 27, 2020 2 / 8
outcomes are already known. Deciding whether studies are replications after observing the
outcomes can leverage post hoc reasoning biases to dismiss “failures” as nonreplications and
“successes” as diagnostic tests of the claims, or the reverse if the observer wishes to discredit
the claims. This can unproductively retard research progress by dismissing replication coun-
terevidence. Simultaneously, replications can fail to meet their intended diagnostic aims
because of error or malfunction in the procedure that is only identifiable after the fact. When
there is uncertainty about the status of claims and the quality of methods, there is no easy solu-
tion to distinguishing between motivated and principled reasoning about evidence. Science’s
most effective solution is to replicate, again.
At its best, science minimizes the impact of ideological commitments and reasoning biases by
being an open, social enterprise. To achieve that, researchers should be rewarded for articulating
their theories clearly and a priori so that they can be productively confronted with evidence [4,6].
Better theories are those that make it clear how they can be supported and challenged by replica-
tion. Repeated replication is often necessary to resolve confidence in a claim, and, invariably,
researchers will have plenty to argue about even when replication and precommitment are nor-
mative practices.
Replication resolved
The purpose of replication is to advance theory by confronting existing understanding with
new evidence. Ironically, the value of replication may be strongest when existing understand-
ing is weakest. Theory advances in fits and starts with conceptual leaps, unexpected observa-
tions, and a patchwork of evidence. That is okay; it is fuzzy at the frontiers of knowledge. The
dialogue between theory and evidence facilitates identification of contours, constraints, and
expectations about the phenomena under study. Replicable evidence provides anchors for that
iterative process. If evidence is replicable, then theory must eventually account for it, even if
only to dismiss it as irrelevant because of invalidity of the methods. For example, the claims
that there are more obese people in wealthier countries compared with poorer countries on
average and that people in wealthier countries live longer than people in poorer countries on
average could both be highly replicable. All theoretical perspectives about the relations
between wealth, obesity, and longevity would have to account for those replicable claims.
There is no such thing as exact replication. We cannot reproduce an earthquake, era, or
election, but replication is not about repeating historical events. Replication is about identify-
ing the conditions sufficient for assessing prior claims. Replication can occur in observational
research when the conditions presumed essential for observing the evidence recur, such as
when a new seismic event has the characteristics deemed necessary and sufficient to observe
an outcome predicted by a prior theory or when a new method for reassessing a fossil offers an
independent test of existing claims about that fossil. Even in experimental research, original
and replication studies inevitably differ in some aspects of the sample—or units—from which
data are collected, the treatments that are administered, the outcomes that are measured, and
the settings in which the studies are conducted [11].
Individual studies do not provide comprehensive or definitive evidence about all conditions
for observing evidence about claims. The gaps are filled with theory. A single study examines
only a subset of units, treatments, outcomes, and settings. The study was conducted in a partic-
ular climate, at particular times of day, at a particular point in history, with a particular mea-
surement method, using particular assessments, with a particular sample. Rarely do researchers
limit their inference to precisely those conditions. If they did, scientific claims would be histori-
cal claims because those precise conditions will never recur. If a claim is thought to reveal a reg-
ularity about the world, then it is inevitably generalizing to situations that have not yet been
PLOS Biology | March 27, 2020 3 / 8
observed. The fundamental question is: of the innumerable variations in units, treatments, out-
comes, and settings, which ones matter? Time-of-day for data collection may be expected to be
irrelevant for a claim about personality and parenting or critical for a claim about circadian
rhythms and inhibition.
When theories are too immature to make clear predictions, repetition of original proce-
dures becomes very useful. Using the same procedures is an interim solution for not having
clear theoretical specification of what is needed to produce evidence about a claim. And, using
the same procedures reduces uncertainty about what qualifies as evidence “consistent with”
earlier claims. Replication is not about the procedures per se, but using similar procedures
reduces uncertainty in the universe of possible units, treatments, outcomes, and settings that
could be important for the claim.
Because there is no exact replication, every replication test assesses generalizability to the
new study’s unique conditions. However, every generalizability test is not a replication. Fig 1‘s
left panel illustrates a discovery and conditions around it to which it is potentially generaliz-
able. The generalizability space is large because of theoretical immaturity; there are many con-
ditions in which the claim might be supported, but failures would not discredit the original
claim. Fig 1‘s right panel illustrates a maturing understanding of the claim. The generalizability
space has shrunk because some tests identified boundary conditions (gray tests), and the repli-
cability space has increased because successful replications and generalizations (colored tests)
have improved theoretical specification for when replicability is expected.
Successful replication provides evidence of generalizability across the conditions that inevi-
tably differ from the original study; unsuccessful replication indicates that the reliability of the
finding may be more constrained than recognized previously. Repeatedly testing replicability
and generalizability across units, treatments, outcomes, and settings facilitates improvement
in theoretical specificity and future prediction.
Theoretical maturation is illustrated in Fig 2. A progressive research program (the left path)
succeeds in replicating findings across conditions presumed to be irrelevant and also matures
the theoretical account to more clearly distinguish conditions for which the phenomenon is
expected to be observed or not observed. This is illustrated by a shrinking generalizability
space in which the theory does not make clear predictions. A degenerative research program
(the right path) persistently fails to replicate the findings and progressively narrows the uni-
verse of conditions to which the claim could apply. This is illustrated by shrinking generaliz-
ability and replicability space because the theory must be constrained to ever narrowing
conditions [12].
This exposes an inevitable ambiguity in failures-to-replicate. Was the original evidence a
false positive or the replication a false negative, or does the replication identify a boundary
condition of the claim? We can never know for certain that earlier evidence was a false posi-
tive. It is always possible that it was “real,” and we cannot identify or recreate the conditions
necessary to replicate successfully. But that does not mean that all claims are true, and science
cannot be self-correcting. Accumulating failures-to-replicate could result in a much narrower
but more precise set of circumstances in which evidence for the claim is replicable, or it may
result in failure to ever establish conditions for replicability and relegate the claim to
The ambiguity between disconfirming an original claim or identifying a boundary condi-
tion also means that understanding whether or not a study is a replication can change due to
accumulation of knowledge. For example, the famous experiment by Otto Loewi (1936 Nobel
Prize in Physiology or Medicine) showed that the inhibitory factor “vagusstoff,” subsequently
determined to be acetylcholine, was released from the vagus nerve of frogs, suggesting that
neurotransmission was a chemical process. Much later, after his and others’ failures-to-
PLOS Biology | March 27, 2020 4 / 8
Fig 1. There is a universe of distinct units, treatments, outcomes, and settings and only a subsetof those qualify as replications—a studyfor
which any outcome would be considered diagnostic evidence about a prior claim. For underspecified theories, there is a larger space for which the
claim may or may not be supported—the theory does not provide clear expectations. These are generalizability tests. Testing replicability is a subset of
testing generalizability. As theory specification improves (moving from left panel to right panel), usually interactively with repeated testing, the
generalizability and replicability space converge. Failures-to-replicate or generalize shrink the space (dotted circle shows original plausible space).
Successful replications and generalizations expand the replicability space—i.e., broadening and strengthening commitments to replicability across units,
treatments, outcomes, and settings.
PLOS Biology | March 27, 2020 5 / 8
replicate his original claim, a crucial theoretical insight identified that the time of year at which
Loewi performed his experiment was critical to its success [13]. The original study was per-
formed with so-called winter frogs. The replication attempts performed with summer frogs
failed because of seasonal sensitivity of the frog heart to the unrecognized acetylcholine, mak-
ing the effects of vagal stimulation far more difficult to demonstrate. With subsequent tests
providing supporting evidence, the understanding of the claim improved. What had been
Fig 2. A discovery provides initial evidence that has a plausible range of generalizability (light blue) and little
theoretical specificity for testing replicability (dark blue). With progressive success (left path) theoretical
expectations mature, clarifying when replicability is expected. Also, boundary conditions become clearer, reducing the
potential generalizability space. A complete theoretical account eliminates generalizability space because the theoretical
expectations are so clear and precise that all tests are replication tests. With repeated failures (right path) the
generalizability and replicability space both shrink, eventually to a theory so weak that it makes no commitments to
PLOS Biology | March 27, 2020 6 / 8
perceived as replications were not anymore because new evidence demonstrated that they
were not studying the same thing. The theoretical understanding evolved, and subsequent
replications supported the revised claims. That is not a problem, that is progress.
Replication is rare
The term “conceptual replication” has been applied to studies that use different methods to
test the same question as a prior study. This is a useful research activity for advancing under-
standing, but many studies with this label are not replications by our definition. Recall that “to
be a replication, 2 things must be true: outcomes consistent with a prior claim would increase
confidence in the claim, and outcomes inconsistent with a prior claim would decrease confi-
dence in the claim." Many "conceptual replications" meet the first criterion and fail the second.
That is, they are not designed such that a failure to replicate would revise confidence in the
original claim. Instead, “conceptual replications” are often generalizability tests. Failures are
interpreted, at most, as identifying boundary conditions. A self-assessment of whether one is
testing replicability or generalizability is answering—would an outcome inconsistent with
prior findings cause me to lose confidence in the theoretical claims? If no, then it is a generaliz-
ability test.
Designing a replication with a different methodology requires understanding of the theory
and methods so that any outcome is considered diagnostic evidence about the prior claim. In
practice, this means that replication is often limited to relatively close adherence to original
methods for topics in which theory and methodology is immature—a circumstance commonly
called “direct” or “close” replication—because the similarity of methods serves as a stand-in
for theoretical and measurement precision. In fact, conducting a replication of a prior claim
with a different methodology can be considered a milestone for theoretical and methodological
Replication is characterized as the boring, rote, clean-up work of science. This misperception
makes funders reluctant to fund it, journals reluctant to publish it, and institutions reluctant to
reward it. The disincentives for replication are a likely contributor to existing challenges of
credibility and replicability of published claims [14].
Defining replication as a confrontation of current theoretical expectations clarifies its impor-
tant, exciting, and generative role in scientific progress. Single studies, whether they pursue
novel ends or confront existing expectations, never definitively confirm or disconfirm theories.
Theories make predictions; replications test those predictions. Outcomes from replications are
fodder for refining, altering, or extending theory to generate new predictions. Replication is a
central part of the iterative maturing cycle of description, prediction, and explanation. A shift in
attitude that includes replication in funding, publication, and career opportunities will acceler-
ate research progress.
We thank Alex Holcombe, Laura Scherer, Leonhard Held, and Don van Ravenzwaaij for com-
ments on earlier versions of this paper, and we thank Anne Chestnut for graphic design
1. Schmidt S. Shall we really do it again? The powerful concept of replication is neglected in the social sci-
ences. Rev Gen Psychol. 2009; 13(2): 90–100.
PLOS Biology | March 27, 2020 7 / 8
2. Camerer CF, Dreber A, Holzmeister F, Ho T-H, Huber J, Johannesson M, et al. Evaluating Replicability
of Social Science Experiments in Nature and Science between 2010 and 2015. Nat Hum Behav. 2018;
2: 637–644. PMID: 31346273
3. Border R, Johnson EC, Evans LM, Smolen A, Berley N, Sullivan PF, et al. No support for historical can-
didate gene or candidate gene-by-interaction hypotheses for major depression across multiple large
samples. Am J Psychiatry. 2019; 176(5): 376–387.
PMID: 30845820
4. MunafòMR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for
reproducible science. Nat Hum Behav. 2017; 1: 0021.
5. Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. Promoting an open
research culture. Science. 2015; 348(6242): 1422–1425.
PMID: 26113702
6. Nosek BA, Ebersole CR, DeHaven A, Mellor DM. The preregistration revolution. Proc Natl Acad Sci U S
A. 2018; 115(11): 2600–2606. PMID: 29531091
7. Jeffreys H. Scientific Inference. 3rd ed. Cambridge: Cambridge University Press; 1973.
8. Muthukrishna M, Henrich J. A problem in theory. Nat Hum Behav. 2019; 3: 221–229.
1038/s41562-018-0522-1 PMID: 30953018
9. Crandall CS, Sherman JW. On the scientific superiority of conceptual replications for scientific progress.
J Exp Soc Psychol. 2016; 66: 93–99.
10. Stroebe W, Strack F. The alleged crisis and illusion of exact replication. Perspect Psychol Sci. 2014; 9
(1): 59–71. PMID: 26173241
11. Shadish WR, Chelimsky E, Cook TD, Campbell DT. Experimental and quasi-experimental designs for
generalized causal inference. 2nd ed. Boston: Houghton Mifflin; 2002.
12. Lakatos I. Falsification and the Methodology of Scientific Research Programmes. In: Harding SG, edi-
tors. Can Theories be Refuted? Synthese Library (Monographs on Epistemology, Logic, Methodology,
Philosophy of Science, Sociology of Science and of Knowledge, and on the Mathematical Methods of
Social and Behavioral Sciences). Dordrecht: Springer;1976. p. 205–259.
13. Bain WA. A method of demonstrating the humoral transmission of the effects of cardiac vagus stimula-
tion in the frog. Q J Exp Physiol. 1932; 22(3): 269–274.
14. Nosek BA, Spies JR, Motyl M. Scientific utopia: II. Restructuring incentives and practices to promote
truth over publishability. Perspect Psychol Sci. 2012; 7(6): 615–631.
1745691612459058 PMID: 26168121
PLOS Biology | March 27, 2020 8 / 8
... In the last decade or so, the concept of replication, which provides further diagnostic evidence for claims (Nosek and Errington, 2020), has gained attention in psychology due to large replication projects with low replication rates (Open Science Collaboration, 2015) and high heterogeneity in replication effect size estimates compared to the original studies (Klein et al., 2014). However, it has also expanded to other fields such as social science (Camerer et al., 2018), economics (Camerer et al., 2016), and cancer biology (Errington et al., 2021), whereby similar large replication projects suggested a crisis of confidence in research findings (Pashler and Wagenmakers, 2012). ...
... However, it has also expanded to other fields such as social science (Camerer et al., 2018), economics (Camerer et al., 2016), and cancer biology (Errington et al., 2021), whereby similar large replication projects suggested a crisis of confidence in research findings (Pashler and Wagenmakers, 2012). This "replication crisis" led to discussions around the replicability, reproducibility (retesting a claim using the same data and comparable analyses as opposed to replication which uses new data; Nosek and Errington, 2020), transparency of research practices, and reliability of findings, which then helped inspire the open science movement (Munafò et al., 2017). ...
... Those in favour of replication studies believe they can increase (or decrease) confidence in research findings, update boundaries on findings i.e., the external validity (Nosek and Errington, 2020), identify type I errors, and control for sampling error (Schmidt, 2009). The response to the "replication crisis" was met with mixed reaction. ...
Full-text available
There are formal calls for increased reproducibility and replicability in sports and exercise science, yet there is minimal information on the overall knowledge of these concepts at a field-wide level. Therefore, we conducted a survey on the attitudes and perceptions of sports and exercise science researchers towards reproducibility and replicability. Descriptive statistics (e.g., proportion of responses), and thematic analysis, were utilized to characterize the responses. Of the 511 respondents, 42% (n = 217) believe there is a significant crisis of reproducibility or replicability in sports and exercise science while 36% (n = 182) believe there is a slight crisis. 3% (n = 15) of respondents believe there is no crisis while 19% (n = 95) did not know. Four themes were generated in the thematic analysis: the research and publishing culture, educational barriers to research integrity, research responsibility to ensure reproducibility and replicability, and current practices facilitating reproducibility and replicability. Researchers believe that engaging in open science can be detrimental to career opportunities due to lack of incentives. They also feel journals are a barrier to reproducible and replicable research due to high publication charges and a focus on novelty. Statistical expertise was identified as a key factor for improving reproducibility and replicability in the future, particularly, a better understanding of study design and different statistical techniques. Statistical education should be prioritised for early career researchers which could positively affect publication and peer review. Researchers must accept responsibility for reproducibility and replicability with thorough project design, appropriate planning of analyses, and transparent reporting practices.
... Conducting a replication study gives students a chance to acquire experience about how to design and manage a study, collect and analyze data, and report findings in a clear and concise manner. By promoting the evaluation of the methodology and results of the original study and considering whether the findings are supported by the data, replication projects also allow the development of critical thinking skills, together with an acute understanding of the importance of replicability in science [7,8]. ...
... Do not expect students to just know why replication is important. Many might, but even for those students the value of replication for science bears repeating [8,12]. Conveying the importance of replication can take many forms, but it often centers on 2 core aspects: the importance of replication within the scientific process, and its pedagogical value as part of students' curriculum. ...
Full-text available
Conducting a replication study is a valuable way for undergraduate students to learn about the scientific process and gain research experience. By promoting the evaluation of existing studies to confirm their reliability, replications play a unique, though often underappreciated, role in the scientific enterprise. Involving students early in this process can help make replication mainstream among the new generation of scientists. Beyond their benefit to science, replications also provide an invaluable learning ground for students, from encouraging the development of critical thinking to emphasizing the importance of details and honing research skills. In this piece, we outline 10 simple rules for designing and conducting undergraduate replication projects, from conceptualization to implementation and dissemination. We hope that these guidelines can help educators provide students with a meaningful and constructive pedagogical experience, without compromising the scientific value of the replication project, therefore ensuring robust, valuable contributions to our understanding of the world.
... It will also damage the credibility of Chinese clinical research in the international community. Knowing but not doing it will lead to research waste [41][42][43]. Some studies suggest that one of the barriers to the implementation of reporting guidelines in Chinese medical journals is the low level of awareness of reporting guidelines among stakeholders such as journal editors [44,45]. ...
Full-text available
Background Reporting quality is a critical issue in health sciences. Adopting the reporting guidelines has been approved to be an effective way of enhancing the reporting quality and transparency of clinical research. In 2012, we found that only 7 (7/1221, 0.6%) journals adopted the Consolidated Standards of Reporting Trials (CONSORT) statement in China. The aim of the study was to know the implementation status of CONSORT and other reporting guidelines about clinical studies in China. Methods A cross-sectional bibliometric study was conducted. Eight medical databases were systematically searched, and 1039 medical journals published in mainland China, Hong Kong, Macau, and Taiwan were included. The basic characteristics, including subject, language, publication place, journal-indexed databases, and journal impact factors were extracted. The endorsement of reporting guidelines was assessed by a modified 5-level evaluation tool, namely i) positive active, ii) positive weak, iii) passive moderate, iv) passive weak and v) none. Results Among included journals, 24.1% endorsed CONSORT, and 0.8% endorsed CONSORT extensions. For STROBE (STrengthening the Reporting of Observational Studies in Epidemiology), PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses), STARD (An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies), CARE (CAse REport guidelines), the endorsement proportion were 17.2, 16.6, 16.4, and 14.8% respectively. The endorsement proportion for SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials), TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis), AGREE (Appraisal of Guidelines, Research, and Evaluation), and RIGHT (Reporting Items for Practice Guidelines in Healthcare) were below 0.7%. Conclusions Our results showed that the implementation of reporting guidelines was low. We suggest the following initiatives including i) enhancing the level of journal endorsement for reporting guidelines; ii) strengthening the collaboration among authors, reviewers, editors, and other stakeholders; iii) providing training courses for stakeholders; iv) establishing bases for reporting guidelines network in China; v) adopting the endorsement of reporting guidelines in the policies of the China Periodicals Association (CPA); vi) promoting Chinese medical journals into the international evaluation system and publish in English.
Full-text available
Replicability is widely regarded as one of the defining features of science and its pursuit is one of the main postulates of meta-research, a discipline emerging in response to the replicability crisis. At the same time, replicability is typically treated with caution by philosophers of science. In this paper, we reassess the value of replicability from an epistemic perspective. We defend the orthodox view, according to which replications are always epistemically useful, against the more prudent view that claims that it is useful in very limited circumstances. Additionally, we argue that we can learn more about the original experiment and the limits of the discovered effect from replications at different levels. We hold that replicability is a crucial feature of experimental results and scientists should continue to strive to secure it.
What factors affect listeners’ perception of the emotions conveyed by music? Ali and Peynircioğlu conducted a series of experiments in which listeners rated emotional judgments of the melodies and lyrics of songs. Here, we present a pre-registered replication and extension of their study with newly adapted stimuli, including several covariates using the Goldsmiths Musical Sophistication Index (Gold-MSI). Using a within-subjects design, we asked participants (n = 104) to rate the emotions they perceived to be conveyed by unfamiliar happy, sad, calm, and angry songs, with and without lyrics, to model the extent to which each factor contributed to participants’ ratings. The results we obtained in our replication contradicted those of the original study, for several variables. The results of our extension, revealing a significant effect of the emotional engagement subscale of the Gold-MSI, indicate that emotion perception can and should be divorced from aspects of musical training. Taken together, the findings of our replication and extension highlight the value of replicating frequently cited studies in music psychology literature.
Full-text available
Is false belief understanding stable from infancy to childhood? We don’t know yet. -A commentary on Sodian’s commentary.
Full-text available
Survey researchers take great care to measure respondents’ answers in an unbiased way; but, how successful are we as a field at remedying unintended and intended biases in our research? The validity of inferences drawn from studies has been found to be improved by the implementation of preregistration practices. Despite this, only 3 of the 83 published articles in POQ and IJPOR in 2020 feature explicitly stated preregistered hypotheses or analyses. This manuscript aims to show survey methodologists how preregistration and replication (where possible) are in service to the broader mission of survey methodology. To that end, we present a practical example of how unknown biases in analysis strategies without preregistration or replication inflate type I errors. In an initial data collection, our analysis showed that the visual layout of battery-type questions significantly decreased data quality. But after committing to replicating and preregistering the hypotheses and analysis plans, none of the results replicated successfully, despite keeping the procedure, sample provider, and analyses identical. This manuscript illustrates how preregistration and replication practices might, in the long term, likely help unburden the academic literature from follow-up publications relying on type I errors.
The work environment has drastically changed in the last 10 years, necessitating a new look at which soft skills are most relevant in today’s workplace. Because of COVID-19, organizations had to rapidly adjust where and how they work. According to the Pew Research Center, 71% of adults, who can perform their work responsibilities from home, are now working remotely. Then, the workplace shifted again during the “Great Resignation” where an all-time record of 24 million employees left their jobs between April and September 2021. This shift is ever more important as research in the last decade indicates that soft skills are being valued more compared to hard skills during the hiring process. The current study replicated Robles’s (2012) study of soft skills to find which soft skills are most relevant to a thriving work environment in 2022. Results indicate that soft skills emphasizing employee initiative and including others in processes are most relevant today, including Adaptable, Agency, Conscientious, Contextual Awareness, Create Clarity, Curiosity, Engage the Mess, Genuine Care, Integrity, Partnership, Play, Positive Energy, Social Skills, and Suppress the Noise.
The coronavirus disease 2019 outbreak has the potential to trigger declines in individual mental health, potentially in the form of depressive symptoms. However, few studies have explored factors protective of mental health during the ongoing pandemic. For the sustainable development of individual health, this study was conducted during the pandemic and examines the relationship between gratitude and symptoms of depression, as well as the moderating effect of psychological capital. Latent variable structural equation modeling was used to analyze depressive symptoms and protective factors in 3123 college students. This study measures gratitude, depressive symptoms, and psychological capital, using the Gratitude Scale, Patient Health Questionnaire‐9, and the Positive Psychological Capital Questionnaire, respectively. Gratitude was negatively related to depressive symptoms, with psychological capital as a moderator of the relationship. Specifically, psychological capital had a powerful protective effect against depressive symptoms. Students with high psychological capital had lower depressive symptoms than those with low psychological capital, regardless of their level of gratitude. In students with low psychological capital, gratitude had a protective effect against depressive symptoms. These findings suggest that psychological capital is a powerful protective factor against depressive symptoms during a pandemic and improving psychological capital could enhance mental health. Gratitude was negatively related to depressive symptoms. Students with high psychological capital had lower depressive symptoms. Psychological capital as a moderator of the relationship between gratitude with depressive symptoms. Gratitude was negatively related to depressive symptoms. Students with high psychological capital had lower depressive symptoms. Psychological capital as a moderator of the relationship between gratitude with depressive symptoms.
Full-text available
Being able to replicate scientific findings is crucial for scientific progress. We replicate 21 systematically selected experimental studies in the social sciences published in Nature and Science between 2010 and 2015. The replications follow analysis plans reviewed by the original authors and pre-registered prior to the replications. The replications are high powered, with sample sizes on average about five times higher than in the original studies. We find a significant effect in the same direction as the original study for 13 (62%) studies, and the effect size of the replications is on average about 50% of the original effect size. Replicability varies between 12 (57%) and 14 (67%) studies for complementary replicability indicators. Consistent with these results, the estimated true-positive rate is 67% in a Bayesian analysis. The relative effect size of true positives is estimated to be 71%, suggesting that both false positives and inflated effect sizes of true positives contribute to imperfect reproducibility. Furthermore, we find that peer beliefs of replicability are strongly related to replicability, suggesting that the research community could predict which results would replicate and that failures to replicate were not the result of chance alone.
Full-text available
Improving the reliability and efficiency of scientific research will increase the credibility of the published scientific literature and accelerate discovery. Here we argue for the adoption of measures to optimize key elements of the scientific process: methods, reporting and dissemination, reproducibility, evaluation and incentives. There is some evidence from both simulations and empirical studies supporting the likely effectiveness of these measures, but their broad adoption by researchers, institutions, funders and journals will require iterative evaluation and improvement. We discuss the goals of these measures, and how they can be implemented, in the hope that this will facilitate action toward improving the transparency, reproducibility and efficiency of scientific research.
Full-text available
There is considerable current debate about the need for replication in the science of social psychology. Most of the current discussion and approbation is centered on direct or exact replications, the attempt to conduct a study in a manner as close to the original as possible. We focus on the value of conceptual replications, the attempt to test the same theoretical process as an existing study, but that uses methods that vary in some way from the previous study. The tension between the two kinds of replication is a tension of values—exact replications value confidence in operationalizations; their requirement tends to favor the status quo. Conceptual replications value confidence in theory; their use tends to favor rapid progress over ferreting out error. We describe the many ways in which conceptual replications can be superior to direct replications. We further argue that the social system of science is quite robust to these threats and is self-correcting.
Full-text available
Transparency, openness, and reproducibility are readily recognized as vital features of science (1, 2). When asked, most scientists embrace these features as disciplinary norms and values (3). Therefore, one might expect that these valued features would be routine in daily practice. Yet, a growing body of evidence suggests that this is not the case (4–6).
Full-text available
There has been increasing criticism of the way psychologists conduct and analyze studies. These critiques as well as failures to replicate several high-profile studies have been used as justification to proclaim a "replication crisis" in psychology. Psychologists are encouraged to conduct more "exact" replications of published studies to assess the reproducibility of psychological research. This article argues that the alleged "crisis of replicability" is primarily due to an epistemological misunderstanding that emphasizes the phenomenon instead of its underlying mechanisms. As a consequence, a replicated phenomenon may not serve as a rigorous test of a theoretical hypothesis because identical operationalizations of variables in studies conducted at different times and with different subject populations might test different theoretical constructs. Therefore, we propose that for meaningful replications, attempts at reinstating the original circumstances are not sufficient. Instead, replicators must ascertain that conditions are realized that reflect the theoretical variable(s) manipulated (and/or measured) in the original study. © The Author(s) 2013.
Objective: Interest in candidate gene and candidate gene-by-environment interaction hypotheses regarding major depressive disorder remains strong despite controversy surrounding the validity of previous findings. In response to this controversy, the present investigation empirically identified 18 candidate genes for depression that have been studied 10 or more times and examined evidence for their relevance to depression phenotypes. Methods: Utilizing data from large population-based and case-control samples (Ns ranging from 62,138 to 443,264 across subsamples), the authors conducted a series of preregistered analyses examining candidate gene polymorphism main effects, polymorphism-by-environment interactions, and gene-level effects across a number of operational definitions of depression (e.g., lifetime diagnosis, current severity, episode recurrence) and environmental moderators (e.g., sexual or physical abuse during childhood, socioeconomic adversity). Results: No clear evidence was found for any candidate gene polymorphism associations with depression phenotypes or any polymorphism-by-environment moderator effects. As a set, depression candidate genes were no more associated with depression phenotypes than noncandidate genes. The authors demonstrate that phenotypic measurement error is unlikely to account for these null findings. Conclusions: The study results do not support previous depression candidate gene findings, in which large genetic effects are frequently reported in samples orders of magnitude smaller than those examined here. Instead, the results suggest that early hypotheses about depression candidate genes were incorrect and that the large number of associations reported in the depression candidate gene literature are likely to be false positives.
The replication crisis facing the psychological sciences is widely regarded as rooted in methodological or statistical shortcomings. We argue that a large part of the problem is the lack of a cumulative theoretical framework or frameworks. Without an overarching theoretical framework that generates hypotheses across diverse domains, empirical programs spawn and grow from personal intuitions and culturally biased folk theories. By providing ways to develop clear predictions, including through the use of formal modelling, theoretical frameworks set expectations that determine whether a new finding is confirmatory, nicely integrating with existing lines of research, or surprising, and therefore requiring further replication and scrutiny. Such frameworks also prioritize certain research foci, motivate the use diverse empirical approaches and, often, provide a natural means to integrate across the sciences. Thus, overarching theoretical frameworks pave the way toward a more general theory of human behaviour. We illustrate one such a theoretical framework: dual inheritance theory.
Progress in science relies in part on generating hypotheses with existing observations and testing hypotheses with new observations. This distinction between postdiction and prediction is appreciated conceptually but is not respected in practice. Mistaking generation of postdictions with testing of predictions reduces the credibility of research findings. However, ordinary biases in human reasoning, such as hindsight bias, make it hard to avoid this mistake. An effective solution is to define the research questions and analysis plan before observing the research outcomes-a process called preregistration. Preregistration distinguishes analyses and outcomes that result from predictions from those that result from postdictions. A variety of practical strategies are available to make the best possible use of preregistration in circumstances that fall short of the ideal application, such as when the data are preexisting. Services are now available for preregistration across all disciplines, facilitating a rapid increase in the practice. Widespread adoption of preregistration will increase distinctiveness between hypothesis generation and hypothesis testing and will improve the credibility of research findings.