Article

Implications of the Credibility Revolution for Productivity, Creativity, and Progress

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The credibility revolution (sometimes referred to as the “replicability crisis”) in psychology has brought about many changes in the standards by which psychological science is evaluated. These changes include (a) greater emphasis on transparency and openness, (b) a move toward preregistration of research, (c) more direct-replication studies, and (d) higher standards for the quality and quantity of evidence needed to make strong scientific claims. What are the implications of these changes for productivity, creativity, and progress in psychological science? These questions can and should be studied empirically, and I present my predictions here. The productivity of individual researchers is likely to decline, although some changes (e.g., greater collaboration, data sharing) may mitigate this effect. The effects of these changes on creativity are likely to be mixed: Researchers will be less likely to pursue risky questions; more likely to use a broad range of methods, designs, and populations; and less free to define their own best practices and standards of evidence. Finally, the rate of scientific progress—the most important shared goal of scientists—is likely to increase as a result of these changes, although one’s subjective experience of making progress will likely become rarer.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Open Science is the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods. In a nutshell, Open Science is transparent and accessible knowledge that is shared and developed through collaborative networks (Vicente-Sáez & Martínez-Fuentes 2018) 7 . Open Science is an umbrella term used to refer to the concepts of openness, transparency, rigour, reproducibility, replicability, and accumulation of knowledge, all of which are considered fundamental features of the scientific endeavour. ...
... Open Science is an umbrella term used to refer to the concepts of openness, transparency, rigour, reproducibility, replicability, and accumulation of knowledge, all of which are considered fundamental features of the scientific endeavour. In recent years, psychological researchers have begun to adopt reforms to make their work better align with these principles and to address the current "credibility revolution" (Vazire, 2018) 8 . ...
Article
Full-text available
The paper analysed 2199 papers published in "Open Science" Research during the period of 2001-2021.Open Science is a set of practices that increase the transparency and accessibility of scientific research. It investigate the exponential growth rate, productive country, organizations, document type, language, productive institutions, productive country, top most journals, prolific authors, key words used and most cited papers. Scientometric methods were employed for investigation. Web of Science database was used for data retrieval. Results showed that there was an upward trend in the growth and development of publications on Open Science, the year of 2021 showed highest growth rate, journal articles are the dominant documents, university of oxford (UK) produced more publications, USA most productive country, journal of neurochemistry published more articles, Krumholz HM of Yale University the most prolific author, most cited paper are recently identified. The study is useful to LIS and Open Science researchers, faculty members and students.
... The reliability and trustworthiness of research results are in question [1][2][3]. This is true especially with respect to their reproducibility (defined in this paper as obtaining the same or similar results when rerunning analyses from previous studies using the original design, data and code; cf. ...
... [4]). Ensuring that the results of studies can be independently confirmed, is expected to reduce waste [2,5] and lead to more reliable outcomes that better inform evidence-based decisions [6,7]. Furthermore, studies that can be independently confirmed may increase public trust in the scientific enterprise [8,9]. ...
Article
Full-text available
Various open science practices have been proposed to improve the reproducibility and replicability of scientific research, but not for all practices, there may be evidence they are indeed effective. Therefore, we conducted a scoping review of the literature on interventions to improve reproducibility. We systematically searched Medline, Embase, Web of Science, PsycINFO, Scopus and Eric, on 18 August 2023. Any study empirically evaluating the effectiveness of interventions aimed at improving the reproducibility or replicability of scientific methods and findings was included. We summarized the retrieved evidence narratively and in evidence gap maps. Of the 105 distinct studies we included, 15 directly measured the effect of an intervention on reproducibility or replicability, while the remainder addressed a proxy outcome that might be expected to increase reproducibility or replicability, such as data sharing, methods transparency or pre-registration. Thirty studies were non-comparative and 27 were comparative but cross-sectional observational designs, precluding any causal inference. Despite studies investigating a range of interventions and addressing various outcomes, our findings indicate that in general the evidence base for which various interventions to improve reproducibility of research remains remarkably limited in many respects.
... The growing complexity of knowledge production requires social science to reconsider its collaborative and management approaches in order to accelerate and advance (Hofman et al. 2021;King 1995;Vazire 2018). The dominant conventional knowledge infrastructuresbased on institutionalized, top-down decision-making processes -tend to reproduce certain methods of knowledge production. ...
... However, open-source cooperation still rarely appears in social science (Beck et al. 2022;Firebaugh 2007;Franzoni and Sauermann 2014;Friesike et al. 2014;Gerring et al. 2020;Vazire 2018). When searching for "open source" in the Web of Science portal archives, we find a continuously rising trend in the number of publications. ...
Article
Full-text available
With the growing complexity of knowledge production, social science must accelerate and open up to maintain explanatory power and responsiveness. This goal requires redesigning the front end of the research to build an open and expandable knowledge infrastructure that stimulates broad collaborations, enables breaking down inertia and path dependencies of conventional approaches, and boosts discovery and innovation. This article discusses the coordinated open-source model as a promising organizational scheme that can supplement conventional research infrastructure in certain areas. The model offers flexibility, decentralization, and community-based development and aligns with open science ideas, such as reproducibility and transparency. Similar solutions have been successfully applied in natural science, but social science needs to catch up. I present the model’s design and consider its potential and limitations (e.g., regarding development, sustainability, and coordination). I also discuss open-source applications in various areas, including a case study of an open-source survey harmonization project Comparative Panel File.
... The International Journal of Research in Marketing, Journal of Consumer Research, and Journal of Consumer Psychology encourage preregistration without requiring it. Similar trends have emerged in economics and psychology, where journals have implemented open science practices such as mandatory data sharing, preregistration, and the publication of replication studies (e.g., Ankel-Peters et al. 2024;Miguel 2021;Vazire, 2018). ...
... One point of view suggests that the value of reproducibility and transparency in scientific research cannot be overstated (Miguel et al., 2014;Munafò et al., 2017;Wagenmakers et al., 2021). Enhancing transparency through open science practices can restore trust in the scientific system and address the credibility debate head-on (Pennington, 2023;Vazire, 2018). Transparency also requires disclosing nonsignificant results or model estimations that do not meet recommended levels of fit. ...
... Psychology and other behavioral, cognitive, and social sciences have faced a replication crisis, also known as the credibility revolution [1][2][3] (terms in italics are defined in the Glossary in Table 1). This crisis refers to a lack of replication and reliability of various effects found in past studies. ...
... Inconsistent data can lead to wasted resources, inaccurate results, and even compromised 3 There are specific software applications to track and organize projects (e.g., Asana, Trello), each with its pros and cons, but these are not specifically designed for BTS projects. Alternatively, STAPLE is a software project dedicated to BTS projects, created by and for researchers 53 . ...
Preprint
The replication crisis in psychology and related sciences contributed to the adoption of large-scale research initiatives known as Big Team Science (BTS). BTS has made significant advances in addressing issues of replication, statistical power, and diversity through the use of larger samples and more representative cross-cultural data. However, while these collaborations hold great potential, they also introduce unique challenges related to their scale. Drawing on experiences from successful BTS projects, we identified and outlined key strategies for overcoming diversity, volunteering, and capacity challenges. We emphasize the need for the implementation of strong organizational practices and the distribution of responsibility to prevent common pitfalls. More fundamentally, BTS requires a shift in mindset toward prioritizing collaborative effort, diversity, transparency, and inclusivity. Ultimately, we call for reflection on the strengths and limitations of BTS to enhance the quality, generalizability, and impact of research across disciplines.
... This need is in line with recent concerns about the stability of scientific evidence in psychology and other fields (Nelson et al., 2018), and the replicability crisis in psychology (Open Science Collaboration, 2015). This context has fostered the "credibility revolution" movement (Vazire, 2018;Munafò et al., 2017) which emphasizes the need for replicability due to its nuclear role in scientific development (a recent review by Nosek et al., 2022). Therefore, in this study, we focus on developing a close replication of Wismeijer & van Assen (2013) due to its importance in the field of BDSM and the applicability of its conclusions. ...
... In line with the literature (Vazire, 2018), our first recommendation is to impulse further replication studies to gather more robust findings with high-powered and more representative samples. We also encourage more cross-cultural and cohort studies to test cultural differences in these findings (e.g., non-Western countries). ...
Article
Full-text available
Bondage, Discipline, Sadism, and Masochism (BDSM) is a range of diversesexual practices. Stigma regarding BDSM is associated with dysfunctional personalities,insecure attachment styles, or damaged well-being. Previous studies have showncontrary evidence to these views. However, the replicability of these findings has notbeen properly studied. The present research provides a close replication study to testdifferences in personality, attachment, rejection sensitivity, and well-being betweenBDSM practitioners and non-practitioners. To overcome limitations in previous studies,this study provides a highly powered sample of a new population (Spanish, N = 1,907),including effect sizes, the presence and impact of LGTBIQA+ individuals, andassessing BDSM roles using an alternative classification. In addition, we exploreddifferences in associations between attachment styles, personality, and well-being inBDSM practitioners. As predicted, BDSM practitioners showed higher levels of secureattachment, conscientiousness, openness, and well-being while also lower levels ofinsecure attachments, rejection sensitivity, neuroticism, and agreeableness, counteringthe stigma. Gender, sexual orientations, and experience with BDSM showedexplanatory potential. Associations between attachment, personality, and well-beingwere invariant across BDSM practitioners and non-practitioners but also across BDSMroles. This is, BDSM practitioners share the same psychological structure as non-practitioners but also show more functional profiles. Thus, de-stigmatizing BDSMpopulations is reinforced and recommended. Limitations and implications for appliedand research audiences are discussed
... Preregistration requires transparency about important study details (e.g., rationale, hypotheses, methods, analysis plan) prior to data collection. The hope is that such transparency increases the quality and credibility of psychological research (Nosek et al., 2019;Vazire, 2018). ...
... This transparency enables research findings to be externally verified (Nosek & Lakens, 2014) and can facilitate replication efforts (Freese, 2007). In general, then, preregistration is supposed to make psychological science better Vazire, 2018). ...
... Informing such issues based on preregistered intervention studies is an important consideration in light of changing standards concerning what is considered credible evidence in support of a particular claim or finding. In recent years, there has been growing concern about the replicability and validity of many findings produced through psychological science (Vazire, 2018), including with respect to research-based claims concerning happiness and well-being (van Zyl et al., 2024). Preregistering one's predictions, methods, and analysis plans is an important step forward because it provides an improved level of transparency and credibility (Nosek et al., 2018). ...
Article
Full-text available
The present work examined results from preregistered intervention studies to inform the structure of subjective well-being (SWB). In five studies aimed at boosting individuals’ SWB, pre- and post-intervention assessments of life satisfaction (LS), positive affect (PA), and negative affect (NA) were examined as separate components in isolated analyses (Model 1), as a causal system in which PA and NA are inputs to LS (Model 2), and as indicators of a latent SWB factor based on a hierarchical conceptualization (Model 3). In each study, robust associations were found among all three SWB components within and across time (contrary to the separate components model); predictive effects were found among all three SWB components across time, rather than unidirectional effects from PA and NA to LS (contrary to the causal system model). In support of a hierarchical conceptualization, all three components had strong loadings on a latent SWB at pre- and post-intervention; in addition, in four studies the intervention had a significant effect on a latent SWB factor, but no unique (residual) effects on LS, PA, or NA. The present work thus provides valuable new insights based on experimental evidence from preregistered intervention studies in support of a hierarchical structure for SWB.
... Such attempts have identified relatively low replication rates (<60%; Camerer et al., 2016;Klein et al., 2014;Klein et al., 2018;Open Science Collaboration, 2015) with few exceptions but see Bak-Coleman & Devezer, 2023 for a comment; Soto, 2019). These findings have motivated claims that the psychological sciences are suffering from a 'replication crisis' (Maxwell et al., 2015;Nelson et al., 2018;Schooler, 2014) and are now undergoing a 'credibility revolution' (Korbmacher et al., 2023;Vazire, 2018). Concerns about replicability have therefore grown over the last decade, and have also been echoed in other sciences (e.g., Errington et al., 2021;Nosek & Errington, 2017). ...
Preprint
In psychological science, replicability—repeating a study with a new sampleachieving consistent results (Parsons et al., 2022)—is critical for affirming the validity of scientific findings. Despite its importance, replication efforts are few and far between in psychological science with many attempts failing to corroborate past findings. This scarcity, compounded by the difficulty in accessing replication data, jeopardizes the efficient allocation of research resources and impedes scientific advancement. Addressing this crucial gap, we present the Replication Database (https://metaanalyses.shinyapps.io/replicationdatabase/), a novel platform hosting 1,239 original findings paired with replication findings. The infrastructure of this database allows researchers to submit, access, and engage with replication findings. The database makes replications visible, easily findable via a graphical user interface, and tracks replication rates across various factors, such as publication year or journal. This will facilitate future efforts to evaluate the robustness of psychological research.
... Replicability reinforces the reliability of results when findings are consistently reproduced and is thus crucial to scientific advancement (Aarts et al., 2015). What has been termed the replication crisis or credibility revolution are the strikingly low replication rates of studies in psychological science (Vazire, 2018). Replication rates vary significantly from as low as 0% (Ebersole et al., 2020) up to 62% (Camerer et al., 2018), posing a substantial threat to the credibility of psychology as a scientific discipline. ...
Thesis
Full-text available
Trust has been shown to reduce conflict, enhance cooperation, and increase risk-taking in organisational settings. However, the role of trust in security contexts, particularly those pertaining to information elicitation, remains largely unexplored. Furthermore, due to the proliferation of online platforms for security-related interviews, accelerated by technological advances and global events such as the recent COVID-19 pandemic, the need to understand the role of trust in online information elicitation contexts is even more pronounced. The same can be said about rapport-building, which seems crucial for improving interview outcomes in face-to-face interviews. To date, little is known about how face-to-face information elicitation techniques translate to virtual environments. The current programme of doctoral research aimed to address these gaps by disentangling and examining the individual and combined effects of interviewer trustworthiness and rapport-building on online interview outcomes. Experiment 1 assessed the relevance of trustworthiness for risk-taking in online settings using a two-player gaming paradigm and a between-subject design. Results revealed that participants playing with the untrustworthy player were less willing to trust them and, in turn, took significantly fewer high-risk decisions during the beginning of the second game than participants playing with the trustworthy player. Building on these findings, experiments 2 and 3 examined the impact of trustworthiness and rapport-building on information sharing, including the disclosure of sensitive or otherwise ‘risky’ information, in online interviews. Experiment 2 adopted a between-subject design in which the interviewer’s trustworthiness and rapport-building efforts were manipulated to evaluate their effects on information sharing during a simulated job interview conducted via a chat platform. The results revealed that trustworthiness indirectly affected total and sensitive information disclosure by fostering trust, while rapport-building did not affect information disclosure. For Experiment 3, a novel manipulation of trustworthiness was developed, and its effect, along with that of rapport-building, was tested during a simulated vetting interview conducted via phone in a between-subject design. In line with findings from Experiment 2, findings indicated that trustworthiness indirectly affected both total and sensitive information disclosure by facilitating trust in the interviewer. While rapport-building increased the total number of details elicited, it did not increase the amount of sensitive information disclosed by participants. Interestingly, exploratory analyses revealed that rapport-building increased participants’ willingness to trust the interviewer from before to after the interview. In Experiment 4, the relative effectiveness of three trustworthiness aspects (ability, integrity, and benevolence) in fostering trust were tested during simulated phone calls between mock handlers and informants. Using a within-subjects design, participants, acting as potential informants, listened to three different audio clips, each representing emphasis on a different aspect of trustworthiness by a handler, and then indicated their willingness to trust and cooperate with each handler. There were no significant differences in trust based on the aspect of trustworthiness demonstrated and findings from the thematic analysis highlighted substantial individual differences, suggesting there is no universal approach to fostering trust. Across Experiments 1, 3, and 4, participants’ propensity to trust significantly influenced their willingness to trust interviewers, with those naturally inclined to trust being more likely to lend their trust to the interviewer. In the discussion of the results, the theoretical and practical implications for security contexts are explored and avenues for future research in online information gathering suggested.
... The 'replication crisis' within psychology is also referred to as a 'credibility revolution'[2], renaissance[3] and opportunity/debate[4]. We use the term 'crisis' consistently through this article in line with Hussey[5] who suggests that crises are 'a call to action [...] an urgency that motivates people to act'. ...
Article
Full-text available
Concerns about the replicability, reproducibility and transparency of research have ushered in a set of practices and behaviours under the umbrella of ‘open research’. To this end, many new initiatives have been developed that represent procedural (i.e. behaviours and sets of commonly used practices in the research process), structural (new norms, rules, infrastructure and incentives), and community-based change (working groups, networks). The objectives of this research were to identify and outline international initiatives that enhance awareness and uptake of open research practices in the discipline of psychology. A systematic mapping review was conducted in three stages: (i) a Web search to identify open research initiatives in psychology; (ii) a literature search to identify related articles; and (iii) a hand search of grey literature. Eligible initiatives were then coded into an overarching theme of procedural, structural or community-based change. A total of 187 initiatives were identified; 30 were procedural (e.g. toolkits, resources, software), 70 structural (e.g. policies, strategies, frameworks) and 87 community-based (e.g. working groups, networks). This review highlights that open research is progressing at pace through various initiatives that share a common goal to reform research culture. We hope that this review promotes their further adoption and facilitates coordinated efforts between individuals, organizations, institutions, publishers and funders.
... Open science, also known as open scholarship, is an umbrella term that encompasses the principle that scientific knowledge should be openly accessible, transparent, rigorous, reproducible, replicable, accumulative, and inclusive in all appropriate contexts, which are considered fundamental features of the scientific endeavor . In psychological science, the open science movement was driven by the credibility revolution (Korbmacher et al., 2023;Nosek et al., 2022;Vazire, 2018), triggered by the "replicability crisis" (Baker, 2016;Hu et al., 2016;Open Science Collaboration, 2015). This movement advocates for transparent, credible, reproducible, and accessible science. ...
Preprint
Full-text available
Over the past decade, the open science movement has transformed the research landscape, though its impact has largely been confined to developed countries. Recently, researchers from developing countries have called for a redesign of open science to better align with their unique contexts. However, raising awareness alone is insufficient—practical actions are required to drive meaningful and inclusive change. In this work, we analyze the opportunities offered by the open science movement and explore the macro- and micro-level barriers researchers in developing countries face when engaging with these practices. Drawing on these insights and the experiences of researchers from developing regions, we propose a four-level guide to support their gradual engagement with open science: (1) utilizing open resources to build a solid foundation for rigorous research, (2) adopting low-cost, easily implementable practices, (3) contributing to open science communities through actionable steps, and (4) taking on leadership roles or forming local communities to foster cultural change. We conclude by discussing potential pitfalls of engaging in open science and outline concrete recommendations for future action.
... It is very important to know the effects of SMEs' industrial value with the use of IR4.0 technologies, especially in supply chain value. Industry 4.0 readiness refers to the level at which organisations were implementing the IR4.0 technologies (Stentoft et al., 2019) and how ready businesses are to step into digitalisation and implement the IR4.0 technologies (Schwab, 2017;Vazire, 2018). Micro, Small and Medium Enterprises (MSMEs) play an important role in the economy of a country even more so when they have unique requirements and capabilities in today's difficult market (Ghafoorpoor Yazdi et al., 2019). ...
Article
Full-text available
This paper extends previous studies on how Micro Small Medium Enterprises (MSMEs) in Brunei Darussalam are adopting Industry 4.0 (IR4.0) technologies. In this study, we examined how two categories of IR4.0 adoption which are the Modest category and the Moderate category as mediating effects and how they can influence firm performance. A questionnaire-based survey was conducted to collect data from 201 owners or managers from MSMEs. The mediating effect is examined using Smart Partial Least Square — Structural Equation Modelling (Smart PLS-SEM) to test the hypotheses. The results indicate the Modest category has no mediation effect on the firm performance while the Moderate category has a mediating effect between business financial planning and firm performance, and cost and firm performance. It also shows that between the Modest and Moderate categories, the Modest category seems to be reluctant in expanding its potential in the use of technologies due to some reasons for example knowledge, skills or financial resources. However, this research is only limited to MSMEs, it is suggested to conduct a comparison study between MSMEs and large businesses in Brunei Darussalam or an international-based study. The study between Micro, Small Enterprises (MSEs) and Medium Enterprises (MEs) will help to widen the insight into this area of study. These findings can benefit the business owners, technology experts and policymakers responsible for assisting MSMEs in adopting IR4.0 technologies.
... We also strongly advocate for summary statistics (i.e., correlation matrix) or open data disclosure that would enable others to partially or fully reproduce the analyses (Mueller & Hancock, 2008;Schumaker & Lomax, 2016). The recent credibility revolution in psychology and the resulting push for open science more generally has led to a widespread recommendation to share raw data for empirical studies, where possible (e.g., Houtkoop et al., 2018;Nosek et al., 2015;Vazire, 2018). Even when there are confidentiality concerns with data sharing, there is no excuse not to share summary statistics in the form of standard deviations and correlations for all variables, which will allow others to reproduce the SEM analyses at least approximately (e.g., if there are missing data). ...
Article
Full-text available
Confirmatory bifactor models have become very popular in psychological applications, but they are increasingly criticized for statistical pitfalls such as tendency to overfit, tendency to produce anomalous results, instability of solutions, and underidentification problems. In part to combat this state of affairs, many different reliability and dimensionality measures have been proposed to help researchers evaluate the quality of the obtained bifactor solution. However, in empirical practice, the evaluation of bifactor models is largely based on structural equation model fit indices. Other critical indicators of solution quality, such as patterns of general and group factor loadings, whether all estimates are interpretable, and values of reliability coefficients, are often not taken into account. In addition, in the methodological literature, some confusion exists about the appropriate interpretation and application of some bifactor reliability coefficients. In this article, we accomplish several goals. First, we review reliability coefficients for bifactor models and their correct interpretations, and we provide expectations for their values. Second, to help steer researchers away from structural equation model fit indices and to improve current practice, we provide a checklist for evaluating the statistical fit of bifactor models. Third, we evaluate the state of current practice by examining 96 empirical articles employing confirmatory bifactor models across different areas of psychology.
... When space constraints limit the inclusion in the method section, researchers could provide it in supplemental materials. By adopting such an open science approach to study engagement, it can accelerate the development of higher-quality studies(Vazire 2018). ...
Article
Full-text available
Introduction Adolescent psychology is embracing intensive longitudinal methods, such as diaries and experience sampling techniques, to investigate real‐life experiences. However, participants might perceive the repetitive self‐reporting in these data collection techniques as burdensome and demotivating, resulting in decreased compliance rates. In this tutorial paper, we present a user‐centered approach aimed at making participation in experience sampling and daily diary studies a meaningful and fun experience for adolescents. Methods In three major research projects that took place between 2019 and 2023, more than 4,000 Dutch adolescents participated (12–25 years old). To improve the participants' user journey, adolescents were invited to codesign our studies and share their expertise in interviews (n = 459), focus groups (n = 101), design decisions (i.e., A/B tests, n = 107), pilots (n = 163), exit interviews (n = 167), and by answering user experience questionnaires (n = 2,109). Results Across projects, we discovered five different main intrinsic and extrinsic motives to participate in intensive longitudinal studies: (1) rewards, (2) fun and interest, (3) helping science or the greater good, (4) helping the scientist or another person, and (5) gaining self‐insight. We provide concrete examples of how we tailored our study designs to address these specific motives to optimize youth engagement. Conclusions The engagement of adolescents in intensive longitudinal studies can be enhanced by making it a meaningful and enjoyable experience, aligned with their own motives.
... When space constraints limit the inclusion in the method section, researchers could provide it in supplemental materials. By adopting such an open science approach to study engagement, it can accelerate the development of higher-quality studies(Vazire 2018). ...
Preprint
Introduction. Adolescent psychology is embracing intensive longitudinal methods, such as diaries and experience sampling techniques, to investigate real-life experiences. However, participants might perceive the repetitive self-reporting in these data collection techniques as burdensome and demotivating, resulting in decreased compliance rates. In this tutorial paper, we present a user-centered approach aimed at making participation in experience sampling and daily diary studies a meaningful and fun experience for adolescents. Methods. In three major research projects that took place between 2019 – 2023, more than 4,000 Dutch adolescents participated (12 - 25 years old). To improve the participants’ user journey, adolescents were invited to co-design our studies and share their expertise in interviews (n = 459), focus groups (n = 101), design decisions (i.e., A/B tests, n = 107), pilots (n = 163), exit interviews (n = 167), and by answering user experience questionnaires (n = 2,109). Results. Across projects, we discovered five different main intrinsic and extrinsic motives to participate in intensive longitudinal studies: (1) rewards, (2) fun and interest, (3) helping science or the greater good, (4) helping the scientist or another person, and (5) gaining self-insight. We provide concrete examples of how we tailored our study designs to address these specific motives to optimize youth engagement. Conclusions. The engagement of adolescents in intensive longitudinal studies can be enhanced by making it a meaningful and enjoyable experience, aligned with their own motives.
... The 'replication crisis' within psychology is also referred to as a 'credibility revolution'[2], renaissance[3] and opportunity/debate[4]. We use the term 'crisis' consistently through this article in line with Hussey[5] who suggests that crises are 'a call to action [...] an urgency that motivates people to act'. 2 royalsocietypublishing.org/journal/rsos R. Soc. ...
Preprint
Background: Concerns about the replicability, reproducibility, and transparency of research have ushered in a set of practices and behaviours under the umbrella of ‘open research’. To this end, many new initiatives have been developed that represent procedural (i.e., behaviours and sets of commonly used practices in the research process), structural (new norms, rules, infrastructure, and incentives), and community-based change (working groups, networks). Objectives: To outline international initiatives that enhance awareness and uptake of open research practices in the discipline of psychology. Methods: A systematic, narrative review was conducted in three stages: (1) a web search to identify open research initiatives in psychology; (2) a literature search to identify supporting publications; and (3) a hand search of grey literature. Eligible initiatives were coded into a narrative theme of procedural, structural, or community-based change. Results: A total of 187 initiatives were identified; 30 were procedural (e.g., toolkits, resources, software), 70 structural (e.g., policies, strategies, frameworks), and 87 community-based (e.g., working groups, networks). Discussion: Open research is progressing at pace through various initiatives that share a common goal to reform research culture. We hope that this review promotes their further adoption and facilitates coordinated efforts between individuals, organisations, institutions, publishers, and funders.
... org/ colle ctions/ 9b1e8 3d1/ repro ducib ility-proje ct-cancer-biolo gy), although it arguably first gained significant attention in the social sciences, particularly in psychology [1][2][3][4]. In response, recent reforms have changed how psychologists conduct their research [5,6] (also see [7]). One example is their adoption of pre-registration and registered reports, which can involve pre-commitment to a particular set of hypotheses and study design, often involving peer review [8][9][10][11]. ...
Article
Full-text available
While psychologists have extensively discussed the notion of a “theory crisis” arising from vague and incorrect hypotheses, there has been no debate about such a crisis in biology. However, biologists have long discussed communication failures between theoreticians and empiricists. We argue such failure is one aspect of a theory crisis because misapplied and misunderstood theories lead to poor hypotheses and research waste. We review its solutions and compare them with methodology-focused solutions proposed for replication crises. We conclude by discussing how promoting inclusion, diversity, equity, and accessibility (IDEA) in theoretical biology could contribute to ameliorating breakdowns in the theory-empirical cycle.
... In addition, and perhaps more importantly, a statement offered via addendum is a poor substitute for cautious interpretation throughout the manuscript. While it may not be feasible or even optimal for all researchers to access and sample under-represented populations in their own work, all researchers can commit to a more careful evaluation of their own study's generalisability 19,88,89 . ...
Article
Full-text available
The field of psychology has rapidly transformed its open science practices in recent years. Yet there has been limited progress in integrating principles of diversity, equity and inclusion. In this Perspective, we raise the spectre of Questionable Generalisability Practices and the issue of MASKing (Making Assumptions based on Skewed Knowledge), calling for more responsible practices in generalising study findings and co-authorship to promote global equity in knowledge production. To drive change, researchers must target all four key components of the research process: design, reporting, generalisation, and evaluation. Additionally, macro-level geopolitical factors must be considered to move towards a robust behavioural science that is truly inclusive, representing the voices and experiences of the majority world (i.e., low-and-middle-income countries).
... To ensure scientific credibility we should be doing and publishing a lot more replications (Nosek et al., 2022;Vazire, 2018;Zwaan et al., 2018). There are many systemic challenges hindering replications: there is a strong bias for novelty in publishing, hiring, promotion, and funding, there are sensitivities around conducting replications where replicators are perceived as having some kind of an agenda, and strong prestige, hindsight, and outcome biases where -for examplereplicators are criticized as incompetent when replications fail, or replication work is regarded as having no value and unsurprising when replications succeed (Chandrashekar and Feldman, 2024), with fierce debates even regarding the very definition of replication success and failure. ...
Article
Full-text available
Commentary on Isager et al. (2021) [https://doi.org/10.31222/osf.io/knjea]. Main arguments: - Replications are very rare: We just do not do replications - Replication value is tied to research value - Replications go beyond replicability
... Friesike & Fecher (2016) highlight that principles such as transparency, reproducibility, and cooperation, collectively known as open science, have gained traction across various fields, including the social sciences. Over the past decade, the social sciences have experienced a credibility revolution, sometimes described as a replicability crisis (Vazire, 2018). Steltenpohl et al. (2023) characterize this shift as part of the open science movement, which has profoundly transformed scientific publishing by enhancing both accessibility and transparency (Albert, 2006). ...
Article
Full-text available
The scholarly publishing landscape is changing fast with the rise of open science practices and increased expectations for transparency and rigour. However, there is a notable gap in understanding how the social science researchers are adopting transparency and openness in scholarly publishing (TOSP), given the emergence of open science practices. Therefore, this paper seeks to: (a) What do social science researchers interpret as "transparency and openness in scholarly publishing", and (b) How do social science researchers navigate and practise transparency and openness in their scholarly publishing? A cohort of the 100 most productive Malaysian-based social science researchers identified from the Web of Science database was invited to participate via email. The evidence reported here comes from 20 who agreed to be interviewed. The findings reveal that social science researchers conceptualise TOSP through seven key themes: Data transparency; practices; Methodological transparency; Embracing open access; Readiness for criticism and feedback; Reliable peer review process; Research ethics in data management; and Articulating research limitations. Additionally, the study emphasises nine TOSP practices that social scientists highlight, including sharing and connecting; publishing in affordable open access journals; authorship and publishing standards; international research collaboration; using open access repositories; adopting preprints; adhering to ethics and integrity; participating in the peer review process; and ensuring research reproducibility. This study underscores the importance of TOSP attributes in fostering transparency and openness, which in turn enhances the credibility and impact of social science research. Aligning with these principles enables researchers to contribute to more reliable and impactful scholarship in an evolving academic landscape.
... The primary aims of pre-registration are to enhance transparency in the research process and to distinguish exploratory analyses from confirmatory analyses. By committing to a specific analysis plan before data collection, pre-registration seeks to prevent QRPs such as phacking-manipulating data analysis until nonsignificant results become significant-and HARKing (Hypothesizing After the Results are Known) (Vazire, 2018;Nosek et al., 2019;Hardwicke & Wagenmakers, 2023). ...
... The primary aims of pre-registration are to enhance transparency in the research process and to distinguish exploratory analyses from confirmatory analyses. By committing to a specific analysis plan before data collection, pre-registration seeks to prevent QRPs such as phacking-manipulating data analysis until nonsignificant results become significant-and HARKing (Hypothesizing After the Results are Known) (Vazire, 2018;Nosek et al., 2019;Hardwicke & Wagenmakers, 2023). ...
Preprint
Full-text available
The replication crisis in the social sciences has revealed systemic issues undermining the credibility of research findings, primarily driven by misaligned incentives that encourage questionable research practices (QRPs). This paper offers a comprehensive and critical review of recent empirical evidence on the effectiveness of Open Science initiatives-such as replication studies, reproducibility efforts, pre-registrations, registered reports, and megastudies-in addressing the root causes of the replication crisis. Building upon and extending prior analyses, we integrate recent theoretical models from economics with empirical findings across various social science disciplines to assess how these practices impact research integrity. Our review demonstrates that while measures like pre-registration and data sharing have advanced transparency, they often fall short in mitigating QRPs due to persistent incentive misalignments. In contrast, registered reports and megastudies show greater promise by fundamentally reshaping the incentive structure, shifting the focus from producing statistically significant results to emphasizing methodological rigor and meaningful research questions. We argue that realigning incentives is crucial for fostering a culture of integrity and offer policy recommendations involving key stakeholders-including authors, journals, editors, reviewers, and institutions-to promote practices that enhance research reliability and credibility across the social sciences.
... Credibility in quantitative research has several dimensions. Vazire (2018) describes a credibility revolution in psychology and relates it to four themes including increasing research reporting transparency, encouraging preregistration of studies, conducting more replication studies, and adopting higher standards of evidence. This credibility revolution has been a response to what is commonly referred to as the replication crisis. ...
Conference Paper
Full-text available
Previous research in technology education has illustrated the need to increase transparency in how research is reported in technology education to improve credibility and enhance the repeatability of published works. Following the identification of a replication crisis is several social scientific fields, it became increasingly clear that analytic decisions made by quantitative researchers can have significant effects on observed results. The fact that making seemingly minor or arbitrary decisions whilst analysing data can impact results so much leads to two problematic scenarios. First, researchers may unknowingly produce unrobust results which are observed only by virtue of the analytic decisions they made rather than results which reflect reality or a broader population. Second, researchers may search through their data for a sequence of analytic decisions which provide a desirable result and report this without describing the entirety of the analytic process such that readers are unaware that the result is not robust. In light of these possibilities, this work is presented under the assumption that if either of the above two scenarios is occurring in technology education research that it is the first. By using a multiverse analysis to reanalyse data from a previously published study, the robustness of results pertaining to a specific hypothesis is illustrated. In a multiverse analysis, a researcher can specify a series of choices to be made in an analysis, such as which variables can be included or whether subgroups within the sample could be analysed separately, and then every possible scenario of decisions is examined. This process illustrates which results will be observed from particular analytic paths. Through this example, the multiverse analysis as a method for robustness checking is described and commentary is provided on the value of pre-registering research for increasing transparency within the quantitative technology education research literature.
... In response to the replication crisis (or reproducibility crisis), the psychological sciences are in the midst of a credibility revolution -a movement to reform research practices to be more rigorous and transparent (Vazire, 2018). The movement has called for widespread changes across all areas of the scientific ecosystem, including those that promote open scholarship practices (Silverstein et al., 2024). ...
... Similar to Churchill's saying, some scholars also identified the potential for change brought about by such crises (see e.g . Drucker 2016;Sijtsma 2023;Spellman 2015;Vazire 2018). However, what actually happens after the visibility dies down and the scandal is 'forgotten', or at least loses its urgency? ...
Article
Full-text available
Scandals involving cases of research misconduct are often considered to be main drivers for policy initiatives and institutional changes to foster research integrity. These impacts of scandals are usually witnessed during scandals’ peak visibility. In this article we change this perspective by examining the way in which scandals continue impacting academic institutions long after the initial attention has faded. To do so, we empirically study research integrity courses at multiple Danish universities. We combine data from document analysis, participatory observations and interviews. In addition, this article makes a conceptual contribution by introducing the notion of the ‘institutional afterlife’ of a scandal. We use this notion to demonstrate how scandals can affect academic communities and practices long after their initial visibility has faded, by re-entering communities and institutions. We thereby contribute a novel approach to studying scandals and their wider implications in academia.
... Additionally, due to perverse publication and academic rewarding culture, the psychological research field went through a crisis in which many published claims turned out both irreproducible and irreplicable. Ever since, it has been reforming and transforming to improve credibility [11]. Nowadays, more rigorous reporting standards are required, and early transparency on what will be researched (e.g. through pre-registration) is frequently given, or even expected. ...
Preprint
As audio machine learning outcomes are deployed in societally impactful applications, it is important to have a sense of the quality and origins of the data used. Noticing that being explicit about this sense is not trivially rewarded in academic publishing in applied machine learning domains, and neither is included in typical applied machine learning curricula, we present a study into dataset usage connected to the top-5 cited papers at the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). In this, we conduct thorough depth-first analyses towards origins of used datasets, often leading to searches that had to go beyond what was reported in official papers, and ending into unclear or entangled origins. Especially in the current pull towards larger, and possibly generative AI models, awareness of the need for accountability on data provenance is increasing. With this, we call on the community to not only focus on engineering larger models, but create more room and reward for explicitizing the foundations on which such models should be built.
... Notably, in the realm of social and cognitive psychology, collaborative endeavors like the Many Labs projects have seen research groups from diverse nations engaging in exact and preregistered replications of original studies or sets of studies (e.g., Klein et al., 2014Klein et al., , 2018; also see Stroebe, 2019). While the adoption of these practices has been associated with enhanced research rigor and more robust scholarly evidence, contributing to what some scholars term a "credibility revolution" (e.g., Korbmacher et al., 2023;Vazire, 2018), their application brings about specific consequences for LGBTIQ+ research. ...
Article
The open science (OS) movement has the potential to fundamentally shape how researchers conduct research and distribute findings. However, the implications for research on lesbian, gay, bisexual, trans, intersex, and queer/questioning (LGBTIQ+) experiences present unique considerations. In this paper, included in the special issue on Reimagining LGBTIQ+ Research , we explore how the OS movement broadens access to and comprehension of LGBTIQ+ experiences while simultaneously imposing limitations on the representation of these identities and raising concerns about risks to LGBTIQ+ researchers and participants. Our research focuses on three facets of the OS movement. First, we examine practices related to open data, which advocates that data should be accessible to other researchers to analyze. Yet, providing access to such data challenges may compromise trust between the research team and study participants. Second, we examine practices related to open replicable research, particularly as it has the potential to both highlight and erase the experiences of groups within the LGBTIQ+ community. Finally, we consider how open access, making scholarly articles free to the public, may help educate a broader audience on the lived experiences of LGBTIQ+ people, but in regions where these identities remain heavily stigmatized and/or criminalized, access may be blocked or individuals could be penalized for retrieving this information.
... Such attempts have identified relatively low replication rates (<60%; Camerer et al., 2016;Klein et al., 2014;Klein et al., 2018;Open Science Collaboration, 2015) with few exceptions but see Bak-Coleman & Devezer, 2023 for a comment; Soto, 2019). These findings have motivated claims that the psychological sciences are suffering from a 'replication crisis' (Maxwell et al., 2015;Nelson et al., 2018;Schooler, 2014) and are now undergoing a 'credibility revolution' (Korbmacher et al., 2023;Vazire, 2018). Concerns about replicability have therefore grown over the last decade, and have also been echoed in other sciences (e.g., Errington et al., 2021;Nosek & Errington, 2017). ...
Article
Full-text available
In psychological science, replicability—repeating a study with a new sample achieving consistent results (Parsons et al., 2022)—is critical for affirming the validity of scientific findings. Despite its importance, replication efforts are few and far between in psychological science with many attempts failing to corroborate past findings. This scarcity, compounded by the difficulty in accessing replication data, jeopardizes the efficient allocation of research resources and impedes scientific advancement. Addressing this crucial gap, we present the Replication Database (https://forrt-replications.shinyapps.io/fred_explorer), a novel platform hosting 1,239 original findings paired with replication findings. The infrastructure of this database allows researchers to submit, access, and engage with replication findings. The database makes replications visible, easily findable via a graphical user interface, and tracks replication rates across various factors, such as publication year or journal. This will facilitate future efforts to evaluate the robustness of psychological research.
... In addition, exactly because information processing in iVR is closer to daily life in the physical 3D world compared to the presentation of pictures and videos in 2D (Kisker, Gruber, & Schöne, 2021a;Schöne, Kisker, et al., 2021), similarities and differences between scientific finding obtained in iVR and classical experimental paradigms (i.e., with pictures and videos in 2D) are meaningful. If some psychological findings obtained through classical experimental paradigms do not replicate in iVR, this could mean that these classical findings are less likely to generalize to everyday life-which is crucial to know for the progress of psychology as a science (Vazire, 2018). ...
Article
Full text open access from publisher: https://doi.org/10.1016/j.actpsy.2024.104485 Immersive virtual reality (iVR), that is, digital stereoscopic 360° scenarios usually presented in head-mounted displays, has gained much popularity in medical, educational, and consumer contexts in the last years. Recently, psychological research started to utilize the theoretical and methodological advantages of iVR. Furthermore, understanding cognitive, emotional, and behavioral processes in iVR similar to real-life is a genuinely psychological, currently understudied topic. This article briefly reviews the current application of iVR in psychological research and related disciplines. The review presents empirical evidence for opportunities and strengths (e.g., realism, experimental control, effectiveness of therapeutic and educational interventions) as well as challenges and weaknesses (e.g., differences in experiencing presence, interacting with VR content including avatars, i.e., graphical representation of a person). The main part discusses areas requiring additional basic research, such as cognitive processes, socio-emotional processes during social interactions in iVR, and possible societal implications (e.g., fraud, VR-addiction). For both research and application, iVR offers a contemporary extension of the psychological toolkit, offering new avenues to investigate and enhance core phenomena of psychology such as cognition, affect, motivation, and behavior. Still, it is crucial to exercise caution in its application as excessive and careless use of iVR can pose risks to individuals' mental and physical well-being.
... This was used together with a negative case analysis involving examining inconsistencies among the data gathered at each stage of collection (Given, 2008). In addition, the decision-making related to the analysis, such as the development of themes, was noted to enhance the transparency of the data analysis, which can help increase research credibility (Lincoln & Guba, 1985;Vazire, 2018). ...
Article
Full-text available
Research on teacher strategies to improve learners' willingness to communicate (WTC) is limited. However, encouraging outside-the-class conversations could increase L2 exposure and improve communicative skills. This study investigated the effects of self-assessment of self-recorded conversations on Thai university EFL learners’ WTC and self-perceived communicative confidence. It also explored their attitudes towards the self-assessment. A mixed-method research design involving questionnaires, reflective reports, and interviews was employed to gain a comprehensive understanding of the effects and perspectives among 46 first-year university students with low English proficiency. Six students were interviewed to provide deeper insights into self-assessment effects. The results suggested that the self-assessment could develop WTC and self-perceived confidence. Also, it could raise awareness of their speaking problems as well as promote learning autonomy and the feeling of being supported by the teacher. Teachers can use the findings to increase learners’ WTC and foster learning autonomy, especially in the EFL context.
... The principles of transparency and openness at first seem at odds with many of the principles of academic capitalism and the changes it has had on research governance. Open research reforms have often been explicitly linked to Mertonian norms, appealing to traditional ethical values for governing research (e.g., Cohoon & Howison, 2021;Vazire, 2018). The previously explored manifestations of academic capitalism provide clear evidence of practices which promoted Mitroff's 'counter-norms': specifically selfinterestedness and solitariness. ...
Article
Full-text available
There is a need for more nuanced and theoretically grounded analysis of the socio-political consequences of methodological reforms proposed by the open research movement. This paper contributes by utilising the theory of academic capitalism and considering how open research reforms may interact with the priorities and practices of the capitalist university. Three manifestations of academic capitalism are considered: the development of a highly competitive job market for researchers based on metricized performance, the increase in administration resulting from university systems of compliance, and the reorganization of academic labour along principles of “post-academic science”. The ways in which open research reforms both oppose and align with these manifestations is then considered, to explore the relationships between specific reforms and academic capitalist praxis. Overall, it is concluded that open research advocates must engage more closely with the potential of reforms to negatively affect academic labour conditions, which may bring them into conflict with either university management, or those who uphold the traditional principles of an ‘all round’ academic role.
... To ensure scientific credibility we should be doing and publishing a lot more replications (Nosek et al., 2022;Vazire, 2018;Zwaan et al., 2018). There are many systemic challenges hindering replications: there is a strong bias for novelty in publishing, hiring, promotion, and funding, there are sensitivities around conducting replications where replicators are perceived as having some kind of an agenda, and strong prestige, hindsight, and outcome biases where -for example -replicators are criticized as incompetent when replications fail, or replication work is regarded as having no value and unsurprising when replications succeed (Chandrashekar & Feldman, 2024), with fierce debates even regarding the very definition of replication success and failure. ...
Preprint
Full-text available
The value of replications goes beyond replicability and is associated with the value of the research it replicates: Commentary on Isager et al. (2021)
Article
This research aims to identify the competencies required by librarians to organize and conduct citizen science activities. It seeks to enhance professional development by addressing gaps in field-specific and generic competencies through lifelong learning programs, thereby improving library services and fostering community engagement. A survey was conducted among librarians from various types of libraries in Croatia, including all library types. An online questionnaire, completed by 172 respondents, collected sociodemographic data, data on experience in conducting citizen science activities and professional development, self-assessment of competencies, competencies needed for conducting citizen science activities, and methods for acquiring these competencies. The study involved librarians who were already familiar with the topic of citizen science. A comprehensive list of competencies was created, covering both professional and generic skills relevant to citizen science. The data were analyzed to evaluate the importance and current acquisition levels of these competencies, as well as preferences for professional development methods. The results indicate a high level of interest among librarians in conducting citizen science activities, with most respondents recognizing the potential of such initiatives to improve library services and contribute to their professional development. While generic competencies such as teamwork, communication, and organizational skills were rated as highly important and well-acquired, field-specific skills like digitization, research data management, and open science concepts were identified as areas requiring further development. Webinars, courses and workshops emerged as the most preferred formats for professional training. Librarians acknowledge the significant role of citizen science in enriching library services and advancing their professional roles. However, targeted training programs are essential to address competency gaps, particularly in digital literacy and research-related skills. By equipping librarians with these competencies, libraries can play a key role in fostering citizen science initiatives and enhancing community engagement.
Article
Full-text available
Past experimental research shows that prosocial behavior promotes happiness. But do past findings hold up to current standards of consistent, rigorous, and generalizable evidence? In this review, we considered the evidentiary value of past experiments examining the happiness (i.e., subjective well-being; SWB) benefits of prosocial action, such as spending money on others or acts of kindness, in non-clinical samples. Specifically, we examined: (1) how consistent findings are across meta-analyses, (2) the conclusions of pre-registered, well-powered experiments, and (3) if the SWB benefits of prosociality are detectable beyond WEIRD (White-Western, Educated, Industrialized, Rich, Democratic) samples. Across the two meta-analyses we found, prosocial behavior led to a small consistent increase in happiness, yet estimates were based primarily on underpowered and WEIRD samples. We identified a growing number of pre-registered experiments (19/71 conducted to date), in which: (1) roughly half were well-powered; (2) only two recruited non-WEIRD samples, both underpowered and collectively showing mixed results; and (3) most examined prosocial spending (79%) over other prosocial behaviors, with happiness gains observed most consistently in well-powered studies on prosocial spending. Finally, we found that just 19% of all experiments recruited non-WEIRD samples, most of which were underpowered and presented mixed results, with acts of prosocial spending demonstrating the most consistent evidence of happiness benefits. We join other researchers in urging for more well-powered pre-registered experiments examining various prosocial behaviors, particularly with Global Majority samples, to ensure that our understanding of the SWB benefits of prosociality are firmly grounded in solid and inclusive evidence.
Article
Full-text available
Al-Hoorie et al. (2024) offer compelling arguments for why L2 motivational self system research is currently in a state of validation crisis. Seeking a constructive resolution to the crisis, in this response we argue that two fundamental conditions are needed for the field to emerge stronger: psychological readiness and methodological maturity. For psychological readiness, we call for a reframing of the “crisis” narrative. We highlight the need to value controversy, to normalize failure and (self-)correction, and to resist the allure of novelty. For methodological maturity, we suggest that an argument-based approach to validation can provide a constructive solution to current controversies. We present an integrated framework which can guide systematic validation efforts, and we demonstrate its application using a recent validation study as an example.
Article
Full-text available
Science is integral to society because it can inform individual, government, corporate, and civil society decision-making on issues such as public health, new technologies or climate change. Yet, public distrust and populist sentiment challenge the relationship between science and society. To help researchers analyse the science-society nexus across different geographical and cultural contexts, we undertook a cross-sectional population survey resulting in a dataset of 71,922 participants in 68 countries. The data were collected between November 2022 and August 2023 as part of the global Many Labs study “Trust in Science and Science-Related Populism” (TISP). The questionnaire contained comprehensive measures for individuals’ trust in scientists, science-related populist attitudes, perceptions of the role of science in society, science media use and communication behaviour, attitudes to climate change and support for environmental policies, personality traits, political and religious views and demographic characteristics. Here, we describe the dataset, survey materials and psychometric properties of key variables. We encourage researchers to use this unique dataset for global comparative analyses on public perceptions of science and its role in society and policy-making.
Article
When we use language to communicate, we must choose what to say, what not to say, and how to say it. That is, we must decide how to frame the message. These linguistic choices matter: Framing a discussion one way or another can influence how people think, feel, and act in many important domains, including politics, health, business, journalism, law, and even conversations with loved ones. The ubiquity of framing effects raises several important questions relevant to the public interest: What makes certain messages so potent and others so ineffectual? Do framing effects pose a threat to our autonomy, or are they a rational response to variation in linguistic content? Can we learn to use language more effectively to promote policy reforms or other causes we believe in, or is this an overly idealistic goal? In this article, we address these questions by providing an integrative review of the psychology of framing. We begin with a brief history of the concept of framing and a survey of common framing effects. We then outline the cognitive, social-pragmatic, and emotional mechanisms underlying such effects. This discussion centers on the view that framing is a natural—and unavoidable—feature of human communication. From this perspective, framing effects reflect a sensible response to messages that communicate different information. In the second half of the article, we provide a taxonomy of linguistic framing techniques, describing various ways that the structure or content of a message can be altered to shape people’s mental models of what is being described. Some framing manipulations are subtle, involving a slight shift in grammar or wording. Others are more overt, involving wholesale changes to a message. Finally, we consider factors that moderate the impact of framing, gaps in the current empirical literature, and opportunities for future research. We conclude by offering general recommendations for effective framing and reflecting on the place of framing in society. Linguistic framing is powerful, but its effects are not inevitable—we can always reframe an issue to ourselves or other people.
Article
Social sciences are navigating an unprecedented period of introspection about the credibility and utility of disciplinary practices. Reform initiatives have emphasized the benefits of various transparency and reproducibility-related research practices; however, the adoption of these across music psychology is unknown. To estimate the prevalence, a manual examination of a random sample of 239 articles out of 1,192 articles published in five music psychology journals between 2017 and 2022 was carried out. About half of the articles were publicly available (112/239) and 39% share some of the research materials, but 5% share raw data and 1% analysis scripts. Pre-registrations were not observed in the sample. Most articles (82%) included a funding disclosure statement, but conflict of interest statements were less common (27%). Replication studies were rare (3%). Additional searches of replication studies were conducted beyond the sample. These analyses did not find substantially more replication studies in music psychology. In general, the results suggest that transparency and reproducibility-related research practices were far from routine in music psychology. The findings establish a baseline that can be used to assess future progress toward increasing the credibility and openness of music psychology research.
Chapter
In a time where new research methods are constantly being developed and science is evolving, researchers must continually educate themselves on cutting-edge methods and best practices related to their field. The second of three volumes, this Handbook provides comprehensive and up-to-date coverage of a variety of issues important in developing, designing, and collecting data to produce high-quality research efforts. First, leading scholars from around the world provide an in depth explanation of various advanced methodological techniques. In section two, chapters cover general important methodological considerations across all types of data collection. In the third section, the chapters cover self-report and behavioral measures and their considerations for use. In the fourth section, various psychological measures are covered. The final section of the handbook covers issues that directly concern qualitative data collection approaches. Throughout the book, examples and real-world research efforts from dozens of different disciplines are discussed.
Article
Objectives As a fundamental ethical principle, honesty plays a pivotal role in trust‐building, social cohesion, and effective governance. This study examined how honesty was valued in Thailand, an upper middle‐income Asian economy steeped in Buddhist values. It also explored the relationship between honesty and socioeconomic characteristics. Methods Using primary data from 1230 Bangkok residents aged 18–75, collected through stratified multi‐stage sampling, an instrument to measure honesty was developed and validated. Honesty was operationalized with three components: truthfulness, respect for ownership, and accountability. Ordinary Least Squares (OLS) and Tobit regression analyses were performed. Results Honesty was valued highly in Thai society. The average scores for accountability and respect for ownership were notably higher than for truthfulness. Socioeconomic characteristics, including gender, age, birth cohort, place of birth, religiosity, and self‐rated economic status, were statistically associated with overall honesty and influenced the three components of honesty differently. The effects of age and birth cohorts were distinct. Younger birth cohorts were associated with a higher level of honesty. However, within each cohort, honesty increased with age. Conclusions This study proposes recommendations aimed at promoting honesty in Thai society and provides ideas for future studies.
Book
Full-text available
This volume, featuring 14 chapters from some of the most forward-thinking scholars applied linguistics, seeks to provide and equip readers with an in-depth and field-specific understanding of OS principles and practices. As evident in the table of contents, the chapters cover a range of topics related to OS. Some are largely conceptual, seeking to foster an understanding of the rationales for OS as well as the open science ethic; others are much more practical, offering hands-on guidance for OS practices (e.g., preregistration, data sharing) whether at the individual researcher, journal, or programmatic level.
Article
The emergence of large language models (LLMs) has sparked considerable interest in their potential application in psychological research, mainly as a model of the human psyche or as a general text-analysis tool. However, the trend of using LLMs without sufficient attention to their limitations and risks, which we rhetorically refer to as “GPTology”, can be detrimental given the easy access to models such as ChatGPT. Beyond existing general guidelines, we investigate the current limitations, ethical implications, and potential of LLMs specifically for psychological research, and show their concrete impact in various empirical studies. Our results highlight the importance of recognizing global psychological diversity, cautioning against treating LLMs (especially in zero-shot settings) as universal solutions for text analysis, and developing transparent, open methods to address LLMs’ opaque nature for reliable, reproducible, and robust inference from AI-generated data. Acknowledging LLMs’ utility for task automation, such as text annotation, or to expand our understanding of human psychology, we argue for diversifying human samples and expanding psychology’s methodological toolbox to promote an inclusive, generalizable science, countering homogenization, and over-reliance on LLMs.
Article
Full-text available
We wish to express our concern for the role of for-profit scientific publishers in understanding and appropriating what “Open Science” means. This role can be characterised as opportunistic, and has led to an interpretation that can cause considerable confusion when we identify Open Science with Open Access and Open Access with "paying for publishing”. This simplistic approach to what Open Science entails has led to poor quality publications, hindering the improvement of researchers' practices and culture. We discuss and clarify issues, identifying “false friends”, misunderstandings and misleading interpretations of Open Science. A superficial interpretation, sometimes driven by vested interests or simply due to the proliferation of bad practices, leads to unethical behaviour or simply opportunism, in the ‘publish and perish’ context where Open Science has developed. We then provide guidance on challenges and potential solutions for all stakeholders to increase rigour and credibility in science, through a genuine researcher perspective of Open Science.
Article
Full-text available
We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p< .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p< .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely high-powered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (< 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied
Article
Full-text available
Dijksterhuis and van Knippenberg (1998) reported that participants primed with an intelligent category (“professor”) subsequently performed 13.1% better on a trivia test than participants primed with an unintelligent category (“soccer hooligans”). Two unpublished replications of this study by the original authors, designed to verify the appropriate testing procedures, observed a smaller difference between conditions (2-3%) as well as a gender difference: men showed the effect (9.3% and 7.6%) but women did not (0.3% and -0.3%). The procedure used in those replications served as the basis for this multi-lab Registered Replication Report (RRR). A total of 40 laboratories collected data for this project, with 23 laboratories meeting all inclusion criteria. Here we report the meta-analytic result of those 23 direct replications (total N = 4,493) of the updated version of the original study, examining the difference between priming with professor and hooligan on a 30-item general knowledge trivia task (a supplementary analysis reports results with all 40 labs, N = 6,454). We observed no overall difference in trivia performance between participants primed with professor and those primed with hooligan (0.14%) and no moderation by gender.
Article
Full-text available
We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
Article
Full-text available
Finkel, Rusbult, Kumashiro, and Hannon (2002, Study 1) demonstrated a causal link between subjective commitment to a relationship and how people responded to hypothetical betrayals of that relationship. Participants primed to think about their commitment to their partner (high commitment) reacted to the betrayals with reduced exit and neglect responses relative to those primed to think about their independence from their partner (low commitment). The priming manipulation did not affect constructive voice and loyalty responses. Although other studies have demonstrated a correlation between subjective commitment and responses to betrayal, this study provides the only experimental evidence that inducing changes to subjective commitment can causally affect forgiveness responses. This Registered Replication Report (RRR) meta-analytically combines the results of 16 new direct replications of the original study, all of which followed a standardized, vetted, and preregistered protocol. The results showed little effect of the priming manipulation on the forgiveness outcome measures, but it also did not observe an effect of priming on subjective commitment, so the manipulation did not work as it had in the original study. We discuss possible explanations for the discrepancy between the findings from this RRR and the original study.
Article
Full-text available
When consumers of science (readers and reviewers) lack relevant details about the study design, data, and analyses, they cannot adequately evaluate the strength of a scientific study. Lack of transparency is common in science, and is encouraged by journals that place more emphasis on the aesthetic appeal of a manuscript than the robustness of its scientific claims. In doing this, journals are implicitly encouraging authors to do whatever it takes to obtain eye-catching results. To achieve this, researchers can use common research practices that beautify results at the expense of the robustness of those results (e.g., p-hacking). The problem is not engaging in these practices, but failing to disclose them. A car whose carburetor is duct-taped to the rest of the car might work perfectly fine, but the buyer has a right to know about the duct-taping. Without high levels of transparency in scientific publications, consumers of scientific manuscripts are in a similar position as buyers of used cars – they cannot reliably tell the difference between lemons and high quality findings. This phenomenon – quality uncertainty – has been shown to erode trust in economic markets, such as the used car market. The same problem threatens to erode trust in science. The solution is to increase transparency and give consumers of scientific research the information they need to accurately evaluate research. Transparency would also encourage researchers to be more careful in how they conduct their studies and write up their results. To make this happen, we must tie journals’ reputations to their practices regarding transparency. Reviewers hold a great deal of power to make this happen, by demanding the transparency needed to rigorously evaluate scientific manuscripts. The public expects transparency from science, and appropriately so – we should be held to a higher standard than used car salespeople.
Article
Full-text available
Improving the reliability and efficiency of scientific research will increase the credibility of the published scientific literature and accelerate discovery. Here we argue for the adoption of measures to optimize key elements of the scientific process: methods, reporting and dissemination, reproducibility, evaluation and incentives. There is some evidence from both simulations and empirical studies supporting the likely effectiveness of these measures, but their broad adoption by researchers, institutions, funders and journals will require iterative evaluation and improvement. We discuss the goals of these measures, and how they can be implemented, in the hope that this will facilitate action toward improving the transparency, reproducibility and efficiency of scientific research.
Article
Full-text available
Poor research design and data analysis encourage false-positive findings. Such poor methods persist despite perennial calls for improvement, suggesting that they result from something more than just misunderstanding. The persistence of poor methods results partly from incentives that favor them, leading to the natural selection of bad science. This dynamic requires no conscious strategizing---no deliberate cheating nor loafing---by scientists, only that publication is a principle factor for career advancement. Some normative methods of analysis have almost certainly been selected to further publication instead of discovery. In order to improve the culture of science, a shift must be made away from correcting misunderstandings and towards rewarding understanding. We support this argument with empirical evidence and computational modeling. We first present a 60-year meta-analysis of statistical power in the behavioral sciences and show that power has not improved despite repeated demonstrations of the necessity of increasing power. To demonstrate the logical consequences of structural incentives, we then present a dynamic model of scientific communities in which competing laboratories investigate novel or previously published hypotheses using culturally transmitted research methods. As in the real world, successful labs produce more "progeny", such that their methods are more often copied and their students are more likely to start labs of their own. Selection for high output leads to poorer methods and increasingly high false discovery rates. We additionally show that replication slows but does not stop the process of methodological deterioration. Improving the quality of research requires change at the institutional level.
Article
Full-text available
Author Openness is a core value of scientific practice. The sharing of research materials and data facilitates critique, extension, and application within the scientific community, yet current norms provide few incentives for researchers to share evidence underlying scientific claims. In January 2014, the journal Psychological Science adopted such an incentive by offering “badges” to acknowledge and signal open practices in publications. In this study, we evaluated the effect that two types of badges—Open Data badges and Open Materials badges—have had on reported data and material sharing, as well as on the actual availability, correctness, usability, and completeness of those data and materials both in Psychological Science and in four comparison journals. We report an increase in reported data sharing of more than an order of magnitude from baseline in Psychological Science, as well as an increase in reported materials sharing, although to a weaker degree. Moreover, we show that reportedly available data and materials were more accessible, correct, usable, and complete when badges were earned. We demonstrate that badges are effective incentives that improve the openness, accessibility, and persistence of data and materials that underlie scientific research.
Article
Full-text available
Language can be viewed as a complex set of cues that shape people’s mental representations of situations. For example, people think of behavior described using imperfective aspect (i.e., what a person was doing) as a dynamic, unfolding sequence of actions, whereas the same behavior described using perfective aspect (i.e., what a person did) is perceived as a completed whole. A recent study found that aspect can also influence how we think about a person’s intentions (Hart & Albarracín, 2011). Participants judged actions described in imperfective as being more intentional (d between 0.67 and 0.77) and they imagined these actions in more detail (d = 0.73). The fact that this finding has implications for legal decision making, coupled with the absence of other direct replication attempts, motivated this registered replication report (RRR). Multiple laboratories carried out 12 direct replication studies, including one MTurk study. A meta-analysis of these studies provides a precise estimate of the size of this effect free from publication bias. This RRR did not find that grammatical aspect affects intentionality (d between 0 and −0.24) or imagery (d = −0.08). We discuss possible explanations for the discrepancy between these results and those of the original study.
Article
Full-text available
The data includes measures collected for the two experiments reported in “False-Positive Psychology” [1] where listening to a randomly assigned song made people feel younger (Study 1) or actually be younger (Study 2). These data are useful because they illustrate inflations of false positive rates due to flexibility in data collection, analysis, and reporting of results. Data are useful for educational purposes.
Article
Full-text available
Crisis of replicability is one term that psychological scientists use for the current introspective phase we are in-I argue instead that we are going through a revolution analogous to a political revolution. Revolution 2.0 is an uprising focused on how we should be doing science now (i.e., in a 2.0 world). The precipitating events of the revolution have already been well-documented: failures to replicate, questionable research practices, fraud, etc. And the fact that none of these events is new to our field has also been well-documented. I suggest four interconnected reasons as to why this time is different: changing technology, changing demographics of researchers, limited resources, and misaligned incentives. I then describe two reasons why the revolution is more likely to catch on this time: technology (as part of the solution) and the fact that these concerns cut across social and life sciences-that is, we are not alone. Neither side in the revolution has behaved well, and each has characterized the other in extreme terms (although, of course, each has had a few extreme actors). Some suggested reforms are already taking hold (e.g., journals asking for more transparency in methods and analysis decisions; journals publishing replications) but the feared tyrannical requirements have, of course, not taken root (e.g., few journals require open data; there is no ban on exploratory analyses). Still, we have not yet made needed advances in the ways in which we accumulate, connect, and extract conclusions from our aggregated research. However, we are now ready to move forward by adopting incremental changes and by acknowledging the multiplicity of goals within psychological science.
Article
Full-text available
Transparency, openness, and reproducibility are readily recognized as vital features of science (1, 2). When asked, most scientists embrace these features as disciplinary norms and values (3). Therefore, one might expect that these valued features would be routine in daily practice. Yet, a growing body of evidence suggests that this is not the case (4–6).
Article
Full-text available
Although researchers often assume their participants are naive to experimental materials, this is not always the case. We investigated how prior exposure to a task affects subsequent experimental results. Participants in this study completed the same set of 12 experimental tasks at two points in time, first as a part of the Many Labs replication project and again a few days, a week, or a month later. Effect sizes were markedly lower in the second wave than in the first. The reduction was most pronounced when participants were assigned to a different condition in the second wave. We discuss the methodological implications of these findings. © The Author(s) 2015.
Article
Full-text available
In 2012, the American Political Science Association (APSA) Council adopted new policies guiding data access and research transparency in political science. The policies appear as a revision to APSA's Guide to Professional Ethics in Political Science. The revisions were the product of an extended and broad consultation with a variety of APSA committees and the association's membership.
Article
Full-text available
The authors evaluate the quality of research reported in major journals in social-personality psychology by ranking those journals with respect to their N-pact Factors (NF)-the statistical power of the empirical studies they publish to detect typical effect sizes. Power is a particularly important attribute for evaluating research quality because, relative to studies that have low power, studies that have high power are more likely to (a) to provide accurate estimates of effects, (b) to produce literatures with low false positive rates, and (c) to lead to replicable findings. The authors show that the average sample size in social-personality research is 104 and that the power to detect the typical effect size in the field is approximately 50%. Moreover, they show that there is considerable variation among journals in sample sizes and power of the studies they publish, with some journals consistently publishing higher power studies than others. The authors hope that these rankings will be of use to authors who are choosing where to submit their best work, provide hiring and promotion committees with a superior way of quantifying journal quality, and encourage competition among journals to improve their NF rankings.
Article
Full-text available
Schnall, Benton, and Harvey (2008) hypothesized that physical cleanliness reduces the severity of moral judgments. In support of this idea, they found that individuals make less severe judgments when they are primed with the concept of cleanliness (Exp. 1) and when they wash their hands after experiencing disgust (Exp. 2). We conducted direct replications of both studies using materials supplied by the original authors. We did not find evidence that physical cleanliness reduced the severity of moral judgments using samples sizes that provided over .99 power to detect the original effect sizes. Our estimates of the overall effect size were much smaller than estimates from Experiment 1 (original d = −0.60, 95% CI [−1.23, 0.04], N = 40; replication d = −0.01, 95% CI [−0.28, 0.26], N = 208) and Experiment 2 (original d = −0.85, 95% CI [−1.47, −0.22], N = 43; replication d = 0.01, 95% CI [−.34, 0.36], N = 126). These findings suggest that the population effect sizes are probably substantially smaller than the original estimates. Researchers investigating the connections between cleanliness and morality should therefore use large sample sizes to have the necessary power to detect subtle effects.
Article
Full-text available
Bargh and Shalev (2012) hypothesized that people use warm showers and baths to compensate for a lack of social warmth. As support for this idea, they reported results from two studies that found an association between trait loneliness and bathing habits. Given the potential practical and theoretical importance of this association, we conducted nine additional studies on this topic. Using our own bathing or showering measures and the most current version of the UCLA Loneliness Scale (Russell, 1996), we found no evidence for an association between trait loneliness and a composite index of showering or bathing habits in a combined sample of 1,153 participants from four studies. Likewise, the aggregated effect size estimate was not statistically significant using the same measures as the original studies in a combined sample of 1,920 participants from five studies. A local meta-analysis including the original studies yielded an effect size estimate for the composite that included zero in the 95% confidence interval. The current results therefore cast doubt on the idea of a strong connection between trait loneliness and personal bathing habits related to warmth. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Full-text available
Although replication is a central tenet of science, direct replications are rare in psychology. This research tested variation in the replicability of 13 classic and contemporary effects across 36 independent samples totaling 6,344 participants. In the aggregate, 10 effects replicated consistently. One effect – imagined contact reducing prejudice – showed weak support for replicability. And two effects – flag priming influencing conservatism and currency priming influencing system justification – did not replicate. We compared whether the conditions such as lab versus online or US versus international sample predicted effect magnitudes. By and large they did not. The results of this small sample of effects suggest that replicability is more dependent on the effect itself than on the sample and setting used to investigate the effect.
Article
Full-text available
Article
Full-text available
The current crisis in psychological research involves issues of fraud, replication, publication bias, and false positive results. I argue that this crisis follows the failure of widely adopted solutions to psychology's similar crisis of the 1970s. The untouched root cause is an information-economic one: Too many studies divided by too few publication outlets equals a bottleneck. Articles cannot pass through just by showing theoretical meaning and methodological rigor; their results must appear to support the hypothesis perfectly. Consequently, psychologists must master the art of presenting perfect-looking results just to survive in the profession. This favors aesthetic criteria of presentation in a way that harms science's search for truth. Shallow standards of statistical perfection distort analyses and undermine the accuracy of cumulative data; narrative expectations encourage dishonesty about the relationship between results and hypotheses; criteria of novelty suppress replication attempts. Concerns about truth in research are emerging in other sciences and may eventually descend on our heads in the form of difficult and insensitive regulations. I suggest a more palatable solution: to open the bottleneck, putting structures in place to reward broader forms of information sharing beyond the exquisite art of present-day journal publication. © The Author(s) 2012.
Article
Full-text available
Recent controversies in psychology have spurred conversations about the nature and quality of psychological research. One topic receiving substantial attention is the role of replication in psychological science. Using the complete publication history of the 100 psychology journals with the highest 5-year impact factors, the current article provides an overview of replications in psychological research since 1900. This investigation revealed that roughly 1.6% of all psychology publications used the term replication in text. A more thorough analysis of 500 randomly selected articles revealed that only 68% of articles using the term replication were actual replications, resulting in an overall replication rate of 1.07%. Contrary to previous findings in other fields, this study found that the majority of replications in psychology journals reported similar findings to their original studies (i.e., they were successful replications). However, replications were significantly less likely to be successful when there was no overlap in authorship between the original and replicating articles. Moreover, despite numerous systemic biases, the rate at which replications are being published has increased in recent decades. © The Author(s) 2012.
Article
Full-text available
Bargh et al. (2001) reported two experiments in which people were exposed to words related to achievement (e.g., strive, attain) or to neutral words, and then performed a demanding cognitive task. Performance on the task was enhanced after exposure to the achievement related words. Bargh and colleagues concluded that better performance was due to the achievement words having activated a "high-performance goal". Because the paper has been cited well over 1100 times, an attempt to replicate its findings would seem warranted. Two direct replication attempts were performed. Results from the first experiment (n = 98) found no effect of priming, and the means were in the opposite direction from those reported by Bargh and colleagues. The second experiment followed up on the observation by Bargh et al. (2001) that high-performance-goal priming was enhanced by a 5-minute delay between priming and test. Adding such a delay, we still found no evidence for high-performance-goal priming (n = 66). These failures to replicate, along with other recent results, suggest that the literature on goal priming requires some skeptical scrutiny.
Article
Full-text available
The author developed a model that explains and predicts both longitudinal and cross-sectional variation in the output of major and minor creative products. The model first yields a mathematical equation that accounts for the empirical age curves, including contrasts across creative domains in the expected career trajectories. The model is then extended to account for individual differences in career trajectories, such as the longitudinal stability of cross-sectional variation and the differential placement of career landmarks (the ages at first, best, and last contribution). The theory is parsimonious in that it requires only two individual-difference parameters (initial creative potential and age at career onset) and two information-processing parameters (ideation and elaboration rates), plus a single principle (the equal-odds rule), to derive several precise predictions that cannot be generated by any alternative theory.
Article
Full-text available
Cohen (1962) pointed out the importance of statistical power for psychology as a science, but statistical power of studies has not increased, while the number of studies in a single article has increased. It has been overlooked that multiple studies with modest power have a high probability of producing nonsignificant results because power decreases as a function of the number of statistical tests that are being conducted (Maxwell, 2004). The discrepancy between the expected number of significant results and the actual number of significant results in multiple-study articles undermines the credibility of the reported results, and it is likely that questionable research practices have contributed to the reporting of too many significant results (Sterling, 1959). The problem of low power in multiple-study articles is illustrated using Bem's (2011) article on extrasensory perception and Gailliot et al.'s (2007) article on glucose and self-regulation. I conclude with several recommendations that can increase the credibility of scientific evidence in psychological journals. One major recommendation is to pay more attention to the power of studies to produce positive results without the help of questionable research practices and to request that authors justify sample sizes with a priori predictions of effect sizes. It is also important to publish replication studies with nonsignificant results if these studies have high power to replicate a published finding.
Article
Full-text available
Existing norms for scientific communication are rooted in anachronistic practices of bygone eras, making them needlessly inefficient. We outline a path that moves away from the existing model of scientific communication to improve the efficiency in meeting the purpose of public science - knowledge accumulation. We call for six changes: (1) full embrace of digital communication, (2) open access to all published research, (3) disentangling publication from evaluation, (4) breaking the "one article, one journal" model with a grading system for evaluation and diversified dissemination outlets, (5) publishing peer review, and, (6) allowing open, continuous peer review. We address conceptual and practical barriers to change, and provide examples showing how the suggested practices are being used already. The critical barriers to change are not technical or financial; they are social. While scientists guard the status quo, they also have the power to change it.
Article
Full-text available
In this article, we accomplish two things. First, we show that despite empirical psychologists' nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.
Article
Full-text available
This analysis, based on focus groups and a national survey, assesses scientists' subscription to the Mertonian norms of science and associated counternorms. It also supports extension of these norms to governance (as opposed to administration), as a norm of decision-making, and quality (as opposed to quantity), as a evaluative norm.
Article
Full-text available
Just over a quarter century ago, Edward Leamer (1983) reflected on the state of empirical work in economics. He urged empirical researchers to “take the con out of econometrics” and memorably observed (p. 37): “Hardly anyone takes data analysis seriously. Or perhaps more accurately, hardly anyone takes anyone else’s data analysis seriously.” Leamer was not alone; Hendry (1980), Sims (1980), and others writing at about the same time were similarly disparaging of empirical practice. Reading these commentaries, we wondered as late-1980s Ph.D. students about the prospects for a satisfying career doing applied work. Perhaps credible empirical work in economics is a pipe dream. Here we address the questions of whether the quality and the credibility of empirical work have increased since Leamer’s pessimistic assessment. Our views are necessarily colored by the areas of applied microeconomics in which we are active, but we look over the fence at other areas as well.
Article
Many philosophers of science and methodologists have argued that the ability to repeat studies and obtain similar results is an essential component of science. A finding is elevated from single observation to scientific evidence when the procedures that were used to obtain it can be reproduced and the finding itself can be replicated. Recent replication attempts show that some high profile results---most notably in psychology, but in many other disciplines as well---cannot be replicated consistently. These replication attempts have generated a considerable amount of controversy and the issue of whether direct replications have value has, in particular, proven to be contentious. However, much of this discussion has occurred in published commentaries and social media outlets, resulting in a fragmented discourse. To address the need for an integrative summary, we review various types of replication studies and then discuss the most commonly voiced concerns about direct replication. We provide detailed responses to these concerns and consider different statistical ways to evaluate replications. We conclude there are no theoretical or statistical obstacles to making direct replication a routine aspect of psychological science.
Article
In 2010–2012, a few largely coincidental events led experimental psychologists to realize that their approach to collecting, analyzing, and reporting data made it too easy to publish false-positive findings. This sparked a period of methodological reflection that we review here and call Psychology’s Renaissance. We begin by describing how psychologists’ concerns with publication bias shifted from worrying about file-drawered studies to worrying about p-hacked analyses. We then review the methodological changes that psychologists have proposed and, in some cases, embraced. In describing how the renaissance has unfolded, we attempt to describe different points of view fairly but not neutrally, so as to identify the most promising paths forward. In so doing, we champion disclosure and preregistration, express skepticism about most statistical solutions to publication bias, take positions on the analysis and interpretation of replication failures, and contend that meta-analytical thinking increases the prevalence of false positives. Our general thesis is that the scientific practices of experimental psychologists have improved dramatically. Expected final online publication date for the Annual Review of Psychology Volume 69 is January 4, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
Psychological scientists draw inferences about populations based on samples—of people, situations, and stimuli—from those populations. Yet, few papers identify their target populations, and even fewer justify how or why the tested samples are representative of broader populations. A cumulative science depends on accurately characterizing the generality of findings, but current publishing standards do not require authors to constrain their inferences, leaving readers to assume the broadest possible generalizations. We propose that the discussion section of all primary research articles specify Constraints on Generality (i.e., a “COG” statement) that identify and justify target populations for the reported findings. Explicitly defining the target populations will help other researchers to sample from the same populations when conducting a direct replication, and it could encourage follow-up studies that test the boundary conditions of the original finding. Universal adoption of COG statements would change publishing incentives to favor a more cumulative science.
Article
Pre-registration of studies before they are conducted has recently become more feasible for researchers, and is encouraged by an increasing number of journals. However, because the practice of pre-registration is relatively new to psychological science, specific guidelines for the content of registrations are still in a formative stage. After giving a brief history of pre-registration in medical and psychological research, we outline two different models that can be applied reviewed and unreviewed pre-registration and discuss the advantages of each model to science as a whole and to.the individual scientist, as well as some of their drawbacks and limitations. Finally, we present and justify a proposed standard template that can facilitate pre-registration. Researchers can use the template before and during the editorial process to meet article requirements and enhance the robustness of their scholarly efforts.
Article
Finkel, Rusbult, Kumashiro, and Hannon (2002, Study 1) demonstrated a causal link between subjective commitment to a relationship and how people responded to hypothetical betrayals of that relationship. Participants primed to think about their commitment to their partner (high commitment) reacted to the betrayals with reduced exit and neglect responses relative to those primed to think about their independence from their partner (low commitment). The priming manipulation did not affect constructive voice and loyalty responses. Although other studies have demonstrated a correlation between subjective commitment and responses to betrayal, this study provides the only experimental evidence that inducing changes to subjective commitment can causally affect forgiveness responses. This Registered Replication Report (RRR) meta-analytically combines the results of 16 new direct replications of the original study, all of which followed a standardized, vetted, and preregistered protocol. The results showed little effect of the priming manipulation on the forgiveness outcome measures, but it also did not observe an effect of priming on subjective commitment, so the manipulation did not work as it had in the original study. We discuss possible explanations for the discrepancy between the findings from this RRR and the original study.
Article
Pre-registration of studies before they are conducted has recently become more feasible for researchers, and is encouraged by an increasing number of journals. However, because the practice of pre-registration is relatively new to psychological science, specific guidelines for the content of registrations are still in a formative stage. After giving a brief history of pre-registration in medical and psychological research, we outline two different models that can be applied—reviewed and unreviewed pre-registration—and discuss the advantages of each model to science as a whole and to the individual scientist, as well as some of their drawbacks and limitations. Finally, we present and justify a proposed standard template that can facilitate pre-registration. Researchers can use the template before and during the editorial process to meet article requirements and enhance the robustness of their scholarly efforts.
Article
The university participant pool is a key resource for behavioral research, and data quality is believed to vary over the course of the academic semester. This crowdsourced project examined time of semester variation in 10 known effects, 10 individual differences, and 3 data quality indicators over the course of the academic semester in 20 participant pools (N = 2696) and with an online sample (N = 737). Weak time of semester effects were observed on data quality indicators, participant sex, and a few individual differences—conscientiousness, mood, and stress. However, there was little evidence for time of semester qualifying experimental or correlational effects. The generality of this evidence is unknown because only a subset of the tested effects demonstrated evidence for the original result in the whole sample. Mean characteristics of pool samples change slightly during the semester, but these data suggest that those changes are mostly irrelevant for detecting effects.
Article
Social psychology's current crisis has prompted calls for larger samples and more replications. Building on Sakaluk's (in this issue) distinction between exploration and confirmation, I argue that this shift will increase correctness of findings, but at the expense of exploration and discovery. The likely effects on the field include aversion to risk, increased difficulty in building careers and hence more capricious hiring and promotion policies, loss of interdisciplinary influence, and rising interest in small, weak findings. Winners (who stand to gain from the mooted changes) include researchers with the patience and requisite resources to assemble large samples; incompetent experimenters; destructive iconoclasts; competing subfields of psychology; and lower-ranked journals, insofar as they publish creative work with small samples. The losers are young researchers; writers of literature reviews and textbooks; flamboyant, creative researchers with lesser levels of patience; and researchers at small colleges. My position is that the field has actually done quite well in recent decades, and improvement should be undertaken as further refinement of a successful approach, in contrast to the Cassandrian view that the field's body of knowledge is hopelessly flawed and radical, revolutionary change is needed. I recommend we retain the exploratory research approach alongside the new, large-sample confirmatory work.
Article
psychological science, the field, continue to struggle with the challenge of establishing interesting and important and replicable phenomena. As I often tell my students, “If scientific psychology was easy, everyone would do it.” We can take some comfort in knowing that other sciences, too, face similar challenges (e.g., Begley & Ellis, 2012). But our business is with psychology. In August of this year, Science published a fascinating article by Brian Nosek and 269 coauthors (Open Science Collaboration, 2015). They reported direct replication attempts of 100 experiments published in prestigious psychology journals in 2008, including experiments reported in 39 articles in Psychological Science. Although I expect there is room to critique some of the replications, the article strikes me as a terrific piece of work, and I recommend reading it (and giving it to students). For each experiment, researchers prespecified a benchmark finding. On average, the replications had statistical power of .90+ to detect effects of the sizes obtained in the original studies, but fewer than half of them yielded a statistically significant effect. As Nosek and his coauthors made clear, even ideal replications of ideal studies are expected to fail some of the time (Francis, 2012), and failure to replicate a previously observed effect can arise from differences between the original and replication studies and hence do not necessarily indicate flaws in the original study (Maxwell, Lau, & Howard, 2015; Stroebe & Strack, 2014). Still, it seems likely that psychology journals have too often reported spurious effects arising from Type I errors (e.g., Francis, 2014).... Language: en
Article
Within social and personality psychology, the existing “old prototype” of a publishable article is at odds with new expectations for transparent reporting. If researchers anticipate having to report everything while continuing to aim for a research product that includes multiple studies, examining a novel effect, with only statistically significant results, this will have negative implications for initial decisions about what research to conduct. First, researchers will be discouraged from collecting additional data because this could potentially mar existing findings. Second, they will be discouraged from pursuing questions for which the answers are unknown, as this would be a waste if the results do not fit old-prototype expectations. These practices undermine what seem to be two universal values within personality and social psychology: truth and interestingness. Suggestions for a “new prototype” that de-emphasizes p-value cutoffs, multiple studies, and novelty will be discussed with an eye toward encouraging research decisions that foster true and interesting findings.
Book
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
Article
Prior research supports the inference that scientific disciplines can be ordered into a hierarchy ranging from the "hard" natural sciences to the "soft" social sciences. This ordering corresponds with such objective criteria as disciplinary consensus, knowledge obsolescence rate, anticipation frequency, theories-to-laws ratio, lecture disfluency, and age at recognition. It is then argued that this hierarchy can be extrapolated to encompass the humanities and arts and interpolated within specific domains to accommodate contrasts in subdomains (e.g., revolutionary versus normal science). This expanded and more finely differentiated hierarchy is then shown to have a partial psychological basis in terms of dispositional traits (e.g., psychopathology) and developmental experiences (e.g., family background). This demonstration then leads to three hypotheses about how a creator's domain-specific impact depends on his or her disposition and development: the domain-progressive, domain-typical, and domain-regressive creator hypotheses. Studies published thus far lend the most support to the domain-regressive creator hypothesis. In particular, major contributors to a domain are more likely to have dispositional traits and developmental experiences most similar to those that prevail in a domain lower in the disciplinary hierarchy. However, some complications to this generalization suggest the need for more research on the proposed hierarchical model. © 2009 Association for Psychological Science.
Article
Three mostly positive developments in academic psychology-the cognitive revolution, the virtual requirement for multiple study reports in our top journals, and the prioritization of mediational evidence in our data-have had the unintended effect of making field research on naturally occurring behavior less suited to publication in the leading outlets of the discipline. Two regrettable consequences have ensued. The first is a reduction in the willingness of researchers, especially those young investigators confronting hiring and promotion issues, to undertake such field work. The second is a reduction in the clarity with which nonacademic audiences (e.g., citizens and legislators) can see the relevance of academic psychology to their lives and self-interest, which has contributed to a concomitant reduction in the availability of federal funds for basic behavioral science. Suggestions are offered for countering this problem. © 2009 Association for Psychological Science.
Article
A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles.
Article
If science were a game, a dominant rule would probably be to collect results that are statistically significant. Several reviews of the psychological literature have shown that around 96% of papers involving the use of null hypothesis significance testing report significant outcomes for their main results but that the typical studies are insufficiently powerful for such a track record. We explain this paradox by showing that the use of several small underpowered samples often represents a more efficient research strategy (in terms of finding p < .05) than does the use of one larger (more powerful) sample. Publication bias and the most efficient strategy lead to inflated effects and high rates of false positives, especially when researchers also resorted to questionable research practices, such as adding participants after intermediate testing. We provide simulations that highlight the severity of such biases in meta-analyses. We consider 13 meta-analyses covering 281 primary studies in various fields of psychology and find indications of biases and/or an excess of significant results in seven. These results highlight the need for sufficiently powerful replications and changes in journal policies. © The Author(s) 2012.
Article
Behavioral scientists routinely publish broad claims about human psychology and behavior in the world's top journals based on samples drawn entirely from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies. Researchers - often implicitly - assume that either there is little variation across human populations, or that these "standard subjects" are as representative of the species as any other population. Are these assumptions justified? Here, our review of the comparative database from across the behavioral sciences suggests both that there is substantial variability in experimental results across populations and that WEIRD subjects are particularly unusual compared with the rest of the species - frequent outliers. The domains reviewed include visual perception, fairness, cooperation, spatial reasoning, categorization and inferential induction, moral reasoning, reasoning styles, self-concepts and related motivations, and the heritability of IQ. The findings suggest that members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans. Many of these findings involve domains that are associated with fundamental aspects of psychology, motivation, and behavior - hence, there are no obvious a priori grounds for claiming that a particular behavioral phenomenon is universal based on sampling from a single subpopulation. Overall, these empirical patterns suggests that we need to be less cavalier in addressing questions of human nature on the basis of data drawn from this particularly thin, and rather unusual, slice of humanity. We close by proposing ways to structurally re-organize the behavioral sciences to best tackle these challenges.
Article
Since Edward Leamer's memorable 1983 paper, "Let's Take the Con out of Econometrics," empirical microeconomics has experienced a credibility revolution. While Leamer's suggested remedy, sensitivity analysis, has played a role in this, we argue that the primary engine driving improvement has been a focus on the quality of empirical research designs. The advantages of a good research design are perhaps most easily apparent in research using random assignment. We begin with an overview of Leamer's 1983 critique and his proposed remedies. We then turn to the key factors we see contributing to improved empirical work, including the availability of more and better data, along with advances in theoretical econometric understanding, but especially the fact that research design has moved front and center in much of empirical micro. We offer a brief digression into macroeconomics and industrial organization, where progress -- by our lights -- is less dramatic, although there is work in both fields that we find encouraging. Finally, we discuss the view that the design pendulum has swung too far. Critics of design-driven studies argue that in pursuit of clean and credible research designs, researchers seek good answers instead of good questions. We briefly respond to this concern, which worries us little.