Book

Corrupt Research: The Case for Reconceptualizing Empirical Management and Social Science

Authors:
... (3) As negative effect studies are generally more difficult to publish than positive effect studies (Hubbard 2015, Chamber 2017, Harris 2017, over time there are many more positive studies in scientific literature than negative studies, so a false positive effect claim can mistakenly become established fact (Nissen et al. 2016). ...
... There is now greater acknowledgement in published literature of likely causes of false published research (Pocock et al. 2004, Ioannidis 2008, Sarewitz 2012, Hubbard 2015, Chamber 2017, Harris 2017: ...
... Editors today generally favour novel studies that show positive effects. It is known that researchers often put negative results in a "file drawer" and move on to other things (Hubbard 2015, Chamber 2017, Harris 2017. Conventional wisdom is that if one is doing MA, one should worry about unreported, negative results (Stroup et al. 2000, Ehm 2016). ...
Preprint
It is generally acknowledged that claims from observational studies often fail to replicate. An exploratory study was undertaken to assess the reliability of base studies used in meta-analysis of short-term air quality-myocardial infarction risk and to judge the reliability of statistical evidence from meta-analysis that uses data from observational studies. A highly cited meta-analysis paper examining whether short-term air quality exposure triggers myocardial infarction was evaluated as a case study. The paper considered six air quality components - carbon monoxide, nitrogen dioxide, sulfur dioxide, particulate matter 10 and 2.5 micrometers in diameter (PM10 and PM2.5), and ozone. The number of possible questions and statistical models at issue in each of 34 base papers used were estimated and p-value plots for each of the air components were constructed to evaluate the effect heterogeneity of p-values used from the base papers. Analysis search spaces (number of statistical tests possible) in the base papers were large, median of 12,288, interquartile range: 2,496 to 58,368, in comparison to actual statistical test results presented. Statistical test results taken from the base papers may not provide unbiased measures of effect for meta-analysis. Shapes of p-value plots for the six air components were consistent with the possibility of analysis manipulation to obtain small p-values in several base papers. Results suggest the appearance of heterogeneous, researcher-generated p-values used in the meta-analysis rather than unbiased evidence of real effects for air quality. We conclude that this meta-analysis does not provide reliable evidence for an association of air quality components with myocardial risk.
... We sought to explain the relationship between access to health information, perception of health information, and cognitive information processing, by testing three experimental hypotheses focusing on NUM. Taking a cognitive-psychological approach to individual level comprehension allowed us to explain the nature and qualities of MHL Hubbard, 2016). First, we mapped a main idea task of pregnancy text and integrated this into an explanatory model representing a system of behaviours, with fuzzy-trace theory as the lens to describe a mother's perception . ...
... Having adequate MHL is particularly important for newcomer mothers because they disproportionately care for family and deal with health barriers such as language, culture, age, and gender associated with migration (Spitzer et al., 2003 and Canadian-born mothers. The research included a sample of Canadian-born mothers as the comparator, to assist with drawing empirical generalizations across different sub-populations (Hubbard, 2016). As such, the rationale for the study was to explain how sociocultural linguistic and demographic factors influence behavior using novel methods to analyze individual level perception; particularly, in a sample whose primary language is not English and who may prefer prose-based health information over numbers (Makowsky et al., 2022). ...
... Lastly, mothers were incentivized to participate and thus may be biased in their responses. However, a strength of OOM is that it examines the mother's MHL as a system, rather than an aggregate of variables representative of the researchers' abstractions to be calculated and correlated (Hubbard, 2016). ...
Thesis
Research suggests newcomer mothers score lower than Canadian-born mothers on health literacy (HL) and health numeracy (HN) assessments and have difficulty accessing maternal health services. This dissertation explored the nexus between language, language competencies, and the comprehension of health information by English-speaking, South Asian newcomer mothers (SANMs) and English-speaking, Canadian-born mothers, with a focus on learning. First, we conducted a scoping review using a systematic search strategy to identify conceptualizations of maternal health literacy (MHL) and HN according to the empirical research. Second, we employed narrative inquiry and used thematic analysis in conjunction with propositional analysis to explicate the verbalizations of mothers who shared stories about their comprehension of ultrasound examination preparation, health-risk information, and shared decision making.
... brand and when do brand haters call for boycotting the brand, the present study does not examine who loves the brand and when brand lovers show being brand loyal. The study of "symmetric variable directional relationships" (SVDRs) is most appropriate in the development and testing of the psychometric properties of distinct measurement scales (e.g., but relying on SVDR measurement for developing and testing theories of mind is inappropriate as Armstrong, 1972Armstrong, , 2012, Gigerenzer and Brighton (2009), Hubbard (2015), McClelland (1998), Ziliak and McCloskey (2008), and Woodside (2019) describe. ...
... This becomes particularly clear in the fact that regression analysis focuses on the unique contribution of a variable while holding constant the values of all other variables in the equation" (Fiss, 2007(Fiss, , p. 1181. Relatedly, relying on MRA and additional symmetric measures (e.g., means, standard deviations, regression coefficients, correlations, and standard errors) rather than examining XY plots and predictive validities of model via testing models on fresh samples of respondents serves to misinform and mislead researchers (Anscombe, 1973;Gigerenzer & Brighton, 2009;Hubbard, 2015;McCloskey, 2002;Meehl, 1978;Soyer & Hogarth, 2012;Ziliak & McCloskey, 2008). "Anscombe's quartet" of different observable data displays for identical symmetric test findings is highly instructive in reaching this conclusion. ...
... In its applications of complexity theory tenets, construction of asymmetric antecedent configurations for indicating single and complex, interval-range, outcomes, and the use of Boolean algebra operations rather than using standard errors and regression analysis, the study here represents a radical departure in theory construction and data analytics from the prior literature on BH and most studies in the consumer research literature. Anscombe (1973), Armstrong, 1970Armstrong, , 2012, Hubbard (2015), McCloskey (2002), Meehl (1978), Trafimow (2014, Trafimow and Marks (2015), Ziliak and McCloskey (2008), and Wasserstein and Lazar (2016) Note: BSIR "brand bad" "AND" BMI "brand not like me" cases (testing P10/11)Model: bsir_bmi = f(lux, fra, usa, uk, age, edu, inc, marital, gend) INTERMEDIATE SOLUTION: frequency cutoff: 3.00; consistency cutoff: 0.84. Solution coverage: 0.16; solution consistency: 0.73. ...
Article
Full-text available
Applying complexity theory tenets, this cross‐cultural study proposes and empirically examines generalizable, asymmetric, case outcome models of consumers who respond extremely negatively to luxury and mass fashion brands that they judge to be acting socially irresponsibly—and separately, consumers who do not engage in such responses—among separate national samples of consumers in France, the UK, and USA. This study includes constructing a theory of discrete simple and complex antecedent algorithms (i.e., screens) of who perceives specific brands acting socially irresponsible, who exudes brand hate (BH), and which brand haters call versus do not call for brand boycotts. The study's findings support the following conclusions. Most consumers expressing BH advocate call for boycotting the brand (supported cross‐culturally in the study here). Among consumers who also view the brand to be very unlike themselves, most consumers recognizing a brand to be acting socially irresponsibly hate the brand. Separate tests of propositions for the models' predictive validities across three national samples of consumers support the models' generalizability. The study adds to workable approaches in psychology and marketing for complementing (or replacing) theories framed in terms of symmetric, variable, directional relationships and examined using null hypotheses significance tests.
... NPS is a hotel administration performance metric in wide use by industry executives because NPS responds well to management's quest for a point estimate standard of measurement rather than relying on null hypothesis statistical tests (NHST) and the reporting of probabilities that observed difference scores differ from zero. Given senior managers' desire for point estimates (i.e., a number) for performance and the abundant criticisms on the use of NHST (Meehl, 1978, Wasserstein & Lazar, 2016, Woodside, 2017 and the reporting of only statistical significance (e.g., p < 0.05) of directional symmetrical relationships -even though the practices continue to be pervasive in research by most academic scholars (for reviews of criticisms, see Hubbard, 2016;Woodside, 2017;Ziliak & McCloskey, 2008) -the present study focuses on examining point estimates of high and low NPSs for different hotels operating with the same corporate-owned hotel properties. Because of management's preference for outcome point estimates and the fatal flaws in NHST, this study adopts Trafimow and Marks's (2015) recommended ban on reporting NHST and p < 0.05 findings -a ban in place now for the journal, Basic and Applied Social Psychology. ...
... The findings of the present study are limited to nine hotels in one industry in one country. As Hubbard (2016) stresses, generalizability of findings is possible only by replication studies in multiple contexts by multiple researchers. Hopefully, other researchers will perform additional studies on the consistency of NPSs across hotels and social media platforms. ...
... Hopefully, other researchers will perform additional studies on the consistency of NPSs across hotels and social media platforms. Research on NPS contributes to Hubbard's (2016) and others (Trafimow & Marks, 2015, Wasserstein & Lazar, 2016, Woodside, 2017 call for researchers to move away from null hypothesis significance testing (NHST) and toward providing asymmetric, precise, point estimates. The present study illustrates a response to their call. ...
... Another important lesson for doctoral students is about best practices for comparing and integrating empirical findings across related empirical studies (Platt, 1964). In our opinion, investigations focusing narrowly on whether replications confirm or reject earlier studies miss the most important contributions that replications can make (Amrhein et al. 2019;Hubbard, 2016). At the best of times, undertaking exact replications is extremely difficult, and often impossible. ...
... At the best of times, undertaking exact replications is extremely difficult, and often impossible. Thus, differences between precedents and replicates should be expected, and interpreted carefully (Hubbard, 2016;Schwab et al., 2011;Stroebe & Strack, 2014). ...
... To address this concern, replicators should focus more on effect sizes and the accuracy of effect size estimates, as well as consider applying Bayesian and/or meta-analytic techniques (Bonett, 2012(Bonett, , 2021Bonett & Price, 2015;Eden, 2002;Gronau et al., 2021;Lebel et al., 2019). These approaches tend to offer more effective ways to compare and integrate findings from precedent and replication studies (Cumming & Finch, 2005;Gelman & Stern, 2006;Hubbard, 2016). ...
Article
Full-text available
In addition to helping advance theory, replication studies offer rich and complementary learning experiences for doctoral students, enabling them to learn general research skills, through the process of striving to imitate good studies. In addition, students gain replication-specific methodological skills and learn about the important roles replications play for making management knowledge trustworthy. We outline best practices for enabling doctoral students and their supervisors to select studies to replicate, execute their replications, and increase the probability of successfully publishing their findings. We also discuss the crucial role of faculty mentors in supporting and guiding replication-based learning of doctoral students. Ultimately, educating doctoral students on how to execute high-quality replication studies helps to answer wider calls for more replication studies in the field of management, an important stepping stone along the journey toward open and responsible research.
... Além disso, este movimento é tão academicamente relevante que as 'ferramentas de segunda geração' são necessárias para participar da discus-são cientifica de mais alto nível (Mazzon & Hernandez, 2013, p. 77). De maneira preocupante, isto ocorre a despeito de muitos dos acadêmicos sequer compreenderem o significado dos resultados estatísticos (Hubbard, 2016;Hubbard & Armstrong, 2006). ...
... Ao mesmo tempo, a abordagem estatística tende a afastar a produção científica em marketing do dia-a-dia organizacional, visto que a enorme maioria dos práticos é incapaz de compreender a linguagem usada pela academia (Jiang, Yang, & Carlson, 2012), além disso o uso de ferramentas cada vez mais específicas e controles estatísticos de maior precisão reduz a capacidade de generalização dos resultados acadêmicos (Morgan et al., 2019). Assim, a produção científica de alto nível, baseada em ferramental estatístico avançado, é dificilmente traduzida em linguagem compreensível para os que usam o marketing em campo, motivo de preocupação para muitos acadêmicos (Harley, 2018;Hubbard, 2016;Reibstein, Day, & Wind, 2009;Roberts, Kayande, & Stremersch, 2014) chegando ao ponto de alguns autores (p.e. November, 2004) sugerirem que o que se produz na academia em marketing deva ser solenemente ignorado pelos práticos. ...
Article
Full-text available
RESUMO Para que serve o marketing? Acadêmicos e práticos precisam ter respostas, precisam conhecer as possibili-dades e complexidades inerentes à disciplina. O marketing foi transformado em algo operacional, focado no método, o que levou à quase eliminação de sua relevância prática. Parte significativa dos que com ele traba-lham, reduzem-no a ações empiricamente comprováveis no curto prazo. Práticos têm dificuldade para aplicar o conhecimento e retiram-no de seu horizonte de consciência e de suas decisões. O marketing perde o que o fez importante: a capacidade de ajudar na compreensão sobre mercados. O ensaio trata das origens e conse-quências deste problema. Busca-se fundamentar a ideia de que o tratamento positivista-empiricista só faz sen-tido num contexto de compreensão da matéria ligado ao marketing estratégico. Para readquirir relevância, a academia precisa apresentar soluções além daquelas ligadas à administração de marketing, deve dar atenção ao marketing estratégico. Sugere-se que tal movimento seja liderado globalmente pelos acadêmicos do Brasil. ABSTRACT What marketing is for? Practitioners and academics must recognize its possibilities and complexities. The transformation of this academic endeavor into something operational and method-related resulted in great reduction in its practical relevance. Most people see it as short-term, measurable and empirically testable. Practitioners are unable to use the developments in their practices, marketing loses its relevance because it disregards what made it important. The paper deals with the origins and consequences of empiricism in marketing and how this approach affects its ability of impacting organizations. It aims at supporting the idea that the positivistic-empi-ricist treatment must be subjected to a broader context, connected to strategic marketing. We conclude that, to regain its practical relevance, marketing must deal not only short-term, marketing management ideas, but must start from its long-term implications, expressed in the strategic marketing approach. The work suggests that Brazilian professors have what it takes to lead that global movement.
... Nevertheless, for approximately 80 years, methodologists have warned of the use of null-hypothesis statistical tests without any apparent success. Researchers have ignored these warnings, continuing to use them even though their interpretations are often erroneous and indicate that the researchers do not understand what the tests indicate (Cortina & Landis, 2011;Hubbard, 2016;McShane & Gal, 2015;Orlitzky, 2012;Starbuck, 2016). The p-values obtained in these tests do not reflect any information on the rigor and reliability of the findings (Aguinis et al., 2020;Bettis, 2012;Bettis, Ethiraj, et al., 2016;Hubbard, 2016;Schwab et al., 2011;Vogel & Homberg, 2021). ...
... Researchers have ignored these warnings, continuing to use them even though their interpretations are often erroneous and indicate that the researchers do not understand what the tests indicate (Cortina & Landis, 2011;Hubbard, 2016;McShane & Gal, 2015;Orlitzky, 2012;Starbuck, 2016). The p-values obtained in these tests do not reflect any information on the rigor and reliability of the findings (Aguinis et al., 2020;Bettis, 2012;Bettis, Ethiraj, et al., 2016;Hubbard, 2016;Schwab et al., 2011;Vogel & Homberg, 2021). ...
Article
Full-text available
Management scholarship should be placed in a unique position to develop relevant scientific knowledge because business and management organizations are deeply involved in most global challenges. However, different critical voices have recently been raised in essays and editorials, and reports have questioned research in the management field, identifying multiple deficiencies that can limit the growth of a relatively young field. Based on an analysis of published criticisms of management research, we would like to shed light on the current state of management research and identify some limitations that should be considered and should guide the growth of this field of knowledge. This work offers guidance on the main problems of the discipline that should be addressed to encourage the transformation of management research to meet both scientific rigor and social relevance. The article ends with a discussion and a call to action for directing research toward the possibility and necessity of reinforcing “responsible research” in the management field. JEL CLASSIFICATION: M00, M10
... Already Fisher (1925), who proposed p = 0.05 as a conventional rejection criterion for H 0 , had offered no more than a convenience justification for its specific value (Hubbard, 2015;Kennedy-Shaffer, 2019): ...
Article
Full-text available
An often-cited convention for discovery-oriented behavioral science research states that the general relative seriousness of the antecedently accepted false positive error rate of α = 0.05 be mirrored by a false negative error rate of β = 0.20. In 1965, Jacob Cohen proposed this convention to decrease a β-error rate typically in vast excess of 0.20. Thereby, we argue, Cohen (unintentionally) contributed to the wide acceptance of strongly uneven error rates in behavioral science. Although Cohen’s convention can appear epistemically reasonable for an individual researcher, the comparatively low probability that published effect size estimates are replicable renders his convention unreasonable for an entire scientific field. Appreciating Cohen’s convention helps to understand why even error rates (α = β) are “non-conventional” in behavioral science today, and why Cohen’s explanatory reason for β = 0.20—that resource restrictions keep from collecting larger samples—can easily be mistaken for the justificatory reason it is not.
... There has been much concern, with assertions ranging from the distant past (e.g., James, 1892James, /1920 to recently (Iso-Ahola, 2022) that psychology suffers relative to physical sciences, such as physics, in that the latter have stable empirical laws, whereas psychology lacks them. A related concern is that much empirical work in psychology does not replicate well (see Hubbard, 2016, for a review) and without replicability, it is difficult to establish empirical laws. In the case of probabilistic empirical laws, their probabilistic nature renders it impossible that findings will always replicate, but there must nevertheless be some reasonable probability of replication (Fiedler & Trafimow, 2024). ...
Article
Full-text available
There is a trepidation, anxiety, or intuition, which has persisted for more than a century, that psychology theories are less anchored in fundamental laws than physics theories. Rather than attempt to refute the concern, the present work accepts it and tries out candidate explanations. These pertain to empirical laws, parsimony, scope, reductionism, falsifiability, mathematical operations (multiplication vs. addition), internal coherence, ceteris paribus stipulations, and purposeful omission of relevant factors (idealization). The conceptions underlying these explanations are not strictly independent, but they point to different distinctive features that might account for the unequal status of physics and psychological science and to different means of improving contemporary psychology. Although the available evidence for or against these candidate explanations is scarce and relies mainly on a few telling examples, we conclude that the last of our candidate explanations—reliance on idealized universes—works best and leads to the most insights about what psychology might learn from physics and what research strategies might foster the ideal of theory-driven psychological science in the future.
... However, there are many NHST detractors (for reviews, see Hubbard, 2015;McCloskey & Ziliak, 2010;Trafimow, 2019a). For those who believe NHST unsound, there is little reason to perform the procedure (Trafimow, Amrhein, et al., 2018). ...
Article
Full-text available
The standard statistical procedure for researchers comprises a two-step process. Before data collection, researchers perform power analyses, and after data collection, they perform significance tests. Many have proffered arguments that significance tests are unsound, but that issue will not be rehashed here. It is sufficient that even for aficionados, there is the usual disclaimer that null hypothesis significance tests provide extremely limited information, thereby rendering them vulnerable to misuse. There is a much better postdata option that provides a higher grade of useful information. Based on work by Trafimow and his colleagues (for a review, see Trafimow, 2023a), it is possible to estimate probabilities of being better off or worse off, by varying degrees, depending on whether one gets the treatment or not. In turn, if the postdata goal switches from significance testing to a concern with probabilistic advantages or disadvantages, an implication is that the predata goal ought to switch accordingly. The a priori procedure, with its focus on parameter estimation, should replace conventional power analysis as a predata procedure. Therefore, the new two-step procedure should be the a priori procedure predata and estimations of probabilities of being better off, or worse off, to varying degrees, postdata.
... Dozens of journals are experimenting with ways to discourage covert analyses. Hundreds of scientists and statisticians have urged journals to stop relying on significance tests, and the American Statistical Association has been cautioning researchers not to use the phrase "statistically significant" and not to interpret p-levels as indicating the strength of evidence (Gigerenzer, 2004;Hubbard, 2015;Kline, 2004;Wasserstein, Schirm & Lazar, 2019;Ziliak & McCloskey, 2008). ...
... Concerning this, a close pre-registered well-powered replication and extension could better assess and credit the findings of the original article (Hubbard, 2016), and aid the test of generalizability of the results (Nosek & Errington, 2020). ...
Preprint
Full-text available
We conducted a pre-registered replication and extensions of Studies 2, 3 and 4 from Hsee and Rottenstreich (2004). They showed that evaluations rely on calculations for neutral affect-poor situations in comparisons to affect-rich domains that rely on feelings. Replicating their Study 2, we found support for the hypothesis that receiving affect-rich music books was less affected by monetary value compared to affect-poor situation of receiving mere cash (original: η2 = 0.02, 90% CI [0.003, 0.05]; replication: η2 = 0.01, 90% CI [0.003, 0.03]). Failing to replicate their Study 3, we found no support for differences between affect-rich stimuli (Panda images) compared to affect-poor stimuli (dots) (original: η2 = 0.03, 90% CI [0.002, 0.10]; replication: η2 = 0.001, 90% CI [0.00, 0.01]). Lastly, we found no support for their Study 4 testing differences between affect-rich empathy manipulation of a mugger scenario, compared to the neutral affect-poor scenario description (original: η2 = 0.02, 90% CI [0.01, 0.07]; replication: η2 = 0.001, 90% CI [0.00, 0.01]). Thus, we conclude mixed support for the original article. Extending the replication, we found support for the role of emotion in subjective valuation in Study 2 (η2 = 0.02, 90% CI [0.01, 0.03]), but less so in Study 3 (η2 = 0.001, 90% CI [0.00, 0.01]) and in Study 4 (η2 = 0.00, 90% CI [0.00, 0.004]). Materials, data, and code are available on: https://osf.io/qbp4g/
... Likewise, OOM does not require assumptions of random sampling, normality, equal population variance, homoscedasticity, or, importantly, that data have quantitative structure (Grice, 2011). In this way, OOM overcomes problems of measurement (e.g., Trendler, 2018) and the limitations of null hypothesis significance testing (e.g., Hubbard, 2015;Lambdin, 2012;Wasserstein et al., 2019). Also, OOM has been positively reviewed and effectively used in a variety of contexts (e.g., Arocha, 2020;Craig & Abramson, 2018;Sauer, 2018;Valentine et al., 2019). ...
Article
Full-text available
Background/Purpose : The purpose of this study was to investigate physical educators’ self-reported use, understanding, and confidence with game-based approaches (GBAs) in their K–12 physical education programs. Method : A survey of New York State physical education professionals was conducted that yielded quantitative data on how they used game-based approaches. Data were analyzed using Observation Oriented Modeling (version 5.4.2022), a tool well suited for survey data, especially teachers’ reports of game-based lesson sequences. Results : Physical educators reported both awareness and confidence with various GBAs spending most physical education game-based lessons teaching invasion games with emphasis on the psychomotor domain. Respondents’ typical game lesson sequences did not match model GBA lesson sequences. Conclusions : Findings from this study indicate respondents do not use GBAs for their game-based lessons. Beyond teacher preparation, intentional professional development should address conceptual, pedagogical, cultural, and political obstacles, and may help practitioners become more pedagogically fluent with GBAs.
... Although a few social scientists have suggested otherwise (Amrhein, Trafimow, & Greenland, 2019;Gilbert et al., 2016), many others have proclaimed the lack of quantitative research replicability in the social sciences is a crisis (e.g., Hubbard, 2016;Hunter, 2001;Ioannidis, 2005;Maxwell et al., 2015;Nelson et al., 2022;Pashler & Harris, 2012;Schmidt, 2009;Stroebe & Strack, 2014). Researchers who have tried to assess the percentage of replicable findings by conceptually or directly replicating randomly selected published studies generally side with the latter group (Camerer et al., 2018;Klein et al., 2018;Moosa, 2018;Olsson-Collentine, Wicherts, & van Assen, 2020;Open Science Collaboration, 2015). ...
Preprint
Full-text available
Most proposed remedies to the social and medical sciences' 'replication crisis', like more open and transparent research, more published conceptual and exact/direct/close replication studies, or more demanding manuscript requirements and peer reviews, are problematic because they are beatable and excessively burden researchers. Thus, they are poor candidates for boosting low replication rates. Instead, the net benefits suggest that social and medical scientists should develop artificial intelligence and machine learning algorithms for screening publications, avoid lax scale creation and use practices, and create a professional culture of trust.
... For further discussion seeHubbard, 2016;Tourish, 2019. ...
Article
Full-text available
As applied fields, management, industrial-organizational psychology, and related disciplines seek to make their knowledge relevant to business practitioners. But the current dissemination model is inefficient, leading some to conclude that the gap between academics and practitioners poses one of the most pressing problems in management today. Using insights derived from the life and work of Timothy Baldwin, we offer four takeaways designed to foster collaboration between theoreticians and practitioners: (1) Shrink the mission, (2) don’t internalize the enemy, (3) find your champions, and (4) use the science of persuasion. These strategies are crucial to closing the gap and thus uniting the efforts of researchers and practitioners so as to ensure the practical relevance of research and to strengthen the organizational value proposition.
... Further evidence of a bias against negative results comes from studies that find that the vast majority of results in the scientific literature are positive (Dickersin et al., 1987;Sterling, 1959), particularly in psychology (Fanelli, 2010), despite the common use of underpowered designs (Bakker et al., 2012). It appears that academics perceive studies with positive results as more valuable than studies with negative results, 1 possibly because the dominance of significance testing in many fields (e.g., Hubbard, 2015) leads researchers to equate positive results with significance. ...
Article
Full-text available
Preregistration has gained traction as one of the most promising solutions to improve the replicability of scientific effects. In this project, we compared 193 psychology studies that earned a Preregistration Challenge prize or preregistration badge to 193 related studies that were not preregistered. In contrast to our theoretical expectations and prior research, we did not find that preregistered studies had a lower proportion of positive results (Hypothesis 1), smaller effect sizes (Hypothesis 2), or fewer statistical errors (Hypothesis 3) than non-preregistered studies. Supporting our Hypotheses 4 and 5, we found that preregistered studies more often contained power analyses and typically had larger sample sizes than non-preregistered studies. Finally, concerns about the publishability and impact of preregistered studies seem unwarranted, as preregistered studies did not take longer to publish and scored better on several impact measures. Overall, our data indicate that preregistration has beneficial effects in the realm of statistical power and impact, but we did not find robust evidence that preregistration prevents p -hacking and HARKing (Hypothesizing After the Results are Known).
... For random assignment of participants to conditions, even if the participants have not been randomly selected, it is nevertheless possible to randomly assign these nonrandomly selected participants to conditions. 11 See Hubbard (2016) b4797-Set2-v4 Business Storytelling, Science and Statistics "9.61 x 6.69" 1st Reading Testing 113 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 20 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 39Xy estimating anything, so they cannot be close to true! Moreover, they do not provide information about usefulness, or about the closeness of the model to the truth, or anything else that might matter (e.g., Kirk, 1996). ...
... In addition, the fact that the advice was so well received by business and economics researchers and cited over 1,000 times and received an accolade by Journal of Global Scholars of Marketing Science, points to one of many problems with significance testing thinking (Trafimow et al., 2021). There have been many criticisms of significance testing, including several reviews (Hubbard, 2016;Ziliak and McCloskey, 2016;Trafimow, 2019) and 43 articles in the 2019 special issue of The American Statistician, the highly respected journal of The American Statistical Association. Because the disadvantages of significance testing have been covered so extensively, there is no point in rehashing them here. ...
Article
Full-text available
Purpose The purpose of this article is to show the gains that can be made if researchers were to use gain-probability (G-P) diagrams. Design/methodology/approach The authors present relevant mathematical equations, invented examples and real data examples. Findings G-P diagrams provide a more nuanced understanding of the data than typical summary statistics, effect sizes or significance tests. Practical implications Gain-probability diagrams provided a much better basis for making decisions than typical summary statistics, effect sizes or significance tests. Originality/value G-P diagrams provide a completely new way to traverse the distance from data to decision-making implications.
... Unfortunately, that a change is well-justified does not mean it will happen. Stunt et al. (2021) conducted focus groups with substantive researchers, journal editors, and grant funders to discuss social scientists' reluctance to deviate from null hypothesis signifi cance testing despite its many documented problems (see Hubbard, 2015;McCloskey & Ziliak, 2010;Trafimow, 2019a for reviews). Although many of the substantive researchers thought such a change beneficial for science, they nevertheless indicated being unwilling to change unless journal editors or grant funders changed first. ...
Article
Full-text available
Social science researchers depend on differences in means between experimental and control conditions to draw substantive conclusions. However, an alternative is to use differences in locations. For normal distributions, means and locations are the same, but for skew normal distributions, means and locations are different. If a difference in means and locations are similar, and in the same direction, the resulting substantive story may be similar. However, if a difference in means and locations are dissimilar, especially if they oppose directionally, the resulting substantive story may differ dramatically. We collected 51 data sets from online data repositories to check how often the differences in means versus locations are substantially different or are in different directions. Although the values depend on what one counts, the overall conclusion is that the two types of differences have a larger than trivial chance of disagreeing substantially. We suggest that when researchers report normal statistics (mean and standard deviation), they should report skew normal statistics (location, scale, and shape) too, against the nontrivial chance that the skew normal statistics imply a substantive story in opposition to that implied by the normal statistics.
... P-hacking is a form of multiple hypothesis testing involving the search for significance during statistical analysis of data (Simmons et al. 2011;Hubbard, 2015;Harris, 2017, Streiner, 2018Barnett & Wren, 2019;Moss & De Bin, 2021). P-hacking allows researchers to find statistically significant results with their data set even when studying a non-existent effect or association (Simonsohn et al., 2014). ...
Article
Full-text available
Odds ratios or p-values from individual observational studies can be combined to examine a common cause−effect research question in meta-analysis. However, reliability of individual studies used in meta-analysis should not be taken for granted as claimed cause−effect associations may not reproduce. An evaluation was undertaken on meta-analysis of base papers examining gas stove cooking (including nitrogen dioxide, NO2) and childhood asthma and wheeze associations. Numbers of hypotheses tested in 14 of 27 base papers (52%) used in meta-analysis of asthma and wheeze were counted. Test statistics used in the meta-analysis (40 odds ratios with 95% confidence limits) were converted to p-values and presented in p-value plots. The median (interquartile range) of possible numbers of hypotheses tested in the 14 base papers was 15,360 (6,336−49,152). None of the 14 base papers made mention of correcting for multiple testing, nor was any explanation offered if no multiple testing procedure was used. Given large numbers of hypotheses available, statistics drawn from base papers and used for meta-analysis are likely biased. Even so, p-value plots for gas stove−current asthma and gas stove−current wheeze associations show randomness consistent with unproven gas stove harms. The meta-analysis fails to provide reliable evidence for public health policy making on gas stove harms to children in North America. NO2 is not established as a biologically plausible explanation of a causal link with childhood asthma. Biases – multiple testing and p-hacking – cannot be ruled out as explanation for a gas stove−current asthma association claim. Selective reporting is another bias in published literature of gas stove–childhood respiratory health studies.
... These studies can also assess the strength of a relationship and the conditions that affect the relationship. Findings that have significant consequences should be corroborated to help foster theory development and generalize marketing results (Hubbard, 2015;Iyer & Griffin, 2021;Tsang & Kwan, 1999). ...
Article
This study presents a reliable, valid, and generalizable four‐item unidimensional scale that captures general bandwagon luxury motivation. After a thorough review of the bandwagon luxury literature, the authors developed an initial set of items which were then reviewed by academic experts. The scale was tested in a series of four studies to refine the scale and demonstrate its reliability and validity: Study 1 was conducted with a student sample in the Southeast, Study 2 with a student referral sample of adults in the Midwest, Study 3 with a national Qualtrics panel sample in the United States, and Study 4 with another national Qualtrics panel sample in the United States that included only those who had bought or consumed a luxury product in the past 12 months. Study 4 was done to corroborate the evidence from Study 3 with a sample of luxury consumers. The generalized bandwagon luxury motivation scale is positively related to status consumption motivation, congruity with one's internal self, a preference for visible luxury brands, and conspicuous consumption. It is negatively related to the inconspicuous luxury motivation of being unknown to the masses and independent self‐construal. This research contributes to the literature by developing a generalized scale to measure the luxury bandwagon effect that is not limited to one luxury product domain.
... P-hacking is a form of multiple hypothesis testing involving the search for significance during statistical analysis of data (Simmons et al. 2011;Hubbard, 2015;Harris, 2017, Streiner, 2018Barnett & Wren, 2019;Moss & De Bin, 2021). P-hacking allows researchers to find statistically significant results with their data set even when studying a non-existent effect or association (Simonsohn et al., 2014). ...
Preprint
Full-text available
Odds ratios or p_values from individual observational studies can be combined to examine a common cause_effect research question in meta_analysis. However, reliability of individual studies used in meta_analysis should not be taken for granted as claimed cause_effect associations may not reproduce. An evaluation was undertaken on meta_analysis of base papers examining gas stove cooking, including nitrogen dioxide, NO2, and childhood asthma and wheeze associations. Numbers of hypotheses tested in 14 of 27 base papers, 52 percent, used in meta_analysis of asthma and wheeze were counted. Test statistics used in the meta_analysis, 40 odds ratios with 95 percent confidence limits, were converted to p_values and presented in p_value plots. The median and interquartile range of possible numbers of hypotheses tested in the 14 base papers was 15,360, 6,336_49,152. None of the 14 base papers made mention of correcting for multiple testing, nor was any explanation offered if no multiple testing procedure was used. Given large numbers of hypotheses available, statistics drawn from base papers and used for meta-analysis are likely biased. Even so, p-value plots for gas stove_current asthma and gas stove_current wheeze associations show randomness consistent with unproven gas stove harms. The meta-analysis fails to provide reliable evidence for public health policy making on gas stove harms to children in North America. NO2 is not established as a biologically plausible explanation of a causal link with childhood asthma. Biases_multiple testing and p-hacking_cannot be ruled out as explanations for a gas stove_current asthma association claim. Selective reporting is another bias in published literature of gas stove_childhood respiratory health studies. Keywords gas stove, asthma, meta-analysis, p-value plot, multiple testing, p_hacking
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
... Researchers have recently started questioning whether relying only on an NHST that uses a single SEM analysis method is safe. For example, Hubbard (2015) calls these tests "corrupt research." Even the Journal of Basic and Applied Social Psychology (BASP) agreed that NHST doesn't tell the reader how advantageous the model is for forecasting an outcome (Trafimow, 2014;Trafimow & Earp, 2017). ...
Article
Although the interplay among moral norms (MN), organizational support (OS), psychological ownership (PO), past green behavior, and green practice behavior (GPB) has been investigated separately in the hospitality and tourism literature, such investigations have been analyzed with the assumption of symmetrical perspective. This research provides additional information by applying both the symmetrical and non-symmetrical paradigms with an innovative methodological approach called the integrated generalized structured component analysis with fuzzy set qualitative comparative analysis (fsQCA). A survey with 277 respondents indicates that MN, OS, PO, and past green behavior can collectively and efficiently explain variations in GPB at work. Results from fsQCA identify four different combinations of configurations that can shape employees’ behavior to perform green practices at work. In addition, MN is identified as a core factor and confirmed to be an indispensable condition to the occurrence of GPB. Moreover, this study tests and confirms all core tenets of complexity theory. Also, we address the potential sub-additive bias by relying on the perspective of the factor measurement model.
... The frequentist approach allowed us to implement a method for testing the relationship between the sample characteristics (size and missing points) and the expected group differences. Even considering that the statistical threshold (here p < 0.05) may be rather arbitrary (see 37,38 ), it is important to note that this was chosen as a controlled systematic approach to study the behavior of the databases when removing subjects, with the ultimate In a further step, we aimed to explore the utility of Bayesian statistics combined with LME modelling. It has been suggested that Bayesian approaches could complement the findings obtained with frequentist analyses, as they provide a more interpretable framework. ...
Article
Full-text available
Linear mixed effects (LME) modelling under both frequentist and Bayesian frameworks can be used to study longitudinal trajectories. We studied the performance of both frameworks on different dataset configurations using hippocampal volumes from longitudinal MRI data across groups—healthy controls (HC), mild cognitive impairment (MCI) and Alzheimer’s disease (AD) patients, including subjects that converted from MCI to AD. We started from a big database of 1250 subjects from the Alzheimer’s disease neuroimaging initiative (ADNI), and we created different reduced datasets simulating real-life situations using a random-removal permutation-based approach. The number of subjects needed to differentiate groups and to detect conversion to AD was 147 and 115 respectively. The Bayesian approach allowed estimating the LME model even with very sparse databases, with high number of missing points, which was not possible with the frequentist approach. Our results indicate that the frequentist approach is computationally simpler, but it fails in modelling data with high number of missing values.
... The replication crisis is at least in part an ethical one, as researchers have admitted to knowing about, or engaging in, efforts to manipulate data; techniques such as p-hacking 2 are widely practiced, and journal editors, publishers, and funders of research have knowingly continued to employ perverse incentives and suspect standards (e.g., the reliance on null-hypothesis significance testing) for evaluating empirical research, the very research from which "evidence-based decisions" are to be made. 3 Yet, the most prominent critics come from "within house." Speaking to proposals for improving research quality, Arocha observed that "most … take the standard linear input-output approach for granted." ...
Chapter
Full-text available
The present period constitutes a crisis of science, of method. Take for example the Collaborative Institutional Training Initiative (CITI) universities often require for both students and faculty as part of their ethics review board protocol. 1 Importantly, this training now covers the replication crisis. Lack of reproducibility of research results, trainees are informed, refers to failures to obtain similar results when replicating an experiment; it is also evident when different researchers analyze the same data yet obtain different results. The training indicates that this problem exists in biomedical, behavioral, computational , and social sciences, and engineering. The replication crisis is at least in part an ethical one, as researchers have admitted to knowing about, or engaging in, efforts to manipulate data; techniques such as p-hacking 2 are widely practiced, and journal editors, publishers, and funders of research have knowingly continued to employ perverse incentives and suspect standards (e.g., the reliance on null-hypothesis significance testing) for evaluating empirical research, the very research from which "evidence-based decisions" are to be made. 3 Yet, the most prominent critics come from "within house." Speaking to proposals for improving research quality, Arocha observed that "most … take the standard linear input-output approach for granted." The problem originates from flawed methodological conceptualizations rooted in posi-tivism, namely empiricism, the thesis that knowledge can only be obtained from sense experience, and operationism, that is, reducing "an unobservable process to the empirical consequences that are used to verify it." 4 Others have gone even farther. Psychological research fails because it focuses on isolated facts rather than the development of a general theory, relies too heavily on quantitative data, and ignores that the same external environment can psychologically be very different and not only the external but the psychological environment must also be controlled. Because of these and other issues, Toomela declared that the last 60 years of mainstream psychology "should simply be forgotten as useless for the development of psychology." 5 In his 1972 volume Social Sciences as Sorcery, Andreski made similar sweeping claims about the social and political sciences, especially their use of quantification as scientific "camouflage," and their narrow conceptions of cause. 6 The replication crisis is, then, evidence of a much deeper and long-standing set of philosophical and methodological problems.
... It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades [8,9], including physiotherapy journals [10,11]. There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials [2,12]. ...
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
... Und es ist auch überraschend, dass sich die breitflächige Anwendung statistischer Nullhypothesentests so lange gehalten hat, wenn in Betracht gezogen wird, dass die in ▶ Tab. 1 skizzierten Probleme schon seit Jahrzehnten immer wieder in Fachpublikationen des Gesundheitswesens aufgeworfen werden [8,9], so auch in physiotherapeutischen Fachzeitschriften [10,11]. Während es bereits Bewegungen weg von statistischen Nullhypothesentests gab, entwickelte sich die Verwendung von alternativen Methoden der statistischen Inferenz über Jahrzehnte nur langsam, wie Analysen der Gesundheitsforschung einschließlich physiotherapeutischer Studien belegen [2,12]. ...
... Arocha (2020) recently described several foundational issues that have impeded psychology's development into a mature and complete science. These issues include methodological shortcomings (see Cohen, 1994;Ioannidis, 2005;Lykken, 1991;Meehl, 1978), an overreliance upon Null Hypothesis Significance Testing (NHST; Acree, 1978;Arocha, 2020;Gigerenzer, 2004;Hubbard, 2016;Lambdin, 2012;Nickerson, 2000;Rozeboom, 1960;Wasserstein & Lazar, 2016), and reliance upon an abandoned philosophy, positivism (Arocha, 2020;Costa & Shimp, 2011;Dougherty, 2013;Grice, 2011;Manicas, 2006). Psychologists have acknowledged and attempted to address some of the methodological and statistical issues. ...
Thesis
In light of documented methodological issues and reproducibility failures, psychologists have sought to improve the scientific credibility of their field. Unfortunately, these efforts have not addressed psychology's problematic foundational philosophy, logical positivism, which has largely been abandoned by modern philosophers. Notably, other older sciences such as chemistry and physics have also replaced logical positivism with a stronger foundation, namely, philosophical realism. This thesis demonstrates how psychologists can overcome their methodological issues and reproducibility failures by likewise embracing a realist philosophy of science that includes Aristotle's four cases (viz., the material, formal, efficient, and final causes). Final causes are particularly important as they explain the purpose or reason for the occurrence of an event in nature. Utilizing Perceptual Control Theory, this thesis provides a general methodology for visually representing such causes in iconic models. Perceptual Control Theory posits that organisms are aware of sensations in the environment and respond to the awareness of these sensations towards some goal (i.e., final cause). In other words, behavior is a result of a goal held by the organism and is not merely produced from the environment. Data from a perceptual control theory task were collected and analyzed to determine the number of individuals whose responses matched the proposed final cause model. Results were highly successful as every individual's set of responses could be traced accurately through the model. Further implications and the importance of these modeling procedures are discussed. Utilizing the modeling technique developed in this thesis, psychologists can begin to rebuild their research upon the foundation of philosophical realism. In doing so, psychologists will be enabled to produce fruitful research which also restores the individual person to the center of investigation, offers inferences to best explanations, improves model testing and theory development, and most importantly, restores teleological explanations to psychological science.
... The frequentist approach allowed us to implement a method for testing the relationship between the sample characteristics (size and missing points) and the expected group differences. Even considering that the statistical threshold (here p < 0.05) may be rather arbitrary (see 31,32 ), it is important to note that our goal was to systematically study the behavior of the databases when removing subjects. To our knowledge, there are no previous studies addressing similar questions with neuroimaging data. ...
Preprint
Full-text available
Linear Mixed Effects (LME) modelling under both frequentist and Bayesian frameworks have been suggested to study longitudinal trajectories. We studied the performance of both approaches on different dataset configurations using hippocampal volumes from longitudinal MRI data across groups - healthy controls (HC), mild cognitive impairment (MCI) and Alzheimer’s Disease (AD) patients -, including subjects that converted from MCI to AD. We started from a big database of 1250 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and we created different reduced datasets simulating real-life situations using a random-removal permutation-based approach. The number of subjects needed to differentiate groups and to detect conversion to AD was 147 and 115 respectively. The Bayesian approach allowed estimating the LME model even with very sparse databases, with high number of missing points, which was not possible with the frequentist approach. Our results indicate that the frequentist approach is computationally simpler, but it fails in modelling data with high number of missing values.
... In this way, QRPs might become normalized, institutionalized, and ingrained (see also Smaldino & McElreath, 2016). Their initial effect of deceiving others soon could give way to self-deception, where researchers would begin to view the research process not as the pursuit of knowledge, but as one of gamesmanship-effectively, this would turn scientists seeking truth into lawyers presenting the most persuasive case (see also Davis, 1971;Hubbard, 2016;Tourish, 2019). ...
Chapter
Full-text available
There is increasing concern that the veracity of research findings in a number of scientific disciplines, including psychology, may be compromised by questionable research/reporting practices (QRPs). QRPs, such as hypothesizing after results are known, selectively deleting outliers, and “p-hacking,” bolster findings by giving the appearance of statistical significance, generalizability, and novelty. In truth, studies containing such QRPs do not replicate, do not generalize, and mislead both research and practice. This process of “ugly” initial results metamorphosing into “beautiful” articles through QRPs is known as the chrysalis effect and has the potential to compromise the integrity of the field and the trust practitioners and external funding agencies place in psychology research. This chapter reviews the extant research of the existence and frequency of QRP engagement. We then transition into the antecedents and outcomes of QRPs, with a focus on the system processes that both encourage and facilitate QRP engagement. We then close with a series of steps that might mitigate QRP prevalence in order for research to reflect best scientific practices.
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
... But the use of statistical significance divides the research community in a range of disciplines, from statistics to social policy, including education. Some consider statistical significance an essential part of impact evaluation, just one aspect of a broader picture, while others regard it as a meaningless and misleading concept that should be abolished altogether (Shrout, 1997;Ziliak and McCloskey, 2008;Trafimov and Marks, 2015;Gorard, 2016;Hubbard, 2016;Wasserstein and Lazar, 2016 This chapter outlines some key concepts underpinning notions of uncertainty, and proposes a way forward, which is then adopted in the subsequent chapter that presents estimates of impact, costs and certainty for a range of common education interventions and approaches. The key proposal is that impacts should be reported as effect sizes, and interpreted alongside internal validity and uncertainty when making a decision about a programme. ...
Book
Full-text available
The ISEE Assessment was initiated with the idea of using science and evidence as its founding pillars. However, we soon noticed that the terms evidence and data prompted a slew of questions and clarifications that we did not anticipate. Recognizing the diversity of views and perspectives of what a science and evidence-based assessment means, a small group of experts was commissioned to provide more clarity and guidance on what evidence means and how data can and should be used in education practice and policymaking. This working group’s focus is on seeking the best way to provide answers to the questions, ‘what worked’, ‘what is working best generally’ and ‘will a given intervention work here and now’. A new taxonomy of eight tiers or levels of evidence guides matching available evidence to these questions and assessing the strength of this evidence. The experts in this group provide a deeper understanding of how effect size and consistency of effect sizes influence learning outcomes, and how they can ‒ and cannot ‒ be used in practice and policy guidance. They also illustrate the potential of this modern approach to evidence based education by discussing the EEF (Education Endowment Fund) Evidence Database, effectively providing a proof of concept regarding some of the key ideas put forward as the new norm.
... When researching inter-group differences, this often means 'translating' NHST results. Aside from the methodological limitations of statistical significance tests, their practical interpretation is problematic (e.g., Cohen 1994;Hubbard 2016;Li and Ali, 2020;Schmidt 1996;Schmidt and Hunter 1997;Trafimow 2003;Woodside 2016;Ziliak and McCloskey 2016). Arguably, statistical significance tests do not provide all the information marketing practitioners require. ...
Article
Full-text available
Marketing scholars often compare groups that occur naturally (e.g., socioeconomic status) or due to random assignment of participants to study conditions. These scholars often report group means, standard deviations, and effect sizes. Although such summary information can be helpful, it can misinform marketing practitioners’ decisions. To avoid this problem, scholars should also report the probability that one group’s member will score higher than another group’s member, and by various amounts. In this vein, newly conceived gain-probability diagrams can depict relevant, concise, and easy-to-comprehend probabilistic information. These diagrams’ nuanced perspective can contradict traditional significance test and effect size implications.
... The publication of academic articles by foreign researchers through their affiliation with Ecuadorian universities that hired them became a precious asset. Consequently, higher education became increasingly commodified (Chapman & Lindner, 2016;Hubbard, 2015;Fernández, 2006), driven by policies based on the nationalization of the higher education system, with scientific output the main element to be considered by university authorities and control bodies. As a corollary to this, the evaluation of academic performance was outsourced to publishers with vested interests: where previously highquality teaching leading to better degrees was the main objective, now the publication of high-impact articles was at a premium. ...
Chapter
Academic migrations have been a constant throughout history. Nowadays, due to the complexity of the global situation, many teachers and researchers have been expelled from their countries, being unable to develop their professional careers. On the contrary, Ecuador has attracted human capital from 2008 to 2018 through public policies aimed at changing the productive matrix and improving its academic results. With this, the hiring of teachers and researchers at a national level was constant and resulted in an exchange of knowledge. To this end, the Network of Researchers in Ecuador (@redcientificos) in the Higher Education system and its interrelationships are analyzed. To demonstrate the fullness of academic possibilities offered intrinsically in its beginnings and the inherent decrease in working conditions for its teachers and researchers since 2018. A circumstance that limits the prospects of its researchers based in Ecuador and introduces temporality and uncertainty both in its processes of academic linkage and production, as well as in its public policies.
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1. have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
... This may surprise many readers, given how widely such tests are used in published research. 1,2 It is also surprising that the widespread use of null hypothesis statistical tests has persisted for so long, given that the problems in Box 1 have been repeatedly raised in healthcare journals for decades, 8,9 including physiotherapy journals. 10,11 There has been some movement away from null hypothesis statistical tests, but the use of alternative methods of statistical inference has increased slowly over decades, as seen in analyses of healthcare research, including physiotherapy trials. ...
Article
Purpose This study applies complexity theory to propose and empirically examine asymmetric case conditions of antecedents and outcome models of high (low) willingness-to-engage in workplace romance (WEWR). This study focuses on constructing complex antecedent conditions that accurately indicate which employees, and under what conditions, employees are high in WEWR. Design/methodology/approach Using an experimental design, 162 employees were assigned one of nine hypothetical vignettes describing different workplace romance contexts including three discrete policies regarding workplace romances (i.e. strictly forbidden, moderate, vs no policy), two motivations for the workplace romance (i.e. job vs love), and two organizational positions of the romance (i.e. hierarchical vs lateral). Participants then reported WEWR responses. Participants also provided demographic, behavioral, and psychological work-related information. This study assesses and supports recipes (i.e. algorithms) of case and organizational structure conditions to identify cases high (low) in WEWR accurately and consistently. Findings The results provide clarity of which and when employees are willing vs unwilling to engage in workplace romances – and the contextualized impacts of organizational bans on WEWR. The study’s results are useful for estimating for whom specific workplace policies are effective or not by specific workplace contexts. Practical implications In highlighting the role of varying antecedent conditions in predicting WEWR, this research will assist organizations and practitioners in understanding the context in which workplace romances are more likely to occur, providing insight as to when employees are likely to comply with workplace romance policies. Originality/value This paper is the first in the workplace romance literature to examine unique combinations of antecedent conditions on WEWR, adding nuance to the current understanding of the behavior.
Chapter
Full-text available
The overall goal of science is to build a valid and reliable body of knowledge about the functioning of the world and how applying that knowledge can change it. As personnel and human resources management researchers, we aim to contribute to the respective bodies of knowledge to provide both employers and employees with a workable foundation to help with those problems they are confronted with. However, what research on research has consistently demonstrated is that the scientific endeavor possesses existential issues including a substantial lack of (a) solid theory, (b) replicability, (c) reproducibility, (d) proper and generalizable samples, (e) sufficient quality control (i.e., peer review), (f) robust and trustworthy statistical results, (g) availability of research, and (h) sufficient practical implications. In this chapter, we first sing a song of sorrow regarding the current state of the social sciences in general and personnel and human resources management specifically. Then, we investigate potential grievances that might have led to it (i.e., questionable research practices, misplaced incentives), only to end with a verse of hope by outlining an avenue for betterment (i.e., open science and policy changes at multiple levels).
Technical Report
Full-text available
لا أتصور أن مفهوم الدلالة الإحصائية (أو المعنوية الإحصائية) غريب أو جديد على كل مشتغل بالإحصاء أو الدراسات القياسية أو التقنيات الكمية عموما. في هذا العمل المترجم للتقرير الأصلي "Moving to a World Beyond p <0.05 "، والذي هو ملخص لـ 43 مقال علمي لرواد الإحصاء الرياضي والتطبيقي في العالم حاليا، والذي كان موضوع ملتقى علمي منظم من طرف "الجمعية الأمريكية للإحصاء" (American Statistical Association) ، سيجد القارئ العادي والمختص على حد سواء نفسه أمام أفكار وصراع علمي وحركة إصلاح جذرية في حقل الاستدلال الإحصائي. ضمن هذا الإطار، رأيت انه من الواجب علي أن أقدم ترجمة (مع تنقيح وتوضيح للأفكار التقنية) عربية لهذا التقرير المهم. التقرير المترجم سيكون لبنة أساسية للتفكير والنقاش حول واقع الممارسة الإحصائية في الدول العربية عامة، سواء على المستوى الأكاديمي والرسمي. الفكرة الأساسية للتقرير ومقالاته، تنصب حول إعادة بناء وتصويب لأخطاء المتراكمة والاستعمال غير السليم للاستدلال الإحصائي خاصة في الجانب التطبيقي العملي للإحصاء. عملنا كان أساسا منصب على ترجمة التقريرالى العربية مع مراعاة المفاهيم التقنية، وكذا شرح لبعض المصطلحات والأفكار الواردة في نص التقرير. العمل كان متعب وطويل النفس، ولا أتصور واعتقد فيه الكمال (خاصة في جانب الترجمة اللغوية). للتذكير، التقرير مجملا، وكل المقالات 43 متاحة للتحميل مجانا، وأنا انصح القارئ والمختص ان يطلع عليها كلها.
Preprint
In the context of discovery-oriented hypothesis testing research, behavioral scientists widely accept a convention for false positive (α) and false negative error rates (β) proposed by Jacob Cohen, who deemed the general relative seriousness of the antecedently accepted α = 0.05 to be matched by β = 0.20. Cohen’s convention not only ignores contexts of hypothesis testing where the more serious error is the β-error. Cohen’s convention also implies for discovery-oriented hypothesis testing research that a statistically significant observed effect is four times more probable to be a mistaken discovery than for a statistically significant true observed effect to be independently replicable. In the long run, Cohen’s convention thus is epistemically harmful to the development of a progressive science of human behavior, making its acceptance crucial in explaining the replication crisis in behavioral science. The balance between α- and β-errors generally ought to be struck using both epistemic and practical considerations. Yet epistemic considerations alone imply that making a genuine contribution to the body of knowledge in behavioral science requires error rates that are not only small but also symmetrical.
Chapter
List of works referred to in Armstrong & Green (2022) The Scientific Method
Article
Full-text available
Marketing scholars often compare groups that occur naturally (e.g., socioeconomic status) or due to random assignment of participants to study conditions. These scholars often report group means, standard deviations, and effect sizes. Although such summary information can be helpful, it can misinform marketing practitioners’ decisions. To avoid this problem, scholars should also report the probability that one group’s member will score higher than another group’s member, and by various amounts. In this vein, newly conceived gain-probability diagrams can depict relevant, concise, and easy-to-comprehend probabilistic information. These diagrams’ nuanced perspective can contradict traditional significance test and effect size implications.
Article
The vast majority of empirical hypotheses in psychology, or in the social sciences more generally, are directional whereas in other sciences, such as the physical sciences, there are more point or narrow‐interval empirical hypotheses. Characteristics of theories and auxiliary assumptions play a role in the difference. Given that psychology research strongly features directional predictions, it is important to question the extent to which these provide convincing tests of theories that they are designed to test. The present work aims to provide a nuanced view that considers the complex interaction between the obviousness of directional predictions, the obviousness of the theory from which they derive, and the quality of the auxiliary assumptions that push towards directional predictions. Then, too, there is the related issue of vulnerability of directional predictions to alternative explanations and how to address them.
ResearchGate has not been able to resolve any references for this publication.