# Richard Donald MoreyCardiff University | CU · School of Psychology

Richard Donald Morey

PhD, Psychology

## About

99

Publications

103,687

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

15,512

Citations

Introduction

Additional affiliations

January 2015 - present

July 2008 - present

**Rijksuniversiteit Groningen**

July 2008 - December 2014

Education

July 2005 - July 2008

August 2004 - July 2008

August 2002 - July 2004

## Publications

Publications (99)

van Doorn et al. (2021) outlined various questions that arise when conducting Bayesian model comparison for mixed effects models. Seven response articles offered their own perspective on the preferred setup for mixed model comparison, on the most appropriate specification of prior distributions, and on the desirability of default recommendations. T...

ANOVA—the workhorse of experimental psychology—seems well understood in that behavioral sciences have agreed-upon contrasts and reporting conventions. Yet, we argue this consensus hides considerable flaws in common ANOVA procedures, and these flaws become especially salient in the within-subject and mixed-model cases. The main thesis is that these...

The Action-sentence Compatibility Effect (ACE) is a well-known demonstration of the role of motor activity in the comprehension of language. Participants are asked to make sensibility judgments on sentences by producing movements toward the body or away from the body. The ACE is the finding that movements are faster when the direction of the moveme...

Roberts (2020, Learning & Behavior, 48 [2], 191–192) discussed research claiming honeybees can do arithmetic. Some readers of this research might regard such claims as unlikely. The present authors used this example as a basis for a debate on the criterion that ought to be used for publication of results or conclusions that could be viewed as unlik...

More than 40 years ago, Paul Meehl (1978) published a seminal critique of the state of theorizing in psychological science. According to Meehl, the quality of theories had diminished in the preceding decades, resulting in statistical methods standing in for theoretical rigor. In this introduction to the special issue Theory in Psychological Science...

Social and behavioural scientists have attempted to speak to the COVID-19 crisis. But is behavioural research on COVID-19 suitable for making policy decisions? We offer a taxonomy that lets our science advance in ‘evidence readiness levels’ to be suitable for policy. We caution practitioners to take extreme care translating our findings to applicat...

The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and – for that specific purpose – the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irre...

Why is there no consensual way of conducting Bayesian analyses? We present a summary of agreements and disagreements of the authors on several discussion points regarding Bayesian inference. We also provide a thinking guideline to assist researchers in conducting Bayesian inference in the social and behavioural sciences.

Although teaching Bayes’ theorem is popular, the standard approach—targeting posterior distributions of parameters—may be improved. We advocate teaching Bayes’ theorem in a ratio form where the posterior beliefs relative to the prior beliefs equals the conditional probability of data relative to the marginal probability of data. This form leads to...

When data analysts operate within different statistical frameworks (e.g., frequentist versus Bayesian, emphasis on estimation versus emphasis on testing), how does this impact the qualitative conclusions that are drawn for real data? To study this question empirically we selected from the literature two simple scenarios—involving a comparison of tw...

This paper introduces JASP, a free graphical software package for basic statistical procedures such as t tests, ANOVAs, linear regression models, and analyses of contingency tables. JASP is open-source and differentiates itself from existing open-source solutions in two ways. First, JASP provides several innovations in user interface design; specif...

Scientific theories explain phenomena using simplifying assumptions—for instance, that the speed of light does not depend on the direction in which the light is moving, or that the shape of a pea plant’s seeds depends on a small number of alleles randomly obtained from its parents. These simplifying assumptions often take the form of statistical nu...

Background:
Studies have shown similar efficacy of different antidepressants in the treatment of depression.
Method:
Data of phase-2 and -3 clinical-trials for 16 antidepressants (levomilnacipran, desvenlafaxine, duloxetine, venlafaxine, paroxetine, escitalopram, vortioxetine, mirtazapine, venlafaxine XR, sertraline, fluoxetine, citalopram, paro...

Statistical inference has grown to be an indispensable tool for scientists seeking to make sense of data. This chapter describes three approaches for a large majority of statistical inferences drawn in the scientific literature which includes: frequentist approaches, likelihood approaches, and Bayesian approaches. Typical approaches to frequentist...

Bayesian parameter estimation and Bayesian hypothesis testing present attractive alternatives to classical inference using confidence intervals and p values. In part I of this series we outline ten prominent advantages of the Bayesian approach. Many of these advantages translate to concrete opportunities for pragmatic researchers. For instance, Bay...

Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data wer...

Although teaching Bayes' theorem is popular, the standard approach---targeting posterior distributions of parameters---may be improved. We advocate teaching Bayes' theorem in a ratio form where the posterior beliefs relative to the prior beliefs equals the conditional probability of data relative to the marginal probability of data. This form leads...

[This corrects the article DOI: 10.1098/rsos.160426.].

This chapter explains why the logic behind p-value significance tests is faulty, leading researchers to mistakenly believe that their results are diagnostic when they are not. It outlines a Bayesian alternative that overcomes the flaws of the p-value procedure, and provides researchers with an honest assessment of the evidence against or in favor o...

We applied three Bayesian methods to reanalyse the preregistered contributions to the Social Psychology special issue `Replications of Important Results in Social Psychology' (Nosek & Lakens. 2014 Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141. (doi:10.1027/1864-9335/a000192)). First, indivi...

In 1881, Donald MacAlister posed a problem in the Educational Times that remains relevant today. The problem centers on the statistical evidence for the effectiveness of a treatment based on a comparison between two proportions. A brief historical sketch is followed by a discussion of two default Bayesian solutions, one based on a one-sided test be...

The field of psychology, including cognitive science, is vexed by a crisis of confidence. Although the causes and solutions are varied, we focus here on a common logical problem in inference. The default mode of inference is significance testing, which has a free lunch property where researchers need not make detailed assumptions about the alternat...

The analysis of R×C contingency tables usually features a test for independence between row and column counts. Throughout the social sciences, the adequacy of the independence hypothesis is generally evaluated by the outcome of a classical p-value null-hypothesis significance test. Unfortunately, however, the classical p-value comes with a number o...

Evidence suggests that there is a tendency to verbally recode visually-presented information, and that in some cases verbal recoding can boost memory performance. According to multi-component models of working memory, memory performance is increased because task-relevant information is simultaneously maintained in two codes. The possibility of dual...

This article provides a Bayes factor approach to multiway analysis of variance (ANOVA) that allows researchers to state graded evidence for effects or invariances as determined by the data. ANOVA is conceptualized as a hierarchical model where levels are clustered within factors. The development is comprehensive in that it includes Bayes factors fo...

The practical advantages of Bayesian inference are demonstrated here through two concrete examples. In the first example, we wish to learn about a criminal’s IQ: a problem of parameter estimation. In the second example, we wish to quantify and track support in favor of the null hypothesis that Adam Sandler movies are profitable regardless of their...

The Food and Drug Administration (FDA) uses a p < 0.05 null-hypothesis significance testing framework to evaluate "substantial evidence" for drug efficacy. This framework only allows dichotomous conclusions and does not quantify the strength of evidence supporting efficacy. The efficacy of FDA-approved antidepressants for the treatment of anxiety d...

Bayesian inference has been advocated as an alternative to conventional analysis in psychological science. Bayesians stress that subjectivity is needed for principled inference, and subjectivity by-and-large has not been seen as desirable. This paper provides the broader rationale and context for subjectivity, and in it we show that subjectivity is...

Analysis of variance (ANOVA), the workhorse analysis of experimental designs, consists of F-tests of main effects and interactions. Yet, testing, including traditional ANOVA, has been recently critiqued on a number of theoretical and practical grounds. In light of these critiques, model comparison and model selection serve as an attractive alternat...

Openness is one of the central values of science. Open scientific practices such as sharing data, materials, and analysis scripts alongside published articles have many benefits, including easier replication and extension studies, increased availability of data for theory-building and meta-analysis, and increased possibility of review and collabora...

Since recent decades, clinicians offering interventions against mental problems must systematically collect data on how clients change over time. Since these data typically contain measurement error, statistical tests have been developed which should disentangle true changes from random error. These statistical tests can be subdivided into two type...

Miller and Ulrich (in press) critique our claim (Hoekstra, Morey, Rouder, & Wagenmakers, 2014), based on a survey given to researchers and students, of widespread misunderstanding of confidence intervals (CIs). They suggest that survey respondents may have interpreted the statements in the survey that we deemed incorrect in an idiosyncratic, but co...

Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated s...

State-trace methods have recently been advocated for exploring the latent dimensionality of psychological processes. These methods rely on assessing the monotonicity of a set of responses embedded within a state-space. Prince et al. (2012) proposed Bayes factors for state-trace analysis, allowing the assessment of the evidence for monotonicity with...

We compared Bayes factors to normalized maximum likelihood for the simple
case of selecting between an order-constrained versus a full binomial model.
This comparison revealed two qualitative differences in testing order
constraints regarding data dependence and model preference.

Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated s...

Color repetitions in a visual scene boost working memory capacity for its elements, a phenomenon known as the color-sharing effect. This may occur because improved perceptual organization reduces information load or because the repetitions capture attention. The implications of these explanations differ drastically for both the theoretical meaning...

Hoijtink, Kooten, and Hulsker (20164.
Hoijtink, H., van Kooten, P., & Hulsker, K. (2016). Why Bayesian psychologists should change the way they use the Bayes factor. Multivariate Behavioral Research, 51, 1--9. doi: 10.1080/00273171.2014.969364.[Taylor & Francis Online]View all references) present a method for choosing the prior distribution for an...

Nature Commentary about the Study: http://www.nature.com/news/unconscious-thought-not-so-smart-after-all-1.16801
Are difficult decisions best made after a momentary diversion of thought? Previous research addressing this important question has yielded dozens of experiments in which participants were asked to choose the best of several options (e.g...

p>Interval estimates -- estimates of parameters that include an allowance for sampling uncertainty -- have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeat...

The power fallacy refers to the misconception that what holds on average -across an ensemble of hypothetical experiments- also holds for each case individually. According to the fallacy, high-power experiments always yield more informative data than do low-power experiments. Here we expose the fallacy with concrete examples, demonstrating that a pa...

When researchers are interested in the effect of certain interventions on certain individuals, single-subject studies are often performed. In their most simple form, such single-subject studies require that a subject is measured on relevant criterion variables several times before an intervention and several times during or after the intervention....

A core aspect of science is using data to assess the degree to which data provide evidence various claims, hypotheses, or theories. Evidence is by defi-nition something that should change the credibility of a claim in a reasonable person's mind. However, common statistics, such as significance testing and confidence intervals have no interface with...

One of the main challenges facing potential users of Bayes factors as an inferential technique is the difficulty of computing them. We highlight a useful relationship that allows certain order-restricted and sign-restricted Bayes factors, such as one-sided Bayes factor tests, to be computed with ease.

Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated s...

Psychological Science recently announced changes to its publication guidelines (Eich, in press). Among these are many positive changes that will increase the quality of the scientific results published in the journal. One of the changes emphasized by Cumming (in press) is an increased emphasis on estimation, as opposed to hypothesis testing. We arg...

We present a cognitive process model of response choice and response time performance data that has excellent psychometric properties and may be used in a wide variety of contexts. In the model there is an accumulator associated with each response option. These accumulators have bounds, and the first accumulator to reach its bound determines the re...

Null hypothesis significance testing (NHST) is undoubtedly the most common inferential technique used to justify claims in the social sciences. However, even staunch defenders of NHST agree that its outcomes are often misinterpreted. Confidence intervals (CIs) have frequently been proposed as a more useful alternative to NHST, and their use is stro...

As a commonly used measure of selective attention, it is important to understand the factors contributing to interference in the Stroop task. The current research examined distracting stimuli in the auditory and visual modalities to determine whether the use of auditory distractors would create additional interference, beyond what is typically obse...

The statistical consistency test of Ioannidis and Trikalinos (2007) has been used recently by , , , , and , to argue that specific sets of experiments show evidence of publication bias. I argue that the test is unnecessary because publication bias exists almost everywhere as property of the research process, not individual studies. Furthermore, for...

Researchers using single-subject designs are typically interested in score differences between intervention phases, such as differences in means or trends. If intervention effects are suspected in data, it is desirable to determine how much evidence the data show for an intervention effect. In Bayesian statistics, Bayes factors quantify the evidenc...

Psi phenomena, such as mental telepathy, precognition, and clairvoyance, have garnered much recent attention. We reassess the evidence for psi effects from Storm, Tressoldi, and Di Risio's (2010) meta-analysis. Our analysis differs from Storm et al.'s in that we rely on Bayes factors, a Bayesian approach for stating the evidence from data for compe...

In this article, we present a Bayes factor solution for inference in multiple regression. Bayes factors are principled measures of the relative evidence from data for various models or positions, including models that embed null hypotheses. In this regard, they may be used to state positive evidence for a lack of an effect, which is not possible in...

Bayes factors have been advocated as superior to pp-values for assessing statistical evidence in data. Despite the advantages of Bayes factors and the drawbacks of pp-values, inference by pp-values is still nearly ubiquitous. One impediment to the adoption of Bayes factors is a lack of practical development, particularly a lack of ready-to-use form...

Gelman and Shalizi (2012) criticize what they call the 'usual story' in Bayesian statistics: that the distribution over hypotheses or models is the sole means of statistical inference, thus excluding model checking and revision, and that inference is inductivist rather than deductivist. They present an alternative hypothetico-deductive approach to...

It is known that visual working memory capacity is limited, but the nature of this limit remains a subject of controversy. Increasingly, two factors are thought to limit visual memory: an object-based limit associated with so-called "slots" models, and an information-based limit associated with resource models. Recently, Barton, Ester, and Awh (200...

One of the most important methodological problems in psychological research is assessing the reasonableness of null models, which typically constrain a parameter to a specific value such as zero. Bayes factor has been recently advocated in the statistical and psychological literature as a principled means of measuring the evidence in data for vario...

Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue in...

The change detection paradigm has become an important tool for researchers studying working memory. Change detection is especially useful for studying visual working memory, because recall paradigms are difficult to employ in the visual modality. Pashler (Perception & Psychophysics, 44, 369-378, 1988) and Cowan (Behavioral and Brain Sciences, 24, 8...

In recent years, statisticians and psychologists have provided the critique that p-values do not capture the evidence afforded by data and are, consequently, ill suited for analysis in scientific endeavors. The issue is particular salient in the assessment of the recent evidence provided for ESP by Bem (2011) in the mainstream Journal of Personalit...

Working memory is the memory system that allows for conscious storage and manipulation of information. The capacity of working memory is extremely limited. Measurements of this limit, and what affects it, are critical to understanding working memory. Cowan (2001) and Pashler (1988) suggested applying multinomial tree models to data from change dete...

Prominent roles for general attention resources are posited in many models of working memory, but the manner in which these can be allocated differs between models or is not sufficiently specified. We varied the payoffs for correct responses in two temporally-overlapping recognition tasks, a visual array comparison task and a tone sequence comparis...

Although the measurement of working memory capacity is crucial to understanding working memory and its interaction with other cognitive faculties, there are inconsistencies in the literature on how to measure capacity. We address the measurement in the change detection paradigm, popularized by Luck and Vogel (Nature, 390, 279-281, 1997). Two measur...

Stroop and Simon tasks are logically similar and are often used to investigate cognitive control and inhibition processes. We compare the distributional properties of Stroop and Simon effects with delta plots and find different although stable patterns. Stroop effects across a variety of conditions are smallest for fast responses and increase as re...