ArticlePDF Available

Nailing Jelly: The Replication Problem Seems to Be Unsurmountable. Two Failed Replications of the Matrix Experiment


Abstract and Figures

We have reported previously on positive effects found in the matrix experiment. This is a setup where a random event generator (REG) drives a display, which participants are instructed to “influence” at will, i.e., in a psychokinesis (PK) setup. The difference of this matrix experiment from standard micro-PK REG experiments was that the deviation from randomness was not measured, but a large array of 2025 correlations between the behavior of the participant and the behavior of the REG was tested. This previous experiment was significant, and we devised a consensus protocol, which was deposited before commencement, according to which we conducted two independent replications with the same experimental setup and equipment. In the first experiment 64 participants conducted the experiment in one location under the experimental guidance of KK, in the second experiment 40 participants conducted the experiment in another location under the experimental guidance of HV. The analysis used a non-parametric randomization test with 10,000 iterations. None of the two experiments was significant. While in the first experiment a very small, but non-significant effect was found, in the second experiment no effect whatsoever was detectable. Sensitivity analyses did not suggest that the effect was in fact there but overlooked by our analysis. We discuss the findings in the context of the larger debate around replicability of parapsychological (PSI) research results and our theoretical model. This starts from the assumptions that such PSI effects are likely effects of a generalized form of entanglement correlations, and a consequence of this model is that such effects must not be used for the transfer of signals. Classical experiments, however, are detectors or extractors of signals in or from systems. This seems to be prohibited. Thus, the replication problem and this failed replication is likely part of the systematic nature of these effects. This makes it unlikely that experimental research alone will be successful in the long run demonstrating PSI effects. Our conclusion is that the matrix experiment is not a replicable paradigm in PSI research.
Content may be subject to copyright.
Nailing Jelly:
The Replication Problem Seems to Be Unsurmountable—
Two Failed Replications of the Matrix Experiment
H W
Poznan Medical University, Department of Pediatric Gastroenterology, Poznan, Poland;
University Witten/Herdecke, Department of Psychology, Witten, Germany;
hange Health Science Institute, Berlin, Germany, +49-30-46906958
K A. K
Technical University Chemnitz, Department of Psychology, Chemnitz, Germany
P S
Technical University Chemnitz, Department of Psychology, Chemnitz, Germany
H V
Change Health Science Institute, Berlin, Germany
T H
Medical University Regensburg, Department of Psychosomatic Medicine, Regensburg, Germany
W  L
Parapsychology Counselling, Freiburg, Germany
Submitted December 17, 2020; Accepted September 12, 2021; Published December 30, 2021
Creative Commons License CC-BY-NC
Abstract-—We have reported previously on positive eects found in the
matrix experiment (Walach et al., 2020). This is a setup where a random
event generator (REG) drives a display, which participants are instructed
to “inuence” at will, i.e., in a psychokinesis (PK) setup. The dierence of
this matrix experiment from standard micro-PK REG experiments was
that instead of the deviation from randomness, a large array of 2025 cor-
Journal of Scientific Exploration, Vol. 35, No. 4, pp. 788–828, 2021 0892-3310/21
Two Failed Replications of the Matrix Experiment 789
relations between the behavior of the participant and the behavior of
the REG was tested. This previous experiment was signicant, and we
devised a consensus protocol, which was deposited before commence-
ment, according to which we conducted two independent replications
with the same experimental setup and equipment. In the rst experiment
64 participants conducted the experiment in one location under the ex-
perimental guidance of KK (power = 0.88), in the second experiment 40
participants conducted the experiment in another location under the
experimental guidance of HV (power = 0.69). The analysis used a non-
parametric randomization test with 10,000 iterations. Neither of the two
experiments was signicant. While in the rst experiment a very small,
but non-signicant eect was found, in the second experiment no eect
was detectable. We discuss the ndings in the context of the larger de-
bate around replicability of parapsychological (PSI) research results and
our theoretical model. The replication problem and this failed replica-
tion is likely part of the systematic nature of such eects. This makes it
unlikely that experimental research alone will be successful in the long
run in demonstrating PSI eects. Our conclusion is that the matrix ex-
periment in and of itself is not a replicable paradigm in PSI research.
Parapsychologists were probably among the rst social science
researchers to understand the necessity to publish all negative ndings
and to insist on replications. Replications come in various forms:
Identical replications are rather rare and refer to the replication of the
very same experiment, including all materials, methods, procedures,
and statistical analyses except that either the same research group
or other researchers run a set of dierent subjects. Mostly, however,
replications also vary some kind of procedure, thereby making
replications conceptual (Schmidt, 2016). We established previously an
experimental paradigm which we hoped would lend itself to replication
(Walach et al., 2020). We report here on two replication experiments
that failed to replicate the previous result.
Parapsychology and Replication
The idea of replicating experiments is, of course, to exclude
accidental ndings, false positives, and small statistical uctuations
being identied as systematic eects. Physicists have long been taken
790 Harald Walach et al.
as examples of doing natural science rigorously, and they demand at
least 5 sigma or standard deviations of the standard normal curve and
associated probabilities as conrmations from various experiments until
they accept an eect as veridical (Abbot & LIGO Scientic Collaboration
and Virgo Collaboration, 2016; Grote, 2018; Horton, 2015). They usually
achieve this by accumulating data through multiple observations or
through multi-lab experiments, as in the LIGO experiments that
detected gravitation waves. Thereby, the very same experimental
setup, including all analysis pipelines and data-collection procedures
are standardized and logged. The idea behind this procedure is that
experiments are detectors for stable and local signals that may be weak
but can be eventually separated from noise. Local signals travel at, or
slower than, the speed of light and appear regular, i.e., exhibit lawful
behavior. The laws might either already be known and predict certain
signals, as in the case of gravitation waves that were long predicted by
the standard model of cosmology, but the signals might be very weak
or very rare, like gravitation waves, or researchers might surmise that
unknown laws underly hitherto undetected local signals. This is what
some researchers in the parapsychology research community assume
(Carr, 2015a; May et al., 1996; May et al., 2018; Radin, 2018).
In parapsychology we are looking for eects whose nature we do
not understand, because there is no accepted theory in the rst place.
Some theories with testable consequences have been proposed, such
as “decision augmentation theory” (DAT), which supposes that all PSI
eects are precognitive eects, where anomalous cognition of future
events is used to augment decision making (May et al., 1995, 2000,
1996). Apart from the fact that this model has problems explaining
makro-PK, spuk, and poltergeist phenomena, it also has some
empirical evidence against it (Dobyns & Nelson, 1998). Observational
theories use some form of argument from a von-Neumann–
Wigner-type of interpretation of quantum physics, in which human
consciousness is central in collapsing the wave function (Houtkooper,
2002; Walker, 1975, 1979, 2011 [1974]). In this family of theories, it is
the joint observational eect of those looking at the data that actually
produce the eect. While this might be a theoretical option, it depends
on the interpretation of the measurement process and the acceptance
of a dualistic model, none of which are currently universally accepted
Two Failed Replications of the Matrix Experiment 7 91
(Bierman, 2010). Finally, a model very similar to ours is the Conscious
Induced Restoration of Time Symmetry (CIRTS) model (Bierman, 2010).
This starts from the assumption that most physical theories, with the
exception of thermodynamics and special relativity, are time symmetric,
and that the brain sustaining consciousness might be a system that
restores time symmetry through providing time-negative eects as in
precognition. The CIRTS model assumes that there is a kind of signal
or informational element in PSI-eects that are, however, bounded by
physical theory. The model which we favor (see below) is also derived
from physical theory and hence a potential scientic candidate for PSI-
modeling, but more strongly than all other models assumes that signal
coding is strictly prohibited.
Are PSI eects due to causal and local signals, i.e., obeying the
special theory of relativity? Are the signals of a known type, i.e., belonging
to the four types of exchange particles of the four basic known forces
in the universe, for instance are we looking for photons as exchange
particles of the electromagnetic force, or for other particles (Penrose,
2004), or are we looking for completely dierent, yet nevertheless
local causes and signals of a physical nature? Or are we even looking
for completely dierent types of signals that cannot be encompassed
within the standard model of physics and hence would, if discovered
and proven as stable and replicable, entail a widening of our worldview
similar to that produced by the advent of quantum mechanics? Some
physical concepts that use higher dimensional models of space and
time than relativity theory and quantum theory would suggest this
(Carr, 2015a, 2015b; Heim, 1989).
The parapsychological database is jagged so far. While we do
have many extremely intriguing phenomena on a phenomenal level
(Braude, 1986, 2017; Grosso, 2016), strong and well-documented
cases, and highly signicant meta-analyses summarizing research
elds or experimental paradigms across researchers, variations, and
time (Cardeña, 2018; May & Marwaha, 2018, 2019a, 2019b), critics are
also correct in pointing out that it is not possible to name one single
parapsychological experiment as foolproof and resistant to experimental
replication (Alcock, 2003; Reber & Alcock, 2020). It is also true that we
have a replication crisis in psychology in general, i.e., the inability to
externally replicate experiments that were thought to be proven (Open
792 Harald Walach et al.
Science Collaboration, 2015; Schooler, 2011). This replication crisis
aects all sciences, according to a survey where 90% of polled scientists
say that there is a problem with replicability (Munafò et al., 2017). It
is notorious in medicine despite the fact that medical interventions
are widely used and believed in (Horton, 2015; Ioannidis, 2005). So
why bother about the lack of replicability in parapsychology? Perhaps
parapsychology is even more replicable than standard science, but only
more controversial and hence less accepted (Radin, 2018; Schwartz et
al., 2018)?
Assumptions Behind Replicability, Synchronicity, Generalized
Quantum Theory, and Generalized Entanglement
Replicability, we observed above, makes the implicit assumption
that we are dealing a) with local, causal signals that are b) regular,
following some lawful rule and c) are therefore always available for
experimental control and manipulation. In nal consequence they
would be amenable to human engineering, once the rules and the
lawful behavior are fully discovered. We started our research from the
assumption that the lack of replicability is part of the systematic nature
of parapsychological eects. In other words, we assumed that the eects
of parapsychology might be lawful, but not of a local-causal nature. This
sounds like a contradiction, but it is not. The focus of science has so far
been mainly on local-causal signals, because once they are discovered,
they can be put to use: We have used electricity, once we discovered the
nature of the electromagnetic force. We can use the gravitational force,
e.g., by sending satellites into orbit. We even made use of the strong
atomic force that keeps atoms together, when we started to engineer
atomic ssure. We make use of the knowledge of the weak force in
isotope calculations, Geiger counters, and the like. So, what would be a
lawful, yet not causal-local event?
In the realm of psychology, Carl-Gustav Jung and Wolfgang
Pauli, the physicist, discussed exactly such phenomena, i.e., lawful
yet not causal relationships, under the umbrella term “synchronicity”
(Atmanspacher & Primas, 2006; Atmanspacher et al., 1995a; Jung, 1952;
Manseld, 1995; Peat, 1992). These are material events in the material
world that in their occurrence appear to be without cause, i.e., they
Two Failed Replications of the Matrix Experiment 7 93
happen “accidentally” or “randomly”, yet they have a correlation
with the psychological state of a person who relates to those events.
Someone calling on the phone, while a person is desperately in need
of this contact would be a typical example. This phenomenon has also
given rise to a series of experiments (Sheldrake & Smart, 2003) which,
in our view, demonstrate that this phenomenon actually exists but is
not of a causal nature (Schmidt et al., 2004a), exactly as Jung and Pauli
would have postulated.
Jung and Pauli wanted synchronicity to be seen as a type of
lawful relationship that is complementary to causal relationships. They
codied this in their famous quaternity (Jung in a letter to Pauli on
November 30, 1950, in Meier, 1992, p. 64; Meier, 2001), depicted in
Figure 1. This also implies that synchronistic relationships that are due
to psychic meaning-making or constellation of an archetype, as Jung
called it, and are in a way part of a “deeper” structure of reality than
local causes. For synchronistic, correlational relationships are part of
this primordial level of “indestructible” energy. Similar to this ontic
level of “indestructible energy”, that some physicists call the endo-
physical level of unbroken unity (Atmanspacher et al., 1995b; Primas,
1994a, 1994b), there are also relationships that pertain to this level and
might be put to use (Lucadou, 2019). They might be lawful, but they are
not causal in nature. The causality principle only operates on the level of
the space–time continuum, or on the level of exo-physics, where clear
delineations and determinations can be made, because the original
unity is broken into measured and measurable parts.
Figure 1. The quaternity Jung suggested to Pauli: Local causes are complementary to
correlational relationships or synchronicity in a similar way as the space-
time continuum is complementary to indestructible energy.
794 Harald Walach et al.
In that sense, attempts to uncover a purported causality in this
realm is futile, simply because there are no local causes operative here,
but only formal and nal causes, to speak in Aristotelian terminology.
Another way of putting this is that there are likely only correlations
of a lawful but not causal nature. Indeed, there are examples in the
physical world for such a type of relationship as well, namely quantum
entanglement correlations (Atmanspacher et al., 2002; d’Espagnat,
1997; Schrödinger, 1935; Shimony, 1989; Stillfried, 2010). These are
quite lawful, but not causal in the sense that the lawfulness of these
correlations is not mediated by any exchange particles of force or
energy. This is something that Einstein had already observed and this is
the reason why he called them “spooky actions at a distance” (Einstein
et al., 1935). They remained purely hypothetical for a long time, derivable
from the formalism, but no one knew whether they are “real”. This
dispute was settled aer John Bell derived his famous inequalities as a
boundary condition for joint probabilities that are mutually exclusive,
obeying locality conditions (Bell, 1987). This inequality gave rise to an
operationalization and an experimental test which nally claried the
issue (Aspect et al., 1982a; Aspect et al, 1982b). There are indeed non-
local correlations, i.e., lawful, yet not causally mediated regularities in
Physical quantum correlations have been empirically documented
as factual beyond any reasonable doubt (Handsteiner et al., 2017; Ma
et al., 2012; Salart et al., 2008; Stefanov et al., 2002). Meanwhile, they
are the basis for various new elds of research and application, from
quantum computing to quantum encryption. And some think they
are the basis for our neural operations as well (Hamero & Penrose,
2014). Therefore, it might be reasonable to assume that such lawful,
yet a-causal and non-local relationships could also play a role in the
wider area of human aairs or in macroscopic nature. However, this
would necessitate that either physical entanglement correlations that
are normally only detectable under highly controlled and articial
conditions can also be preserved to some degree in the macroscopic
environment; or that there is an equivalent to physical entanglement
correlations that are exactly those meaningful correlations Jung spoke
of, but not necessarily of a physical nature. Such correlations might
be, for instance, systemic, i.e., pertaining to the general setup of a
Two Failed Replications of the Matrix Experiment 7 95
system of dierent physical constituents, and not only strictly physical
in nature (Atmanspacher et al., 2002; Lucadou, 1995, 2015b). This is the
path some of us have chosen, in assuming that there is a generalized
form of entanglement that is operative in various types of systems,
provided they have a certain structure (Filk & Römer, 2011; Walach &
von Stillfried, 2011a, 2011b). We assume that parapsychological eects
are due to such correlations, lawful, yet not causal, regular, yet not local
(Lucadou, 2015b; Walach et al., 2014).
The No-Signal-Transfer (NT) Axiom and the Development of the
Matrix Experiment
A corollary of this assumption is that if such correlations are
mistaken for causal-local regularities and could be potentially used
as such they will either change channel, i.e., show up in the control
condition, or they will reverse signs, i.e., become signicantly weaker,
or are seen in dierent parameters. The reason for these observations
is given by the fact that physical entanglement correlations must not
be used as causal signals, and this can be formally proven (Lucadou et
al., 2007). We therefore assume that this no-signal-transfer axiom (NT-
axiom) also holds in the generalized case, although here it cannot be
proven to be true, but is assumed to hold.
This NT-axiom states:
If a system is governed by non-local correlations but is treated as if the
correlations were local causes, and if a signal is extracted from it, or
could be extracted in principle, then those purported signals will break
down in a second experiment, or when so used.
This means that experiments on systems constituted by such
non-local correlations that are repeated, constitute a violation of the
NT-axiom and are likely to demonstrate a breakdown of such eects,
either by a dwindling of the eect size or by demonstrating paradoxical
eects, such as having the eect show up in the control group, or
changing its sign. It also means: This is only applicable to replications of
experiments. We will return to this problem in the Discussion.
Indeed, there is a series of empirical hints that testify to the
ubiquity of this phenomenon in parapsychology. A recent example is a
796 Harald Walach et al.
commissioned identical replication of a previously reported experiment
in which mental eort of trained meditators was supposed to aect an
interference pattern in a standard double-slit optical setup (Radin et al.,
2012). The strictly preregistered and controlled study came out negative
(Walleczek & von Stillfried, 2019), although an eect can be seen in
a completely dierent channel, in the variance. The same was found
in the multisite replication of the PEAR Lab’s micro PK-experiment
(Jahn & Dunne, 1987). Although a case can be made that the PEAR
Lab’s database was largely due to the eect of some gied subjects,
the consortium replication between Princeton, Freiburg, and Giessen
was predened as a large replication study of the PEAR Lab procedure
in a protocol, in which Walter von Lucadou also predicted the negative
The replication was negative (Jahn et al., 2000), but secondary
parameters that were not logged in the protocol, variance and non-
linearity parameters, were clearly signicant (Pallikari, 2001). Maier and
colleagues had the same experience in a series of PK- and priming
experiments (Dechamps & Maier, 2020; Maier et al., 2014; Maier &
Dechamps, 2018; Maier et al., 2018). This has also been observed in
other datasets from parapsychology (Bierman, 2000).
We started from the assumption that, if this lack of causal stability
is to be expected there might be a workaround by testing for some kind
of indirect parameter that would prohibit the coding of a causal signal.
Standard experiments that use a control group are, by default, cause
detectors and thereby allow the coding of a signal. For instance, they
yield a result which in a strict replication experiment could be used to
code a signal: Scores above or below the mean of the rst experiment
would be the signal to code a 1 or a 0. This, however, would violate the
NT-axiom and hence would cause the eect to vanish or change track,
if our model is correct. Therefore, we sought an experimental setup
that would be as immune as possible to such potential violation.
One such setup was developed by Walter von Lucadou in what
he called the “Matrix Experiment”, meanwhile referred to as the
“Correlation Matrix Method – CMM” (Lucadou, 1974, 1986, 1987a,
1987b, 1991, 2006, 2015a; Lucadou et al., 1987). The idea here is not to
dene a clear outcome parameter which would be prone to violating
the NT-axiom, as it would allow for signal coding, but to use an array
Two Failed Replications of the Matrix Experiment 7 97
of variables in a correlation matrix. The correlation matrix reects the
correlation of the interaction of human intentions or human behavior
with a physical system that is otherwise locally decoupled. In our case
the physical system was a micro-psychokinesis (micro-PK) experiment.
A computer displayed a fractal, a Julia-set, whose change—growth or
shrinkage—was driven by a random event generator (REG). That means
the behavior of the fractal could not have been inuenced by ordinary
means of interaction. However, human participants were instructed
to do so by their intentionality and were asked to move the sampling
process of the random event generator forward by pressing either
one of two keys on the keyboard of the laptop computer that ran the
experiment. These keystrokes represent the psychological or behavioral
variables, while the behavior of the physical system represents
the physical variables. These variables can be correlated across all
experiments and all participants and yield a correlation matrix. If the
correlation matrix contains a signature of the intentional eect or the
entanglement eect of participants with the experiment or physical
system, then we would expect more signicant correlations than by
statistical chance expectation or in a control experiment that is run
without a participant present.
Indeed, von Lucadou’s previous experiments were supportive of
this idea and produced more signicant correlations than expected
by chance and more than seen in a control matrix. Thus, we set out
to replicate this experimental setup with a larger, well-controlled
experiment. We rebuilt the hardware and soware—the random
event generator and the control soware—from scratch and enlarged
the matrix into a matrix of 45 psychological and 45 physical variables
(because there were 5 such variables per run and 9 runs made up an
experiment), yielding a matrix of 2025 cells. We created a robust non-
parametric system of statistical evaluation by simulating 10,000 such
experiments and deriving the statistical signicance from it. This rst
large replication in two labs yielded a signicant but fragile eect, as
signicance broke down in reasonably improved methods of analysis
(Walach et al., 2020).
We then convened an international consortium of experts to arrive
at a consensus protocol. This protocol followed our original one quite
closely with a few exceptions (see Methods section below) and formed
798 Harald Walach et al.
the basis for future replications. One such replication was conducted
by Karolina Kirmse as part of her master’s thesis under the supervision
of Peter Sedlmeier. Another replication was conducted by Hans Vogt
and Harald Walach. Both replications came out negative. We report on
these replications in this paper and will end with a few ideas about
potential ways forward and why we think it will be a dicult challenge
to experimentally prove anomalistic eects using experimental models
(Rabeyron, 2020).
We used a predened protocol that was the result of a consensus
meeting of experts. The studies reported here are in fact replications
of the parent study (Walach et al., 2020). The protocol was dened
and published beforehand on the Open Science Framework platform
( Since it is described there in detail, we will
only summarize the most important elements here. The experiment
is a comparatively strict replication of the parent study, as the same
equipment, the same material, and the same procedures were used
with only a few exceptions that are described below. The criterion for
a successful replication was a signicant result as determined by a
statistical randomization test (see below). Since we do not assume a
stable, causal eect, a standard power analysis is not part of the protocol
but can only be provided as a post-hoc analysis.
Material and Participants
We used the same equipment as in the parent experiment. KK was
lent one of the four REGs that were used for the rst experiment and
received a copy of the soware program that operated the experiment.
This soware program was custom written in C following the rst code
which was programmed in Basic. This program operated the experiment
automatically, prompted the experimenter and the participants for
inputs, and wrote the data into a le.
The rst replication was conducted by KK in Dresden with a broad
group of participants recruited mainly in public spaces, the second by
HV in Witten with a group of students of psychology, gaining course
credits through their participation. The experiment was advertised as an
Two Failed Replications of the Matrix Experiment 7 9 9
experiment in extraordinary facilities and was conducted face to face,
one aer the other. Before the experiment started, the experimenter
switched on the computer and the equipment with a lead time of half
an hour to allow for dri and warming up.
The experimenter greeted the participant and briey explained
the experiment and handed out a consent form, as well as a very
short questionnaire. The questionnaire data were deliberately not
used, as in the previous experiment, but had the function of involving
the participant with the experiment. In Experiment 1, however, this
questionnaire was extended and evaluated for exploratory purposes
(see below). When the participants were ready, the experimenter started
the program and le them alone. The participant could take as long as
necessary. They had the instruction to “inuence the movement of the
fractal on the screen” in the indicated direction and knew that they had
to press either of two shi keys on the computer keyboard to move the
sampling process forward. Each time either one of these shi keys was
pressed, the REG was sampled and the result was used to generate a
movement of the fractal displayed on the screen. The sampling process
was ltered by a Markov chain instead of the frequently used XOR-lter.
This was done for two reasons. First, Markov-chain ltering makes a
process smoother and look more natural. Most natural processes, like
the weather, are Markov processes, i.e., they contain one or two lags of
memory. Second. the Markov process preserves some of the physical
properties of the REG. Perceptually, this resulted in the appearance
of a very smooth movement of the fractal. What the participant did
not know was that when both shi keys were pressed, the sampling
process would go on until one of them or both keys were released. The
sampling process was repeated 80 times, since each sub-run consisted
of 80 such trials, and 3 sub-runs with three dierent instructions
made up one experiment. Each run was associated with a specic
instruction to either grow or shrink the movement of the fractal or
keep it constant. These instructions were conveyed by red arrows on
the screen, and each instruction was repeated 3 times at random. Thus,
a full experiment consisted of 3*3 sub-runs with 80 trials or 720 data-
points. In contrast to the parent experiment, each participant conducted
only one experiment.
800 Harald Walach et al.
Outcome Variables
For creating the 45 x 45 correlation matrix, ve behavioral psycho-
logical and ve physical variables were generated.
The ve behavioral psychological variables were generated by the
behavior of the participants and dened as follows:
T1: Number of le key presses
T2: Number of right key presses
T3: Number of double key presses
DR: Mean time between key presses, i.e., speed
DV: Mean variance between key presses, i.e., constancy of
The ve physical variables were associated with the behavior of the
random event generator (REG) and derived from the following values:
TR: Number of times the output of the Markov-chain parsing of
the REG yielded “1” during one run, i.e., the physical behavior
of the REG ltered by the Markov chai
DT: The number of steps the fractal display deviated from the
experimental instruction in either direction or from the
central position, i.e., this is the summarized number of steps
the fractal deviated from the goal
deviation of the actual physical output of the Markov chain
from an ideal Markov chain, measured as the deviation of the
theoretical autocorrelation function from the experimental
autocorrelation function of the sub-run calculated over 10 steps
ZT: mean voltage output of the REG at channel 4 out of eight;
this channel was dened a priori as the one where the voltage
would be recorded, because it was the middle channel and
hence least likely to be aected by currency changes due
to physical switching processes; the other channels were
measured but the data not checked and analyzed
ZV: the standard deviation of this voltage output at channel 4, i.e.,
of the variable ZT.
There was continuous voltage applied to the Zener diode which
triggered a current. This randomly changing current was converted
by analogue–digital converter. Each time a key press was enacted the
converter was sampled. If the number of bits was smaller than the
previous one the outcome was 0, if it was larger the outcome was 1,
Two Failed Replications of the Matrix Experiment 801
and if it was equal a new sampling was initiated.
The 5 variables were calculated for each run per participant. As each
participant had nine runs, this yielded 45 behavioral-psychological and 45
physical variables. These variables were correlated across all participants,
which together produced the 45*45 matrix with 2025 cells. These cells
were lled by the Spearman rank correlations coecients between the
respective variables across participants. We counted the number of
correlations signicant at the predetermined level of p < .1 (one-sided,
or .05 two-sided). This is arbitrary and followed previous practice and our
protocol. We also report sensitivity analyses for correlations signicant
at a lower p-value than that. The idea behind the testing procedure is
as follows: In each correlation matrix there is a number of correlations
signicant at a certain level by chance. For instance, in a matrix of 100
cells there would be 5 correlations expected to be signicant at the level
p = .05 or 10 at the level p = .1. Similarly, in a matrix of 2025 cells we
would expect 202 to 203 correlations to be signicant at a level p = .1.
Therefore, we counted the number of correlations signicant at the
level p ≤ .1 and tested (see below), whether this number of signicant
correlations found empirically was signicantly dierent from a chance
nding, using a randomization test, or dierent from the number of
correlations found in a control experiment.
Control Experiments
Aer each participant had nished his or her experiment, the
experimenter started a control experiment and then le the room.
The control experiment consisted of the physical equipment running
empty. This resulted in the generation and recording of the physical
variables (TR, DT, KR, ZT, ZV as described above) without interference
or interaction from a participant, sampling as many data points as
during a real experiment. The generated array of physical variables was
automatically written into a database, and the psychological variables
of the previous experiment copied into the control database as
corresponding psychological variables. Thus, each real experiment was
matched by a control experiment with the same set of psychological
variables, whereby all potential causal and non-causal eects were
transferred into the control database and correlated with a new set of
independent physical variables.
802 Harald Walach et al.
Special Features of the Two Experiments
Experiment 1, conducted by KK in Dresden, had, in addition, the
following features: Instead of performing just one control experiment
at the end of each session, a second control experiment was carried
out at the beginning. In this way, the scope of the comparison was
expanded. Furthermore, the questionnaire used in previous matrix
experiments was modied by adding state variables identied as
particularly psi-promoting (see Braud, 2002) and discussed by the matrix
experiment consortium against the background of the Organizational
Closure. These variables served as foundation of data to exchange the
psychological variables (keystrokes) with the questionnaire data, using
the “Phenomenology of Consciousness Inventory” (Pekala, 1995) in an
additional analysis and to perform explorative analyses to examine which
variables, determined as favoring psi eects, inuence the number of
signicant correlations. The questionnaire was implemented in an
online format on the computer where the experiment took place. In
addition, this questionnaire was continued aer the experiment had
been performed in order to allow a comparison of the participants’
states before and aer the experiment. The experimenter was blinded
to the responses during the experiment; only aer the experiment was
completed were the answers inspected.
Experiment 2, conducted by HV in Witten used two additional
features: There was a switch implemented that allowed the system to
choose between two types of REGs. One was the custom-made REG
that was also used by KK, identical to the ones from the rst experiment.
The second was an o-the-shelf REG called TrueRNG which can be
easily purchased and implemented via a USB-stick. The idea was to
see whether our elaborate sampling process would really be better or
whether we might be able to oer a simpler system for wider usage.
A coin toss at the beginning of the experiment decided which REG
would operate the experiment. The second feature referred to the
implementation of an assessment of absorption (Glicksohn, 2001;
Glicksohn et al., 1992; Watt & Tierney, 2013). It consists of measuring
objective time the experiment takes and then asking participants to
estimate the time they took to conduct the experiment. The dierence
can serve as a measure of absorption, as more deeply absorbed
Two Failed Replications of the Matrix Experiment 803
participants tend to underestimate the time (Sedlmeier et al., 2020).
Ethical clearances were given by the respective ethical boards.
Statistical Analysis and Data Preparation
Data analysis followed the predened protocol and consisted in a
randomization test as specied. Briey, an analysis script was written
in Matlab to reshue the data 10,000 times and to recalculate the
correlation coecients each time (see Appendix). For every permutation
step, the number of signicant correlations in the matrix was counted.
The number of times out of those 10,000 permutations where an
equal or larger number of correlations was found than in the empirical
matrix, divided by 10,000, yields an estimate of the true probability that
the empirical result or a more extreme one could have been found by
The experiment might be challenged to be open to systematic
causal coding, for instance if someone used a certain strategy such
as hammering on the keyboard, or always alternating shi keys, there
might be causal correlations between physical and psychological
variables. Therefore, we dened a sensitivity analysis: We analyzed
only those correlations that are found in the time-forward or upper
part of the matrix. As the matrix unfolds 9 * 5 psychological variables
in rows and 9 * 5 variables in columns the correlation of the rst set
of physical variables with the second set of psychological variables is
a time-forward correlation of physical variables in the rst run with
psychological variables in the second run, which should preclude all
causality, as causality normally does not run backwards in time.
In the second experiment, the data for the TrueRNG in Experiment
2 were found to not conform to expected behavior (Appendix Figure 1
and Appendix Figure 2). Closer inspection revealed that this was due to
a newly acquired programming glitch when programming the switch
between the REGs that led to a buer overow for the data coming
from the TrueRNG. We normalized the data and aer normalization
they conformed well to chance expectation (Appendix Figure 3 and
Appendix Figure 4). The programming mistake was corrected for
subsequent usage.
804 Harald Walach et al.
Experiment 1—Dresden Experiment by KK
Sixty-four participants were recruited, 43 females (67%) and 21
males (33%). Due to the layout of the questionnaire, which used the
original one by Walter von Lucadou, age was only available in categories.
The category of 41 to 50 years was the modal one with 17 participants.
Two participants were below 20 years, six were below 30, and 14 were
between 31 and 40. Sixteen participants were between 51 and 60, seven
were between 61 and 70 and one person was older between 71 and 80.
The results of the statistical analysis of Experiment 1 can be seen
in Table 1. (Appendix Table 1 presents the data together with the results
of the control matrices.)
Result of Statistical Analysis (Permutation Test with 10,000 Iterations)
of Experimental Matrix, Full 45*45 Matrix, Experiment 1.
Yellow: Signicant Results; Red: Missing Signicance
sig_th: theoretical signicance level at which the number of signicant correlations is
z0: number of signicant correlations empirically found at respective level
n_sim: number of simulated matrices out of 10,000 with signicant correlations at or
above the number found empirically
p_sim: actual signicance level of observed number of correlations (n_sim/10,000)
z0_part: number of correlations in time-forward (upper) part of the matrix
n_part_sim: number of signicant correlations found in 10,000 simulations at
respective level in upper part of the matrix
p_part_sim: actual signicance level of observed number of correlations (n_part_
sim/10,000) in upper part of the matrix
The rst line of Table 1 presents the signicance level at which
the numbers of signicant correlations are counted, the second line
gives the empirically found number of signicant correlations at that
Two Failed Replications of the Matrix Experiment 805
level. The number of simulated matrices with signicant correlations
at or above the number found empirically out of 10,000 simulations
follows in the next line, and the p-level is given by this number divided
by 10,000. The red color indicates which one of those statistical tests
did not reach formal signicance, while the yellow color indicates
signicance. The lower part of Table 1 reports the same for the upper
diagonal of the correlation matrix, which contains only time-forward
correlations, i.e., the correlation of the physical variables in the rst
run with the psychological variables of the second run (abbreviated as
“part” in Table 1). This contains the causally independent parts of the
correlation matrix because they are time-forward.
Only for some of the levels of signicance were there more
signicant correlations than expected by chance (remember that p ≤
.1 was the predened level), namely for correlations at the level of p ≤
.02, p ≤ .0005, and p ≤ .0001. The number of signicant correlations
at p ≤ .05 and p ≤ .01 miss formal signicance by a small margin. The
number of signicant correlations at the predened level of p ≤ .1 is
not signicant.
While in the original experiment (Walach et al., 2020) we found
signicant correlations beyond chance even in the time-forward upper
part of the matrix, overall none could be determined in this case.
We also analyzed smaller matrices (27*45, 18*27) which correspond
to the setup of previous experiments by Walter von Lucadou and can
be considered as replications of the earlier experiments. None of them
showed any consistent and clear-cut results (Appendix Table 2 and
Appendix Table 3).
Experiment 2—Witten Experiment by HV
The experiment conducted by HV in Witten recruited 40 parti-
cipants, all of them students at the university and most of them
psychology students who received course credits. Thus, all of them
were between 18 and 30 years old. In that experiment we also measured
time and had participants estimate the time of the experiment. On
average, participants estimated the experiment as 0.4 minutes shorter
than it actually was, which is a sign of modest absorption or closure.
Twelve participants reckoned that the experiment took longer than it
806 Harald Walach et al.
actually took, and thus were likely not very involved. The result of the
statistical analysis is given in Table 2. Graphical representations of the
experimental matrices of Experiments 1 (KK) and 2 (HV), as well as one
of the control experiments (by KK) are presented in Figures 2, 3, and 4.
As can be seen, at none of the evaluated levels of signicance do
we nd more signicant correlations than expected by chance, neither
in the full matrix, nor in the partial one. The same is true for the smaller
matrices (27*45, 18*27 matrix; data not shown). Because there was no
systematic eect in the rst place, further analyses as to the ecacy of
the two dierent REGs or the importance of organizational closure,
measured as absorption were no longer useful. The control matrices
did not show a signicant eect either.
Taken together, none of the two experiments corroborates our
original ndings and the replication must be considered failed.
Result of Statistical Analysis (Permutation Test with 10,000 Iterations)
of Experimental Matrix, Full 45*45 Matrix, Experiment 2.
Yellow: Signicant Results; Red: Missing Signicance
sig_th: theoretical signicance level at which the number of signicant correlations is
z0: number of signicant correlations empirically found at respective level
n_sim: number of simulated matrices out of 10,000 with signicant correlations at or
above the number found empirically
p_sim: actual signicance level of observed number of correlations (n_sim/10,000)
z0_part: number of correlations in time-forward (upper) part of the matrix
n_part_sim: number of signicant correlations found in 10,000 simulations at
respective level
Two Failed Replications of the Matrix Experiment 807
Figure 2. Experimental Matrix of Experiment 1 (KK).
Figure 3. Experimental Matrix of Experiment 2 (HV).
Figure 4. Control Matrix of Experiment 1 (KK).
808 Harald Walach et al.
Our hope that we might be able to replicate our earlier positive
nding and those reported by Walter von Lucadou (Lucadou, 1986,
1987b, 1991, 2000; Lucadou et al., 1987; Walach et al., 2020) did not bear
out. The two experiments, reported here, were part of a concerted eort
to nd a replicable experimental model that would circumvent the NT
axiom. This prohibits signal coding for anomalous experiments which
are supposed to operate on the basis of generalized entanglement cor-
relations. Circumventing the NT axiom was not possible. By the same
token, our negative results also preclude an anomalous signal. For had
such an anomalous signal been there, we would have been able to see
it, as it would have driven one of our variables (TR) that measures the
deviation of the REG from randomness, and thus produced a series
of signicant correlations. This result has to be seen against poten-
tial weaknesses and against other results, partially positive and partially
A major weakness of our experiments is that they are compara-
tively small. So, one could argue that they did not have the necessary
power. While our predecessor experiment had 503 participants, these
new experiments only had 104 participants together. Using the eect
size of our predecessor experiment, approximately r = .38, our smaller
experiment had a power of 69% and the larger one a power of 88% to
detect the eect. Precisely because we assume that the eects are of
a non-classical, non-signal–like nature a classical power discussion is
beside the point, we contend. A classical power analysis assumes that
there is a stable eect that can be detected, given enough resources.
We do not think that this is the case. This is the reason why in our con-
sortium protocol power analysis is not part of the protocol, but only
denition of recruitment procedures and a preclusion of optional stop-
ping. As we argue below, power is not the decisive issue, as there are
various instances of strongly powered and well-prepared replications
that were unsuccessful.
This lack of success in replication is not a problem of personal
factors, as these experiments were conducted by two independent
groups following the same protocol and using the same equipment.
Rather, it feeds into a stream of similar results: Walleczek and
Two Failed Replications of the Matrix Experiment 809
von Stillfried (2019) were unable to replicate the Radin double-slit
experiment, a careful replication in which Radin himself was involved,
conducted the experiment according to a predened protocol, and
analyzed the data according to previous standards. Rabeyron was unable
to replicate Bem’s retro-priming results (Rabeyron, 2020). Maier and
colleagues were unable to replicate earlier results and found higher-
level regularities, i.e., an eect that moves in a kind of sinusoidal wave
from positivity to negativity and potentially back again (Dechamps &
Maier, 2020; Maier & Dechamps, 2018; Maier et al., 2018). The matrix
experiment was repeated in a dierent form by Grote, who could
not nd clear-cut eects either (Grote, 2015, 2017). A newly designed
experiment by Grote which replicated the general setup of the matrix
experiment with new equipment and 200 participants was unsuccessful.
This demonstrates that power does not seem to be the issue. However,
an analysis of correlations of the same physical data of this experiment
with dierent psychological variables, in that case questionnaire data
obtained from each participant before the experiment, was marginally
signicant (p = 0.064) (Grote, 2021). Jolij and Bierman conducted two
replications of Bem’s retropriming paradigm, but found no eect (Jolij
& Bierman, 2019). However, when they analyzed the questionnaire data
that were also taken together with the psi data in a matrix analytical
approach, they found a signicant result (p < 0.03) in one experiment,
the smaller one with 61 participants, and a borderline signicant eect
in the second study (p = 0.06) with 222 participants. This is again a clear
hint that the decisive question is not about power.
These results have to be seen together with experiments by Ana
Borges in Edinburgh who has conducted three experiments herself with
clearly positive results and one commissioned by another experimenter
with negative results (Ana Borges, personal communication and
unpublished Ph.D. thesis, The University of Edinburgh, Department of
The results of the experiments of Ana Borges can only be really
discussed once they are fully published. Meanwhile one might suppose
that in those experiments we are dealing with an experimenter eect, as
the study conducted by a second experimenter who was indierent to
the results was clearly negative, while the studies conducted by Borges
herself, who is enthusiastic about this work, were positive. We had
810 Harald Walach et al.
such a setup implicitly in our experiments: KK tended toward hoping
to nd positive results, while HV was pretty indierent toward the
results of the experiment. Was the negative eect of HVs experiment
a negative experimenter eect? As the CIRTS theory would suggest, all
experiments might be in principle tests of experimenter PK (or the lack
of it) (Bierman, 2008).
A strength and weakness of our experiment at the same time was
the statistical analysis. The Monte-Carlo simulation of potential dierent
matrices produces an empirical distribution against which statistical
inferences can be made without any parametric assumptions and is thus
a straightforward, non-parametric analysis. It is comparatively stable:
The p-values change maximally by 10-3 and the values of signicant
simulated matrices by around 30, if the 10,000 iterations are repeated
30 times, i.e., instead of 1,069 signicant matrices which translates into
p = 0.106 we would have 1,099 signicant matrices which translates
into p = 0.109. It also corrects for potential causal biases, as these are
destroyed in the permutations.
But such an analysis also destroys the intricate network between
potential causal and non-local correlations, making the analysis
conservative. The type of analysis chosen and dened in the protocol
actually uses only the experimental matrix. One could also use dierence
scores between the experimental and the control matrix and other
metrics for the statistical analysis. We have done that for exploratory
purposes. But this does not change the result.
An optimal analysis might be able to use some dierence metric
between the control and the experimental matrix. One might argue
that the eect is embedded within the whole experiment and not
only within the experimental matrix. Thus, some dierence measure
between the two matrices might be better able to capture the eect.
This is for a subsequent analysis of the data to decide.
In our view, the results seem to suggest a decline eect as observed
by Maier and colleagues: Our own rst experimental results were the
stimulus for further work. They were very positive. The experiments
of Ana Borges were immediate successor experiments timewise and
were also positive. KK’s experiment was next and had a small, nearly
signicant eect. HVs experiment was the last in this series and had a
zero eect. This supports a decline eect and contradicts our expectation
Two Failed Replications of the Matrix Experiment 811
that the matrix method might help to mitigate such a decline. A decline
eect is a prediction of our model (Lucadou, 2015b; Lucadou et al.,
2007; Walach et al., 2014): The NT axiom states that whenever eects
due to generalized entanglement correlations are mistaken as causal
eects and could be used for signal transmission. the eects go away
(decline), or change channel, i.e., become visible in another parameter
not tested, or change sign, i.e., become obvious in the control group.
Obviously, the NT axiom (Lucadou et al., 2007) cannot be circum-
vented as we had hoped. It may take longer before a decline comes
into eect. But eventually there is no experimental system that gener-
ates its own comparison standard through a control group that can
elude it. For no matter how complex the system or how many degrees
of freedom, eventually there will always be an option to code a signal.
In our case it would have been the number of signicant correlations.
We had similar experiences with other experimental models. A
careful pilot study of a DMILS replication, in which we tried to replicate
the originally successful DMILS studies of Schlitz and Braud (Braud
& Schlitz, 1983; Braud & Schlitz, 1991; Schlitz & Braud, 1997), yielded
a strong positive eect of r = .35, which was, however, not tested
statistically as per protocol (Schmidt et al., 2001). A large replication with
sucient participants for detecting a much smaller eect failed utterly
(Schmidt, 2002; Schmidt et al., 2002). We replicated the Grinberg-
Zylberbaum study in which he had claimed that a visual stimulation of
one subject had introduced transferred evoked potentials in the EEG of
a spatially distant, but connected participant (Grinberg-Zylberbaum et
al., 1994). In our study we could not nd transferred potentials as such,
but signicant deviations from chance expectations (Wackermann et
al., 2003). Harald Walach commissioned two large-scale replications in
the same lab, which were clearly positive, but never published (Claudio
Naranjo, personal communication; he had conducted the studies but
was prohibited from publishing the data by Wackermann, the former
head of the lab). We thought we had a replicable, if complicated
paradigm and conducted another replication which was meant to be
completely foolproof against fraud and artifacts, as it was between
subjects separated by about 800 kms. But we could not nd the eect
in its original signature. We found an eect in the alpha frequency band
which was signicant in three studies. However, the relevance of this
812 Harald Walach et al.
eect remains unclear as it only showed up aer averaging thousands
of trials. Instead, we saw an unexpected anticipatory or precognition
eect (Hinterberger et al., 2008, 2007). The reverse priming study by
Daryl Bem (Bem, 2011) did not prove to be as replicable as hoped either
(Jolij & Bierman, 2019; Rabeyron, 2014; Ritchie et al., 2012).
It seems we have enough controversial data and failed replications.
It is important to note at this point: Failed replications and positive
results in meta-analyses do not contradict each other. It is possible that
in a long series of experiments some very careful negative replications,
although they might be important, either do not (Schmidt et al., 2004b),
or only partially (Bösch et al., 2006) inuence the summary result of the
meta-analysis, because many other positive results are published or
because eects that have been negative in the hands of one research
group recover in other labs (Bierman, 2001). This is to be expected
under the NT axiom, since it only applies to strict replications. As soon
as parameters are changed, and they usually are when other groups
replicate an experiment, it is, technically speaking, a new experiment,
even though it might use the same experimental model and will be
analyzed under the same umbrella by meta-analysts. Thus, one way
out of the conundrum would be to conduct replications as conceptual
replications, changing important elements in an experimental paradigm
so as to prevent it from being a direct replication which could be used
for signal coding.
Another thought might be worth considering: If our hypothesis
is correct and generalized entanglement correlations exist and are the
basis for most, if not all PSI phenomena, then we need to consider the
fact that in real life they are normally always embedded in a series of
local-causal correlations which also support and frame them, like water
is supported in a sponge (Lucadou, 2019). In the experimental situation
we are trying to separate the two out, squeezing the sponge, as it were,
and then are surprised to nd the structure and the water gone.
Thus, the current situation is an impasse: The directly replicable
paradigm that critics demand seems to be impossible. The fact that so
many studies have been conducted by dierent groups and in slightly
varying designs allows meta-analysts to draw positive conclusions.
Hence, both skeptics and proponents of PSI are right and wrong at
the same time. The “Dodo bird verdict” which has beset psychotherapy
Two Failed Replications of the Matrix Experiment 813
research is valid here as well: All have won and all must have prizes
(Luborsky et al., 2002; Rosenzweig, 1936). It has been pointed out that
this constitutes a paradox: If PSI is real, as a lot of the data suggest,
then by the same token it cannot be proven experimentally, because
the experimental paradigm presupposes the possibility of partitioning
reality into independent segments, which is exactly what PSI negates
(Rabeyron, 2020).
What this series of replications together with other evidence
shows, is in our view that a causal, signal-theoretical interpretation of
PSI is unlikely. It rather strengthens, even though indirectly, an analysis
and theoretical model that assumes these eects to be instances
of generalized entanglement correlations, or similar processes.
If so, critics will remark: Why is it that entanglement correlations
could be empirically proven in the physical case, but not in such a
generalized case as in parapsychology? The answer to this question is
straightforward: In the physical case we have a very strong formalism
that allows the derivation of expectation values or empirical bounds
that are theoretically dened, such as Bell’s inequalities. This dened
frame is not given in the generalized case because the model is not
strong enough and does not contain enough quantitative terms that
would allow such a derivation. In the physical case, only combinations
of for example polarization angles are measured, and whether they
are correlated or not is not determined by an experimental control
group of dierent or incompatible angles, but by the violation of
Bell’s inequalities, i.e., by the theoretical distribution of two joint
probabilities. This is structurally completely dierent from determining
the control standard by a control experiment. As long as we do not have
an equally strong theoretical framework, we will not be able to provide
a straightforward proof of the facticity of generalized entanglement
Proponents of remote viewing experiments oen lament about the
inadequacy of experimenting with people who have no special gi for
PSI, as is the rule in experiments like ours (May et al., 2018). They liken it
to trying to judge musical prowess in an average group of people, some
of whom might be musically gied while the majority won’t be, diluting
the end result. Experimenting with gied people might help avoid this
pitfall. However, it was estimated that this will be maximally one or
814 Harald Walach et al.
two in one hundred (May et al., 2018). While this argument is certainly
convincing in part, it conates two distinct points: Working with gied
people is certainly a good idea. But this does not preclude failures,
as the failed replication by Walleczek and von Stillfried (2019) showed.
The remote viewing experiment is not an experiment in the sense the
term is used here. And this might be the reason why remote viewing
experiments cannot violate the NT axiom and hence can produce quite
stable results (Targ, 2019; Targ & Katra, 2000).
In remote viewing there is no control standard that is produced by
the experiment. The control is the expectation of no special information
transferred, which is a generic null-expectation. Therefore, it can be
replicated at will. The NT axiom would only come into eect in the
counterfactual situation, which by denition never exists, if the same
person were to target the very same target twice. But the same remote
viewer will not normally do this, and once a target is described there is
no point in having this repeated. Also, in experimental setups that are
similar, targets and participants are normally changed, thus implicitly
avoiding the NT axiom. Therefore, some free-response remote viewing
or telepathy studies might be able to eschew the NT axiom, but all
studies that produce their own control standard in a control group and
are replicated as an exact replication will have the same problems as
we experienced. Unfortunately, remote viewing and Ganzfeld telepathy
studies belong to a category where a lot of expert knowledge, material,
and facilities are necessary and hence do not lend themselves to
the type of classroom experiment that is set up quickly and easily to
demonstrate telepathy.
Thus, we might have to live with the fact that a denitive
experimental paradigm is very dicult, if not impossible, to have. As
long as a paradigm incorporates enough changes, for instance by way
of conceptual replication, or changing variables, or outcome measures
each time it is conducted, it may eschew the NT axiom. But by the
same token it will also be less convincing to skeptics, who will keep
demanding a strict replication. Thus, skeptics will likely have an easy
life: They won’t be bullied into acceptance by a foolproof experimental
paradigm of PSI, because it simply may not exist. So, is experimenting,
then, unnecessary and a waste of time and resources? Probably not,
because it might teach us about higher order parameters, such as the
Two Failed Replications of the Matrix Experiment 815
recovery time it takes until an eect bounces back, or about the amount
of change necessary to make an experiment conceptually a new one
(Dechamps & Maier, 2020; Maier et al., 2018). Or it might help decide
between theoretical options (Bierman, 2010). Or it might yield a higher
class of models that not only predict when an eect might appear, but
also when it will go away. Experimenting might thus also produce the
parameters necessary to build a fuller model that contains enough
richness to derive a formally more stringent theory.
But we should probably give up the hope that the intellectual ght
about whether anomalous cognition eects or PSI is real, can be won
with the brute force of rational argument and experimental evidence
alone. This is very rarely the nal arbiter anyway, even for very mundane
questions, where social movements, intellectual fashions, generic
worldviews, political considerations are oen much more important
(Latour, 1999). Perhaps a mixed approach will be best: devising clever
experiments, avoiding the pitfalls of the NT axiom by changing
procedures in replications, not forgetting qualitative real-world studies,
observations of natural occurrence of PSI and analytical arguments
combating the prevailing naturalistic stance that is more of a dogma
than an intellectual necessity (van Fraassen, 2016; Williams & Robinson,
2016). All this together might help opening up the community for the
possibility of PSI. Producing a nal proof is likely a vain expectation, as
our results show.
Our conclusion is: The matrix experiment is likely not a replicable
experiment. The NT axiom that prohibits signal transfer in systems that
are built on correlations might be operative even in this sophisticated
experimental design. This makes likely that such eects are not of a
local-causal nature. In addition, artefacts might be operative in this
highly complex study. There might be other regularities involved
which we do not understand as yet, but we can preclude signals with
a high likelihood, else we would have seen their eect. Future studies
should determine if conceptual replications of the matrix experiment
changing important elements and parameters can avoid the NT axiom.
In addition, further research eorts could advance the experimental
setup of the matrix experiment/CMM, transferring it to other psi areas.
816 Harald Walach et al.
This work was funded by Bial Grant 400/14. The study by KK was a
master’s thesis at the Technical University Chemnitz. We thank Nikolaus
von Stillfried for important conceptual and logistic help, especially with
the expert symposium that allowed the formulation of the consensus
protocol. We thank Torkel Falkenberg who helped secure co-funding
for the expert-symposium and the meeting venue. We are grateful to
all the experts who participated and who helped formulate this protocol
in various Delphi rounds.
Author Statements: HW developed the original protocol,
secured funding, organized Experiment 2, and wrote the rst dra
of the manuscript. He participated in discussing and nalizing the
manuscript. KK and HV conducted the experiments and collected
data. PS supervised Experiment 1. WvL was the senior advisor in this
project, provided his material, and helped with the understanding of
the experimental procedures and discussion of results. TH conducted
the statistical analyses. All authors participated in writing, reviewing,
and approving the nal manuscript.
Conict of Interests: None of the authors has a conict of interest.
Abbot, B. P., & LIGO Scientic Collaboration and Virgo Collaboration. (2016). Ob-
servation of gravitational waves from a binary black hole merger. Physical
Review Letter, 116(061102). doi:10.1103/PhysRevLett.116.061102
Alcock, J. E. (2003). Give the null hypothesis a chance: Reasons to remain doubtful
about the existence of PSI. Journal of Consciousness Studies, 10(6–7), 29–50.
Aspect, A., Dalibard, J., & Roger, G. (1982a). Experimental test of Bell’s inequalities
using time varying analyzers. Physics Review Letter, 49, 1804–1807.
Aspect, A., Grangier, P., & Roger, G. (1982b). Experimental realization of Einstein-
Podolsky–Rosen–Bohm–Gedankenexperiment: A new violation of Bell’s
inequalities. Physics Review Letter, 49, 91–94.
Atmanspacher, H., & Primas, H. (2006). Pauli’s ideas on mind and matter in the
context of contemporary science. Journal of Consciousness Studies, 13(3),
Atmanspacher, H., Primas, H., & Wertenschlag-Birkhäuser, E. (Eds.). (1995a).
Der Pauli-Jung-Dialog und seine Bedeutung für die moderne Wissenscha.
Two Failed Replications of the Matrix Experiment 8 17
Atmanspacher, H., Römer, H., & Walach, H. (2002). Weak quantum theory: Com-
plementarity and entanglement in physics and beyond. Foundations of
Physics, 32, 379–406.
Atmanspacher, H., Wiedenmann, G., & Amann, A. (1995b, January/February).
Descartes revisited: The endo-exo distinction and its relevance for the
study of complex systems. Complexity.
Bell, J. S. (1987). Speakable and unspeakable in quantum mechanics. Cambridge Uni-
versity Press.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retro-
active inuences on cognition and aect. Journal of Personality and Social
Psychology, 100, 407–425.
Bierman, D. J. (2000). On the nature of anomalous phenomena: Another reality
between the world of subjective consciousness and the objective world
of physics? In P. van Loocke (Ed.), The physical nature of consciousness (pp.
269–292). John Benjamins.
Bierman, D. J. (2001). On the nature of anomalous phenomena: Another reality
between the world of subjective consciousness and the objective world
of physics? In P. Van Loocke (Ed.), The physical nature of consciousness (pp.
269–292). Benjamins.
Bierman, D. J. (2008). Consciousness induced restoration of time symmetry (CIRTS), a
psychophysical theoretical perspective. Paper presented at the Parapsycho-
logical Association 51st Annual Convention, Winchester, UK.
Bierman, D. J. (2010). Consciousness induced restoration of time symmetry
(CIRTS): A psychophysical theoretical perspective. Journal of Parapsychol-
ogy, 74(2), 273–299.
Bösch, H., Steinkamp, F., & Boller, E. (2006). Examining psychokinesis: The inter-
action of human intention with random number generators—A meta-
analysis. Psychological Bulletin, 132, 497–523.
Braud, W. (2002). Psi-favorable conditions. In V. G. Rammohan (Ed.), New frontiers
of human science: A festschri for K. Ramakrishna Rao (pp. 95–118). McFar-
Braud, W., & Schlitz, M. (1983). Psychokinetic inuence on electrodermal activity.
Journal of Parapsycholoty, 47, 95–119.
Braud, W., & Schlitz, M. J. (1991). Conscious interactions with remote biological
systems: Anomalous intentionality eects. Subtle Energies, 2, 1–46.
Braude, S. E. (1986). The limits of inuence. Psychokinesis and the philosophy of sci-
ence. Routledge and Kegan Paul.
Braude, S. E. (2017). The mediumship of Carlos Mirabelli (1889–1951). Journal of
Scientic Exploration, 31, 435–456.
Cardeña, E. (2018). The experimental evidence for parapsychological phenomena:
818 Harald Walach et al.
A review. American Psychologist, 73(5), 663–677. doi:10.1037/amp0000236
Carr, B. J. (2015a). Higher dimensions of space and time and their implications
for psi. In E. May & S. Marwaha (Eds.), Extrasensory perception: Support,
skepticim and science, Vol. 2 (pp. 21–61). Greenwood.
Carr, B. J. (2015b). Hyperspatial models of matter and mind. In E. F. Kelly, A. Crab-
tree, & P. Marshall (Eds.), Beyond physicalism: Toward reconciliation of sci-
ence and spirituality (pp. 227–273). Rowman & Littleeld.
d’Espagnat, B. (1997). Aiming at describing empirical reality. In R. S. Cohen, M.
Horne, & J. Stachel (Eds.), Potentiality, entanglement, and passion-at-a-
distance (pp. 71–87). Kluwer.
Dechamps, M. C., & Maier, M. A. (2020). How smokers change their world and
how the world responds: Testing the oscillatory nature of micro-psycho-
kinetic observer eects on addiction-related stimuli. Journal of Scientic
Exploration, 33, 406–434. doi:10.31275/2019/1513
Dobyns, Y. H., & Nelson, R. D. (1998). Empirical evidence against decision aug-
mentation theory. Journal of Scientic Exploration, 12, 231–257.
Einstein, A., Podolsky, B., & Rosen, N. (1935). Can quantum-mechanical descrip-
tion of reality be considered complete? Physical Review, 47, 777–780.
Filk, T., & Römer, H. (2011). Generalized quantum theory: Overview and latest de-
velopments. Axiomathes, 21, 211–220. doi: 10.1007/s10516-010-9136-6
Glicksohn, J. (2001). Temporal cognition and the phenomenology of time: A mul-
tiplicative function for apparent duration. Consciousness and Cognition,
10, 1–25.
Glicksohn, J., Mourad, B., & Pavell, E. (1992). Imagination, absorption and subjec-
tive time estimation. Imagination, Cognition and Personality, 11(2), 167–176.
Grinberg-Zylberbaum, J., Delaor, M., Attie, L., & Goswami, A. (1994). The Ein-
stein-Podolsky-Rosen paradox in the brain: The transferred potential.
Physics Essays, 7, 422–427.
Grosso, M. (2016). The man who could y: St.Joseph of Copertino and the mystery of
levitation. Rowman & Littleeld.
Grote, H. (2015). A correlation study between human intention and the output
of a binary random event generator. Journal of Scientic Exploration, 29,
Grote, H. (2017). Multiple-analysis correlation study between human psychologi-
cal variables and binary random events. Journal of Scientic Exploration,
31, 231–254.
Grote, H. (2018). Gravitationswellen. Geschichte einer Jahrhundertentdeckung [Gravi-
tation waves. History of a once-in-a-century discovery]. Beck.
Grote, H. (2021). Mind-matter entanglement correlations: Blind analysis of a new
correlation matrix experiment. Journal of Scientic Exploration, 35(2), 287–
Two Failed Replications of the Matrix Experiment 8 19
Hamero, S., & Penrose, R. (2014). Consciousness in the universe: A review of the
‘Orch OR’ theory. Physics in Life Review, 11, 39–78.
Handsteiner, J., Friedman, A. S., Rauch, D., Gallicchio, J., Liu, B., Hosp, H., . . . Zeil-
inger, A. (2017). Cosmic Bell test: Measurement settings from Milky Way
stars. Physical Review Letters, 118(6), 060401-060401-060408. doi:10.1103/
Heim, B. (1989). Elementarstrukturen der Materie. Einheitliche strukturelle Quanten-
feldtheorie der Materie und Gravitation Bd. 1. Resch-Verlag.
Hinterberger, T., Mochty, U., Schmidt, S., Erat, L.-M., & Walach, H. (2008). EEG
Korrelationen zwischen räumlich weit entfernten Paaren [EEG correlati-
ons between spatially distant pairs]. Zeitschri für Anomalistik, 8, 55–75.
Hinterberger, T., Studer, P., Jäger, M., Haverty-Stacke, C., & Walach, H. (2007).
The slide-show presentiment eect discovered in brain electrical activity.
Journal of the Society of Psychical Research, 71, 148–166.
Horton, R. (2015). Oine: What is medicine’s 5 sigma? Lancet, 385, 1380. doi:10.1016/
Houtkooper, J. M. (2002). Arguing for an observational theory of paranormal phe-
nomena. Journal of Scientic Exploration, 16, 171–186.
Ioannidis, J. P. A. (2005). Why most published research ndings are false. PLoS
Medicine, 2(8), e124.
Jahn, R. G., & Dunne, B. J. (1987). Margins of reality. The role of consciousness in the
physical world. Harcourt Brace Jovanovich.
Jahn, R. G., Dunne, B. J., Bradish, G. J., Dobyns, Y. H., Lettieri, A., Nelson, R. D.,
. . . Walter, B. (2000). Mind/machine interaction consortium: PortREG
replication experiments. Journal of Scientic Exploration, 14, 499–555.
Jolij, J., & Bierman, D. J. (2019). Two attempted retro-priming replications show
theory-relevant anomalous connectivity. Journal of Scientic Exploration,
33(1), 43–60.
Jung, C. G. (1952). Synchronizität als ein Prinzip akausaler Zusammenhänge. In C.
G. Jung & W. Pauli (Eds.), Naturerklärung und psyche (pp. 1–107). Rascher.
Latour, B. (1999). Pandora’s hope: An essay on the reality of science studies. Harvard
University Press.
Luborsky, L., Rosenthal, R., Diguer, L., Andrusyna, T. P., Berman, J. S., Levitt, J. T.,
. . . Krause, E. D. (2002). The Dodo bird verdict is alive and well—Mostly.
Clinical Psychology: Science and Practice, 9, 2–12.
Lucadou, W. v. (1974). Zum parapsychologischen ExperimentEine methodolo-
gische Skizze. Zeitschri für Parapsychologie und Grenzgebiete der Psycho-
logie, 16, 57–62.
Lucadou, W. v. (1986). Keine Spur von PsiZusammenfassende Darstellung eines
umfangreichen Psychokineseexperiments. Zeitschri für Parapsychologie
und Grenzgebiete der Psychologie, 29, 169–197.
820 Harald Walach et al.
Lucadou, W. v. (1987a). A multivariate PK experiment. Part I. An approach combin-
ing physical and psychological conditions of the PK process. European
Journal of Parapsychology, 6(4), 305–345.
Lucadou, W. v. (1987b). A multivariate PK experiment. Part III. Is PK a real force?
The results and their interpretation. European Journal of Parapsychology,
6(4), 369-428.
Lucadou, W. v. (1991). Locating Psi-burstsCorrelations between psychological char-
acteristics of observers and observed quantum physical uctuations. Paper
presented at the The Parapsychological Association 34th Annual Conven-
tion, Proceedings of Presented Papers. Heidelberg.
Lucadou, W. v. (1995). The model of pragmatic information (MPI). European Journal
of Parapsychology, 11, 58–75.
Lucadou, W. v. (2000). Backward causation and the Hausdor-Dimension of singu-
lar events. Paper presented at the Proceedings of Presented Papers, The
Parapsychological Association 43rd Annual Convention August 17–20,
Lucadou, W. v. (2006). Self-organization of temporal structures—A possible solu-
tion for the intervention problem. In D. P. Sheehan (Ed.), Frontiers of time.
RetrocausationExperiment and theory (pp. 293-315). American Institute of
Lucadou, W. v. (2015a, July 16-19). The correlation-matrix method (CMM)A new light
upon the repeatability problem of parapsychology. Paper presented at the
58th Annual Convention of the Parapsychological Association, University
of Greenwich.
Lucadou, W. v. (2015b). The Model of Pragmatic Information (MPI). In E. C. May &
S. Marwaha (Eds.), Extrasensory perception: Support, skepticism, and science:
Vol. 2: Theories and the future of the eld (pp. 221–242). Praeger.
Lucadou, W. v. (2019). Homeopathy and the action of meaning: A theoretical ap-
proach. Journal of Scientic Exploration, 33, 213–254. doi:10.31275/2019.1343
Lucadou, W. v., Lay, B., & Kunzmann, H. (1987). A multivariate PK experiment.
Part II. Relationships between psychological variables. European Journal of
Parapsychology, 6(4), 347–368.
Lucadou, W. v., Römer, H., & Walach, H. (2007). Synchronistic phenomena as en-
tanglement correlations in generalized quantum theory. Journal of Con-
sciousness Studies, 14(4), 50–74.
Ma, X.-S., Zotter, S., Koer, J., Ursin, R., Jennewein, T., Brukner, C., & Zeilinger,
A. (2012). Experimental delayed-choice entanglement swapping. Nature
Physics, online. doi:10.1038/NPHYS2294
Maier, M. A., Büchner, V. L., Kuhbandner, C., Pitsch, M., Fernández-Capo, M., &
Gámiz-Sanfeliu, M. (2014). Feeling the future again: Retroactive avoid-
ance of negative stimuli. Journal of Consciousness Studies, 21(9-10), 121–152.
Two Failed Replications of the Matrix Experiment 82 1
Maier, M. A., & Dechamps, M. C. (2018). Observer eects on quantum random-
ness: Testing micro-psychokinetic eects of smokers on addiction-relat-
ed stimuli. Journal of Scientic Exploration, 32, 265–297.
Maier, M. A., Dechamps, M. C., & Pitsch, M. (2018). Intentional observer ef-
fects on quantum randomness: A Bayesian analysis reveals evidence
against micro-psychokinesis. Frontiers in Psychology, 9(379). doi:10.3389/
Manseld, V. (1995). Synchronicity, science, and soul-making. Understanding Jungian
synchronicity through physics, Buddhism, and philosophy. Open Court.
May, E. C., & Marwaha, S. B. (Eds.). (2018). The Star Gate Archives. Reports of the
United States Government sponsored psi program, 1972–1995. Vol 1: Remote
viewing, 1972-1984. Mc Farland.
May, E. C., & Marwaha, S. B. (Eds.). (2019a). The Star Gate Archives. Reports of the
United States Government-sponsored psi program 1972–1995. Vol. 3: Psycho-
kinesis. McFarland.
May, E. C., & Marwaha, S. B. (Eds.). (2019b). The Star Gate Archives. Reports of the
United States Government sponsored psi program, 1972–1995. Vol 2: Remote
viewing: 1985–1995. McFarland.
May, E. C., Spottiswoode, S. J., Utts, J. M., & James, C. L. (1995). Applications of
Decision Augmentation Theory. Journal of Parapsychology, 59, 221–250.
May, E. C., Spottiswoode, S. J. P., & Faith, L. V. (2000). The correlation of the gradi-
ent of shannon entropy and anomalous cognition: Toward an AC sensory
system. Journal of Scientic Exploration, 14, 53–72.
May, E. C., Utts, J. M., & Spottiswoode, S. J. P. (1996). Deicision augmentation
theory: Applications to the random number generator. Journal of Scientic
Exploration, 9, 453–488.
May, E. C., Utts, J. M., Trask, V. V., Luke, W. W., Frivold, T. J., & Humphrey, B. S.
(2018). Review of the psychoenergetic research conducted at SRI Inter-
national (1973–1988). In E. C. May & S. B. Marwala (Eds.), The Star Gate
Archives. Reports of the United States Government sponsored psi program,
1972-1995. Vol 1: Remote viewing, 1972–1984 (pp. 495-504). McFarland.
Meier, C. A. (Ed.) (1992). Wolfgang Pauli und C. G. Jung. Ein Briefwechsel 1932–1958.
Meier, C. A. (Ed.) (2001). Atom and archetype: The Pauli/Jung letters 1932–1958. Princ-
eton University Press.
Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Per-
cie du Sert, N., . . . Ioannidis, J. P. A. (2017). A manifesto for reproducible
science. Nature Human Behaviour, 1, 0021. doi:10.1038/s41562-016-0021
Open Science Collaboration. (2015). Estimating the reproducibility of psychologi-
cal science. Science, 349(6251), aac4716. doi:10.1126/science.aac4716
822 Harald Walach et al.
Pallikari, F. (2001). A study of the fractal character in electronic noise process. Cha-
os, Solitons and Fractals, 12, 1499–1507.
Peat, F. D. (1992). Synchronizität: Die verborgene Ordnung [Synchronicity: The hidden
order]. Goldmann.
Pekala, R. J. (1995). A short, unobtrusive hypnotic-assessment procedure for assess-
ing hypnotizability level: I. Development and research. American Journal
of Clinical Hypnosis, 37, 271–283.
Penrose, R. (2004). The Road to reality: A complete guide to the laws of the universe.
Jonathan Cape.
Primas, H. (1994a). Endo- and exo-theories of matter. In H. Atmanspacher & G.
J. Dalenoort (Eds.), Inside versus outside (Vol. 63, pp. 163–193). Springer.
Primas, H. (1994b). Realism and quantum mechanics. In D. Prawitz, B. Skyrms,
& D. Westerstahl (Eds.), Logic, methodology and philosophy of science IX:
Proceedings of the Ninth International Contress of Logic, Methodology and
Philosophy of Science, Uppsala, Sweden, August 7-14, 1991 (pp. 609–631).
Rabeyron, T. (2014). Retro-priming, priming, and double testing: Psi and repli-
cation in a test–retest design. Frontiers in Human Neuroscience, 8(154).
Rabeyron, T. (2020). Why most research ndings about psi are false: The replicabil-
ity crisis, the psi paradox and the myth of Sisyphus. Frontiers in Psychol-
ogy, 11(2468). doi:10.3389/fpsyg.2020.562992
Radin, D. (2018). Real magic. Ancient wisdom, modern science, and a guide to the secret
power of the universe. Harmony.
Radin, D., Michel, L., Galdamez, K., Wendland, P., Rickenbach, R., & Delorme, A.
(2012). Consciousness and the double-slit interference pattern: Six ex-
periments. Physics Essays, 25, 157–171.
Reber, A. S., & Alcock, J. E. (2020). Searching for the impossible: Parapsychol-
ogy’s elusive quest. American Psychologist, 75, 391–399. doi:10.1037/
Ritchie, S. J., Wiseman, R., & French, C. C. (2012). Failing the future: Three unsuc-
cessful attempts to replicate Bem’s ‘retroactive facilitation of recall’ eect.
PLoS One, 7(3), e33423. doi: 10/1371/journal.pone.0033423
Rosenzweig, S. (1936). Some implicit common factors in diverse methods in psy-
chotherapy. American Journal of Orthopsychiatry, 6, 412–415.
Salart, D., Baas, A., Branciard, C., Gisin, N., & Zbinden, H. (2008). Testing spooky
actions at a distance. Nature, 454, 861–864
Schlitz, M., & Braud, W. (1997). Distant intentionality and healing: Assessing the
evidence. Alternative Therapies in Health and Medicine, 3, 38–53.
Schmidt, S. (2002). Aussergewöhnliche Kommunikation? Eine kritische Evaluatgion
Two Failed Replications of the Matrix Experiment 82 3
des parapsychologischen Standardexperimentes zur direkten mentalen Inter-
aktion. BIS.
Schmidt, S. (2016). Shall we really do it again? The powerful concept of replication
is neglected in the social sciences. In A. E. Kazdin (Ed.), Methodological
issues and strategies in clinical research (pp. 581–596). American Psychologi-
cal Association.
Schmidt, S., Müller, S., & Walach, H. (2004a). Do you know who is on the phone?
Replication of an experiment on telephone telepathy. Paper presented at the
The Parapsychological Association 47th Annual Convention, Vienna.
Schmidt, S., Schneider, R., Binder, M., Bürkle, D., & Walach, H. (2001). Investigat-
ing methodological issues in EDA-DMILS: Results from a pilot study.
Journal of Parapsychology, 65, 59–82.
Schmidt, S., Schneider, R., Utts, J., & Walach, H. (2002). Remote intention on elec-
trodermal activity - Two meta-analyses. Parapsychological Association An-
nual Convention.
Schmidt, S., Schneider, R., Utts, J., & Walach, H. (2004b). Remote intention on
electrodermal activity—Two meta-analyses. British Journal of Psychology,
95, 235–247.
Schooler, J. (2011). Unpublished results hide the decline eect. Nature, 470, 437.
Schrödinger, E. (1935). Discussion of probability relations between separated sys-
tems. Proceedings of the Cambridge Philosophical Society, 31, 555–563.
Schwartz, G. E., Woollacott, M., Schwartz, S., Barušs, I., Beauregard, M., Dossey,
L., . . . Tart, C. (2018). The Academy for the Advancement of Postma-
terialist Sciences: Integrating Consciousness into Mainstream Science,.
Explore. The Journal of Science and Healing, 14(2), 111–113.
Sedlmeier, P., Winkler, I., & Lukina, A. (2020). How long did the time spent in med-
itation feel? “Attention. Attention. Attention.”. Psychology of Conscious-
ness: Theory, Research, and Practice.
Sheldrake, R., & Smart, P. (2003). Videotaped experiments on telephone telepathy.
Journal of Parapsychology, 67, 187–206.
Shimony, A. (1989). Search for a worldview which can accomodate our knowledge
of microphysics. In J. T. Cushing & E. Mcmullin (Eds.), Philosophical con-
sequences of quantum theory: Reections on Bell’s Theorem (pp. 25–37). Uni-
versity of Notre Dame Press.
Stefanov, A., Zbinden, H., Gisin, N., & Suarez, A. (2002). Quantum correlations
with spacelike separated beam splitters in motion: Experimental test of
multisimultaneity. Physical Review Letters, 88, 120404.
Stillfried, N. v. (2010). Generalized entanglement: Theoretical and experimental explo-
rations. (PhD), Europa Universität Viadrina, Frankfurt (Oder).
Targ, R. (2019). What do we know about PSI? The rst decade of remote-viewing
824 Harald Walach et al.
research and operations at Stanford Research Institute. Journal of Scien-
tic Exploration, 33, 569–592.
Targ, R., & Katra, J. E. (2000). Remote viewing in a group setting. Journal of Scien-
tic Exploration, 14, 107–114.
van Fraassen, B. (2016). Naturalism in epistemology. In R. N. Williams & D. N.
Robinson (Eds.), Scientism: The new orthodoxy (pp. 64–95). Bloomsbury.
Wackermann, J., Seiter, C., Keibel, H., & Walach, H. (2003). Correlations between
brain electrical activities of two spatially separated human subjects. Neu-
roscience Letters, 336, 60–64.
Walach, H., Horan, M., Hinterberger, T., & von Ludacou, W. (2020). Evidence for
anomalistic correlations between human behavior and a random event
generator – Result of an independent replication of a micro-PK experi-
ment. Psychology of Consciousness: Theory, Research, and Practice, 7(2),
173–188. doi:10.1037/cns0000199
Walach, H., Lucadou, W. v., & Römer, H. (2014). Parapsychological phenomena as
examples of generalized non-local correlations - A theoretical framework.
Journal of Scientic Exploration, 28, 605–631.
Walach, H., & von Stillfried, N. (2011a). Generalised quantum theory—Basic idea
and general intuition: A background story and overview. Axiomathes, 21,
185-209. doi:10.1007/s10516-010-9145-5
Walach, H., & von Stillfried, N. (2011b). Generalizing quantum theory - Approaches
and applications. Axiomathes, 21 (2)(Special Issue), 185–371.
Walker, E. H. (1975). Foundations of paraphysical and parapsychological phenom-
ena. In L. Oteri (Ed.), Quantum physics and parapsychology (pp. 1–53): Para-
psychology Foundation.
Walker, E. H. (1979). The quantum theory of psi phenomena. Psychoenergetic Sys-
tems, 3, 259–299.
Walker, E. H. (2011, orig. 1974). Consciousness and quantum theory. In E. Mitchell
(Ed.), Psychic explorations: A challenge for science (pp. 544–568). Cosimo.
Walleczek, J., & von Stillfried, N. (2019). False-positive eect in the Radin dou-
ble-slit experiment on observer consciousness as determined with the
advanced meta-experimental protocol. Frontiers in Psychology, 10(1891).
Watt, C., & Tierney, I. (2013). A preliminary test of the Model of Pragmatic Informa-
tion using cases of spontaneous anomalous experience. Journal of Con-
sciousness Studies, 20(11), 205–220.
Williams, R. N., & Robinson, D. N. (Eds.). (2016). Scientism: The new orthodoxy.
Two Failed Replications of the Matrix Experiment 82 5
Appendix Figure 1. Distribution of sampling of True RNG.
Appendix Figure 2. Distribution of sampling of our traditional RNG.
Appendix Figure 3. REG-output of all REGs before normalization.
Appendix Figure 4. REG-output of all REGs aer normalization.
826 Harald Walach et al.
Number of Signicant Matrix Elements in the 45 x 45 Experimental Matrix
Compared to the Control Matrices C1 and C2 and to Chance Expectation Depending
on Signicance Level. Experiment 1 by KK, Original Analysis.
Because the data of this analysis were based on KK’s own analytic strategy which is slightly
dierent from that of TH who evaluated the data for this experiment statistically, some numbers
deviate from Table 1.
Statistical Analysis of Experiment 1 – 27*45 Matrix;
Randomization Test with 10,000 Iterations
sig_th: theoretical signicance level at which the number of signicant correlations is counted
z0: number of signicant correlations empirically found at respective level
n_sim: number of simulated matrices out of 10,000 with signicant correlations at or above the
number found empirically
p_sim: actual signicance level of observed number of correlations (n_sim/10,000)
z0_part: number of correlations in time-forward (upper) part of the matrix
n_part_sim: number of signicant correlations found in 10,000 simulations at respective level in
upper part of the matrix
p_part_sim: actual signicance level of observed number of correlations (n_part_sim/10,000) in
upper part of the matrix
sig_th 0.1 0.05 0.02 0.01 0.005 0.002 0.0010.0005 0.0002 0.0001
full z0 163.00 96.00 45.00 21.00 8.00 7.00 3.00 2.00 0.00 0.00
full n_sim 697 439 511 1118 2482 585 990 676 1716 937
full p_sim 0.0697 0.0439 0.0511 0.1118 0.2482 0.0585 0.099 0.0676 0.1716 0.0937
part z0_part 83.00 36.00 13.00 8.00 1.00 1.00 0.00 0.00 0.00 0.00
part n_part_sim 405 1817 2831 1903 5948 2727 3172 1879 838 446
part p_part_sim 0.0405 0.1817 0.2831 0.1903 0.5948 0.2727 0.3172 0.1879 0.0838 0.0446
Two Failed Replications of the Matrix Experiment 82 7
Statistical Analysis of Experiment 1 – 18*27 Matrix;
Randomization Test with 10,000 Iterations
sig_th: theoretical signicance level at which the number of signicant correlations is counted
z0: number of signicant correlations empirically found at respective level
n_sim: number of simulated matrices out of 10,000 with signicant correlations at or above the
number found empirically p_sim: actual signicance level of observed number of correlations
0_part: number of correlations in time-forward (upper) part of the matrix
n_part_sim: number of signicant correlations found in 10,000 simulations at respective level in
upper part of the matrix
p_part_sim: actual signicance level of observed number of correlations (n_part_sim/10,000) in
upper part of the matrix
Number of Signicant Matrix Elements in the 45 x 9 Varied Experimental Matrix
with Psychological Variables Obtained by Questionnaire Compared to the Control
Matrices C1 and C2 and to Chance Expectation Depending on Signicance Level;
Experiment 1 by KK, Original Analysis
Note. The number of correlations were calculated between 45 physical variables (TR, DT, KR, ZT,
ZV x 9 runs) and 9 psychological variables (joy, love, anger, grief, fear, arousal, inner dialogue,
direction of attention, absorption), reecting the states of consciousness of the participants
measured with the Phenomenology of Consciousness Inventory (PCI) (Pekala, 1995).
sig_th 0.1 0.05 0.02 0.01 0.005 0.002 0.001 0.0005 0.0002 0.0001
full z0 63.00 33.00 18.00 6.00 3.00 3.00 1.00 1.00 0.00 0.00
full n_sim 1667 1876 1015 2631 2381 722 1071 490 788 427
full p_sim 0.1667 0.1876 0.1015 0.2631 0.2381 0.0722 0.1071 0.049 0.0788 0.0427
part z0_part 30.00 11.00 4.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00
part n_part_sim 1606 3888 3778 4961 5211 2732 1600 870 376 215
part p_part_sim 0.1606 0.3888 0.3778 0.4961 0.5211 0.2732 0.16 0.087 0.0376 0.0215
828 Harald Walach et al.
Program Code for the Permutation Test in Matlab:
for n = 1:10000
 % random permutations
 EPh2 = EPh(:,randperm(size(EPh,2)));
 CPh2 = CPh(:,randperm(size(CPh,2)));
 EPs2 = EPs(:,randperm(size(EPs,2)));
 CPs2 = CPs(:,randperm(size(CPs,2)));
 % calculation of correlation matrix
 [E_rho, E_p] = corr( EPh2’,EPs2’, ‘type’, ‘Spearman’, ‘rows’, ‘all’, ‘tail’,’both’);
 [C_rho, C_p] = corr( CPh2’,EPs2’, ‘type’, ‘Spearman’, ‘rows’, ‘all’, ‘tail’,’both’);
nc= size(E_p,1)*size(E_p,2);
sig_th = [.1, .05,.02,.01, .005, .002, .001, .0005, .0002, .0001];
n_soll = sig_th.*nc;
for p_th = sig_th
n0_exp(ti) = sum(sum(psig));
n0_cont(ti) = sum(sum(psig));
... The field of parapsychology has to a great extent become divided into two camps, with one believing that progress is being made with experimental research (represented by most writers in Cardeña et al. [2015]), and the other believing that some property of psi prevents reliable control of the phenomena. The latter includes ideas such as that psi is intrinsically unrepeatable (Eisenbud, 1992(Eisenbud, /1963, is actively evasive (Beloff, 1994), is radically elusive (Batcheldor, 1994), manifests as a trickster (Hansen, 2001), is constrained to be unrepeatable and useless (Lucadou, 2001;Millar, 2015;Walach et al. 2021;Walach et al., 2014), and is unsustainable (Kennedy, 2003(Kennedy, , 2016a). These are not naïve newcomers to parapsychology or outsiders. ...
Full-text available
David Marks’ previous book about the paranormal (Marks, 2000) and other earlier writings established his reputation as a firm skeptic. He wrote the current book in order to learn about new developments in paranormal research during the past 20 years. The overall conclusion in this book is that Marks now believes that spontaneous paranormal phenomena may occur, but psi is a spontaneous process that cannot be controlled and demonstrated in laboratory experiments. His belief that instances of spontaneous psi may occur is based largely on a personal experience of synchronicity that had layers of meaning for him. The experience is described and evaluated in chapter four. He rates the probability as 75% that the experience had a paranormal component.
... The field of parapsychology has to a great extent become divided into two camps, with one believing that progress is being made with experimental research (represented by most writers in ), and the other believing that some property of psi prevents reliable control of the phenomena. The latter includes ideas such as that psi is intrinsically unrepeatable (Eisenbud, 1992(Eisenbud, /1963, is actively evasive (Beloff, 1994), is radically elusive (Batcheldor, 1994), manifests as a trickster (Hansen, 2001), is constrained to be unrepeatable and useless (Lucadou, 2001;Millar, 2015;Walach et al. 2021;Walach et al., 2014), and is unsustainable (Kennedy, 2003(Kennedy, , 2016a. These are not naïve newcomers to parapsychology or outsiders. ...
Full-text available
The work reported here is a rigorous conceptual replication of the so-called “Correlation-Matrix” experiment by an independent author. The experiment has been built from scratch with new hardware and software, testing 200 participants that have spent about half an hour each trying to ‘influence’ a physical random process visualized for feedback. The analysis software has been conceptualized following a strict blind analysis protocol. Blind analysis is a more rigid form of pre-registered analysis, in which the complete analysis software is written and tested before the data is actually analysed for the effect under study. The unblinding of the analysis, also called ‘opening of the box’ of the experiment described here has been performed live at the PA convention 2019 in Paris. The main result was found to be not statistically significant and fell well within the expected random distribution of possible results. A second experiment, also following a blind analysis protocol, included questionnaires that were correlated with the participants’ performance to ‘influence’ the physical random process (the main psi task). This yielded a probability of p=0.06 to have occurred by chance, under a null hypothesis. A post-hoc analysis of the hit rate for the psi task across all participants, which is mathematically independent from the correlation analysis, yielded a probability of p=0.06 as well, to have occurred by chance. Three unexpected anecdotal incidences that occurred during the execution of the experiment and the testing and actual analysis of the data may add to the canon of oddities and trickster-like effects sometimes reported in parapsychology research.
Full-text available
The replicability crisis in psychology has been influenced by the results of nine experiments conducted by Bem (2011) and presented as supporting the existence of precognition. In this paper, we hope to show how the debate concerning these experiments could be an opportunity to develop original thinking about psychology and replicability. After a few preliminary remarks about psi and scientific epistemology, we examine how psi results lead to a paradox which questions how appropriate the scientific method is to psi research. This paradox highlights a problem in the way experiments are conducted in psi research and its potential consequence on mainstream research in psychology. Two classical experiments - the Ganzfeld protocol and the Bem studies - are then analyzed in order to illustrate this paradox and its consequences. Mainstream research is also addressed in the broader context of the replication crisis, decline effect and questionable research practices. Several perspectives for future research are proposed in conclusion and underline the heuristic value of psi studies for psychology.
Full-text available
Beginning 1972, three physicists at Stanford Research Institute (now known as SRI International)––Harold Puthoff, Edwin May, and Russell Targ––initiated free-response remote viewing experiments with psi gifted participants. The percipients were asked to describe their mental images with regard to some person or event distant in space and time. Many of our experimental series were statistically significant at four standard deviations from chance expectation, with effect sizes greater than 0.6. From these highly efficient experiments, we concluded that the accuracy and reliability of remote viewing is independent of distance up to 10,000 km, and of time up to several days into the future. Psi ability clearly violates our ordinary ideas of causality, since future events are seen to be the cause or trigger for experiences at an earlier time. We also learned that feedback to the viewer is helpful, but it is not necessary. Remote viewing is a nonanalytic ability; describing a distant shape, form, or location on the planet is easier than guessing a number from 1 to 10. The purpose of this paper is to correct the misconception that psi is weak and unreliable. On the contrary, in our laboratory experiments and classified operational tasks psi was found to be surprisingly reliable and useful.
Full-text available
ABSTRACT. Naturalism is often presented as a methodological assumption, that the best way to find things out is by empirical means. But since no one doubts this, we understand it at once as signaling that there is nothing else to be found out in any case. That links the methodological dictum at once to Naturalism understood as the ontological view that everything there is, is physical or material or within the domain of the natural sciences. That side is not my topic; I will focus on traditional problems of epistemology. There is, as I see it, a definite parting of the ways, where Naturalism and Empiricism develop a completely different conception of the that subject. I will trace this development through the past half century or so, from the 1944 School of Naturalism, via Willard van Orman Quine, to such nearer contemporaries as Stephen Leeds, Michael Devitt, and Penelope Maddy.
Full-text available
According to standard quantum theory, the occurrence of a specifi c outcome during a quantum measurement is completely random (see Bell 1964). However, some authors refer to revised versions of quantum mechanics (e.g., Walker 2000, Penrose & Hameroff 2011, Mensky 2013, Stapp 2017), and propose that the human mind can actually infl uence the probability of such outcomes. Empirical support for this idea has been provided by micro-psychokinesis (micro-PK) research, which shows a small but signifi cant overall eff ect (see Bösch, Steinkamp, & Boller 2006). However, attempts to replicate specifi c fi ndings have often failed (e.g., Jahn et al. 2000), a critique that is not exclusive to psi paradigms. In an attempt to explain these failures, von Lucadou, Römer, and Walach (2007) established a theoretical model predicting unsystematic variations of such an infl uenc-ing eff ect across replications, resulting in a decline of a predictable eff ect in micro-PK data over time. Maier, Dechamps, and Pfl itsch (2018) slightly expanded this theory by proposing that the temporal variation of such an eff ect follows a systematic pattern, which can be tested and used for prediction making. In this research, we generated such a prediction using data from two previous studies that initially demonstrated a strong micro-PK followed by a subsequent decline in the eff ect over the course of 297 participants (Maier & Dechamps 2018); we then put it to the test with a preregistered additional set of newly collected data from 203 subjects. We compared these results with 10,000 simulated datasets (each set with an N = 203) each comprising random data. Three tests were applied to the experimental data: an area-under-the-curve analysis, a local maximum fi t test, and an endpoint fi t test. These tests revealed no signifi cant fi t of the real data regarding the predicted data pattern. Further analyses explored additional techniques, including an analysis of the highest-reached Bayes Factor (BF) over the course of the experiments, the overall orientation of the BF curve, and its transformation into oscillatory components via a Fourier analysis. All these methods allowed for statistically signifi cant diff erentiations between experimental data on the one side, and the control group and simulation data on the other. We conclude that the analyses of the temporal development of an eff ect along these lines constitute a fruitful approach toward testing non-random and volatile time trends within micro-PK data.
Full-text available
Prior work by Radin et al. (2012, 2016) reported the astonishing claim that an anomalous effect on double-slit (DS) light-interference intensity had been measured as a function of quantum-based observer consciousness. Given the radical implications, could there exist an alternative explanation, other than an anomalous consciousness effect, such as artifacts including systematic methodological error (SME)? To address this question, a conceptual replication study involving 10,000 test trials was commissioned to be performed blindly by the same investigator who had reported the original results. The commissioned study performed confirmatory and strictly predictive tests with the advanced meta-experimental protocol (AMP), including with systematic negative controls and the concept of the sham-experiment, i.e., counterfactual meta-experimentation. Whereas the replication study was unable to confirm the original results, the AMP was able to identify an unacceptably low true-negative detection rate with the sham-experiment in the absence of test subjects. The false-positive detection rate reached 50%, whereby the false-positive effect, which would be indistinguishable from the predicted true-positive effect, was significant at p = 0.021 (σ = −2.02; N = 1,250 test trials). The false-positive effect size was about 0.01%, which is within an-order-of-magnitude of the claimed consciousness effect (0.001%; Radin et al., 2016). The false-positive effect, which indicates the presence of significant SME in the Radin DS-experiment, suggests that skepticism should replace optimism concerning the radical claim that an anomalous quantum consciousness effect has been observed in a controlled laboratory setting.
Full-text available
The purpose of this paper is a theoretical one. It does not enter the debate of evidencebased medicine (EBM) about the validity of meta-analyses including pooled data from placebo-controlled clinical trials of homeopathy and the result of epidemiological clinical studies about the success of homoeopathic treatments. The paper tries to answer the question why extremely high diluted substances may be able to result in a medical reaction of a patient even if no single molecule of the used substance may be present in the medicament. This paper describes the Model of Pragmatic Information (MPI) and the Generalized Quantum Theory (GQT) and how they can be applied as a model to describe properties of homoeopathic treatment. From the point of view of the MPI and GQT, homoeopathic treatment and medicaments are "Pseudo-Machines" (PM). The Model of PM (MPM) includes sociological, psychological, physical, and causal- as well as non-causal (entanglement) processes which are relevant for the (homoeopathic) treatment. This means that the properties of PM can clearly be distinguished from placebos. In terms of MPM placebos can be considered as a specific form of PM. On the other hand, MPM is able to explain the limitation of double-blind placebo-controlled studies in medicine, Complementary Medicine (CM), and elsewhere. Finally, the paper describes an experimental method (Correlation Matrix Method, CMM) how the operation of PM can be tested empirically. Furthermore, this method allows distinguishing causal- and non-causal processes in medical treatments in general and is not limited to homoeopathy but could serve as a new approach in EBM. Keywords: Evidence based medicine (EBM), Complementary Medicine (CM), Homeopathy, Model of Pragmatic Information (MPI), Generalized Quantum Theory (GQT), Macroscopic Entanglement (ME), NT-axiom, Pseudo-Signals, Pseudo-Machines (PM), Correlation Matrix Method (CMM), Complementarity, Entanglement, Self-organization, Organizational Closure (OC)
Full-text available
We report a theory-relevant post hoc analysis of 2 Dutch retro-priming experiments that were part of a large replication project of the retro-priming experiment by Bem and colleagues. This replication project sought to investigate the role of the experimenter in psi studies. The results of the retro-priming experiments performed by student research groups at the University of Amsterdam (N=61) and of the University Groningen (N=222) however did not replicate Bem’s earlier findings of an anomalous interference of a future stimulus on response times. We report the results of these two studies here, but the over-all results will be reported elsewhere. Both Dutch studies used the exact same software as Bem and colleagues. However, both studies used different questionnaires. The questionnaires asked for information that in previous research had been associated with success in psi tasks and could help us to deal with individual differences but, above all, could be used as selection criteria for participants in future studies. In the Amsterdam study, there were 14 questions, while in the Groningen there were 55. A correlation analysis revealed several significant correlations between the psi-effect in the Bem task and questionnaire items. In this paper we focus on the post-hoc research question: Is this global composition of the correlation matrix anomalous, as suggested by Generalized Quantum Theory? Rather than using the subjective number of ‘significant’ correlations as a dependent variable, we introduced 2 objective measures directly representing the correlation values in the cells to characterize the ‘Connectivity’ in the matrix. Our analysis revealed ‘Connectivity’ to be marginally significant larger (p
Recently, American Psychologist published a review of the evidence for parapsychology that supported the general claims of psi (the umbrella term often used for anomalous or paranormal phenomena). We present an opposing perspective and a broad-based critique of the entire parapsychology enterprise. Our position is straightforward. Claims made by parapsychologists cannot be true. The effects reported can have no ontological status; the data have no existential value. We examine a variety of reasons for this conclusion based on well-understood scientific principles. In the classic English adynaton, "pigs cannot fly." Hence, data that suggest that they can are necessarily flawed and result from weak methodology or improper data analyses or are Type I errors. So it must be with psi effects. What we find particularly intriguing is that, despite the existential impossibility of psi phenomena and the nearly 150 years of efforts during which there has been, literally, no progress, there are still scientists who continue to embrace the pursuit. (PsycINFO Database Record (c) 2019 APA, all rights reserved).