Content uploaded by Espen A. Sjoberg
All content in this area was uploaded by Espen A. Sjoberg on Feb 18, 2017
Content may be subject to copyright.
Sjoberg Behav Brain Funct (2017) 13:3
Logical fallacies inanimal model
Espen A. Sjoberg*
Background: Animal models of human behavioural deﬁcits involve conducting experiments on animals with the
hope of gaining new knowledge that can be applied to humans. This paper aims to address risks, biases, and falla-
cies associated with drawing conclusions when conducting experiments on animals, with focus on animal models of
Conclusions: Researchers using animal models are susceptible to a fallacy known as false analogy, where inferences
based on assumptions of similarities between animals and humans can potentially lead to an incorrect conclusion.
There is also a risk of false positive results when evaluating the validity of a putative animal model, particularly if the
experiment is not conducted double-blind. It is further argued that animal model experiments are reconstructions
of human experiments, and not replications per se, because the animals cannot follow instructions. This leads to an
experimental setup that is altered to accommodate the animals, and typically involves a smaller sample size than a
human experiment. Researchers on animal models of human behaviour should increase focus on mechanistic validity
in order to ensure that the underlying causal mechanisms driving the behaviour are the same, as relying on face valid-
ity makes the model susceptible to logical fallacies and a higher risk of Type 1 errors. We discuss measures to reduce
bias and risk of making logical fallacies in animal research, and provide a guideline that researchers can follow to
increase the rigour of their experiments.
Keywords: Argument from analogy, Conﬁrmation bias, Type 1 error, Animal models, Double-down eﬀect, Validity
© The Author(s) 2017. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium,
provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license,
and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/
publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
A logical fallacy is a judgment or argument based on
poor logical thinking. It is an error in reasoning, which
usually means that either the line of reasoning is ﬂawed,
or the objects in the premise of the argument are dissimi-
lar to the objects in the conclusion . Scientists are not
immune to logical fallacies and are susceptible to making
arguments based on unsound reasoning. For instance, a
common fallacy is aﬃrming the consequent. is involves
the following line of reasoning: if A is true, then X is
observed. We observe X, therefore A must be true. is
argument is fallacious because observing X only tells us
that there is a possibility that A is true: the rule does not
specify that A follows X, even if X always follow A.1 Stud-
ies that have explicitly investigated this in a scientist sam-
ple found that 25–33% of scientists make the fallacy of
aﬃrming the consequent and conclude that X→A is a
valid argument [2, 3].
Making logical fallacies is a human condition, and there
is a large range of fallacies commonly committed [1, 4, 5].
In the present paper, we will focus on a select few that are
of particular relevance to animal model research, espe-
cially in the context of validity and reliability of conclu-
sions drawn from an experiment.
1 If you struggle to follow this line of reasoning, a concrete example makes
it easier: If it is wine, then the drink has water in it. Water is in the drink.
erefore, it must be wine. Nowhere does the rule specify that only wine
contains water as an ingredient, so simply making this observation does not
allow us to conclude that it is wine.
Department of Behavioral Sciences, Oslo and Akershus University College
of Applied Sciences, St. Olavs Plass, P.O. Box 4, 0130 Oslo, Norway
Page 2 of 13
Sjoberg Behav Brain Funct (2017) 13:3
e fallacy of aﬃrming the consequent is connected with
a tendency to seek evidence that conﬁrms a hypothesis.
Many scientists conduct their experiments under the
assumption that their experimental paradigm is a legiti-
mate extension of their hypothesis, and thus their results
are used to conﬁrm their beliefs. As an example, imagine
a hypothesis that states that patients with bipolar disor-
der have reduced cognitive processing speed, and we do a
reaction time test to measure this. us, a fallacious line
of reasoning would be: if bipolar patients have reduced
cognitive processing speed, then we will observe slower reac-
tion time on a test. We observe a slower reaction time, and
therefore bipolar patients have reduced cognitive processing
speed. is would be aﬃrming the consequent, because
the observed outcome is assumed to be the result of the
mechanism outlined in the hypothesis, but we cannot with
certainty say that this is true. e results certainly suggests
this possibility, and it may in fact be true, but the patients
may have exhibited slower reaction times for a variety of
reasons. If a signiﬁcant statistical diﬀerence between bipo-
lar patients and controls is found, it may be common to
conclude that the results support the cognitive processing
speed hypothesis, but in reality this analysis only reveals
that the null hypothesis can be rejected, not necessarily
why it can be rejected [6, 7]. e manipulation of the inde-
pendent variable gives us a clue as to the cause of the rejec-
tion of the null hypothesis, but this does not mean that the
alternative hypothesis is conﬁrmed beyond doubt.
Popper  claimed that hypotheses could never be
conﬁrmed; only falsiﬁed. He claimed that we could not
conclude with absolute certainty that a statement is true,
but it is possible to conclude that it is not true. e clas-
sic example is the white swan hypothesis: even if we have
only observed white swans, we cannot conﬁrm with
certainty the statement “all swans are white”, but if we
observe a single black swan then we can reject the state-
ment. Looking for conﬁrmation (searching for white
swans) includes the risk of drawing the wrong conclusion,
which in this case is reached through induction. How-
ever, if we seek evidence that could falsify a hypothesis
(searching for black swans), then our observations have
the potential to reject our hypothesis. Note that rejecting
the null hypothesis in statistical analyses is not necessar-
ily synonymous with falsifying an experimental hypoth-
esis. Null-hypothesis testing is a tool, and when we use
statistical analyses we are usually analysing a numerical
analogy of our experimental hypothesis.
When a hypothesis withstands multiple tests of falsiﬁ-
cation, Popper called it corroborated . We could argue
that if a hypothesis is corroborated, then its likelihood
of being true increases, because it has survived a gaunt-
let of criticism by science . However, it is important
to note that Popper never made any such suggestion, as
this would be inductive reasoning: exactly the problem he
was trying to avoid! Even if a hypothesis has supporting
evidence and has withstood multiple rounds of falsiﬁca-
tion, Popper meant that it is not more likely to be true
than an alternative hypothesis, and cannot be conﬁrmed
with certainty . Instead, he felt that a corroborated
theory could not be rejected without good reason, such
as a stronger alternative theory . Popper may be cor-
rect that we cannot conﬁrm a hypothesis with absolute
certainty, but in practice it is acceptable to assume that
a hypothesis is likely true if it has withstood multiple
rounds of falsiﬁcation, through multiple independent
studies using diﬀerent manipulations (see “Animal model
experiments are reconstructions” section). However, in
the quest for truth we must always be aware of the pos-
sibility, however slight, that the hypothesis is wrong, even
if the current evidence makes this seem unlikely.
Conﬁrmation bias is the tendency to seek information
that conﬁrms your hypothesis, rather than seeking infor-
mation that could falsify it . is can inﬂuence the
results when the experimenter is informed of the hypoth-
esis being tested, and is particularly problematic if the
experiment relies on human observations that has room
for error. e experimenters impact on the study is often
implicit, and may involve subtly inﬂuencing participants
or undermining methodological ﬂaws, something also
known as experimenter bias .
e tendency to express conﬁrmation bias in science
appears to be moderated by what ﬁeld of study we belong
to. Physicists, biologists, psychologists, and mathemati-
cians appear to be somewhat better at avoiding conﬁr-
mation bias than historians, sociologists, or engineers,
although performance varies greatly from study to study
[3, 15–18]. In some cases, the tendency to seek conﬁrm-
ing evidence can be a result of the philosophy of science
behind a discipline. For instance, Sidman’s  book Tac-
tics of Scientiﬁc Research, considered a landmark text-
book on research methods in behavior analysis [20–22],
actively encourages researchers to look for similarities
between their research and others, which is likely to
increase conﬁrmation bias.
Conﬁrmation bias has been shown in animal research
as well, but this fallacy is reduced when an experiment is
conducted double-blind . Van Wilgenburg and Elgar
found that 73% of non-blind studies would report a sig-
niﬁcant result supporting their hypothesis, while this was
only the case in 21% of double-blind studies. An interest-
ing new approach to reduce conﬁrmation bias in animal
research is to fully automatize the experiment [24, 25].
is involves setting up the equipment and protocols
Page 3 of 13
Sjoberg Behav Brain Funct (2017) 13:3
in advance, so that large portions of an experiment can
be run automatically, with minimal interference by the
experimenter. Along with double-blinded studies, this is
a promising way to reduce conﬁrmation bias in animal
It is important to note that the conﬁrmation bias phe-
nomenon occurs as an automatic, unintentional process,
and is not necessarily a result of deceptive strategies .
As humans, we add labels to phenomena and establish
certain beliefs about the world, and conﬁrmation bias is a
way to cement these beliefs and reinforce our sense of
identity.2 Scientists may therefore be prone to conﬁrma-
tion bias due to a lack of education on the topic, and not
necessarily because they are actively seeking to ﬁnd cor-
Argument fromanalogy andanimal model
e issues reported in this paper apply to all of science,
and we discuss principles and phenomena that any scien-
tist would hopefully ﬁnd useful. However, the issues will
primarily be discussed in the context of research on ani-
mal models, as some of the principles have special appli-
cations in this ﬁeld. In this section, we outline how an
animal model is deﬁned, and problems associated with
arguing from analogy in animal research.
Dening an animal model
e term “animal model” is not universally deﬁned in the
literature. Here, we deﬁne an animal model as an animal
suﬃciently similar to a human target group in its physi-
ology or behaviour, based on a natural, bred, or experi-
mentally induced characteristic in the animal, and which
purpose is to generate knowledge that may be extrapo-
lated to the human target group. In this article, we focus
on translational animal models in the context of behav-
ioural testing, which usually involve a speciﬁc species or
strain, or an animal that have undergone a manipulation
prior to testing.
An animal model can of course model another non-
human animal, but for the most part the aim of it is to
study human conditions indirectly through animal
research. at research is conducted on animals does
not necessarily mean that the animal acts as a model for
humans. It is only considered an animal model when its
function is to represent a target group or condition in
humans, e.g. people with depression, autism, or brain
injury. e current paper focuses on animal models of
mental illness, but animal models as a whole represent a
large variety of conditions, and are particularly common
2 anks to Rachael Wilner for pointing out this argument.
to use in drug trials. See Table1 for an overview of com-
mon animal models of mental illnesses.
It should also be noted that the term “animal model”
refers to an animal model that has at least been vali-
dated to some extent, while a model not yet validated is
referred to as a “putative animal model”. at a model is
“validated” does not mean that the strength of this vali-
dation cannot be questioned; it merely means that previ-
ous research has given the model credibility in one way
In research on animal models, scientists sometimes use
an approach called the argument from analogy. is
involves making inferences about a property of one
group, based on observations from a second group,
because both groups have some other property in com-
mon . Analogies can be very useful in our daily lives as
well as in science: a mathematical measurement, such as
“one meter”, is essentially an analogy where numbers and
quantities act as representations of properties in nature.
When applying for a job, a person might argue that she
would be a good supervisor because she was also a good
basketball coach, as the jobs have the property of lead-
ership in common. Concerning animal models, arguing
from analogy usually involves making inferences about
humans, based on an earlier observation where it was
found that the animals and humans have some prop-
erty in common. Arguing from analogy is essentially a
potentially erroneous judgment based on similarities
between entities. However, this does not make the argu-
ment invalid by default, because the strength of the argu-
ment relies on: (1) how relevant the property we infer is
to the property that forms the basis of the analogy; (2) to
what degree the two groups are similar; (3) and if there is
any variety in the observations that form the basis of the
Animal models themselves are analogies, as their exist-
ence is based on the assumption that they are similar to
a target group in some respect. If the two things we are
drawing analogies on are similar enough so that we will
reasonably expect them to correlate, an argument from
analogy can be strong! However, when we draw the con-
clusion that two things share a characteristic, because we
have established that they already share another, diﬀerent
characteristic, then we are at risk of making the fallacy of
false analogy .
The false analogy
A false analogy is essentially an instance when an argu-
ment based on an analogy is incorrect. is can occur
when the basis of similarity between objects do not jus-
tify the conclusion that the objects are similar in some
Page 4 of 13
Sjoberg Behav Brain Funct (2017) 13:3
other respect. For instance, if Jack and Jill are siblings,
and Jack has the property of being clumsy, we might infer
that Jill is also clumsy. However, we have no information
to assert that Jill is clumsy, and the premise for our argu-
ment is based solely on the observation that Jack and Jill
have genetic properties in common. We are assuming
that clumsiness is hereditary, and therefore this is prob-
ably a false analogy. Note that knowledge gained later
may indicate that—in fact—clumsiness is hereditary, but
until we have obtained that knowledge we are operat-
ing under assumptions that can lead to false analogies.
Even if clumsiness was hereditary, we could still not say
with absolute certainty that Jill is clumsy (unless genetics
accounted for 100% of the variance). is new knowledge
would mean that our analogy is no longer false, as Jill’s
clumsiness can probably at least in part be explained by
genetics, but we are still arguing from analogy: we cannot
know for certain if Jill is clumsy, based solely on observa-
tions with Jack.
The false analogy inanimal models
With animal models, the false analogy can occur when
one group (e.g. an animal) share some characteristics
with another group (e.g. humans), and we assume that
the two groups also share other characteristics. For
instance, because chimpanzees can follow the gaze of
a human, it could be assumed that the non-human pri-
mates understand what others perceive, essentially dis-
playing theory of mind [28–30]. However, Povinelli
etal.  argue that this is a false analogy, because we
are drawing conclusions about the inner psychological
state of the animal, based on behavioural observations.
It may appear that the animal is performing a behaviour
that requires complex thinking, while in reality it only
reminds us of complex thinking , most likely because
we are anthropomorphizing the animal’s behaviour
—particularly the assumption that the mind of an ape
is similar to the mind of a human . A diﬀerent exam-
ple would be birds that are able to mimic human speech:
the birds are simply repeating sounds, and we are anthro-
pomorphising if we believe the birds actually grasp our
concept of language.
Robbins  pointed out that homology is not guar-
anteed between humans and primates, even if both the
behavioural paradigm and the experimental result are
identical for both species: diﬀerent processes may have
been used by the two species to achieve the same out-
come. Since an animal model is based on common prop-
erties between the animal and humans, we may assume
that new knowledge gained from the animal model is
also applicable to humans. In reality, the results are only
indicative of evidence in humans.
Arguing from analogy, therefore, involves the risk
of applying knowledge gained from the animal over to
humans, without knowing with certainty if this applica-
tion is true. Imagine the following line of reasoning: we
ﬁnd result A in a human experiment, and in an animal
model we also ﬁnd result A, establishing face validity for
the animal model. Consequently, we then conduct a dif-
ferent experiment on the animal model, ﬁnding result B.
If we assume that B also exist in humans, without try-
ing to recreate these results in human experiments, then
we are arguing from analogy, potentially drawing a false
Illustration: argument fromanalogy inthe SHR model
An illustration of argument from analogy comes from
the SHR (spontaneously hypertensive rat) model of
ADHD (Attention-Deﬁcit/Hyperactivity Disorder) [35,
Table 1 A summary ofsome available animal models ofmental illnesses, wherethe animals themselves act asthe model
forthe target group
The animals are genetically modied, bred for a specic trait, or manipulated in some physiological fashion (e.g. a lesion or drug injec tion)
Mental illness Model References
Anxiety Serotonin receptor 1A knockout mice 
Corticosterone treated mice 
Attention-Deﬁcit/Hyperactivity Disorder Spontaneously Hypertensive rat 
Thyroid receptor β1 transgenic mice 
Autism Valproic Acid rat 
Depression Corticosterone treated rats and mice 
Chronic Mild Stress rats and mice 
Obsessive Compulsive Disorder Quinpirole treated rats 
Post-Traumatic Stress Syndrome Congenital learned helpless rat 
Schizophrenia Ventral hippocampus lesioned rats 
Methylazoxymethanol acetate treated rats 
Developmental vitamin D deﬁcient rats 
Page 5 of 13
Sjoberg Behav Brain Funct (2017) 13:3
36]. Compared to controls, usually the Wistar Kyoto rat
(WKY), the SHRs exhibit many of the same behavioural
deﬁcits observed in ADHD patients, such as impulsive
behaviour [37–42], inattention [35, 37], hyperactivity [37,
43], and increased behavioural variability [44–47].
One measure of impulsive behaviour is a test involving
delay discounting. In this paradigm, participants are faced
with the choice of either a small, immediate reinforcer
or a larger, delayed reinforcer. Both ADHD patients 
and SHRs  tend to show a preference for the smaller
reinforcer as the delay between response and reinforcer
increases for the large reinforcer. Research on delay dis-
counting with ADHD patients suggests that they are delay
averse, meaning that impulsivity is deﬁned as making
choices that actively seek to reduce trial length (or overall
delay) rather than immediacy [48–56], but this is usually
achieved by choosing a reinforcer with a short delay.
ere is no direct evidence to suggest that SHRs oper-
ate by the same underlying principles as ADHD patients.
Studies on delay discounting using SHRs tend to manipu-
late the delay period between response and reinforcer
delivery, but do not compare the results with alternative
explanations. is is because the rats cannot be told the
details of the procedure (e.g. if the experiment ends after
a speciﬁc time or a speciﬁc number of responses). ere-
fore, most authors who have investigated delay discount-
ing usually avoid the term delay aversion . However,
some authors make the argument from analogy where
they assume that the rats show a similar eﬀect to ADHD
children: Bizot etal.  concluded that “…SHR are less
prone to wait for a reward than the other two strains, i.e.
exhibit a higher impulsivity level… (p. 220)”, and Pardey,
Homewood, Taylor and Cornish  concluded that “…
SHRs are more impulsive than the WKY as they are less
willing to wait for an expected reinforcer (p. 170).” Even
though the evidence shows that SHRs preference for
the large reinforcer drops with increased delay, we can-
not conclude with certainty that this occurs because the
SHRs do not want to wait. e experimental setup does
not tell us anything conclusive about the animal’s motiva-
tion, nor its understanding of the environmental condi-
tions. Hayden  has argued that the delay discounting
task is problematic in measuring impulsivity in animals
because it is unlikely that the animals understand the
concept of the inter-trial interval. Furthermore, if the
SHRs were less willing to wait for a reinforcer, then we
may argue that this shows immediacy, and not necessar-
ily delay aversion. In this case, it may instead support the
dual pathway model of ADHD, which takes into account
both delay aversion and an impulsive drive for immediate
reward [56, 61, 62].
Assuming that the rats are delay averse or impulsive is
arguing from analogy. e evidence may only suggests
that the rats are impulsive, not necessarily why they are
impulsive. e results may also not speak to whether the
reason for this behaviour is the same in ADHD and SHRs
(mechanistic validity—see “Mechanistic validity” sec-
tion). If we were to manipulate the magnitude of the large
reinforcer then we will also ﬁnd a change in performance
[57, 63]. How do we know that the SHRs are sensitive to
temporal delays, and not to other changes in the experi-
mental setup, such as the inter-trial interval , rein-
forcer magnitude , or the relative long-term value of
the reward ?
The validity criteria ofanimal models
Before any further discussion on logical fallacies in ani-
mal models, the validity criteria of these models must
be addressed. We must also point out that there are two
approaches to animal model research: (1) validating a
putative animal model, and (2) conducting research on
an already validated model.
When asserting the criteria for validating an putative
animal model, the paper by Willner  is often cited,
claiming that the criteria for a valid animal model rests
on its face, construct, and predictive validity. is means
that the model must appear to show the same symp-
toms as the human target group (face validity), that the
experiment measures what it claims to measure and can
be unambiguously interpreted (construct validity), and
that it can make predictions about the human popula-
tion (predictive validity). However, there is no univer-
sally accepted standard for which criteria must be met in
order for an animal model to be considered valid, and the
criteria employed may vary from study to study [66–70].
Based on this, Belzung and Lemoine  attempted to
broaden Willner’s criteria into a larger framework, pro-
posing nine validity criteria that assess the validity of
animal models for psychiatric disorders. Tricklebank and
Garner  have argued that, in addition to the three
criteria by Willner , a good animal model must also
be evaluated based on how it controls for third variable
inﬂuences (internal validity), to what degree results can
be generalized (external validity), whether measures
expected to relate actually do relate (convergent valid-
ity), and whether measures expected to not relate actu-
ally do not relate (discriminant validity). ese authors
argue that no known animal model currently fulﬁls all of
these criteria, but we might not expect them to; what is
of utmost importance is that we recognize the limitation
of an animal model, including its application. Indeed, it
could be argued that a reliable animal model may not
need to tick all the validity boxes as long it has predic-
tive validity, because in the end its foremost purpose is
to make empirical predictions about its human target
group. However, be aware that arguing from analogy
Page 6 of 13
Sjoberg Behav Brain Funct (2017) 13:3
reduces the model’s predictive validity, because its pre-
dictive capabilities may be limited to the animal studied.
Behavioural similarities between a putative model and
its human target group is not suﬃcient grounds to vali-
date a model. In other words, face validity is not enough:
arguably, mechanistic validity is more important. is
is a term that normally refers to the underlying cogni-
tive and biological mechanisms of the behavioural deﬁ-
cits being identical in both animals and humans ,
though we can extend the deﬁnition to include external
variables aﬀecting the behaviour, rather than attributing
causality to only internal, cognitive events. Whether the
observed behaviour is explained in terms of neurologi-
cal interactions, cognitive processes, or environmental
reinforcement depends on the case in question, but the
core of matter is that mechanistic validity refers to the
cause of the observed behavioural deﬁcit or symptom. If
we can identify the cause of the observed behaviour in
an animal model, and in addition establish that this is
also the cause of the same behaviour in humans, then we
have established mechanistic validity. is validity crite-
rion does not speak to what has triggered the onset of a
condition (trigger validity), or what made the organism
vulnerable to the condition in the ﬁrst place (ontopatho-
genic validity), but rather what factors are producing the
speciﬁc symptoms or behaviour . For instance, falling
down the stairs might have caused brain injury (trigger
validity), and this injury in turn reduced dopamine trans-
mission in the brain, which lead to impulsive behaviour.
When an animal model is also impulsive due to reduced
dopamine transmissions, we have established mechanis-
tic validity (even if the trigger was diﬀerent).
The validity ofmodels ofconditions withlimited etiology
Face validity has been argued to be of relatively low
importance in an animal model, because it does not
speak about why the behaviour occurs [33, 69], i.e. the
evidence is only superﬁcial. However, it could be argued
that face validity is of higher importance in animal mod-
els of ADHD, because the complete etiology underlying
the condition is not yet fully known, and therefore an
ADHD diagnosis is based entirely on behavioural symp-
ere is limited knowledge of the pathophysiology on
many of the mental illnesses in the Diagnostic and Sta-
tistical Manual of Mental Disorders ; depression
and bipolar disorder are examples of heterogeneous
conditions where animal models have been diﬃcult to
establish [75, 76]. When dealing with a heterogeneous
mental disorder, it is inherently harder for animal models
to mimic the behavioural deﬁcits, particularly a range of
diﬀerent deﬁcits [75, 77–80]. We could argue, therefore,
that mechanistic validity in animal models is diﬃcult,
if not impossible, to establish from the outset when our
knowledge of causality in humans might be limited.
Models can be holistic or reductionist
Animal models can be approached with diﬀerent applica-
tions in mind: it can aim to act holistic or reductionist. A
holistic approach assumes that the model is a good rep-
resentation of the target group as a whole, including all
or most symptoms and behavioural or neurological char-
acteristics. Alternatively, a reductionist approach uses an
animal model to mimic speciﬁc aspects of a target group,
such as only one symptom. is separation may not be
apparent, because animal models are usually addressed as
if they are holistic; for instance, the valproic acid (VPA)
rat model of autism is typically just labelled as an “animal
model of autism” in the title or text , but experiments
typically investigate speciﬁc aspects of autism [82–84].
is does not mean that the model is not holistic, but
rather that its predictive validity is limited to the aspects
of autism investigated so far. Similarly, the SHR is typi-
cally labelled as an “animal model of ADHD” , but it
has been suggested that the model is best suited for the
combined subtype of ADHD [36, 73], while Wistar Kyoto
rats from Charles River Laboratories are more suited for
the inattentive subtype . e point of this distinction
between holistic and reductionist approaches is to under-
line that animal models have many uses, and falsifying
a model in the context of one symptom does not mean
the model has become redundant. As long as the model
has predictive validity in one area or another, then it can
still generate hypotheses and expand our understand-
ing of the target group, even if the model is not a good
representation of the target group as a whole. Indeed,
an animal model may actually be treated as holistic until
it can be empirically suggested that it should in fact be
reductionist. However, researchers should take care not
to assume that a model is holistic based on just a few
observations: this would be arguing from analogy and
bears the risk of making applications about humans that
are currently not established empirically. e exact appli-
cations and limitations of an animal model should always
be clearly deﬁned [33, 86].
Animal model experiments are reconstructions
e terms “replicate” and “reproduce” are often used
interchangeably in the literature , but with regards to
animal models their distinction is particularly important.
Replication involves repeating an experiment using the
same methods as the original experiment, while a repro-
duction involves investigating the same phenomenon
using diﬀerent methods . Replications assure that the
Page 7 of 13
Sjoberg Behav Brain Funct (2017) 13:3
eﬀects are stable, but a reproduction is needed to ensure
that the eﬀect was not due to methodological issues.
We suggest a third term, reconstruction, which has
special applications in animal models. A reconstruction
involves redesigning an experiment, while maintaining
the original hypothesis, in order to accommodate diﬀer-
ent species. When an animal experiment aims to investi-
gate a phenomenon previously observed on humans, we
have to make certain changes for several reasons. First,
the animals are a diﬀerent species than humans, and have
a diﬀerent physiology and life experience. Second, the
animals do not follow verbal instructions and must often
(but not always) be trained to respond. ird, the experi-
mental setup must often be amended so that a behaviour
equivalent to a human behaviour is measured. A fourth
observation is that animal studies tend to use smaller
sample sizes than human experiments, which makes
them more likely to produce large eﬀect sizes when a sig-
niﬁcant result is found .
An animal model experiment actively attempts to
reconstruct the conditions of which we observed an
eﬀect with humans, but makes alterations so that we can
be relatively certain that an equivalent eﬀect is observed
in the animals (or vice versa, where a human experiment
measures an equivalent eﬀect to what was observed in an
animal study). is questions the construct validity of the
study: how certain are we that the task accurately reﬂects
the human behaviour we are investigating?
Another problem concerned with reconstruc-
tion is the standardization fallacy . This refers
to the fact that animal experiments are best repli-
cated if every aspect of the experiment is standard-
ized. However, by increasing experimental control
we lose external validity, meaning that the results are
less likely to apply to other situations . The dif-
ficulty is therefore to find a balance between the two,
and finding this balance may depend on the research
question we seek to answer [33, 92]. One approach
is to initially begin with replications, and if these are
successful move on to perform reproductions, and
eventually reconstructions. This is essentially what
van der Staay, Arndt and Nordquist  have previ-
ously suggested: successful direct replication is fol-
lowed by extended replication where modifications
are made within the procedure, the animal’s environ-
ment (e.g. housing or rearing), or their gender. Should
the effect persevere, then we have systematically
established a higher degree of generalization without
losing internal validity. At the final stage, quasi-repli-
cations are conducted using different species, which
is similar to our concept of reconstructions, and it is
at this stage that the translational value of the find-
ings are evaluated.
The double‑down eect
When we run animal model experiments, we have to use
a control group for comparison. When we are evaluating
a putative model, we are therefore indirectly evaluating
both animal groups for their appropriateness as an ani-
mal model for the phenomenon in question, even if we
hypothesized beforehand that just one group would be
suitable, and this is the double-down eﬀect. If we were to
discover that the control group, rather than the experi-
ment group, shows the predicted characteristic, then
it may be tempting to use hindsight bias to rationalize
that the result was predicted beforehand, something that
should always be avoided! In actuality, this is an occasion
that can be used to map the observable characteristics of
the animals, which is called phenotyping. is may show
that the control group has a property that makes them
a suitable candidate as a new putative model. Follow-up
studies can then formally evaluate whether this puta-
tive animal model has validity. is approach is perfectly
acceptable, provided that the initial discovery of the con-
trol group’s suitability is seen as suggestive and not con-
clusive, until further study provide more evidence.
When an animal model has already been validated, the
double-down eﬀect still applies: we are still indirectly
evaluating two animal groups at once, but it is less likely
that that the control group will display the animal’s char-
acteristic due to previous validation. Failure to replicate
previous ﬁndings can be interpreted in many ways; it
could be an error in measurement, diﬀerences in experi-
mental manipulations, or that the animal model is simply
not suitable as a model in this speciﬁc paradigm (but still
viable in others). Should we observe that controls express
a phenomenon that was expected of the experimental
group, then we should replicate the study to rule out that
the ﬁnding occurred by chance or through some meth-
odological error. is may lead us to suggest the control
group as a putative model, pending further validation.
The double‑down eect andthe le drawer problem
Since the purpose of animal models is to conduct
research on non-human animals, with the aim to advance
knowledge about humans, then inevitably the animal
model and the human condition it mimics must be simi-
lar in some respect. If they were not, then the pursuit of
the model would be redundant. erefore, from the out-
set, there is likely to be publication bias in favour of data
that shows support for a putative animal model, because
otherwise it has no applications.
e double-down eﬀect of evaluating two animal
groups at once makes animal models particularly suscep-
tible to the ﬁle drawer problem. is is a problem where
the literature primarily reﬂects publications that found
signiﬁcant results, while null results are published less
Page 8 of 13
Sjoberg Behav Brain Funct (2017) 13:3
frequently [93, 94]. is aversion to the null creates what
Ferguson and Heene called “undead theories”, which are
theories that survive rejection indeﬁnitely, because null
results that refute them are not published . e ori-
gin of this trend is not entirely clear, but it probably came
into existence by treating the presence of a phenomenon
as more interesting than its absence. Once an eﬀect has
been documented, replications may now be published
that support the underlying hypothesis.
e ﬁle drawer eﬀect is probably related to the sunk-
cost eﬀect: this is a tendency to continue on a project due
to prior investment, rather than switching to a more via-
ble alternative . us, if we publish null results, it may
seem that previous publications with signiﬁcant ﬁndings
were wasteful, and we may feel that we are contributing
towards dissent rather than towards ﬁnding solutions. It
may be in the researcher’s interest to ﬁnd evidence sup-
porting the theory in order to justify their invested time,
thus becoming victim of conﬁrmation bias.
Furthermore, if null results are found, they might be
treated with more skepticism than a signiﬁcant result.
is is, of course, a fallacy in itself as both results should
be treated the same: why would a null result be subjected
to more scrutiny than a signiﬁcant result? When the
CERN facility recorded particles travelling faster than the
speed of light, the observation appeared to falsify the the-
ory of relativity . is result was met with skepticism
, and it was assumed that it was due to a measure-
ment error (which in the end it turned out to be). Nev-
ertheless, if the result had supported relativity, would the
degree of skepticism have been the same?
In the context of animal studies, the double-down
eﬀect makes it more likely that a signiﬁcant result is
found when comparing two animal groups. Either
group may be a suitable candidate for a putative animal
model, even if only one group was predicted to be suit-
able beforehand. If any result other than a null result will
show support for an animal model (or a putative model),
then multiple viable models will be present in the litera-
ture, all of which will be hard to falsify (as falsifying one
model may support another). Indeed, this is currently the
case for animal models, where there are multiple avail-
able models for the same human conditions [80, 99–103].
e ﬁle drawer problem is a serious issue in science ,
and the trend may often be invisible to the naked eye, but
methods such as meta-analyses have multiple tools to
help detect publication bias in the literature .
Measures toimprove animal model research
e main purpose of this paper was to address sev-
eral risks and fallacies that may occur in animal model
research, in order to encourage a rigorous scientiﬁc
pursuit in this ﬁeld. We do not intend to discourage
researchers from using animal models, but rather
hope to increase awareness of potential risks and falla-
cies involved. In order to make the issues addressed in
the paper more overviewable, we have created a list for
researchers to confer when designing animal experiment
and interpreting their data.
1. Be aware of your own limitations. Some of the falla-
cies and risks addressed in this paper may be una-
voidable for a variety of reasons. Nevertheless, the
ﬁrst step towards improving one’s research is to be
aware of the existence of these risks. When writing
the discussion section of a report, it may be neces-
sary to point out possible limitations. Even if they are
not explicitly stated, it is still healthy for any scientist
to be aware of them.3
2. Establish predictive and mechanistic validity. If you
are attempting to validate a putative animal model,
ensure that the experiment is as similar as pos-
sible to experiments done on humans. If this is not
possible, explain why in the write-up. If the experi-
ment is novel, and the animal model is already vali-
dated through previous research, then this principle
does not necessarily apply, because the purpose is to
uncover new knowledge that may be translated to
humans. In such instances, a new hypothesis gains
validity in a follow-up experiment on humans.
Remember that there are several criteria available
for validating an animal model, but there is no uni-
versal agreement on which set of criteria should be
followed. However, the two most important crite-
ria are arguably predictive validity and mechanistic
validity, because face validity is prone to logical fal-
lacies. Establishing mechanistic validity ensures that
the mechanisms causing the observed behaviour are
the same in the model and humans, while establish-
ing predictive validity means that knowledge gained
from the model is more likely to apply to humans.
3. Deﬁne an a priori hypothesis and plan the statistical
analysis beforehand. It is crucial to have an a priori
hypothesis prior to conducting the experiment,
otherwise one might be accused of data dredging
and reasoning after-the-fact that the results were
expected [107, 108]. When validating a putative ani-
mal model, this drastically reduces the double-down
eﬀect. If the data do not show the predicted pat-
tern then it is perfectly acceptable to suggest a new
3 e author of this manuscript once held a conference talk where he sug-
gested the possibility that one of his own research results may have been
inﬂuenced by conﬁrmation bias . Never assume that only others are
prone to bias—even authors of logical fallacy papers may commit fallacies!.
Page 9 of 13
Sjoberg Behav Brain Funct (2017) 13:3
hypothesis and/or a putative animal model for fur-
When designing the experiment, keep in mind which
statistical analysis would be appropriate for analysing
the data. If the statistical method is chosen post hoc,
then it may not correspond to the chosen design,
and one might be accused of data dredging, which
involves choosing a statistical procedure that is more
likely to produce signiﬁcant results . Also, keep
in mind which post hoc tests are planned, and that
the correct one is chosen to reduce familywise error
when there are multiple comparisons to be made. It
is highly recommended that eﬀect sizes are reported
for every statistical test: this will give insight into the
strength of the observed phenomenon, and also allow
a more detailed comparison between studies .
4. Do a power analysis. For logistical, practical, or eco-
nomic reasons, animal model research may be forced
to use sample sizes smaller than what is ideal. Nev-
ertheless, one should conduct a power analysis to
ascertain how many animals should be tested before
the experiment starts. When doing multiple com-
parisons, it may be diﬃcult to establish the sample
size because the power analysis may only grant the
sample size of an omnibus analysis (the analysis of
the whole, not its individual parts), and not what is
required to reach signiﬁcance with post hoc tests
. If all the post hoc analyses are of equal interest,
choose the sample size required to achieve power of
0.8 in all comparisons. Alternatively, use a compar-
ison-of-most-interest approach where the sample
size is determined by the power analysis of the post
hoc comparison that is of highest interest . If
a power analysis is not conducted, or not adhered
to, it may be prudent to use a sample size similar to
previously conducted experiments in the literature,
and then do a post hoc power analysis to determine
the power of your study. Once the experiment is
completed and the data analysed, one must never
increase the sample size, because this will increase
your chances of ﬁnding a signiﬁcant result (conﬁrma-
tion bias) [109, 111, 112].
5. Double-blind the experiment. By doing the experi-
ment double-blind, we severely reduce the risk of
conﬁrmation bias. is means that the experimenter
is blind to the a priori hypothesis of the study, as
well as what group each animal belongs to. How-
ever, in some cases it may be diﬃcult or impossible
to do this. For instance, if the experimental group
has a phenotype that distinguishes them from con-
trols (e.g. white vs. brown rats), then it is diﬃcult to
blind the experimenter. For logistical and monetary
reasons it may also be impractical to have a qualiﬁed
experimenter who is blind to the relevant literature
of the study. Also, avoid analysing data prior to the
experiment’s completion, because if the data are not
in line with your predictions then one might implic-
itly inﬂuence the experiment to get the data needed
(experimenter bias ). Be aware that it is neverthe-
less perfectly acceptable to inspect the data on occa-
sion without statistically analysing it, just to ensure
that the equipment is working as it is supposed to
(or state in advance at what point it is acceptable to
check the data, in case there are circumstances where
you may want to terminate the experiment early).
6. Avoid anthropomorphizing. While it is inevitable to
describe our results in the context of human under-
standing and language, we must be careful not to
attribute the animals with human-like qualities.
Avoid making inferences about the animal’s thoughts,
feelings, inner motivation, or understanding of the
situation. We can report what the animals did, and
what this means in the context of our hypothesis, but
take care not to make assumptions of the inner work-
ings of the animal.
7. Avoid arguing from analogy. No matter how vali-
dated an animal model is, we cannot be certain that
a newly observed eﬀect also applies to humans. If
research on an animal model yields new information
that could give insight into the human target group,
ensure to mention that the data is suggestive, not
conclusive, pending further validation. Remember
that the strength of an animal model is to generate
new knowledge and hypotheses relevant to the target
group, including the assessment of potentially useful
treatments, but that these new possibilities are only
hypothetical once they are discovered.
8. Attempt to publish, despite a null result. If you pre-
dicted a speciﬁc result based on trends in the litera-
ture, but failed to ﬁnd this result, do not be discour-
aged from publishing the data (especially if you failed
to replicate a result in a series of experiments). is
is particularly important if the experiment had a low
sample size, as null results from such studies are
probably the least likely to be published, thus fuelling
the ﬁle drawer problem. By making the data avail-
able via either an article (for instance through Jour-
nal of Articles in Support of the Null Hypothesis) or a
dataset online, then you are actively contributing to
reduce the ﬁle drawer problem.
9. Replicate, reproduce, and reconstruct. Replicating an
experiment in order to establish interval validity and
reliability of an animal model is essential. When rep-
licating experiments multiple times, we reduce the
risk that the original ﬁnding was a chance result. If
previous replications have succeeded, then attempt
Page 10 of 13
Sjoberg Behav Brain Funct (2017) 13:3
to include a new hypothesis, experimental manipu-
lation, or follow-up experiment during the study to
expand our knowledge of the research question. is
process establishes both internal and external valid-
ity. Finally, reconstruct the experiment on humans,
so that the ﬁndings may be attributed across species.
A note onneurological similarities
e principles discussed in this paper have been
addressed in a behavioural context, but it should be
noted that they also apply to neurological evidence for
animal models, though increasing the validity in this case
can operate somewhat diﬀerently.
When we ﬁnd neurological elements that are the same
in both the animal model and the human target group
(that do not exist in controls), we should be careful to
draw any conclusions based on this. Just like behavioural
evidence, the links are suggestive and not necessarily con-
clusive. It is risky to assume that the physiological prop-
erties shared between humans and animals operate the
same way. In drug research, over 90% of drugs that show
eﬀectiveness on animal models fail to work on humans,
a problem called attrition . In the context of animal
models of mental illness, Belzung and Lemoine  pro-
posed the concept biomarker validity, which means that
the function of a neurological mechanism is the same
in the animal model and humans, even if the biomarker
responsible for this function may be diﬀerent across the
species. In other words, the two species may have diﬀer-
ent biological markers, but as long as they operate the
same way, and in turn produce the same symptoms, then
this adds validity to the model.
Of course, in reality things are not this simple. Neuro-
logical evidence is usually not based on the presence of
a single component, but rather multiple elements such
as rate of neurotransmitter release, reuptake, polymor-
phism, neural pathways, drug eﬀectiveness, or a combi-
nation of factors. e core message is that we must be
aware that ﬁnding similar neurological elements in both
animals and humans does not mean that they operate the
same way. If we make this assumption, we are arguing
It should be noted that conﬁrmation bias could also be
a problematic issue in neuroscientiﬁc research. Garner
 illustrates this with a car example: if we believe that
the gas pedal of a car is the cause of car accidents, then
removing the gas pedal from a car will drastically reduce
the accident rate of that car, conﬁrming that indeed the
gas pedal was the cause of car accidents. In neuroscience,
we may knock out a gene or selectively breed strains to
add or remove a genetic component. When the hypoth-
esized behaviour is shown (or not shown), we might
conclude that we have conﬁrmed our hypothesis. e
conclusion could be wrong because it is based on correla-
tion, and thus future replications of this result is likely to
make the same logical error .
In this paper, it has been discussed how animal models
can be susceptible to logical fallacies, bias, and a risk of
getting results that could give a false sense of support for
a putative animal model. Researchers should remember
that behavioural results found in an animal model of a
human condition does not guarantee that this knowl-
edge is applicable to humans. Replicating, reproducing
and reconstructing results over numerous studies will
drastically reduce the probability that the results are
similar by chance alone, although this does not necessar-
ily shed light on why the behaviour occurs. Researchers
should therefore be encouraged to investigate mecha-
nistic validity, meaning what underlying processes are
causing the behaviour. By simply looking at face valid-
ity, we have an increased risk of making errors through
Animal models can be very useful for investigating the
mechanisms behind a human condition. is new knowl-
edge can help improve our understanding and treatment
of this condition, but the researcher must not assume that
the observed animal behaviour also applies to humans.
Ultimately, animal models only provide solid evidence for
the animal used, and indicative evidence of human behav-
iour. However, this is also the strength of animal models:
indicative evidence may open the door to new ideas about
human behaviour that were not previously considered.
rough reconstructions, it can be established whether or
not the phenomenon exists in humans, and if the model
has mechanistic validity and predictive validity then this
certainly increases the application of the model, as well as
its value for the progress of human health.
ADHD: Attention-Deﬁcit/Hyperactivity Disorder; CERN: European Organiza-
tion for Nuclear Research; DSM: Diagnostic and Statistical Manual of Mental
Disorders; SHR: spontaneously hypertensive rat; VPA: valproic acid rat; WKY:
Wistar Kyoto rat.
Rachael Wilner gave valuable insight and feedback throughout multiple ver-
sions of the manuscript, especially into improving the language and structure
of the paper, as well as clarifying several arguments. A conversation with
Øystein Vogt was largely inspirational in terms of writing this article. Magnus
H. Blystad gave feedback that substantiated several claims, particularly the
neurology section. Espen Borgå Johansen oﬀered critical input on several
occasions, which lead to some arguments being empirically strengthened.
Carsta Simon’s feedback improved some of the deﬁnitions employed in the
article. Other members of the research group Experimental Behavior Analysis:
Translational and Conceptual Research, Oslo and Akershus University College,
is to be thanked for their contribution and feedback, particularly Per Holth,
Rasmi Krippendorf, and Monica Vandbakk.
Page 11 of 13
Sjoberg Behav Brain Funct (2017) 13:3
The author declare that he has no competing interests.
Received: 21 January 2016 Accepted: 1 February 2017
1. Salmon M. Introduction to logic and critical thinking. Boston: Wads-
worth Cengage Learning; 2013.
2. Barnes B. About science. New York: Basil Blackwell Inc.; 1985.
3. Kern LH, Mirels HL, Hinshaw VG. Scientists’ understanding of
propositional logic: an experimental investigation. Soc Stud Sci.
4. Tversky A, Kahneman D. Extensional versus intuitive reasoning: the con-
junction fallacy in probability judgment. Psychol Rev. 1983;90:293–315.
5. Kahneman D. Thinking, fast and slow. London: Macmillan; 2011.
6. Haller H, Krauss S. Misinterpretations of signiﬁcance: a problem stu-
dents share with their teachers. Methods Psychol Res. 2002;7:1–20.
7. Badenes-Ribera L, Frías-Navarro D, Monterde-i-Bort H, Pascual-Soler
M. Interpretation of the P value: a national survey study in academic
psychologists from Spain. Psicothema. 2015;27:290–5.
8. Popper KR. The LOGIC OF SCIENTIﬁC DISCOVery. London: Hutchinson;
9. Lewens T. The meaning of science. London: Pelican; 2015.
10. Leahey TH. The mythical revolutions of american psychology. Am
11. Law S. The great philosophers. London: Quercus; 2007.
12. Keuth H. The Philosophy of Karl Popper. Cambridge: Cambridge Univer-
sity Press; 2005.
13. Nickerson RS. Conﬁrmation bias: a ubiquitous phenomenon in many
guises. Rev Gen Psychol. 1998;2:175.
14. Rosenthal R, Fode KL. The eﬀect of experimenter bias on the perfor-
mance of the albino rat. Behav Sci. 1963;8:183–9.
15. Inglis M, Simpson A. Mathematicians and the selection task. In: Pro-
ceedings of the 28th international conference on the psychology of
mathematics education; 2004. p. 89–96.
16. Jackson SL, Griggs RA. Education and the selection task. Bull Psychon
17. Hergovich A, Schott R, Burger C. Biased evaluation of abstracts depend-
ing on topic and conclusion: further evidence of a conﬁrmation bias
within scientiﬁc psychology. Curr Psychol. 2010;29:188–209.
18. Mahoney MJ. Scientist as subject: the psychological imperative. Phila-
delphia: Ballinger; 1976.
19. Sidman M. Tactics of scientiﬁc research. New York: Basic Books; 1960.
20. Moore J. A special section commemorating the 30th anniversary of tac-
tics of scientiﬁc research: evaluating experimental data in psychology
by Murray Sidman. Behav Anal. 1990;13:159.
21. Holth P. A research pioneer’s wisdom: an interview with Dr. Murray Sid-
man. Eur J Behav Anal. 2010;12:181–98.
22. Michael J. Flight from behavior analysis. Behav Anal. 1980;3:1.
23. van Wilgenburg E, Elgar MA. Conﬁrmation bias in studies of nestmate
recognition: a cautionary note for research into the behaviour of ani-
mals. PLoS ONE. 2013;8:e53548.
24. Poddar R, Kawai R, Ölveczky BP. A fully automated high-throughput
training system for rodents. PLoS ONE. 2013;8:e83171.
25. Jiang H, Hanna E, Gatto CL, Page TL, Bhuva B, Broadie K. A fully auto-
mated drosophila olfactory classical conditioning and testing system
for behavioral learning and memory assessment. J Neurosci Methods.
26. Oswald ME, Grosjean S. Conﬁrmation bias. In: Pohl R, editor. Cognitive
illusions: a handbook on fallacies and biases in thinking, judgement
and memory. Hove: Psychology Press; 2004. p. 79.
27. Mill JS. A system of logic. London: John W. Parker; 1843.
28. Premack D, Woodruﬀ G. Does the chimpanzee have a theory of mind?
Behav Brain Sci. 1978;1:515–26.
29. Call J, Tomasello M. Does the chimpanzee have a theory of mind? 30
years later. Trends Cogn Sci. 2008;12:187–92.
30. Gomez J-C. Non-human primate theories of (non-human primate)
minds: some issues concerning the origins of mind-reading. In: Car-
ruthers P, Smith PK, editors. Theories of theories of mind. Cambridge:
Cambridge University Press; 1996. p. 330.
31. Povinelli DJ, Bering JM, Giambrone S. Toward a science of other minds:
escaping the argument by analogy. Cogn Sci. 2000;24:509–41.
32. Dutton D, Williams C. A view from the bridge: subjectivity, embodiment
and animal minds. Anthrozoös. 2004;17:210–24.
33. van der Staay FJ, Arndt SS, Nordquist RE. Evaluation of animal models of
neurobehavioral disorders. Behav Brain Funct. 2009;5:11.
34. Robbins T. Homology in behavioural pharmacology: an approach
to animal models of human cognition. Behav Pharmacol.
35. Sagvolden T. Behavioral validation of the spontaneously hypertensive
rat (Shr) as an animal model of attention-deﬁcit/hyperactivity disorder
(Ad/Hd). Neurosci Biobehav Rev. 2000;24:31–9.
36. Sagvolden T, Johansen EB, Wøien G, Walaas SI, Storm-Mathisen J,
Bergersen LH, et al. The spontaneously hypertensive rat model of
ADHD—the importance of selecting the appropriate reference strain.
37. Sagvolden T, Aase H, Zeiner P, Berger D. Altered reinforcement mecha-
nisms in attention-deﬁcit/hyperactivity disorder. Behav Brain Res.
38. Wultz B, Sagvolden T. The hyperactive spontaneously hypertensive rat
learns to sit still, but not to stop bursts of responses with short inter-
response times. Behav Genet. 1992;22:415–33.
39. Malloy-Diniz L, Fuentes D, Leite WB, Correa H, Bechara A. Impulsive
behavior in adults with attention deﬁcit/hyperactivity disorder: char-
acterization of attentional, motor and cognitive impulsiveness. J Int
Neuropsychol Soc. 2007;13:693–8.
40. Evenden JL. The pharmacology of impulsive behaviour in rats Iv: the
eﬀects of selective serotonergic agents on a paced ﬁxed consecutive
number schedule. Psychopharmacology. 1998;140:319–30.
41. Fox AT, Hand DJ, Reilly MP. Impulsive choice in a rodent model of atten-
tion-deﬁcit/hyperactivity disorder. Behav Brain Res. 2008;187:146–52.
42. Sonuga-Barke EJ. Psychological heterogeneity in Ad/Hd—a dual
pathway model of behaviour and cognition. Behav Brain Res.
43. Berger DF, Sagvolden T. Sex diﬀerences in operant discrimination
behaviour in an animal model of attention-deﬁcit hyperactivity disor-
der. Behav Brain Res. 1998;94:73–82.
44. Uebel H, Albrecht B, Asherson P, Börger NA, Butler L, Chen W, et al.
Performance variability, impulsivity errors and the impact of incentives
as gender-independent endophenotypes for ADHD. J Child Psychol
45. Johansen EB, Killeen PR, Sagvolden T. Behavioral variability, elimination
of responses, and delay-of-reinforcement gradients in Shr and Wky rats.
Behav Brain Funct. 2007;3:1.
46. Adriani W, Caprioli A, Granstrem O, Carli M, Laviola G. The spontane-
ously hypertensive-rat as an animal model of ADHD: evidence for
impulsive and non-impulsive subpopulations. Neurosci Biobehav Rev.
47. Scheres A, Oosterlaan J, Sergeant JA. Response execution and inhibi-
tion in children with AD/HD and other disruptive disorders: the role of
behavioural activation. J Child Psychol Psychiatry. 2001;42:347–57.
48. Sonuga-Barke E, Taylor E, Sembi S, Smith J. Hyperactivity and delay
aversion—I. The eﬀect of delay on choice. J Child Psychol Psychiatry.
49. Sonuga-Barke EJ, Williams E, Hall M, Saxton T. Hyperactivity and delay
aversion III: the eﬀect on cognitive style of imposing delay after errors. J
Child Psychol Psychiatry. 1996;37:189–94.
50. Kuntsi J, Oosterlaan J, Stevenson J. Psychological mechanisms in
hyperactivity: I response inhibition deﬁcit, working memory impair-
ment, delay aversion, or something else? J Child Psychol Psychiatry.
51. Solanto MV, Abikoﬀ H, Sonuga-Barke E, Schachar R, Logan GD, Wigal T,
et al. The ecological validity of delay aversion and response inhibi-
tion as measures of impulsivity in AD/HD: a supplement to the NIMH
multimodal treatment study of AD/HD. J Abnorm Child Psychol.
Page 12 of 13
Sjoberg Behav Brain Funct (2017) 13:3
52. Dalen L, Sonuga-Barke EJ, Hall M, Remington B. Inhibitory deﬁcits, delay
aversion and preschool AD/HD: implications for the dual pathway
model. Neural Plast. 2004;11:1–11.
53. Bitsakou P, Psychogiou L, Thompson M, Sonuga-Barke EJ. Delay aversion
in attention deﬁcit/hyperactivity disorder: an empirical investigation of
the broader phenotype. Neuropsychologia. 2009;47:446–56.
54. Tripp G, Alsop B. Sensitivity to reward delay in children with atten-
tion deﬁcit hyperactivity disorder (ADHD). J Child Psychol Psychiatry.
55. Marx I, Hübner T, Herpertz SC, Berger C, Reuter E, Kircher T, et al. Cross-
sectional evaluation of cognitive functioning in children, adolescents
and young adults with ADHD. J Neural Transm. 2010;117:403–19.
56. Marco R, Miranda A, Schlotz W, Melia A, Mulligan A, Müller U, et al. Delay
and reward choice in ADHD: an experimental test of the role of delay
aversion. Neuropsychology. 2009;23:367–80.
57. Garcia A, Kirkpatrick K. Impulsive choice behavior in four strains of rats:
evaluation of possible models of attention deﬁcit/hyperactivity disor-
der. Behav Brain Res. 2013;238:10–22.
58. Bizot J-C, Chenault N, Houzé B, Herpin A, David S, Pothion S, et al.
Methylphenidate reduces impulsive behaviour in Juvenile Wistar
rats, but not in adult Wistar, Shr and Wky rats. Psychopharmacology.
59. Pardey MC, Homewood J, Taylor A, Cornish JL. Re-evaluation of an
animal model for ADHD using a free-operant choice task. J Neurosci
60. Hayden BY. Time discounting and time preference in animals: a critical
review. Psychon Bull Rev. 2015;23:1–15.
61. Scheres A, Dijkstra M, Ainslie E, Balkan J, Reynolds B, Sonuga-Barke E,
et al. Temporal and probabilistic discounting of rewards in children and
adolescents: eﬀects of age and ADHD symptoms. Neuropsychologia.
62. Sonuga-Barke EJ, Sergeant JA, Nigg J, Willcutt E. Executive dysfunction
and delay aversion in attention deﬁcit hyperactivity disorder: nosologic
and diagnostic implications. Child Adolesc Psychiatr Clin N Am.
63. Botanas CJ, Lee H, de la Peña JB, de la Peña IJ, Woo T, Kim HJ, et al.
Rearing in an enriched environment attenuated hyperactivity and
inattention in the spontaneously hypertensive rats, an animal model of
attention-deﬁcit hyperactivity disorder. Physiol Behav. 2016;155:30–7.
64. Sjoberg EA, Holth P, Johansen EB. the eﬀect of delay, utility, and magni-
tude on delay discounting in an animal model of attention-deﬁcit/hyper-
activity disorder (ADHD): a systematic review. In: Association of behavior
analysis international 42nd annual convention. Chicago, IL; 2016.
65. Willner P. Validation criteria for animal models of human mental disor-
ders: learned helplessness as a paradigm case. Prog Neuropsychophar-
macol Biol Psychiatry. 1986;10:677–90.
66. Geyer MA, Markou A. Animal models of psychiatric disorders. In: Bloom
FE, Kupfer DJ, editors. Psychopharmacology: the fourth generation of
progress. New York: Raven Press; 1995. p. 787–98.
67. McKinney W. Animal models of depression: an overview. Psychiatr Dev.
68. Koob GF, Heinrichs SC, Britton K. Animal models of anxiety disorders.
In: Schatzberg AF, Nemeroﬀ CB, editors. The American Psychiatric Press
textbook of psychopharmacology. 2nd ed. Washington: American
Psychiatric Press; 1998. p. 133–44.
69. Sarter M, Bruno JP. Animal models in biological psychiatry. In: D’Haenen
H, den Boer JA, Willner P, editors. Biological psychiatry. Chichester:
Wiley; 2002. p. 37–44.
70. Weiss JM, Kilts CD. Animal models of depression and schizophrenia. In:
Schatzberg AF, Nemeroﬀ CB, editors. The American Psychiatric Press
textbook of psychopharmacology. 2nd ed. Washington: American
Psychiatric Press; 1998. p. 89–131.
71. Belzung C, Lemoine M. Criteria of validity for animal models of psychi-
atric disorders: focus on anxiety disorders and depression. Biol Mood
Anxiety Disord. 2011;1(1):9. doi:10.1186/2045-5380-1-9.
72. Tricklebank M, Garner J. The possibilities and limitations of animal
models for psychiatric disorders. Cambridge: RSC Drug Discovery Royal
Society of Chemistry; 2012. p. 534–57.
73. Sagvolden T, Johansen EB. Rat models of ADHD. In: Stanford C, Tannock
R, editors. Behavioral neuroscience of attention-deﬁcit/hyperactivity
disorder and its treatments. Berlin: Springer; 2012. p. 301–15.
74. Association AP. Diagnostic and statistical manual of mental disorders
(Dsm-5®). Arlington County: American Psychiatric Pub; 2013.
75. Nestler EJ, Hyman SE. Animal models of neuropsychiatric disorders. Nat
76. Gould TD, Einat H. Animal models of bipolar disorder and mood stabi-
lizer eﬃcacy: a critical need for improvement. Neurosci Biobehav Rev.
77. Karatekin C. A comprehensive and developmental theory of ADHD is
tantalizing, but premature. Behav Brain Sci. 2005;28:430–1.
78. Willcutt EG, Doyle AE, Nigg JT, Faraone SV, Pennington BF. Validity of the
executive function theory of attention-deﬁcit/hyperactivity disorder: a
meta-analytic review. Biol Psychiatry. 2005;57:1336–46.
79. Einat H, Manji HK. Cellular plasticity cascades: genes-to-behavior
pathways in animal models of bipolar disorder. Biol Psychiatry.
80. Sontag TA, Tucha O, Walitza S, Lange KW. Animal models of attention
deﬁcit/hyperactivity disorder (ADHD): a critical review. ADHD Atten
Deﬁcit Hyperact Disord. 2010;2:1–20.
81. Schneider T, Przewłocki R. Behavioral alterations in rats prenatally
exposed to valproic acid: animal model of autism. Neuropsychophar-
82. Mehta MV, Gandal MJ, Siegel SJ. Mglur5-antagonist mediated reversal
of elevated stereotyped, repetitive behaviors in the VPA model of
autism. PLoS ONE. 2011;6:e26077.
83. Markram K, Rinaldi T, La Mendola D, Sandi C, Markram H. Abnormal fear
conditioning and amygdala processing in an animal model of autism.
84. Snow WM, Hartle K, Ivanco TL. Altered morphology of motor cortex
neurons in the VPA rat model of autism. Dev Psychobiol. 2008;50:633–9.
85. Sagvolden T, Dasbanerjee T, Zhang-James Y, Middleton F, Faraone S.
Behavioral and genetic evidence for a novel animal model of attention-
deﬁcit/hyperactivity disorder predominantly inattentive subtype. Behav
Brain Funct. 2008;4:b54.
86. van der Staay FJ. Animal models of behavioral dysfunctions: basic
concepts and classiﬁcations, and an evaluation strategy. Brain Res Rev.
87. Gómez O, Juristo N, Vegas S. Replication, reproduction and re-analysis:
three ways for verifying experimental ﬁndings. In: Proceedings of the
1st international workshop on replication in empirical software engi-
neering research (RESER 2010). Cape Town, South Africa; 2010.
88. Cartwright N. Replicability, reproducibility, and robustness: comments
on Harry Collins. Hist Polit Econ. 1991;23:143–55.
89. Slavin R, Smith D. The relationship between sample sizes and eﬀect
sizes in systematic reviews in education. Educ Eval Policy Anal.
90. Würbel H. Behaviour and the standardization fallacy. Nat Genet.
91. Richter SH, Garner JP, Würbel H. Environmental standardization: cure
or cause of poor reproducibility in animal experiments? Nat Methods.
92. Josef van der Staay F, Arndt S, Nordquist R. The standardization-general-
ization dilemma: a way out. Genes Brain Behav. 2010;9:849–55.
93. Rosenthal R. The ﬁle drawer problem and tolerance for null results.
Psychol Bull. 1979;86:638.
94. Sterling TD. Publication decisions and their possible eﬀects on infer-
ences drawn from tests of signiﬁcance—or vice versa. J Am Stat Assoc.
95. Ferguson CJ, Heene M. A vast graveyard of undead theories publication
bias and psychological science’s aversion to the null. Perspect Psychol
96. Arkes HR, Blumer C. The psychology of sunk cost. Organ Behav Hum
Decis Process. 1985;35:124–40.
97. Brumﬁel G. Particles break light-speed limit. Nature. 2011. doi:10.1038/
98. Matson J. Faster-than-light neutrinos? Physics luminaries voice doubts.
Sci Am. 2011. https://www.scientiﬁcamerican.com/article/ftl-neutri-
nos/. Accessed 13 Feb 2017.
99. Davids E, Zhang K, Tarazi FI, Baldessarini RJ. Animal models of attention-
deﬁcit hyperactivity disorder. Brain Res Rev. 2003;42:1–21.
100. Klauck SM, Poustka A. Animal models of autism. Drug Discov Today Dis
Page 13 of 13
Sjoberg Behav Brain Funct (2017) 13:3
• We accept pre-submission inquiries
• Our selector tool helps you to ﬁnd the most relevant journal
• We provide round the clock customer support
• Convenient online submission
• Thorough peer review
• Inclusion in PubMed and all major indexing services
• Maximum visibility for your research
Submit your manuscript at
Submit your next manuscript to BioMed Central
and we will help you at every step:
101. Arguello PA, Gogos JA. Schizophrenia: modeling a complex psychiatric
disorder. Drug Discov Today Dis Models. 2006;3:319–25.
102. Schmidt MV, Müller MB. Animal models of anxiety. Drug Discov Today
Dis Models. 2006;3:369–74.
103. Deussing JM. Animal models of depression. Drug Discov Today Dis
104. Pautasso M. Worsening ﬁle-drawer problem in the abstracts of natural,
medical and social science databases. Scientometrics. 2010;85:193–202.
105. Rothstein HR, Sutton AJ, Borenstein M. Publication bias in meta-analysis:
prevention, assessment and adjustments. Chichester: Wiley; 2006.
106. Sjoberg EA, D’Souza A, Cole GG. An evolutionary hypothesis concern-
ing female inhibition abilities: a literature review. In: Norwegian behav-
ior analysis society conference. Storeell, Norway; 2016.
107. Smith GD, Ebrahim S. Data dredging, bias, or confounding: they can all
get you into the BMJ and the friday papers. Br Med J. 2002;325:1437–8.
108. Simmons JP, Nelson LD, Simonsohn U. False-positive psychology
undisclosed ﬂexibility in data collection and analysis allows presenting
anything as signiﬁcant. Psychol Sci 2011:0956797611417632.
109. Sullivan GM, Feinn R. Using eﬀect size—or why the P value is not
enough. J Grad Med Educ. 2012;4:279–82.
110. Brooks GP, Johanson GA. Sample size considerations for multi-
ple comparison procedures in Anova. J Mod Appl Stat Methods.
111. Royall RM. The eﬀect of sample size on the meaning of signiﬁcance
tests. Am Stat. 1986;40:313–5.
112. Nakagawa S, Cuthill IC. Eﬀect size, conﬁdence interval and statistical
signiﬁcance: a practical guide for biologists. Biol Rev. 2007;82:591–605.
113. Garner JP. The signiﬁcance of meaning: why do over 90% of behavioral
neuroscience results fail to translate to humans, and what can we do to
ﬁx it? ILAR J. 2014;55:438–56.
114. Ramboz S, Oosting R, Amara DA, Kung HF, Blier P, Mendelsohn M, et al.
Serotonin receptor 1a knockout: an animal model of anxiety-related
disorder. Proc Natl Acad Sci. 1998;95:14476–81.
115. David DJ, Samuels BA, Rainer Q, Wang J-W, Marsteller D, Mendez I, et al.
Neurogenesis-dependent and-independent eﬀects of ﬂuoxetine in an
animal model of anxiety/depression. Neuron. 2009;62:479–93.
116. Siesser W, Zhao J, Miller L, Cheng SY, McDonald M. Transgenic mice
expressing a human mutant Β1 thyroid receptor are hyperactive,
impulsive, and inattentive. Genes Brain Behav. 2006;5:282–97.
117. Gourley SL, Taylor JR. Recapitulation and reversal of a persistent
depression-like syndrome in rodents. Curr Protoc Neurosci. 2009;Chap-
ter 9:Unit-9.32. doi:10.1002/0471142301.ns0932s49.
118. Willner P. Chronic mild stress (CMS) revisited: consistency and behav-
ioural-neurobiological concordance in the eﬀects of CMS. Neuropsy-
119. Szechtman H, Sulis W, Eilam D. Quinpirole induces compulsive check-
ing behavior in rats: a potential animal model of obsessive-compulsive
disorder (OCD). Behav Neurosci. 1998;112:1475.
120. King JA, Abend S, Edwards E. Genetic predisposition and the develop-
ment of posttraumatic stress disorder in an animal model. Biol Psychia-
121. Lipska BK, Jaskiw GE, Weinberger DR. Postpubertal emergence of
hyperresponsiveness to stress and to amphetamine after neonatal
excitotoxic hippocampal damage: a potential animal model of schizo-
phrenia. Neuropsychopharmacology. 1993;9:67–75.
122. Lodge DJ, Behrens MM, Grace AA. A loss of parvalbumin-containing
interneurons is associated with diminished oscillatory activity in an
animal model of schizophrenia. J Neurosci. 2009;29:2344–54.
123. Kesby JP, Burne TH, McGrath JJ, Eyles DW. Developmental vitamin D
deﬁciency alters Mk 801-induced hyperlocomotion in the adult rat: an
animal model of schizophrenia. Biol Psychiatry. 2006;60:591–6.