ArticlePDF AvailableLiterature Review

Are animal models predictive for humans?

  • Americans For Medical Advancement

Abstract and Figures

It is one of the central aims of the philosophy of science to elucidate the meanings of scientific terms and also to think critically about their application. The focus of this essay is the scientific term predict and whether there is credible evidence that animal models, especially in toxicology and pathophysiology, can be used to predict human outcomes. Whether animals can be used to predict human response to drugs and other chemicals is apparently a contentious issue. However, when one empirically analyzes animal models using scientific tools they fall far short of being able to predict human responses. This is not surprising considering what we have learned from fields such evolutionary and developmental biology, gene regulation and expression, epigenetics, complexity theory, and comparative genomics.
Content may be subject to copyright.
BioMed Central
Page 1 of 20
(page number not for citation purposes)
Philosophy, Ethics, and Humanities
in Medicine
Open Access
Are animal models predictive for humans?
Niall Shanks
, Ray Greek*
and Jean Greek
Wichita State University, Department of History, 1845 N Fairmont, Fiske Hall, Wichita KS 67260, USA and
Americans For Medical
Advancement, 2251 Refugio Rd Goleta, CA 93117, USA
Email: Niall Shanks -; Ray Greek* -; Jean Greek -
* Corresponding author
It is one of the central aims of the philosophy of science to elucidate the meanings of scientific terms
and also to think critically about their application. The focus of this essay is the scientific term
predict and whether there is credible evidence that animal models, especially in toxicology and
pathophysiology, can be used to predict human outcomes. Whether animals can be used to predict
human response to drugs and other chemicals is apparently a contentious issue. However, when
one empirically analyzes animal models using scientific tools they fall far short of being able to
predict human responses. This is not surprising considering what we have learned from fields such
evolutionary and developmental biology, gene regulation and expression, epigenetics, complexity
theory, and comparative genomics.
"When I use a word,' Humpty Dumpty said in rather a
scornful tone, 'it means just what I choose it to mean
– neither more nor less." "The question is," said Alice,
"whether you can make words mean so many different
Lewis Carroll in Through the Looking Glass 1871.
There is a serious scientific controversy concerning the
predictive power of animal models. In this article we will
use the phrase animal model to mean, or the word animal
in the context of, the use of a nonhuman animal, usually
a mammal or vertebrate to predict human response to
drugs and disease. We enthusiastically acknowledge that
animals can be successfully used in many areas of science,
such as in basic research, as a source for replacement parts
for humans, as bioreactors and so forth. These uses are not
included in our definition or critique as there is no claim
made for their predictive power. This article focuses solely
on using animals/animal models to predict human
responses in light of what the word predict means in sci-
Philosophy of science
The meaning of words is very important in all areas of
study but especially science.
Philosophers of science including Quine, Hempel and
others have argued that words must have meaning in sci-
ence and in fact these meanings separate science from
pseudoscience. Take for example the word prediction. A
research method need not be predictive to be used but if
one claims predictive ability for the test or project, then
one means something very specific.
This paper addresses the use of the word predict as applied
to animal models. It is our position that the meaning of
the word has been corrupted and hence the concept
behind the word is in danger as well as everything the con-
Published: 15 January 2009
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2 doi:10.1186/1747-5341-4-2
Received: 23 July 2008
Accepted: 15 January 2009
This article is available from:
© 2009 Shanks et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 2 of 20
(page number not for citation purposes)
cept implies. Predict is not so much a word as a concept
and is closely related to hypothesis. Hypothesis can be
defined as a proposed explanation for a phenomenon,
either observed or thought, that needs to be tested for
validity. According to Sarewitz and Pielke:
In modern society, prediction serves two important
goals. First, prediction is a test of scientific understand-
ing, and as such has come to occupy a position of
authority and legitimacy. Scientific hypotheses are tested
by comparing what is expected to occur with what actually
occurs. When expectations coincide with events, it
lends support to the power of scientific understanding
to explain how things work. " [Being] predictive of
unknown facts is essential to the process of empirical
testing of hypotheses, the most distinctive feature of
the scientific enterprise," observes biologist Francisco
Ayala (Ayala, F. 1996. The candle and the darkness Sci-
ence 273:442.)[1] (Emphasis added.)
In the case of animal models what actually occurs is what
happens in humans. If the purpose of the test, be it a test
on an animal or in silico, is to predict human response
then the tests must be evaluated by how well it conforms
to human response. We again acknowledge that not all
tests and studies involving animals are done with predic-
tion in mind. Nevertheless, those tests promoted as being
predictive must be judged by how well they predict
human response.
Sarewitz and Pielke continue:
Second, prediction is also a potential guide for deci-
sion making. We may seek to know the future in the
belief that such knowledge will stimulate and enable
beneficial action in the present [1].
We will return to decision making.
The philosopher W.V.O. Quine has remarked:
A prediction may turn out true or false, but either way
it is diction: it has to be spoken or, to stretch a point,
written. Etymology and the dictionary agree on this
point. The nearest synonyms "foresight," "foreknowl-
edge," and "precognition" are free of that limitation,
but subject to others. Foreknowledge has to be true,
indeed infallible. Foresight is limited to the visual
when taken etymologically, and is vague otherwise.
"Precognition" connotes clairvoyance ... Prediction is
rooted in a general tendency among higher vertebrates to
expect that similar experiences will have sequels similar to
each other [[2]159] ... (Emphasis added.)
Predictions, generated from hypotheses, are not always
correct. But if a modality or test or method is said to be
predictive then it should get the right answer a very high
percentage of the time in the biological sciences and all
the time in the physical sciences. (We will discuss this as
it applies to the biological sciences momentarily.)
If a modality consistently fails to make accurate predic-
tions then the modality cannot be said to be predictive
simply because it occasionally forecasts a correct answer.
The above separates the scientific use of the word predict
from the layperson's use of the word, which more closely
resembles the words forecast, guess, conjecture, project and
so forth. We will return to these points shortly
Many philosophers of science think a theory (and we add,
a modality) could be confirmed or denied by testing the
predictions it made. Unlike the physical sciences, the bio-
logical sciences, which study complex systems, must rely
on statistics and probability when assessing what the
response to a stimulus will be or in discussing the likeli-
hood a phenomenon will occur. There is an assumption
that has taken on the trappings of a theory or perhaps an
overarching hypothesis in the animal model community
that results from experiments on animals can be directly
applied to humans; that animal models are predictive.
This has resulted in an unquestioned methodological
approach; using animals as surrogate humans. Ironically,
this hypothesis has not been questioned as hypotheses
should be questioned in science, hence our calling it an
overarching hypothesis. Whether or not the animal model
per se can be used to predict human response can be tested
and if the results have a high enough sensitivity, specifi-
city, positive and negative predictive value then the
hypothesis that animals can predict human response
would be verified. If verified, then one could say that ani-
mal models are predictive for humans and if refuted then
one could say animal models are not predictive for
There are two very different ways animals and hypotheses
are used in science. Hypotheses result in predictions that
have to be tested thereby confirming or falsifying the
hypothesis. Let us assume that a scientist is using animals
to perform basic research. At the end of the series of ani-
mal experiments the investigator has, at most, a hypothe-
sis about a likely human response to the same stimulus or
substance, when allowances have been made for differ-
ences in body weight, exposure, and so on. The prediction
that the hypothesis entails must then be tested, and this
will require the gathering of human data. The prediction
may be verified or it may be falsified in the light of such
human data, but the evidential burden here cannot be
evaded from the standpoint of basic scientific methodol-
ogy. Nowhere in this use of animals to generate a hypoth-
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 3 of 20
(page number not for citation purposes)
esis have animals been assumed predictive. LaFollette and
Shanks have referred to the practice of using animals in
this fashion as heuristic or hypothetical animal models
(HAMs) [3,4].
This is in contrast to the hypothesis that some scientists
start with, namely that animals are predictive for humans.
(See table 1.) By assuming this, these scientists make the
claim that drugs and chemicals that would have harmed
humans have been kept off the market secondary to
results from animal tests. This is disingenuous unless we
have a priori reason to assume animal models are predic-
tive. The hypothesis was, in these cases, never tested. It
would in many cases be unethical to conduct on humans
the sorts of carefully controlled laboratory studies that are
regularly conducted on, say, rodents. However, there are
other, ethical ways to gain human data in the context of
epidemiology (for example retrospective and prospective
epidemiological studies), in vitro research using human
tissue, in silico research, and the recent technological
breakthrough of microdosing [5]. Further, it must never
be forgotten that when industrial chemicals find their way
into the environment, or drugs are marketed to the gen-
eral population, human experiments have already taken
place. Moreover, as Altman [6] has observed, there are
many examples, both ancient and modern, where
researchers, doubting the applicability or relevance of ani-
mal models to the human situation, have experimented
on themselves – a practice that Altman points out contin-
ues to the present (recent Nobel laureate Barry Marshal
being but one example). In any event, at the very least a
track record of success (vis-à-vis positive and negative pre-
dictive values) using specific animal models should be
evident if society is to accept hypotheses from animal test-
ing as predictive for humans.
Therefore, we must emphasize that when discussing ani-
mals as predictive models we are discussing the overarch-
ing hypothesis that animals are predictive, not the use of
animals to generate a hypothesis such as occurs in basic
Now is an appropriate time to discuss the concept of proof
and where the burden of proof lies in science. As in law, it
lies on the claimant. The null hypothesis demands that we
assume there is no connection between events until such
causation is proven. Thus, those claiming animal models
are predictive of human responses in the context of bio-
medical research must show that what they are claiming is
true. The burden is not on us to prove that animal models
of, say carcinogenesis or toxicity, are not predictive. It is
the job of those advocating animal models as predictive to
demonstrate they are. This will require a consideration of
what the evidence actually shows.
While physics deals with simple systems upon which
reductionism can be practiced, biology does not always
have that luxury. There are areas in biology – for example
comparative anatomy – where the use of scaling princi-
ples have had genuine applicability. But biology is not
physics, and there are other places – for example in some
branches of predictive toxicology – where the use of such
scaling factors (such as body weight
) have been less
than useful for reasons we will explore below. Biological
systems are certainly consistent with the laws of physics,
but they have properties consequent upon internal organ-
ization, ultimately rooted in evolutionary history, not
found in physics. This means that even when the same
stimulus is applied, end results can differ markedly. The
response of different humans to the same drug or disease
is a well-known example of this phenomenon [7-13].
There are however, ways to judge the predictive nature of
tests even in biological complex systems. Values such as
positive predictive value, sensitivity, specificity, and nega-
tive predictive value (we will discuss these values momen-
tarily) can be calculated to confirm or refute hypotheses.
Values from tests seeking to predict a response that
approach what would be expected from random chance
would obviously not fall into the category predictive.
Claims about the predictive nature of animal models
According to Salmon there are at least three reasons for
making predictions:
Table 1: Hypothesis
Animal models in toxicology and disease research. Correct use of animal models in areas such as basic research.
Assumptions Usual plus animal models are predictive. Usual, e.g. there are universal laws.
Hypothesis X leads to Y. None or X leads to Y.
Animal test In an animal test, X led to Y. X leads to Y based on results in animals.
Hypothesis X leads to Y in humans.
Test Apply X in humans and see if it leads to Y or study populations where X
was applied and ascertain result.
Conclusion Since in animals X did/did not lead to Y, X will/will not
lead to Y in humans also.
Actual results from humans.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 4 of 20
(page number not for citation purposes)
1. because we want to know what will happen in the
2. to test a theory;
3. an action is required and the best way to choose which
action is to predict the future [14].
In the case of carcinogenesis we want to know: (1) what
will happen in the future (will the chemical cause cancer
in humans?); and (3) an action is required (allow the
chemical on the market or not?) and the best way to
choose which action is to be able to predict the future.
Neither (1) nor (3) is subtle. We want a correct answer to
the question, "Is this chemical carcinogenic to humans?"
or to similar questions such as, "What will this drug do to
humans?" and "Is this drug a teratogen?" and "Is this the
receptor used by HIV to enter the human cell?" But guess-
ing correctly or finding correlations are not, as we have
seen the same as predicting the answer. Neither is a high
degree of sensitivity alone, as we shall see, the same as pre-
The following will help the reader gain a feel for the con-
tours of this scientific debate.
Butcher [15], Horrobin [16], Pound et al. [17] and others
[3,4,18-24] have questioned the value of using animals to
predict human response. Regardless, prediction is a prob-
lem. U.S. Secretary of Health and Human Services Mike
Leavitt stated in 2007:
Currently, nine out of ten experimental drugs fail in
clinical studies because we cannot accurately predict
how they will behave in people based on laboratory
and animal studies" [24].
This is a very damaging statement for those who assert
that animals are predictive. For some, the facts behind this
statement would, without further support, answer the pre-
diction question. But we will continue.
This debate has recently expanded to Philosophy, Ethics,
and Humanities in Medicine. Knight [25] recently ques-
tioned the use of chimpanzees in biomedical research cit-
ing among other reasons their lack of predictability.
Shanks and Pyles [26] questioned the ability of animals to
predict human response resulting in Vineis and Melnick
[27] responding that animals can be used to predict
human response to chemicals with reference to carcino-
genesis and that epidemics of cancer could have been pre-
vented if animal data had been used to reduce human
exposure or ban the chemical entirely. This claim, of ani-
mals predicting human response, is not unique [28,29].
Gad wrote in Animal Models in Toxicology 2007:
Biomedical sciences' use of animals as models to help
understand and predict responses in humans, in toxi-
cology and pharmacology in particular, remains both
the major tool for biomedical advances and a source
of significant controversy ...
At the same time, although there are elements of poor
practice that are real, by and large animals have
worked exceptionally well as predictive models for
humans-when properly used ...
Whether serving as a source of isolated organelles,
cells or tissues, a disease model, or as a prediction for
drug or other xenobiotic action or transformation in
man, experiments in animals have provided the neces-
sary building blocks that have permitted the explosive
growth of medical and biological knowledge in the
later half of the 20th century and into the 21st century
Animals have been used as models for centuries to pre-
dict what chemicals and environmental factors would
do to humans ... The use of animals as predictors of
potential ill effects has grown since that time [the year
Current testing procedures (or even those at the time
in the United States, where the drug [thalidomide] was
never approved for human use) would have identified
the hazard and prevented this tragedy [29]. (Emphasis
Fomchenko and Holland observe:
GEMs [genetically engineered mice] closely recapitu-
late the human disease and are used to predict human
response to a therapy, treatment or radiation schedule
[30]. (Emphasis added.)
Hau, editor of an influential handbook on animal-based
research notes:
A third important group of animal models is
employed as predictive models. These models are used
with the aim of discovering and quantifying the
impact of a treatment, whether this is to cure a disease
or to assess toxicity of a chemical compound [31].
Clearly, Hau offers the use of animals as predictive models
just as we are describing.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 5 of 20
(page number not for citation purposes)
The prediction claim is also strong when the word predic-
tion is not actually used but is implied or linked to causal-
ity. Fomchenko and Holland continue:
Using in vitro systems and in vivo xenograft brain
tumor modeling provides a quick and efficient way of
testing novel therapeutic agents and targets, knowl-
edge from which can be translated and tested in more
sophisticated GEMs that faithfully recapitulate human
brain tumors and will likely result in high-quality clinical
trials with satisfactory treatment outcomes and reduced
drug toxicities. Additional use of GEMs to establish
causal links between the presence of various genetic
alterations and brain tumor initiation or determining
their necessity for tumor maintenance and/or progres-
sion provide us with a glimpse into other important
aspects of brain tumor biology [30]. (Emphasis
Fomchenko and Holland are here clearly saying what hap-
pens in animals will happen in humans; that animals are
predictive. Akkina is saying the same:
A major advantage with this in vivo system [geneti-
cally modified SCID mice] is that any data you get
from SCID-hu mice is directly applicable to a human
situation [32].
This use of prediction is not confined to the scientific lit-
erature. It is, if anything even more widespread when sci-
entists are speaking to the nonscientist public.
The above examples could be multiplied without effort.
Due to the ubiquitous nature of comments like the above,
we can safely deduce that many in the scientific commu-
nity use the word predict to mean that what happens in
animal models will translate directly to humans. But is
this a factual interpretation of reality?
Prediction in biological complex systems
What does constitute prediction in biological complex
systems? Many justify the use of animals as predictive
models by stating that animals are predictive but may not
be reliably predictive. This seems to be oxymoronic. Reli-
ably predictive would be a tautology and a method cannot
be predictive, in science, if it is not reliably so. However,
we acknowledge that biology is not physics so perhaps
some leniency is needed when discussing prediction in
biological complex systems. How then should we think of
prediction in the context of toxicology, pathophysiology,
and pharmacology? The 2 × 2 table for calculating sensi-
tivity, specificity, positive predictive value and negative
predictive value is how predictability is assessed in these
contexts (see table 2).
In biology many concepts are best evaluated by using sim-
ple statistical methods involving probability. For exam-
ple, in medicine we can use a blood test to determine
whether someone has liver disease. In order to ascertain
how well this test actually determines the health of the
liver we calculate the sensitivity and specificity of the test
along with the positive predictive value (PPV) and nega-
tive predictive value (NPV). The sensitivity of a test is the
probability (measured on a scale from 0.0 to 1.0) of a pos-
itive test among people whose test should be positive –
those who do in fact suffer from liver disease. Specificity is
the probability of a negative test among people whose test
should be negative – those without liver disease. The pos-
itive predictive value of a test is the proportion of people
with positive test results who are actually positive. The
negative predictive value is the proportion of people with
negative test results who are actually negative. This is all
quite straightforward. Very few tests have a sensitivity or
specificity of 1.0 or a PPV and NPV of 1.0 but in order for
a test to be useful given the demanding standards of med-
ical practice, in this case tell us if the patient actually has
liver disease, it needs to be have PPV and NPV in at least
the .95 to 1.0 range.
By definition, when we speak of animals predicting
human response in drug testing and disease research we
are addressing the risks of wrong predictions and how
much risk society is willing to tolerate. Troglitazone
(Rezulin™) is a good example of the margin of error for
medical practice tolerated in society today. Troglitazone
was taken by well over 1 million people with less 1% suf-
fering liver failure, yet the drug was withdrawn because of
this side effect [33]. (Interestingly, animal studies failed to
reproduce liver failure from troglitazone [34].) Rofecoxib
(Vioxx™) is another example of the small percentage of
morbidity or mortality tolerated in the practice of medi-
cine vis-à-vis introducing a new drug. Figures vary, and are
controversial, but it now appears that apparently less than
1% of people who took rofecoxib experienced a heart
attack or stroke as a result, yet it was also withdrawn [35].
This means that even if a test with a PPV of .99 had assured
Table 2: Statistics used in analysis of prediction.
Gold Standard
Test + TP FP
TP = True positive
TN = True negative
TN = True negative
FN = False negative
Sensitivity = TP/TP+FN
Specificity = TN/FP+TN
Positive Predictive Value = TP/TP+FP
Negative Predictive Value = TN/FN+TN
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 6 of 20
(page number not for citation purposes)
industry that rofecoxib and troglitazone were safe, the test
would not have been accurate enough for society's stand-
ards. This is an important point. Medical practice does not
tolerate risks (probability of being wrong) acceptable in
some experiments conducted in labs. In basic research we
might proceed with a study based on the outcome being
more likely than not. For basic research this is acceptable.
However, getting the answer wrong in medical practice
has consequences; people die. Societal standards for med-
ical practice today demand very high sensitivity, specifi-
city, PPV and NPV from its tests. We will apply the above
to animal models shortly.
These standards of prediction, described above, should
not be confused with those of other activities in society
such as gambling in Las Vegas. If we worked out a method
to be correct 51% of the time, we would gladly take that
predictive ability to the blackjack table, the roulette wheel,
and the craps table and leave with a fortune. Sometimes
being correct 51% of the time is great!
In light of the above, it is common to use multiple tests
when attempting to determine a patient's condition or
evaluate a drug. If someone suggests that an animal, say a
mouse, can predict human response to chemicals vis-à-vis
carcinogenesis, he would need to provide data consistent
with that needed for table 2. Perhaps not one animal
alone is capable of predicting human response but when
the same result occurs in two species, say and mouse and
a monkey, then perhaps the results are predictive. Or per-
haps animal data combined with other data translates
into a high predictive value. Again, if this were the case the
person making the claim should be able to provide data
amenable to evaluation by the gold standard laid out in
table 2. To the best of our knowledge no such data exists.
Predicting human response
We will now discuss the actual data that does exist. The
data from testing six drugs on animals was compared with
the data from humans [36]. The animal tests were shown
to have a sensitivity of 0.52 and the positive predictive
value was 0.31. The sensitivity is about what one would
expect from a coin toss and the PPV less. Not what is con-
sidered predictive in the scientific sense of the word. Val-
ues of this nature are more appropriately referred to as
guesses. Because of data like this, animal modelers will
occasionally use the phrase concordance rate or true positive
concordance rate when judging animal tests. These terms
are not in the normal prediction-relevant lexicon and are
usually used to mean correlation, which has nothing to do
with prediction, as we will see.
Two studies from the 1990s revealed that: (1) in only 4 of
24 toxicities were found in animal data first [36]; and (2)
in only 6 of 114 cases did clinical toxicities have animal
correlates [37]. The sensitivity, specificity, PPV and NPV of
animal models based on these studies are obviously sub-
A 1994 study of 64 marketed drugs conducted by the Jap-
anese Pharmaceutical Manufacturers Association found
that 39/91 (43%) clinical toxicities were not forecast from
animal studies [38]. (This study, as do many others,
counted as a positive prediction when any animal corre-
lated with the human response. This is disingenuous as it
is cherry picking the data.) Without knowing the raw data
it is impossible to calculate a true PPV and NPV but even
taken at face value, 43% wrong/57% correct is not predic-
Figures 1 and 2 illustrate graphically our contention that
animals are not predictive. Both figures chart bioavailabil-
ity data from three species of animals and compare it to
data from humans. (Bioavailability is usually defined as
the fraction of a drug that reaches the systemic circulation
and reflects a number of different variables. Regardless of
the correlation or lack thereof of the variables, the bioa-
vailability of the drug is the final determinant of how
much drug presents to the receptor or active site.) Figure 1
was compiled by Harris from a paper by Grass and Sinko
in Advanced Drug Delivery Reviews. As the reader can see
the bioavailability of various drugs is measured in
humans and three species of animals (representing pri-
mates, rodents and dogs) and the results plotted. Some of
the drugs that showed high levels of bioavailability in
dogs had very low levels in humans and vice-versa. This
was true regardless of drug or species. Some levels did cor-
relate between species but as a whole there was no corre-
lation between what a drug did in humans and what it did
in any given animal species or any combination thereof.
Figure 2 was complied by Harris from a book section by
Mandagere and Jones in the book Drug Bioavailability: Esti-
mation of Solubility, Permeability, Absorption and Bioavaila-
bility (Methods and Principles in Medicinal Chemistry) and
made the same measurements and reached the same con-
clusions as did Grass and Sinko.
As you can see there is little correlation between animal
and human data. In some cases human bioavailability is
high when bioavailability in dogs is high but in other
cases dogs and humans vary considerably. The patterns
exhibited by both are what are frequently referred to as a
shotgun pattern; meaning that if one fired a shotgun full
of bird shot at a target one would see the same pattern. No
precision and no accuracy. The pattern is also referred to
as a scattergram, meaning that the pattern is what one
would expect from random associations.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 7 of 20
(page number not for citation purposes)
The above illustrates why eliminating drugs in develop-
ment based on animal tests presents problems. Sankar in
The Scientist 2005:
The typical compound entering a Phase I clinical trial
has been through roughly a decade of rigorous pre-
clinical testing, but still only has an 8% chance of
reaching the market. Some of this high attrition rate is
due to toxicity that shows up only in late-stage clinical
trials, or worse, after a drug is approved. Part of the
problem is that the toxicity is assessed in the later
stages of drug development, after large numbers of
compounds have been screened for activity and solu-
bility, and the best produced in sufficient quantities
for animal studies.
Howard Jacob notes that rats and humans are 90%
identical at the genetic level. However, the majority of
the drugs shown to be safe in animals end up failing
in clinical trials. "There is only 10% predictive power,
since 90% of drugs fail in the human trials" in the tra-
ditional toxicology tests involving rats. Conversely,
some lead compounds may be eliminated due to their toxic-
ity in rats or dogs, but might actually have an acceptable
risk profile in humans [39]. (Emphasis added.)
Again, for some this alone would settle the prediction
question. But we continue.
Sensitivity is not the same prediction. While it is true that
that all known human carcinogens that have been ade-
Human vs animal bioavailability 1Figure 1
Human vs animal bioavailability 1. Graph generously provided by James Harris PhD, who presented it at the Center for
Business Intelligence conference titled 6
Forum on Predictive ADME/Tox held in Washington, DC September 27–29, 2006 and is
adapted from data that appeared in Grass GM, Sinko PJ. Physiologically-based pharmacokinetic simulation modelling. Adv Drug
Deliv Rev. 2002 Mar 31;54(3):433–5.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 8 of 20
(page number not for citation purposes)
quately studied have been shown to be carcinogenic in at
least one animal species [40-42], it is also true that an
irreverent aphorism in biology known as Morton's Law
states: "If rats are experimented upon, they will develop
cancer." Morton's law is similar to Karnofsky's law in ter-
atology, which states that any compound can be tera-
togenic if given to the right species at the right dosage at
the right time in the pregnancy. The point being that it is
very easy to find positive results for carcinogenicity and
teratogenicity; a high sensitivity. Nonetheless, this is
meaningless without also knowing specificity, positive
predictive value, and negative predictive value.
How well do animal models predict carcinogenesis? Pos-
sible carcinogens are listed in the Integrated Risk Informa-
tion System (IRIS) chemicals database managed by the
Environmental Protection Agency (EPA). According to
Knight et al. [43] as of 1 January 2004, IRIS was unable to
classify the carcinogenic status of 93 out of 160 chemicals
that had been evaluated only by animal tests. The World
Health Organisation also classifies chemicals according to
carcinogenicity via the International Agency for Research
on Cancer (IARC).
Knight et al. wrote in 2006:
For the 128 chemicals with human or animal data also
assessed by the human carcinogenicity classifications
were compatible with EPA classifications only for
those 17 having at least limited human data (p =
0.5896). For those 111 primarily reliant on animal
Human vs animal bioavailability 2Figure 2
Human vs animal bioavailability 2. Graph generously provided by James Harris PhD, who presented it at the Center for
Business Intelligence conference titled 6
Forum on Predictive ADME/Tox held in Washington, DC September 27–29, 2006 and is
adapted from data that appeared in Arun K Mandagere and Barry Jones. Prediction of Bioavailability. In (Eds) Han van de
Waterbeemd, Hans Lennernäs, Per Artursson, and Raimund Mannhold. Drug Bioavailability: Estimation of Solubility, Permeability,
Absorption and Bioavailability (Methods and Principles in Medicinal Chemistry) Wiley-VCS 2003. P444–60.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 9 of 20
(page number not for citation purposes)
data, the EPA was much more likely than the IARC to
assign carcinogenicity classifications indicative of
greater human risk (p < 0.0001) [43].
This discrepancy is troublesome. Knight et al. discussed a
study in 1993 by Tomatis and Wilbourn [44]. Tomatis
and Wilbourn surveyed the 780 chemical agents or expo-
sure circumstances listed within Volumes 1–55 of the
IARC monograph series [45]. They found that "502
(64.4%) were classified as having definite or limited evi-
dence of animal carcinogenicity, and 104 (13.3%) as def-
inite or probable human carcinogens ... around 398
animal carcinogens were considered not to be definite or
probable human carcinogens."
Knight et al. continue:
... based on these IARC figures, the positive predictiv-
ity of the animal bioassay for definite probable human
carcinogens was only around 7% (104/502), while the
false positive rate was a disturbing 79.3% (398/502)
More-recent IARC classifications indicate little movement
in the positive predictivity of the animal bioassay for
human carcinogens. By January 2004, a decade later, only
105 additional agents had been added to the 1993 figure,
yielding a total of 885 agents or exposure circumstances
listed in the IARC Monographs [46]. Not surprisingly the
proportion of definite or probable human carcinogens
resembled the 1993 figure of 13.3%. By 2004, only 9.9%
of these 885 were classified as definite human carcino-
gens, and only 7.2% as probable human carcinogens,
yielding total of 17.1%.
Haseman [47] published a study in 2000 in which he
revealed that 250 (53.1%) of chemicals in the NTP carci-
nogenicity database were carcinogenic in at least one sex-
species group. He concluded that the actual number pos-
ing a significant carcinogenic risk to humans was proba-
bly far lower. Approximately half of all chemicals tested
on animals and included in the comprehensive Berkeley-
based potency carcinogenic database (CPDB) were carci-
nogenic [48].
Knight et al. conclude:
If a risk-avoidance interpretation is used, in which any
positive result in male or female mice or rats is consid-
ered positive, then nine of the 10 known human car-
cinogens among the hundreds of chemicals tested by
the NTP are positive, but so are an implausible 22% of
all chemicals tested. If a less risk-sensitive interpreta-
tion is used, whereby only chemicals positive in both
mice and rats are considered positive, then only three
of the six known human carcinogens tested in both
species are positive. The former interpretation could
result in the needless denial of potentially useful
chemicals to society, while the latter could result in
widespread human exposure to undetected human
carcinogens [43].
At this point in the debate, some will state that animal
models can be useful in science and scientific research and
attempt to conflate the word predict with the word useful.
This is disingenuous for many reasons. First, useful is too
ambiguous to mean anything. Useful to whom? Useful
how? Almost anything can be useful in some sense of the
word. If someone gets paid to engage in fortune telling
then fortune telling is very useful to that person. Whether
it can be used to predict the future is an entirely different
question. We do not deny animal models can be quite use-
ful in certain circumstances but this has nothing to do
with whether they are predictive. Second, this is an exam-
ple of bait and switch; sell animal models as predictive for
humans then justify their use, since they are not predic-
tive, because they are useful. Freeman and St Johnston
illustrate this type of disingenuousness when they state:
Many scientists who work on model organisms,
including both of us, have been known to contrive a
connection to human disease to boost a grant or
paper. It's fair: after all, the parallels are genuine, but
the connection is often rather indirect. More examples
will be discussed in later chapters [49].
Third, predict has a very specific meaning in science,
indeed the concept of prediction is one thing that separate
science from pseudoscience. By conflating useful and pre-
dict we diminish the respectability of science in general
putting it more on the level of selling used cars. Finally, we
again acknowledge that studying animals can lead to new
knowledge. This point is not in dispute.
Let us take and in depth look at one drug and the animal
tests that could have been performed and evaluate what we
would have learned from them. There are many examples
of animal models giving results at extreme variance from
humans and even from each other; thalidomide being but
one, but thalidomide occupies a special place in history so
we will use it. Thalidomide was a sedative prescribed to
pregnant women in the late 1950 and early 1960s. The
children of some of these women were born without
limbs, a condition known as phocomelia. Could the tha-
lidomide tragedy have been predicted and prevented on
the basis of animal experimentation as Gad [29] and oth-
ers have claimed? Consider the evidence. Schardein who
has studied this tragedy has observed:
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 10 of 20
(page number not for citation purposes)
In approximately10 strains of rats, 15 strains of mice,
11 breeds of rabbits, 2 breeds of dogs, 3 strains of
hamsters, 8 species of primates and in other such var-
ied species as cats, armadillos, guinea pigs, swine and
ferrets in which thalidomide has been tested, tera-
togenic effects have been induced only occasionally
We remind the reader that these results, and those below
were from tests performed after thalidomide's affects had
been observed in humans. Schardein also observes:
It is the actual results of teratogenicity testing in pri-
mates which have been most disappointing in consid-
eration of these animals' possible use as a predictive
model. While some nine subhuman primates (all but
the bushbaby) have demonstrated the characteristic
limb defects observed in humans when administered
thalidomide, the results with 83 other agents with
which primates have been tested are less than perfect.
Of the 15 listed putative human teratogens tested in
nonhuman primates, only eight were also teratogenic
in one or more of the various species [51].
Manson and Wise summarized the thalidomide testing as
An unexpected finding was that the mouse and rat
were resistant, the rabbit and hamster variably respon-
sive, and certain strains of primates were sensitive to
thalidomide developmental toxicity. Different strains
of the same species of animals were also found to have
highly variable sensitivity to thalidomide. Factors such
as differences in absorption, distribution, biotransfor-
mation, and placental transfer have been ruled out as
causes of the variability in species and strain sensitivity
Could the use of animal models have predicted thalido-
mide's adverse affects? Even if all the animals mentioned
above were studied the answer is no. Different species
showed a wide variety of responses to thalidomide. Once
again, if you bet on enough horses you will probably find
a winner or if you cherry pick the data you will find a win-
ner. In the present case of thalidomide, human effects
were already known so cherry picking is easy. The animal
models for thalidomide discussed above were aimed at
retroactively simulating known human effects. Even then
not many animal models succeeded. If the human effects
were unknown, what would the case have looked like from
the standpoint of prediction? In this case, to pursue the
horse racing analogy, we would have numerous horses to
bet on without any idea which one would win. Certainly
one will win (which is not a given when testing on ani-
mals in hopes of reproducing or guessing human
response), but which one? We cannot know that until
after the fact so how do we judge prospectively which horse
to wager on or which animal model to choose? Which
model species were relevant to the human case in advance
of the gathering of human data? This is by no means a triv-
ial question as evolutionary closeness does not increase
the predictive value of the model. Caldwell points out that
relatively small biological differences between test sub-
jects can lead to very different outcomes:
It has been obvious for some time that there is gener-
ally no evolutionary basis behind the particular drug
metabolizing ability of a particular species. Indeed,
among rodents and primates, zoologically closely
related species exhibit markedly different patterns of
metabolism [53].
The thalidomide case illustrates why the overarching
hypothesis that animals are predictive for humans is
wrong. Again, this overarching hypothesis is in contrast to
using animals as heuristic devices where the hypotheses
drawn from them must be tested.
Even if we retrospectively picked all the animals that
reacted to thalidomide as humans did, we still could not
say these animals predicted human response as their his-
tory of agreeing with human response to other drugs var-
ied considerably. Prediction vis-à-vis drug testing and
disease research implies a track record. Single correct
guesses are not predictions. Nonhuman primates are a
good example of this. They more or less reacted to thalid-
omide as humans did (so we will give them the benefit of
the doubt as corresponding to humans in this case). How-
ever, when tested with other drugs they predicted human
response about as well as a coin toss. Add to all this the
fact that all the animals whose offspring exhibited phoc-
omelia consequent to the administration of thalidomide
did so only after being given doses 25–150 times the
human dose [54-56] and it does not appear that any ani-
mal, group of animals, or the animal model per se could
have been used to predict thalidomide's teratogenicity in
humans. (Ironically, it was the thalidomide tragedy that
ushered in many of the regulatory requirements for using
Thalidomide's controversial history should not interfere
with our analysis, as the history in question does not over-
lap with our issue. The controversy revolves around what
animals were tested, whether pregnant animals were
tested, what the drug company knew and when they knew
it and so forth. This is immaterial, as we are analyzing the
data as if it were available before releasing the drug. We
are giving the animal model the maximum benefit of the
doubt and what we find is that even if all the data availa-
ble today had been available then, the decision to release
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 11 of 20
(page number not for citation purposes)
the drug or not would not have been informed by animal
tests. Karnofsky's law is relevant here. Any drug is tera-
togenic if given to the right animal at the right time. Given
thalidomide's profile today, physicians would advise
pregnant women not to take the drug, which is what phy-
sicians advise every pregnant woman about almost every
nonlife-saving drug anyway, regardless of the results of
animal tests.
The claim that thalidomide's affects were or could have
been predicted by animals is an example of cherry picking
the data.
The quantitative/qualitative controversy
We now move on to the quantitative/qualitative contro-
versy. There is a tendency on the part of some researchers
to see all differences between species as mere quantitative
differences – presumably differences that can be compen-
sated for in the context of prediction. Vineis and Melnick:
However, we disagree with Shanks and Pyles about the
usefulness of animal experiments in predicting human
hazards. Based on the darwinian observation of inter-
species and inter-individual variation in all biological
functions, Shanks and Pyles suggest that animal exper-
iments cannot be used to identify hazards to human
health. We claim that while the activity of enzymes
may vary among individuals and among species, this
does not indicate that critical events in disease proc-
esses occurring after exposure to hazardous agents dif-
fer qualitatively between animal models and
humans.... For the most part, differences in how labo-
ratory animals and humans metabolize environmen-
tal agents, or in the interactions of these agents with
molecular targets (e.g., DNA, enzymes, or nuclear
receptors), are quantitative in nature [27].
This is very much a Newtonian way of thinking and it
ignores the effects of biological evolution and the fact that
animals are complex systems.
Toxicologists have known for a long time that species dif-
ferences may be quantitative or qualitative [53,57]. Con-
sider a model substrate such as phenol. Humans and rats
excrete phenol through two pathways, sulfate conjugation
and glucuronic acid conjugation. There is a quantitative
difference between humans and rats since the ratios of sul-
fate conjugation to glucuronic acid conjugation are differ-
ent in each species. But there are qualitative differences
too. Cats are incapable of glucuronic acid conjugation,
and must excrete phenol via the sulfate pathway. For pigs
the reverse is true, they cannot use the sulfate pathway,
and must rely on glucuronic acid conjugation. (It is worth
noting that there are at least seven metabolic pathways
that are unique to primates – for example the aromatiza-
tion of quinic acid [57].)
One lesson to be drawn from this example is that even if
the same function is achieved by two species (e.g., excre-
tion of phenol), it does not follow that they are doing so
by the exact same underlying causal mechanisms. In the
context of toxicology or pharmacology, these mechanistic
differences can be important in assessing safety as well as
pharmacological utility.
Other voices
We are not the only ones concerned about the predictive
power of animal models. The scientific community itself
is not marching in lock step when it comes to the predic-
tive utility of animal models. We will take a moment to
examine what some of these scientists actually say about
the power of animal models to predict human responses.
The following quotes from scientists (and the above
quotes from Leavitt and Sankar), of course, prove nothing
in the sense of mathematical proof, they nevertheless pro-
vide a window into the thinking of people well versed in
the field and as such a reasonable person should take
them seriously. They should give pause to those who
think that the prediction issue is one where there is no rea-
sonable controversy.
R.J. Wall and M. Shani observe:
The vast majority of animals used as models are used
in biomedical preclinical trials. The predictive value of
those animal studies is carefully monitored, thus pro-
viding an ideal dataset for evaluating the efficacy of
animal models. On average, the extrapolated results
from studies using tens of millions of animals fail to
accurately predict human responses ... We conclude
that it is probably safer to use animal models to
develop speculations, rather than using them to
extrapolate [58].
Curry points out:
The failure, in the clinic, of at least fourteen potential
neuroprotective agents expected to aid in recovery
from stroke, after studies in animal models had pre-
dicted that they would be successful, is examined in
relation to principles of extrapolation of data from
animals to humans [59].
The above proves two things. 1. At least some members of
the animal experimentation community do know what
the word predict means. 2. They also know animal models
are not predictive. Their analysis and conclusions, which
revealed the failure of animal models, was neither new
nor surprising. History reveals the same.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 12 of 20
(page number not for citation purposes)
Discrepancies between animal-human studies and even
animal-animal studies date back centuries. Percival Pott
showed coal tar was carcinogenic to humans in 1776.
Yamagiwa and Ichikawa showed it was carcinogenic in
some animals in 1915. But even then, rabbits did not
respond as mice [60]. In 1980 there were roughly sixteen-
hundred known chemicals that caused cancer in mice and
rodents, but only approximately fifteen known to cause
cancer in humans [61]. The Council on Scientific Affairs
publishing in the Journal of the American Medical Associa-
tion in 1981 stated:
The Council's consultants agree that to identify carcino-
genicity in animal tests does not per se predict either risk or
outcome in human experience ... the Council is con-
cerned about the hundreds of millions of dollars that
are spent each year (both in the public and private sec-
tors) for the carcinogenicity testing of chemical sub-
stances. The concern is particularly grave in view of the
questionable scientific value of the tests when used to
predict human experience [62]. (Emphasis added.)
David Salsburg of Pfizer wrote in 1983 that a report by the
National Cancer Institute that examined 170 chemicals
concluded that lifetime feeding studies using rodents
lacked sensitivity and specificity. He stated:
If we restrict attention to long term feeding studies
with mice or rats, only seven of the 19 human non-
inhalation carcinogens (36.8%) have been shown to
cause cancer. If we consider long term feeding or inha-
lation studies and examine all 26, only 12 (46.2%)
have been shown to cause cancer in rats or mice after
chronic exposure by feeding or inhalation. Thus the
lifetime feeding study in mice and rats appears to have
less than a 50% probability of finding known human
carcinogens.On the basis of probability theory, we would
have been better off to toss a coin [63]. (Emphasis
Should we discard every drug that causes cancer in ani-
mals? Acetaminophen, chloramphenicol, and metronida-
zole are known carcinogens in some animal species
[64,65]. Phenobarbital and isoniazid are carcinogens in
rodents[60,66,67]. Does this mean they never should
have been released to the market? Diphenylhydantoin
(phenytoin) is carcinogenic to humans but not rats and
mice [68-70]. Occupational exposure to 2-naphthylamine
appears to cause bladder cancer in humans. Dogs and
monkeys also suffer bladder cancer if exposed to 2-naph-
thylamine orally and mice suffer from hepatomas. It does
not appear to have carcinogenic properties in rats and rab-
bits. These are qualitative differences due to differences in
metabolism of aromatic amines [71]. It also appears that
fewer genetic, epigenetic, or gene expression events are
needed to induce cancer in rodents than are needed to
induce cancer in humans [72-74]. (A good review of spe-
cies differences in relation to carcinogenesis and why they
exist is Anisimov et al. [72].)
Intraspecies differences also exist. Clofibrate, nafenopin,
phenobarbital, and reserpine cause cancer in old but not
young rats [68,75].
Should the above drugs that caused cancer in some ani-
mal have been banned? If the null hypothesis is that there
is no association between animal carcinogens and human
carcinogens strong enough so the animal model can be
said to be predictive, then we see much evidence to sup-
port the null hypothesis but very little if any to refute it.
The point to be made here is that there are scientists
(rather more than we have space to mention) who ques-
tion the predictive and/or clinical value of animal-based
research and history is on their side. As noted above, the
opinions of scientists prove nothing in and of itself. Fur-
ther, some of what we have presented could be dismissed
as anecdotes but this would be a mistake. First, the studies
referenced in the previous section are just that, scientific
studies not anecdotes. Second, the examples presented are
referenced, anecdotes are not (unless they are case reports
and we must remember that thalidomide's downfall
started as a series of case reports). But again we must ask
where the burden of proof lies? We believe the second law
of thermodynamics because there has never been an
example of where it was wrong. Just one such example
would falsify the law. If the animal model community
claims the animal model is predictive, then they must
explain the examples and studies that reveal it was not.
Examples such as case reports count when disproving an
assertion, especially when they are then supported with
studies, but cannot be used by those making the claim as
proof for their overarching hypothesis. That is how sci-
ence works. We did not make the rules. In summary there
is ample evidence to question, if not disprove entirely, the
overarching hypothesis that animal models are predictive
for humans.
To take the argument one step further, we must ask what
conditions ought to be satisfied if animals are to serve as
predictors of human biomedical phenomena. This is a
question concerning theory and come under the heading
of philosophy of science.
The requirements that need to be satisfied to get genuine
causal predictions (as opposed to mere correlations)
about members of one species on the basis of test results
on members of another species are very difficult to satisfy
(and may at best only be approximated in a few cases).
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 13 of 20
(page number not for citation purposes)
Models or a modality claiming predictability assumes
identical causal properties. As researchers Carroll and
Overmier explain in their recent book Animal Research and
Human Health [76], and as LaFollette and Shanks also do
in Brute Science[3], animals in biomedical research are fre-
quently used as causal analogical models (CAMs). If the
heart pumps blood in a chimpanzee, then we reason by
analogy it will pump blood in humans also. If fen-phen is
safe for the hearts of animals we reason by analogy it will
be safe for human hearts as well [77]. Carroll and Over-
mier state:
When the experimenter devises challenges to the ani-
mal and studies a causal chain that, through analogy,
can be seen to parallel the challenges to humans, the
experimenter is using an animal model [76].
These are examples of using animals as CAMs or predic-
tive models according to the traditionally used definition
of the word prediction and as used by us in this article. We
will discuss CAMs more fully in the section on theory.
Animal models in this sense involve causal analogical rea-
soning. First, what is a causal analogical model (CAM)
and how does it involve causal analogical reasoning? The
first condition that must be met in order for a thing to be
considered a CAM is this: "X (the model) is similar to Y
(the object being modelled) in respects {a...e}." If "X has
additional property f, then while f has not been observed
directly in Y, likely Y also has property f [3]." This latter
claim is something that needs to be tested. In the present
case it means the gathering of human data.
This first condition is not enough. For instance chimpan-
zees and humans have (a) an immune system, (b) have
99% of their DNA in common with humans, (c) contract
viruses, etc. HIV reproduces very slowly in chimpanzees.
We therefore expect HIV to reproduce slowly in humans.
[3]. So if HIV replicates slowly in chimpanzees, animal
experimenters reason by analogy that it will do the same in
humans. This turns out to be false.
CAMs must satisfy two further conditions: (1) the com-
mon properties (a, ..., e) must be causal properties which
(2) are causally connected with the property (f) we wish
to project – specifically, (f) should stand as the cause(s) or
effect(s) of the features (a, ..., e) in the model. When ani-
mals are used as causal analogical models the reasoning
process taking us from results in the model to the system
modelled is called causal analogical reasoning [3].
But it is not enough simply to point to similarities to jus-
tify cross-species extrapolation in the context of causal
analogical reasoning. In complex, interactive systems such
as organisms, we need to know whether there are relevant
causal differences, i.e., causal disanalogies (with respect to
mechanisms and pathways) that compromise the useful-
ness of the analogical reasoning. In other words, for a
CAM to be predictive, there should be no causally-rele-
vant disanalogies between the model and the thing being
modeled. For example, there must be no properties {g, h,
i} unique to either the model or the object modelled that
causally interact with the common properties {a...e},
since such properties will likely compromise the predic-
tive utility of the model.
The idea here is an old one. It concerns causal determinism
– a concept that has played a fundamental role in the
development of modern science. Causal determinism
rests on two basic principles: (1) The Principle of Causality,
according to which all events have causes; and (2) The
Principle of Uniformity, according to which, for qualita-
tively identical system, all other things being equal, same
cause is always followed by same effect.
These ideas played a role in our earlier discussion of New-
tonian mechanics at the beginning of this essay. In a way,
the whole issue of prediction comes down to the principle
of uniformity. Are the animals used to make predictions
about humans qualitatively identical to humans once
allowances have been made for difference in body weight
or surface area? No reasonable person who understands
evolutionary biology, and who knows, for example, that
rodents and humans have taken very different evolution-
ary trajectories since the lineages leading to modern
humans and rodents, respectively, diverged over seventy
million years ago, will expect qualitative identity. But per-
haps qualitative identity is an ideal that can only be
approximated. Are humans and their animal models suf-
ficiently similar for approximate predictions to be made?
The numerous studies referenced above, say no. Why is
this the case?
Vertebrates are evolved complex systems. Such systems
may manifest different responses to the same stimuli due
to: (1) differences with respect to genes/alleles present;
(2) differences with respect to mutations in the same gene
(where one species has an ortholog of a gene found in
another); (3) differences with respect to proteins and pro-
tein activity; (4) differences with respect to gene regula-
tion; (5) differences in gene expression; (6) differences in
protein-protein interactions; (7) differences in genetic
networks; (8) differences with respect to organismal
organization (humans and rats may be intact systems, but
may be differently intact); (9) differences in environmen-
tal exposures; and last but not least; (10) differences with
respect to evolutionary histories. These are some of the
important reasons why members of one species often
respond differently to drugs and toxins, and experience
different diseases. These ten facts alone would be suffi-
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 14 of 20
(page number not for citation purposes)
cient for some to conclude that animal models cannot be
predictive for human; that transspecies extrapolation is
impossible vis-à-vis drug response and disease research
especially when analyzed in lights of the standards society
today demands. (And the standards not set too high. If
you think they are ask yourself if, had you taken rofecoxib
and been harmed, would you have accepted a .99 PPV as
In biomedicine we do not have the mathematician's lux-
ury of modeling humans and rodents by beginning, "let
humans and rodents be spheres." If only it were that sim-
ple. Instead, what we do have are a lot of theoretical
grounds for questioning the predictive utility of animal
models. But of course, such theoretical reasoning may be
dismissed as being just that. The real question is one of
evidence. We have examined the evidence against and
found it compelling but we should now examine the evi-
dence cited as supporting the predictive nature of animals.
We will now turn our attention to the famous Olson
study, which many suppose to have settled the matters at
hand firmly on the side of animal model being predictive
for humans.
The Famous Olson Study
The Olson study [78] purports (and has certainly been
cited in this regard) to provide evidence of the vast predic-
tive utility of animal models in assessing human toxicity.
In response to an article by Shanks et al. [79] Conn and
Parker quoted the Olson study stating:
The authors have simply overlooked the classic study
(Olson, Harry, et al.., 2000. "Concordance of the Tox-
icity of Pharmaceuticals in Humans and in Animals."
Regul Toxicol Pharmacol32, 56–67) that summarizes
the results from 12 international pharmaceutical com-
panies on the predictivity of animal tests in human
toxicity. While the study is not perfect, the overall con-
clusion from 150 compounds and 221 human toxicity
events was that animal testing has significant predic-
tive power to detect most – but not all – areas of
human toxicity [80]. (Emphasis theirs.)
We encourage the reader to examine the Olson Study in its
entirety. Here we include some important representative
paragraphs from the Olson study and our commentary
will follow. We apologize for the length of the quote but
due to the importance many place on the paper, we
believe a thorough examination is justified.
This report summarizes the results of a multinational
pharmaceutical company survey and the outcome of
an International Life Sciences Institute (ILSI) Work-
shop (April 1999), which served to better understand
concordance of the toxicity of pharmaceuticals observed
in humans with that observed in experimental ani-
mals. The Workshop included representatives from
academia, the multinational pharmaceutical industry,
and international regulatory scientists.The main aim of
this project was to examine the strengths and weaknesses of
animal studies to predict human toxicity (HT). The data-
base was developed from a survey which covered only those
compounds where HTs were identified during clinical
development of new pharmaceuticals, determining
whether animal toxicity studies identified concordant
target organ toxicities in humans ...
The results showed the true positive HT concordance rate
of 71% for rodent and nonrodent species, with nonro-
dents alone being predictive for 63% of HTs and
rodents alone for 43%. The highest incidence of over-
all concordance was seen in hematological, gastroin-
testinal, and cardiovascular HTs, and the least was
seen in cutaneous HT. Where animal models, in one or
more species, identified concordant HT, 94% were first
observed in studies of 1 month or less in duration. These
survey results support the value of in vivo toxicology
studies to predict for many significant HTs associated
with pharmaceuticals and have helped to identify HT
categories that may benefit from improved methods ...
The primary objective was to examine how well toxicities
seen in preclinical animal studies would predict actual
human toxicities for a number of specific target organs
using a database of existing information ...
Although a considerable effort was made to collect data that
would enable a direct comparison of animal and human
toxicity, it was recognized from the outset that the data
could not answer completely the question of how well ani-
mal studies predict overall the responses of humans. To
achieve this would require information on all four boxes in
Fig. 1, and this was not practicable at this stage. The mag-
nitude of the data collection effort that this would
require was considered impractical at this stage. The
present analysis is a first step, in which data have been col-
lected pertaining only to the left column of Fig. 1: true pos-
itives and false negatives. [See figure 3.] By definition,
therefore the database only contains compounds stud-
ied in humans (and not on those that never reached
humans because they were considered too toxic in ani-
mals or were withdrawn for reasons unrelated to tox-
icity). Despite this limitation, it was deemed useful to
proceed in the expectation that any conclusions that
emerged would address some of the key questions and
focus attention on some of the strengths and weak-
nesses of animal studies ...
A working party of clinicians from participating com-
panies developed criteria for "significant" HTs to be
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 15 of 20
(page number not for citation purposes)
Olsen figure 1Figure 3
Olsen figure 1.
Olsen figure 3Figure 4
Olsen figure 3.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 16 of 20
(page number not for citation purposes)
included in the analysis. For inclusion a HT (a) had to
be responsible for termination of development, (b)
had to have resulted in a limitation of the dosage, (c)
had to have required drug level monitoring and per-
haps dose adjustment, or (d) had to have restricted the
target patient population. The HT threshold of severity
could be modulated by the compound's therapeutic
class (e.g., anticancer vs anti-inflammatory drugs). In
this way, the myriad of lesser "side effects" that always
accompany new drug development but are not sufficient to
restrict development were excluded. The judgments of the
contributing industrial clinicians were final as to the
validity of including a compound. The clinical trial
phase when the HT was first detected and whether HT
was considered to be pharmacology-related was
recorded. HTs were categorized by organ system and
detailed symptoms according to standard nomencla-
ture (COSTART, National Technical Information Serv-
ice, 1999) ...
Concordance by one or more species: Overall and by
HT. Overall, the true positive concordance rate (sensitivity)
was 70% for one or more preclinical animal model species
(either in safety pharmacology or in safety toxicology)
showing target organ toxicity in the same organ system as
the HT [Fig. 4].
This study did not attempt to assess the predictability of pre-
clinical experimental data to humans. What it evaluated
was the concordance between adverse findings in clin-
ical data with data which had been generated in exper-
imental animals (preclinical toxicology) [78].
(Emphasis added.)
The Olson Study, as noted above, has been employed by
researchers to justify claims about the predictive utility of
animal models. However we think there is much less here
than meets the eye. Here's why:
1. The study was primarily conducted and published by
the pharmaceutical industry. This does not, in and of
itself, invalidate the study. However, one should never
lose sight of the fact that the study was put together by par-
ties with a vested interest in the outcome. If this was the
only concern, perhaps it could be ignored, however, as we
will now show, there are some rather more serious flaws.
2. The study says at the outset that it is aimed at measuring
the predictive reliability of animal models. Later the
authors concede that their methods are not, as a matter of
fact, up to this task. This makes us wonder how many of
those who cite the study have actually read it in its
3. The authors of the study invented new statistical termi-
nology to describe the results. The crucial term here is
"true positive concordance rate" which sounds similar to
"true predictive value" (which is what should have been
measured, but was not). A Google search on "true positive
concordance rate" yielded twelve results (counting
repeats), all of which referred to the Olson Study (see fig-
ure 5). At least seven of the twelve Google hits qualified
the term "true positive concordance rate" with the term
"sensitivity" – a well-known statistical concept. In effect,
these two terms are synonyms. Presumably the authors of
the study must have known that "sensitivity" does not
measure "true predictive value." In addition you would
need information on "specificity" and so on, to nail down
this latter quantity. If all the Olson Study measured was
sensitivity, its conclusions are largely irrelevant to the
great prediction debate.
4. Any animals giving the same response as a human was
counted as a positive result. So if six species were tested
and one of the six mimicked humans that was counted as
a positive. The Olson Study was concerned primarily not
with prediction, but with retroactive simulation of ante-
cedently know human results.
5. Only drugs in clinical trials were studied. Many drugs
tested do not actually get that far because they fail in ani-
mal studies.
6. "...the myriad of lesser "side effects" that always accom-
pany new drug development but are not sufficient to
restrict development were excluded." A lesser side effect is
one that affects someone else. While hepatotoxicity is a
major side effect, lesser side effects (which actually matter
to patients) concern profound nausea, tinnitus, pleuritis,
headaches and so forth. We are also left wondering
whether there was any independent scientific validity for
the criteria used to divide side effects into major side
effects and lesser side effects.
7. Even if all the data is good – and it may well be – sensi-
tivity (i.e. true positive concordance rate) of 70% does not
settle the prediction question. Sensitivity is not synony-
mous with prediction and even if a 70% positive predic-
tion value rate is assumed, when predicting human
response 70% is inadequate. In carcinogenicity studies,
the sensitivity using rodents may well be 100%, the specif-
icity, however, is another story. That is the reason rodents
cannot be said to predict human outcomes in that partic-
ular biomedical context.
The Olson Study is certainly interesting, but even in its
own terms it does not support the notion that animal
models are predictive for humans. We think it should be
cited with caution. A citation search (also performed with
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 17 of 20
(page number not for citation purposes)
Google resultsFigure 5
Google results.
 !"""##$%&'(!!)))(!!*$'+*'* ' *' *'* '
Web Images Maps News Shopping Gmail more
Sign in
Advanced Search
Web Books Results 1 - 12 of 12 for " true positive concordance rate" . (0.30 seconds)
Concordance of the Toxicity of Pharmaceuticals in Humans and in ...
Overall, the true positive concordance rate (sen-. sitivity) was 70% for one or more
preclinical animal. TABLE 2. Distribution of Compounds by Therapeutic ... - Similar pages
European Journal of Pharmaceutical Sciences : Early microdose drug ...
There was a true positive concordance rate of 71% for comparable target organs in
rodent plus non-rodent studies and identified human toxicities for 150 ... - Similar pages
Histopathology of Preclinical Toxicity Studies: Interpretation and ... - Google Books
by Peter Greaves - 2007 - Medical - 953 pages
Overall, the true positive concordance rate (sensitivity) is of the order of 70% with 30% of
human toxicities not predicted by safety pharmacology or ...
First dose of potential new medicines to humans: how animals help ...
Overall, the true positive concordance rate (sensitivity) of the data derived from
conventional studies is of the order of 70%, with 30% of human toxicities ... - Similar pages
First dose of potential new medicines to humans: how animals help ...
Overall, the true positive concordance rate (sensitivity) of the data derived from
conventional studies is of the order of 70%, with 30% of human toxicities ...
nrd1329.html;jsessionid=AC7F28279806834829F7F100EEEE4E6E - Similar pages
7100048a 440..447
showed the true positive concordance rate of 70%. for rodent and non-rodent species,
with non-rodents. alone being predictive for 63% of HTs and rodents ... het/2000/00000019/00000008/7100048a?
crawler=true - Similar pages
7100048a 440..447
showed the true positive concordance rate of 70%. for rodent and non-rodent species,
with non-rodents. alone being predictive for 63% of HTs and rodents ...
! !"""##$%&'(!!)))(!!*$'+*'* ' *' *'* ' arn/het/2000/00000019/00000008/7100048a -
Similar pages
First dose of potential new medicines to humans: how animals help ...
Overall, the true positive concordance rate (sensitivity) of the data derived from
conventional studies is of the order of 70%, with 30% of human toxicities ... nrd/journal/v3/n3/full/nrd1329.html - Similar pages
"Pre-clinical Safety Evaluation". In: Pharmacovigilance
Overall, the true positive concordance rate was. 70% for the pre-clinical animal species
to show. target organ toxicity in the same organ system as ... - Similar pages
First dose of potential new medicines to humans: how animals help ...
Overall, the true positive concordance rate (sensitivity) of the data derived from
conventional studies is of the order of 70%, with 30% of human toxicities ... - Similar pages
First dose of potential new medicines to humans: how animals help ...
Overall, the true positive concordance rate (sensitivity) of the data derived from
conventional studies is of the order of 70%, with 30% of human toxicities ... - Similar pages
File Format: PDF/Adobe Acrobat - View as HTML
true positive concordance rate (sensitivity) of the data. derived from conventional studies
is of the order of. 70%, with 30% of human toxicities not ... - Similar pages
Search within results | Language Tools | Search Tips | Dissatisfied? Help us improve |
Try Google Experimental
Google Home - Advertising Programs - Business Solutions - Privacy - About Google
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 18 of 20
(page number not for citation purposes)
Google on 7/23/08) led us to 114 citations for the Olson
paper. We question whether caution is being used in all
these citations.
Mark Kac stated, "A proof is that which convinces a rea-
sonable man." Even though the burden of proof is not on
us to prove animal models are not predictive, we believe
we have presented a proof that would convince a reason-
able man.
There are both quantitative and qualitative differences
between species. This is not surprising considering our
current level of knowledge vis-à-vis evo devo, gene regula-
tion and expression, epigenetics, complexity theory, and
comparative genomics. Hypotheses generate predictions,
which can be then proven true or false. Predict has a very
distinct meaning in science and according to some is the
foundation of science itself. Prediction does not mean ret-
rospectively finding one animal that responded to stimuli
like humans and therefore saying that the animal pre-
dicted human response nor does it mean cherry picking
data nor does it mean occasionally getting the right
When a concept such as "Animal models can predict
human response" is accepted as true, it is not functioning
as a hypothesis. We have referred to this as an overarching
hypothesis but could have easily referred to it as an
unfounded assumption. An assumption or overarching
hypothesis might in fact be true but its truth must be
proven. If a modality such as animal testing or using ani-
mals to predict pathophysiology in human disease is said
to be a predictive modality, then any data generated from
said modality should have a very high probability of being
true in humans. Animal models of disease and drug
response fail this criterion.
In medicine, even positive predictive values of .99 may be
inadequate for some tests and animal models do not even
roughly approximate that. Therefore, animal models are
not predictors of human response. Some animals do occa-
sionally respond to stimuli as do humans. However, how
are we to know prospectively which animal will mimic
humans? Advocates who maintain animals are predictive
confuse sensitivity with prediction. Animals as a group are
extremely sensitive for carcinogenicity or other biological
phenomena. Test one hundred different strains or species
and one is very likely to react like humans. But the specif-
icity is very low; likewise the positive and negative predic-
tive values. (Even if science did decide to abandon the
historically correct use of the word predict, every time an
animal-model advocate said animal species × predicted
human response Y, she would also have to admit that ani-
mal species A, B, C, D, E and so forth predicted incorrectly.
Thus justifying the use of animals because animal models
per se to make our drug supply safer or predict facts about
human disease would not be true.)
Some have suggested we should not criticize animal mod-
els unless we have better suggestions for research and test-
ing [27]. It is not incumbent upon us to postpone
criticizing animal models as not being predictive until
predictive models such as in silico, in vitro or in vivo are
available. Nor is it incumbent upon us to invent such
modalities. Astrology is not predictive for foretelling the
future therefore we criticize such use even though we have
no notion of how to go about inventing such a future-tell-
ing device.
Some have also suggested that animal models may some-
day be predictive and that we should so acknowledge.
While this is true in the sense that anything is possible it
seems very unlikely, as genetically modified organisms
have been seen to suffer the same prediction problems we
have addressed [16,81-87] and, as mentioned different
humans have very different responses to drugs and dis-
ease. Considering our current understanding of complex
systems and evolution it would be surprising if one spe-
cies could be used to predict outcomes in another at the
fine-grained level where our study of disease and drug
response is today and to the accuracy that society
demands from medical science.
There are direct and indirect consequences to this misun-
derstanding of what prediction means. If we did not allow
on the market any chemical or drug that causes cancer, or
is teratogenic, or causes severe side effects in any species,
then we would have no chemicals or drugs at all. Further-
more, there is a cost to keeping otherwise good chemicals
off the market. We lose: treatments perhaps even cures;
the income that could have been generated; and new
knowledge that could have been gained from learning
more about the chemical. These are not insignificant
downsides. Since we now understand vis-à-vis personal-
ized medicine that even humans differ in their response to
drugs and disease and hence one human cannot predict
what a drug will do to another human, it seems illogical
to find models that are predictive using completely differ-
ent species from humans. If we truly want predictive tests
and research methods (and we do), it would seem logical
to start looking intraspecies not interspecies.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
All authors contributed equally and have read and
approved the manuscript.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 19 of 20
(page number not for citation purposes)
About the authors
Niall Shanks, PhD is the Curtis D. Gridley Professor in the
History and Philosophy of Science at Wichita State Uni-
versity where he is also Professor of Philosophy, Adjunct
Professor of Biological Sciences, Adjunct Professor of
Physics, and Associate Member of the Graduate Faculty.
He is the President (2008–9) of the Southwest and Rocky
Mountain Division of the American Association for the
Advancement of Science. He received a PhD in Philoso-
phy from the University of Alberta.
Ray Greek, MD completed medical school at the Univer-
sity of Alabama in Birmingham and a residency in
anesthesiology at the University of Wisconsin-Madison.
He taught at the University of Wisconsin and Thomas Jef-
ferson University in Philadelphia. He is now the president
of the not for profit Americans For Medical Advancement
Jean Greek, DVM completed veterinary school at the Uni-
versity of Wisconsin-Madison and a residency in derma-
tology at the University of Pensylvania. She taught at the
University of Missouri and is now in private practice.
This paper has been made possible in part by a grant from
the National Anti-Vivisection Society of Chicago, IL http:/
/ to Americans For Medical Advancement
. Americans For Medical
Advancement is a not for profit educational organization
whose position on the use of animals is summarized
nicely in this article. While NAVS opposes all animal-
based research, AFMA does not. All authors are officers or
directors of Americans For Medical Advancement.
1. Sarewitz D, Pielke RA Jr: Prediction in Science and Policy. In Pre-
diction: Science, Decision Making, and the Future of Nature Edited by:
Sarewitz D, Pielke RA Jr, Byerly R Jr. Island Press; 2000:11-22.
2. Quine W: Quiddities" An Intermittently Philosophical Dictionary Cam-
bridge: The Belknap Press of Harvard University Press; 2005.
3. LaFollette H, Shanks N: Brute Science: Dilemmas of animal experimenta-
tion London and New York: Routledge; 1996.
4. Greek R, Greek J: Specious Science New York: Continuum Int; 2002.
5. Xceleron [
6. Altman L: Who Goes First? The Story of Self-Experimentation in Medicine
University of California Press; 1998.
7. Bruder CE, Piotrowski A, Gijsbers AA, Andersson R, Erickson S, de
Stahl TD, Menzel U, Sandgren J, von Tell D, Poplawski A, Crowley M,
Crasto C, Partridge EC, Tiwari H, Allison DB, Komorowski J, van
Ommen GJ, Boomsma DI, Pedersen NL, den Dunnen JT, Wirdefeldt
K, Dumanski JP: Phenotypically concordant and discordant
monozygotic twins display different DNA copy-number-var-
iation profiles. Am J Hum Genet 2008, 82:763-771.
8. Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML, Heine-
Suner D, Cigudosa JC, Urioste M, Benitez J, Boix-Chornet M,
Sanchez-Aguilera A, Ling C, Carlsson E, Poulsen P, Vaag A, Stephan Z,
Spector TD, Wu YZ, Plass C, Esteller M: Epigenetic differences
arise during the lifetime of monozygotic twins. Proc Natl Acad
Sci USA 2005, 102:10604-10609.
9. Weiss ST, McLeod HL, Flockhart DA, Dolan ME, Benowitz NL, John-
son JA, Ratain MJ, Giacomini KM: Creating and evaluating
genetic tests predictive of drug response. Nat Rev Drug Discov
2008, 7:568-574.
10. Kaiser J: Gender in the pharmacy: does it matter? Science 2005,
11. Willyard C: Blue's clues. Nat Med 2007, 13:1272-1273.
12. Couzin J: Cancer research. Probing the roots of race and can-
cer. Science 2007, 315:592-594.
13. Holden C: Sex and the suffering brain. Science 2005, 308:1574.
14. Salmon W: Rational Prediction. Philosophy of Science
15. Butcher EC: Can cell systems biology rescue drug discovery?
Nat Rev Drug Discov 2005, 4:461-467.
16. Horrobin DF: Modern biomedical research: an internally self-
consistent universe with little contact with medical reality?
Nat Rev Drug Discov 2003, 2:151-154.
17. Pound P, Ebrahim S, Sandercock P, Bracken MB, Roberts I: Where is
the evidence that animal research benefits humans? BMJ
2004, 328:514-517.
18. Ediorial:
The time is now. Nat Rev Drug Discov 2005, 4:613.
19. Littman BH, Williams SA: The ultimate model organism:
progress in experimental medicine. Nat Rev Drug Discov 2005,
20. Uehling M: Microdoses of Excitement over AMS, 'Phase 0' Tri-
als. Bio-IT World 2006, 2006:.
21. Dixit R, Boelsterli U: Healthy animals and animal models of
human disease(s) in safety assessment of human pharmaceu-
ticals, including therapeutic antibodies. Drug Discovery Today
2007, 12:336-342.
22. Greek R, Greek J: Sacred Cows and Golden Geese: The Human Cost of
Experiments on Animals New York: Continuum Int; 2000.
23. Greek J, Greek R: What Will We Do if We Don't Experiment on Animals.
Continuum 2004.
24. FDA Issues Advice to Make Earliest Stages Of Clinical Drug
Development More Efficient [
25. Knight A: The beginning of the end for chimpanzee experi-
ments? Philos Ethics Humanit Med 2008, 3:16.
26. Shanks N, Pyles RA: Evolution and medicine: the long reach of
"Dr. Darwin". Philos Ethics Humanit Med 2007, 2:4.
27. Vineis P, Melnick R: A Darwinian perspective: right premises,
questionable conclusion. A commentary on Niall Shanks and
Rebecca Pyles' "evolution and medicine: the long reach of
"Dr. Darwin"". Philos Ethics Humanit Med 2008, 3:6.
28. Debate title: Animals are predictive for humans
8464924004908818871&q=mad ison+debate+ani-
mal&total=5&start=0&num=30&so =0&type=search&plindex=0]
29. Gad S: Preface. In Animal Models in Toxicology Edited by: Gad S. CRC
Press; 2007:1-18.
30. Fomchenko EI, Holland EC: Mouse models of brain tumors and
their applications in preclinical trials. Clin Cancer Res 2006,
31. Hau J: Animal Models. In Handbook of Laboratory Animal Science Ani-
mal Models Volume II. 2nd edition. Edited by: Hau J, van Hoosier GK
Jr. CRC Press; 2003:2-8.
32. Staff: Of Mice...and Humans. Drug Discovery and Development 2008,
33. FDA panel recommends continued use of controversial dia-
betes drug []
34. Masubuchi Y: Metabolic and non-metabolic factors determin-
ing troglitazone hepatotoxicity: a review. Drug Metab Pharma-
cokinet 2006, 21:347-356.
35. Topol EJ: Failing the public health – rofecoxib, Merck, and the
FDA. N Engl J Med 2004, 351:1707-1709.
36. Heywood R: Clinical Toxicity – Could it have been predicted?
Post-marketing experience. Animal Toxicity Studies: Their Rele-
vance for Man 1990:57-67.
37. Spriet-Pourra C, Auriche M: Drug Withdrawal from Sale. New
York 2nd edition. 1994.
38. Igarashi T: The duration of toxicity studies required to support
repeated dosing in clinical investigation – A toxicologists
opinion. In CMR Workshop: The Timing of Toxicological Studies to Sup-
port Clinical Trials Edited by: Parkinson CNM, Lumley C, Walker SR.
Boston/UK: Kluwer; 1994:67-74.
Philosophy, Ethics, and Humanities in Medicine 2009, 4:2
Page 20 of 20
(page number not for citation purposes)
39. Sankar U: The Delicate Toxicity Balance in Drug Discovery.
The Scientist 2005, 19:32.
40. Wilbourn J, Haroun L, Heseltine E, Kaldor J, Partensky C, Vainio H:
Response of experimental animals to human carcinogens: an
analysis based upon the IARC Monographs programme. Car-
cinogenesis 1986, 7:1853-1863.
41. Rall DP: Laboratory animal tests and human cancer. Drug
Metab Rev 2000, 32:119-128.
42. Tomatis L, Aitio A, Wilbourn J, Shuker L: Human carcinogens so
far identified. Jpn J Cancer Res 1989, 80:795-807.
43. Knight A, Bailey J, Balcombe J: Animal carcinogenicity studies: 1.
Poor human predictivity. Altern Lab Anim 2006, 34:19-27.
44. Tomatis L, Wilbourn L: Evaluation of carcinogenic risk to
humans: the experience of IARC. In New Frontiers in Cancer Cau-
sation Edited by: Iversen. Washington, DC: Taylor and Francis;
45. IARC: IARC Monographs on the Evaluation of Carcinogenic Risks to
Humans Lyon: IARC; 1972.
46. IARC monographs programme on the evaluation of carcino-
genic risks to humans [
47. Haseman JK: Using the NTP database to assess the value of
rodent carcinogenicity studies for determining human can-
cer risk. Drug Metab Rev 2000, 32:169-186.
48. Gold LS, Slone TH, Ames BN: What do animal cancer tests tell
us about human cancer risk?: Overview of analyses of the
carcinogenic potency database. Drug Metab Rev 1998,
49. Freeman M, St Johnston D: Wherefore DMM? Disease Models &
Mechanisms 2008, 1:6-7.
50. Schardein J: Drugs as Teratogens CRC Press; 1976.
51. Schardein J: Chemically Induced Birth Defects. Marcel Dekker 1985.
52. Manson J, Wise D: Teratogens. Casarett and Doull's Toxicology 4th
edition. 1993:228.
53. Caldwell J: Comparative Aspects of Detoxification in Mam-
mals. In Enzymatic Basis of Detoxification Volume 1. Edited by: Jakoby
W. New York: Academic Press; 1980.
54. Runner MN: Comparative pharmacology in relation to tera-
togenesis. Fed Proc 1967, 26:1131-1136.
55. Keller SJ, Smith MK: Animal virus screens for potential tera-
togens. I. Poxvirus morphogenesis. Teratog Carcinog Mutagen
1982, 2:361-374.
56. Staples RE, Holtkamp DE: Effects of Parental Thalidomide
Treatment on Gestation and Fetal Development. Exp Mol
Pathol 1963, 26:81-106.
57. Caldwell J: Problems and opportunities in toxicity testing aris-
ing from species differences in xenobiotic metabolism. Toxi-
col Lett 1992, 64–65(Spec No):651-659.
58. Wall RJ, Shani M: Are animal models as good as we think? The-
riogenology 2008, 69:2-9.
59. Curry SH: Why have so many drugs with stellar results in lab-
oratory stroke models failed in clinical trials? A theory based
on allometric relationships. Ann N Y Acad Sci 2003, 993:69-74.
discussion 79–81
60. Shubick P: Statement of the Problem. In Human Epidemiology and
Animal Laboratory Correlations in Chemical Carcinogenesis Edited by:
Coulston F, Shubick P. Ablex Pub; 1980:5-17.
61. Coulston F: Final Discussion. In Human Epidemiology and Animal
Laboratory Correlations in Chemical Carcinogenesis Edited by: Coulston
F, Shubick P. Ablex; 1980:407.
62. Council_on_Scientific_Affairs: Carcinogen regulation. JAMA 1981,
63. Salsburg D: The lifetime feeding study in mice and rats – an
examination of its validity as a bioassay for human carcino-
gens. Fundam Appl Toxicol 1983, 3:63-67.
64. IARC: IARC Working group on the evaluation of carcino-
genic risks to humans. Lyon 1972, 1–78:.
65. Sloan DA, Fleiszer DM, Richards GK, Murray D, Brown RA:
Increased incidence of experimental colon cancer associated
with long-term metronidazole therapy. Am J Surg 1983,
66. Clemmensen J, Hjalgrim-Jensen S: On the absence of carcino-
genicity to man of phenobarbital. In Human Epidemiology and Ani-
mal Laboratory Correlations in Chemical Carcinogenesis Edited by: Alex
Pub. Coulston F, Shubick S; 1980:251-265.
67. Clayson D: The carcinogenic action of drugs in man and ani-
mals. In Human Epidemiology and Animal Laboratory Correlations in
Chemical Carcinogenesis Edited by: Coulston F, Shubick P. Ablex Pub;
68. Anisimov V: Carcinogenesis and Aging Boca Rotan: CRC Press; 1987.
69. Anisimov V: Molecular and Physiological Mechanisms of Aging St Peters-
burg: Nauka; 2003.
70. Dilman VM, Anisimov VN: Effect of treatment with phenformin,
diphenylhydantoin or L-dopa on life span and tumour inci-
dence in C3H/Sn mice. Gerontology 1980, 26:241-246.
71. IARC: Some aromatic amines, hydrazine and related sub-
stances, n-nitroso compounds and miscellaneous alkylating
IARC monograph on the evaluation of carcinogenic risks to
humans, Lyon 1974, 4:.
72. Anisimov VN, Ukraintseva SV, Yashin AI: Cancer in rodents: does
it tell us about cancer in humans? Nat Rev Cancer 2005,
73. Hahn WC, Weinberg RA: Modelling the molecular circuitry of
cancer. Nat Rev Cancer 2002, 2:331-341.
74. Rangarajan A, Weinberg RA: Opinion: Comparative biology of
mouse versus human cells: modelling human cancer in mice.
Nat Rev Cancer 2003, 3:952-959.
75. Anisimov VN: Age as a risk in multistage carcinogenesis. In
Comprehensive Geriatric Oncology 2nd edition. Edited by: Balducci L,
Ershler WB, Lyman GH. M E: Informa Healthcare. Taylor and Francis
group; 2004:75-101. 157–178
76. Overmier JB, Carroll ME: Basic Issues in the Use of Animals in
Health Research. In Animal Research and Human Health Edited by:
Carroll ME, Overmier JB. American Psychological Association;
77. Kolata G: 2 Top Diet Drugs Are Recalled Amid Reports of
Heart Defects. New York Times. New York 1997.
78. Olson H, Betton G, Robinson D, Thomas K, Monro A, Kolaja G, Lilly
P, Sanders J, Sipes G, Bracken W, Dorato M, Van Deun K, Smith P,
Berger B, Heller A: Concordance of the toxicity of pharmaceu-
ticals in humans and in animals. Regul Toxicol Pharmacol 2000,
79. Shanks N, Greek R, Nobis N, Greek J: Animals and Medicine: Do
Animal Experiments Predict Human Response? Skeptic 2007,
80. Conn P, Parker J: Letter. Animal research wars. Skeptic 2007,
81. Van Regenmortel MH: Reductionism and complexity in molec-
ular biology. Scientists now have the tools to unravel biolog-
ical and overcome the limitations of reductionism. EMBO Rep
2004, 5:1016-1020.
82. Morange M: A successful form for reductionism. The Biochemist
2001, 23:37-39.
83. Morange M: The misunderstood gene Cambridge: Harvard University
Press; 2001.
84. Mepham TB, Combes RD, Balls M, Barbieri O, Blokhuis HJ, Costa P,
Crilly RE, de Cock Buning T, Delpire VC, O'Hare MJ, Houdebine LM,
van Kreijl CF, Meer M van der, Reinhardt CA, Wolf E, van Zeller AM:
The Use of Transgenic Animals in the European Union: The
Report and Recommendations of ECVAM Workshop 28.
Altern Lab Anim 1998, 26:21-43.
85. Liu Z, Maas K, Aune TM: Comparison of differentially expressed
genes in T lymphocytes between human autoimmune dis-
ease and murine models of autoimmune disease. Clin Immunol
2004, 112:225-230.
86. Dennis C: Cancer: off by a whisker. Nature 2006, 442:739-741.
87. Houdebine LM: Transgenic animal models in biomedical
research. Methods Mol Biol 2007, 360:163-202.
... The proposal to use new animal models to predict what will happen in humans (Shanks et al., 2009) makes sense when the analogies between the entities involved are taken into account. Currently high protein and low carbohydrate diets are considered to have positive effects on functional capacity in patients with heart damage, although studies are believed to be necessary and since they are not carried out in animal models, their realization is not very feasible. ...
... The use of animals by the biomedical sciences as models to help understand and predict responses in humans, in toxicology and pharmacology in particular, remains the primary tool for biomedical advancements and a source of significant controversy. In general, animals have performed exceptionally well as predictive models for humans when used correctly (Shanks et al., 2009). ...
Proposing animal models that allow predicting results in humans becomes critical when the analogies in physiology between both entities are reviewed. About heart disease, the heart rate in humans is more similar to that of chickens than that of the mouse, rat or other mammalian models generally used to study this disease. In the present work, the ethology on the attraction of chickens to earthworms as a food source was reviewed, in addition hematological, organ and urological parameters were measured in chickens fed with double and triple the protein percentage supplied with Eisenia foetida live added to the feed. commercial for the Cobb500 line. The results show a marked attraction depending on the nutritional status of the birds for Eisenia foetida and differences in hematological parameters, but not for urological parameters. The morphological characteristics of the heart showed a clear association between three times the protein load in the food and cardiac damage in 2 of 7 animals fed during 7 weeks of study. The present work represents the first contribution with the animal model approach in chickens to study cardiac damage and its possible prediction for humans.
... Traditional chemical risk assessment methodology relies on the use of animal models, which is now discouraged, not only due to ethical reasons but also from concerns over the predictivity of animal data for human responses [1][2][3]. In order to replace traditional animal assessment, alternative testing strategies are needed with improved quality, efficiency and speed of chemical hazard and risk assessments [4]. ...
... The authors declare that they have no competing interests. 1 Institute of Occupational Medicine (IOM), Edinburgh, UK. 2 National Institute for Public Health and the Environment -RIVM, Bilthoven, The Netherlands. 3 Institute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands. 4 Adolphe Merkle Institute, University of Fribourg, Chemin des Verdiers 4, 1700 Fribourg, Switzerland. ...
Full-text available
Background Toxicity assessment for regulatory purposes is starting to move away from traditional in vivo methods and towards new approach methodologies (NAM) such as high-throughput in vitro models and computational tools. For materials with limited hazard information, utilising quantitative Adverse Outcome Pathways (AOPs) in a testing strategy involving NAM can produce information relevant for risk assessment. The aim of this work was to determine the feasibility of linking in vitro endpoints to in vivo events, and moreover to key events associated with the onset of a chosen adverse outcome to aid in the development of NAM testing strategies. To do this, we focussed on the adverse outcome pathway (AOP) relating to the onset of pulmonary fibrosis. Results We extracted in vivo and in vitro dose–response information for particles known to induce this pulmonary fibrosis (crystalline silica, specifically α-quartz). To test the in vivo–in vitro extrapolation (IVIVE) determined for crystalline silica, cerium dioxide nanoparticles (nano-CeO2) were used as a case study allowing us to evaluate our findings with a less studied substance. The IVIVE methodology outlined in this paper is formed of five steps, which can be more generally summarised into two categories (i) aligning the in vivo and in vitro dosimetry, (ii) comparing the dose–response curves and derivation of conversion factors. Conclusion Our analysis shows promising results with regards to correlation of in vitro cytokine secretion to in vivo acute pulmonary inflammation assessed by polymorphonuclear leukocyte influx, most notable is the potential of using IL-6 and IL-1β cytokine secretion from simple in vitro submerged models as a screening tool to assess the likelihood of lung inflammation at an early stage in product development, hence allowing a more targeted investigation using either a smaller, more targeted in vivo study or in the future a more complex in vitro protocol. This paper also highlights the strengths and limitations as well as the current difficulties in performing IVIVE assessment and suggestions for overcoming these issues.
... 4 Commonly, animal studies measure a set of behavioral outcomes that are considered to be analogous to those used in clinical settings, often deployed to quantify a treatment effect. 5 However, due to the complex recovery of the injured brain and difficulty in scaling animal models to human TBI, the outcomes measured may not be optimally selected. 4 Limited experimental resources may lead to studies with either few animals but large batteries of behavioral outcomes, or many animals but few behavioral tests, thereby risking to under-or overpower experiments, respectively. ...
... Parameters from the smallest unit of each test were grouped in level 3, e.g. Morris water maze (hidden trials[1][2][3][4][5], open field test (squared arena) or forced swim test (days 1-3). For every week and level 3 test, the best logistic classification model in terms of area under the receiver operating characteristic curve (AUC) was identified. ...
Repetitive mild traumatic brain injury (rmTBI) is a potentially debilitating condition with long-term sequelae. Animal models are used to study rmTBI in a controlled environment, but there is currently no established standard battery of behavioral tests used. Primarily, we aimed to identify the best combination and timing of behavioral tests to distinguish injured from non-injured animals in rmTBI studies, and secondarily, to determine whether combinations of independent experiments have better behavioral outcome prediction accuracy. Data of 1,203 mice from 58 rmTBI experiments, some of which has already been published, was used. In total, 11 types of behavioral tests were measured by 37 parameters at 13 timepoints during the first 6 months after injury. Univariate regression analyses were used to identify optimal combinations of behavioral tests and whether the inclusion of multiple heterogenous experiments improved accuracy. k-means clustering was used to determine whether a combination of multiple tests could discriminate mice with rmTBI from non-injured mice. We found that a combination of behavioral tests outperformed individual tests alone when discriminating animals with rmTBI from non-injured animals. The best timing for most individual behavioral tests was 3-4 months after first injury. Overall, Morris water maze (MWM; hidden and probe frequency) was the behavioral test with best capability of detecting injury effects (AUC=0.98). Combinations of open field tests and elevated plus mazes also performed well (AUC=0.92), as well as the forced swim test alone (AUC=0.90). In summary, multiple heterogeneous experiments tended to predict outcome better than individual experiments and MWM 3-4 months after injury was the optimal test, but several combinations performed well too. In order to design future preclinical rmTBI trials, we have included an interactive application available online utilizing the data from the study.
... Evidence suggests that the microenvironment plays a crucial role in tissue remodeling and regeneration. Given the genetic interspecies differences, findings in animal models often do not translate into effective interventions for human patients [19]. ...
The urothelium covers the inner surface of the urinary tract, forming a urinary tract barrier. Impairment of the integrity and dysfunction of the urinary tract barrier is associated with the occurrence and development of various diseases. The development of a three-dimensional model of the urothelium is critical for pathophysiological studies of this site, especially under physiological fluid shear stress stimulated by the urinary flow. In this study, a urothelium on-chip is fabricated with micromilling and replica molding techniques, which contains a microfluidic chip for cell culture and a pump-based fluid perfusion system. The mechanical properties of the human urinary tract are simulated by adjusting the concentration and degree of amino substitution of the gelatin methacrylate hydrogel. The matrix stiffness is similar to the natural urinary tract. Pulsatile flow and periodic flow are provided to simulate the fluid environment of the upper and lower urinary tracts, respectively. The results show that the physiological fluid shear stress could promote the differentiation and maturation of urothelial cells. The model could simulate the three-dimensional structure of urothelium and urinary flow microenvironment, showing morphological structure close to the natural urothelium, specific differentiation and maturation markers (uroplakin 2, cytokeratin 20), and urothelial barrier function.
... with regard to CYP expression and expression levels) do not always allow the effects of a tested substance to be transferred from the test animal to humans [221]. In vitro experiments can thus be used as a supplement to the toxicological testing of drugs, as such systems can detect undesirable adverse effects that only occur in human cells in advance. ...
The conversion of many drugs into toxic or less toxic metabolites depends on the first-pass effect of the liver. Toxic metabolites are associated with drug-induced liver injury, often causing liver failure. Thus, therapies often have to be interrupted despite a promising drug effect. The development of an improved drug metabolism model consisting of liver-derived CYP3A4-overexpressing cells and tumour cells should be used to study drugs in vitro for their efficacy on tumour cells considering the first-pass effect. Two different co-culture system approaches were tested (transwell insert-based and liver cell supernatant transfer) with HepG2 CYP3A4 cells, originally derived from a tumor characterized by genetic instability, serving as an in vitro liver cell model for establishment. PANC-1 cells (pancreatic cancer) or MCF-7 cells (breast cancer) were used as prototypical tumour cell lines. For cell treatment, three PANC-1 or MCF-7 active drugs MG-132 (MG), Taxol (TX) and Tamoxifen (TAM), were used as these drugs are inactivated (MG and TX) or activated (TAM) by the liver first-pass effect. Three widely used cytotoxicity assays (XTT-, CellTiter-Glo™ 2.0-, and trypan blue exclusion assay) were used to determine acute cytotoxicity to both liver and tumour cells. For co-culture validation, HepG2 CYP3A4 cells were replaced by CYP3A4-overexpressing immortalized primary-like hepatocytes, the FH3 CYP3A4 cells, representing the human liver cell model more closely. Finally, in a pilot project, primary colon carcinoma cells (pCC-cells) were tested along with HepG2 CYP3A4 and MG. The use of co-cultures with HepG2 CYP3A4 showed, that first-pass metabolism of MG resulted in reduced cytostatic effects in PANC-1 cells. The TX-induced cytotoxic effect on PANC-1 and MCF-7 cell lines is only slightly attenuated by the first-pass effect. Activation of TAM with FH3 CYP3A4 liver cells in indirect co-culture with transwell inserts most efficiently enhanced the cytotoxic effect on MCF-7 cells. In contrast, the use of HepG2 CYP3A4 proved to be insufficient. PCC-cells were affected by MG in MC in a dose-dependent manner. Using co-culture systems with HepG2 CYP3A4 cells, colon cancer cells were clearly protected from MG up to a concentration of 2 μM. In conclusion, first-pass effect could be simulated in vitro in MG- and TX-treated PANC-1 cells and TX-treated MCF-7 cells as well as on MG-treated pCC-cells using HepG2 CYP3A4 cells. Obtained results with PANC-1 and MCF-7 could be validated with FH3 CYP3A4 cells. However, demonstration of the first-pass effect using different end-point measurement methods yielded different results and further studies are needed for definite conclusions.
... Another research used M.R.I (magnetic resonance imaging) to show a shift in bone marrow saturated to unsaturated fat content [29]. However, animal models may not always predict human responses [158] and clinical investigations that indicate increased bone marrow adiposity in diabetic patients have not ruled out obesity as a confounding factor. T2DM is linked to insulin resistance pathophysiologically. ...
Full-text available
Through a number of biochemical and structural processes, long-term exposure to a diabetic environment causes alterations in bone metabolism and poor bone micro-architecture. These modifications make the bone more prone to fractures and impede osseous healing. In clinical practice, management of diabetes mellitus plays important role for preventing bone health complications. To effectively identify fracture risk in individuals with diabetes mellitus, alternate fracture risk assessment techniques may be required. There is currently no definitive model describing how diabetes mellitus affects bone health, particularly in view of progenitor cells. The best available information on the influence of diabetes mellitus on bone health in vitro and in vivo is summarised in this review, with a focus on future translational research prospects.
... March 4, 2023;11:39] models based on mice BBB and human BBB responsiveness to analogous US energies and mechanical indices. However, like any research involving animal models, translation from mice to humans is unpredictable [56]. The translational gap for humans to mouse safety data should also be evaluated [57,58]. ...
Objective: It is unknown whether ultrasound-induced blood-brain barrier (BBB) disruption can promote epileptogenesis and how BBB integrity changes over time after sonication. Methods: To gain more insight into the safety profile of ultrasound (US)-induced BBB opening, we determined BBB permeability as well as histological modifications in C57BL/6 adult control mice and in the kainate (KA) model for mesial temporal lobe epilepsy in mice after sonication with low-intensity pulsed ultrasound (LIPU). Microglial and astroglial changes in ipsilateral hippocampus were examined at different time points following BBB disruption by respectively analyzing Iba1 and glial fibrillary acidic protein immunoreactivity. Using intracerebral EEG recordings, we further studied the possible electrophysiological repercussions of a repeated disrupted BBB for seizure generation in nine non-epileptic mice. Results: LIPU-induced BBB opening led to transient albumin extravasation and reversible mild astrogliosis, but not to microglial activation in the hippocampus of non-epileptic mice. In KA mice, the transient albumin extravasation into the hippocampus mediated by LIPU-induced BBB opening did not aggravate inflammatory processes and histologic changes that characterize the hippocampal sclerosis. Three LIPU-induced BBB opening did not induce epileptogenicity in non-epileptic mice implanted with depth EEG electrodes. Conclusion: Our experiments in mice provide persuasive evidence of the safety of LIPU-induced BBB opening as a therapeutic modality for neurological diseases.
... Animals are distinct from humans in terms of genetics, epigenetics, and physiology. Moreover, the hypothesis that animal findings can be translated to humans has not yet been validated [1,2]. These issues raise the need for physiologically relevant human in vitro models to investigate cancer biology and therapeutic development. ...
Full-text available
The evolution of preclinical in vitro cancer models has led to the emergence of human cancer-on-chip or microphysiological analysis platforms (MAPs). Although it has numerous advantages compared to other models, cancer-on-chip technology still faces several challenges such as the complexity of the tumor microenvironment and integrating multiple organs to be widely accepted in cancer research and therapeutics. In this review, we highlight the advancements in cancer-on-chip technology in recapitulating the vital biological features of various cancer types and their applications in life sciences and high-throughput drug screening. We present advances in reconstituting the tumor microenvironment and modeling cancer stages in breast, brain, and other types of cancer. We also discuss the relevance of MAPs in cancer modeling and precision medicine such as effect of flow on cancer growth and the short culture period compared to clinics. The advanced MAPs provide high-throughput platforms with integrated biosensors to monitor real-time cellular responses applied in drug development. We envision that the integrated cancer MAPs has a promising future with regard to cancer research, including cancer biology, drug discovery, and personalized medicine.
... This no-flow technique can result in data that is less predictive of how a compound will respond, or be responded to, in the in vivo environment. Associatively, even in vivo animal testing can be unpredictable of human physiology due to phenotypic differences in cell types and organ systems [20][21][22]. Strategically, benchtop µfluidic environments that better mimic in vivo flow networks are gaining research and development interests [3,12]. In this context, we stage the SsWaterfall fluidic culture system (Figure 1), an exposure platform having two unique traits, (1) unidirectionality with non-recirculating fluid flows, and (2) the sequential assembly-line of cell cultures. ...
Full-text available
Microfluidic screening tools, in vitro, evolve amid varied scientific disciplines. One emergent technique, simultaneously assessing cell toxicity from a primary compound and ensuing cell-generated metabolites (dual-toxicity screening), entails in-line systems having sequentially aligned culture chambers. To explore dual-tox screens, we probe the dissemination of nutrients involving 1-way transport with upstream compound dosing, midstream cascading flows, and downstream cessation. Distribution of flow gives rise to broad concentration ranges of dosing compound (0→ICcompound100) and wide-ranging concentration ranges of generated cell metabolites (0→ICmetabolites100). Innately, single-pass unidirectional flow retains 1st pass informative traits across the network, composed of nine interconnected culture wells, preserving both compound and cell-secreted byproducts as data indicators in each adjacent culture chamber. Thereafter, to assess effective compound hepatotoxicity (0→ECcompound100) and simultaneously classify for cell-metabolite toxicity (0→ECmetabolite100), we reveal utility by analyzing culture viability against ramping exposures of acetaminophen (APAP) and nefazodone (NEF), compounds of hepatic significance. We then discern metabolite generation with an emphasis on amplification across µchannel multiwell sites. Lastly, using conventional cell functions as indicator tools to assess dual toxicity, we investigate a non-drug induced liver injury (non-DILI) compound and DILI compound. The technology is for predictive evaluations of new compound formulations, new chemical entities (NCE), or drugs that have previously failed testing for unresolved reasons.
Piceatannol (PCN), a SIRT1 activator, regulates multiple oxidative stress mechanism and has anti-inflammatory potential in various inflammatory conditions. However, its role in Diabetic insulted peripheral neuropathy (DN) remains unknown. Oxidative stress and mitochondrial dysfunction are major contributing factors to DN. Myriad studies have proven that sirtuin1 (SIRT1) stimulation convalesce nerve functions by activating mitochondrial functions like mitochondrial biogenesis and mitophagy. Diabetic neuropathy (DN) was provoked by injecting streptozotocin (STZ) at a dose of 55 mg/kg, i.p to male Sprague Dawley (SD) rats. Mechanical, thermal hyper- algesia was evaluated by using water immersion, Vonfrey Aesthesiometer, and Randall Sellito Calipers. Motor, sensory nerve conduction velocity was measured using Power Lab 4sp system whereas The Laser Doppler system was used to evaluate nerve blood flow. To induce hyperglycemia for the in vitro investigations, high glucose (HG) (30 mM) conditions were applied to Neuro2a cells. At doses of 5 and 10 µM, PCN was examined for its role in SIRT1 and Nrf2 activation. HG-induced N2A cells, reactive oxygen exposure, mitochondrial superoxides and mitochondrial membrane potentials were restored by PCN exposure, and their neurite outgrowth was enhanced. Peroxisome proliferator activated receptor-gamma coactivator-1α (PGC-1α) directed mitochondrial biogenesis was induced by increased SIRT1 activation by piceatannol. SIRT1 activation also enhanced Nrf2-mediated antioxidant signalling. Our study results inferred that PCN administration can counteract the decline in mitochondrial function
Drug toxicity is one of the last tests that a new drug goes through often in late-stage clinical trials and at this stage many drugs fail. Toxicity tests are performed in rats or dogs, those toxicity tolerance is similar but not the same as in in humans. New ways of testing drugs are being developed these are structure-based computational toxic prediction and toxicogenomics which uses gene expression data.
IN RECENT months, policy decisions have been made in the federal regulatory agencies that outline new programs for the identification of carcinogens with the aim of eliminating them, if possible, from our environment. While there can be no objection to this goal, it is true that once a political-public-policy process of such magnitude is mounted, it may be extremely difficult to change its course, even though evidence may become overwhelming that its cost may far outstrip any possible benefit. News reports have claimed that 80% to 90% of all cancer may be due to environmental factors,¹ and this has often been translated to mean the occupational environment. With this assertion in mind, the Council on Scientific Affairs, with the help of several expert consultants, has reviewed the basis for the progress of the federal carcinogen regulation initiative. The earnest desire of the federal regulators to prevent as many cancers
1. Animal data are essential for the safety assessment of new drugs, particularly when initiating clinical trials where little or no human data are available. However, the duration of toxicity studies required to support repeated dosing in humans should be discussed on a more scientific basis, taking into consideration the relationship between the duration of repeated dosing and new toxicity findings in animals, the reliability of the results of animal studies at predicting potential safety in clinical use, and the relationship of the time-scale in different animal species to man. 2. It is proposed that four weeks repeated dosing in animals is sufficient to support studies in humans up to four weeks. Following this, 3-month repeated dose toxicity studies should be recommended as the minimum duration to support repeated dosing of longer than four weeks in all phases of clinical investigation. 3. Long-term dosing in humans, such as that over six months, can only be supported by human data which are accumulated by the step-wise process of clinical investigation.