ArticlePDF Available

Animal experiments scrutinised: Systematic reviews demonstrate poor human clinical and toxicological utility

Authors:

Abstract

The assumption that animal models are reasonably predictive of human outcomes provides the basis for their widespread use in toxicity testing and in biomedical research aimed at developing cures for human diseases. To investigate the validity of this assumption, the comprehensive "Scopus" biomedical bibliographic databases were searched for published systematic reviews of the human clinical or toxicological utility of animal experiments. Of 20 reviews examining clinical utility, authors concluded that the animal models were substantially consistent with or useful in advancing clinical outcomes in only two cases, and the conclusion in one case was contentious. Included were reviews of the clinical utility of experiments expected by ethics committees to lead to medical advances, of highly-cited experiments published in major journals, and of chimpanzee experiments - the species most likely to be predictive of human outcomes. Seven additional reviews failed to clearly demonstrate utility in predicting human toxicological outcomes such as carcinogenicity and teratogenicity. Consequently, animal data may not generally be assumed to be substantially useful for these purposes. Possible causes include interspecies differences, the distortion of experimental outcomes arising from experimental environments and protocols, and the poor methodological quality of many animal experiments evident in at least 11 reviews. No reviews existed in which a majority of animal experiments were of good quality. While the latter problems might be minimised with concerted effort, given their widespread nature, the interspecies limitations are likely to be technically and theoretically impossible to overcome. Yet, unlike non-animal models, animal models are not normally subjected to formal scientific validation. Instead of simply assuming they are predictive of human outcomes, the consistent application of formal validation studies to all test models is clearly warranted, regardless of their animal, non-animal, historical, contemporary or possible future status. Expected benefits would include greater selection of models truly predictive of human outcomes, increased safety of people exposed to chemicals that have passed toxicity tests, increased efficiency during the development of human pharmaceuticals, and decreased wastage of animal, personnel and financial resources. The poor human clinical and toxicological utility of most animal models for which data exists, in conjunction with their generally substantial animal welfare and economic costs, justify a ban on animal models lacking scientific data clearly establishing their human predictivity or utility.
Introduction
Trends in laboratory animal use
Standards for the reporting of laboratory animal use
vary internationally, with many countries failing to
record or publicise statistics on animal use at all. Of
those that do, most record only live animal use, and
fail to record the substantial numbers of animals that
may be killed prior to certain procedures, such as dis-
section or the collection of organs, tissues or cells.
Hence, making realistic annual estimates of animal
use within biomedical research and toxicity testing is
difficult. Despite these limitations, it remains clear
from consideration of the European Union (EU) and
United States alone, that many millions of animals
are used worldwide, and that certain trends are
resulting in an increase in laboratory animal use.
EU
European Commission (EC) statistics on laboratory
animal use in 25 EU Member States, revealed that
12,117,583 animals were used in 2005, the latest
reporting period (except for France, which provided
figures for 2004). The majority of these were mice
(53.1%), rats (19.3%), cold-blooded animals (15.1%,
consisting of fish [primarily], amphibians and rep-
tiles), and birds (5.4%). As in previous years,
France, Germany and the UK reported the greatest
animal use (1).
Systematic Reviews of Animal Experiments Demonstrate
Poor Human Clinical and Toxicological Utility
Andrew Knight
Animal Consultants International, London, UK
Summary — The assumption that animal models are reasonably predictive of human outcomes provides
the basis for their widespread use in toxicity testing and in biomedical research aimed at developing cures
for human diseases. To investigate the validity of this assumption, the comprehensive Scopus biomedical
bibliographic databases were searched for published systematic reviews of the human clinical or toxico-
logical utility of animal experiments. In 20 reviews in which clinical utility was examined, the authors con-
cluded that animal models were either significantly useful in contributing to the development of clinical
interventions, or were substantially consistent with clinical outcomes, in only two cases, one of which was
contentious. These included reviews of the clinical utility of experiments expected by ethics committees to
lead to medical advances, of highly-cited experiments published in major journals, and of chimpanzee
experiments — those involving the species considered most likely to be predictive of human outcomes.
Seven additional reviews failed to clearly demonstrate utility in predicting human toxicological outcomes,
such as carcinogenicity and teratogenicity. Consequently, animal data may not generally be assumed to be
substantially useful for these purposes. Possible causes include interspecies differences, the distortion of
outcomes arising from experimental environments and protocols, and the poor methodological quality of
many animal experiments, which was evident in at least 11 reviews. No reviews existed in which the major-
ity of animal experiments were of good methodological quality. Whilst the effects of some of these prob-
lems might be minimised with concerted effort (given their widespread prevalence), the limitations
resulting from interspecies differences are likely to be technically and theoretically impossible to overcome.
Non-animal models are generally required to pass formal scientific validation prior to their regulatory
acceptance. In contrast, animal models are simply assumed to be predictive of human outcomes. These
results demonstrate the invalidity of such assumptions. The consistent application of formal validation
studies to all test models is clearly warranted, regardless of their animal, non-animal, historical, contem-
porary or possible future status. Likely benefits would include, the greater selection of models truly pre-
dictive of human outcomes, increased safety of people exposed to chemicals that have passed toxicity tests,
increased efficiency during the development of human pharmaceuticals and other therapeutic interven-
tions, and decreased wastage of animal, personnel and financial resources. The poor human clinical and
toxicological utility of most animal models for which data exists, in conjunction with their generally sub-
stantial animal welfare and economic costs, justify a ban on animal models lacking scientific data clearly
establishing their human predictivity or utility.
Key words: animal experiment, animal study, clinical trial, human outcome, systematic review.
Address for correspondence: Andrew Knight, Animal Consultants International, 91 Vanbrugh Court,
Wincott Street, London SE11 4NR, UK.
E-mail: info@AnimalConsultants.org
ATLA 35, 641–659, 2007 641
United States
In the USA, laboratory animal use is federally reg-
ulated by the Animal Welfare Act 1966 (amended in
1985), which excludes laboratory-bred mice and
rats, as well as non-mammals, from consideration
or protection (2, 3), despite the fact that mice and
rats comprise the overwhelming majority of all lab-
oratory subjects. This impedes the accurate estima-
tion of laboratory animal use in the USA. For
example, although 1,012,713 regulated animals
were used in the Fiscal Year 2006 (4), the latest
reporting period, Carbone (5) estimated that in
excess of 100 million mice are used annually. This
represents a dramatic increase from the 17–22 mil-
lion vertebrates used in the mid-1980s (6).
Genetically-modified animal use
In recent years, the previous steady decreases in
laboratory animal use have been reversed, in some
countries, mostly as a result of dramatic increases
in the use of genetically-modified (GM) animals.
The production of these GM animals requires sub-
stantial breeding, which serves to further increase
the numbers of animals used. Within the UK, for
example, a steady and significant reduction since
1976 stabilised during the early 1990s, and then
reversed. 3,012,032 procedures on living, regulated
animals (vertebrates and one species of octopus,
Octopus vulgaris) were conducted in 2006, the high-
est number for around 15 years (7). Greater breed-
ing and use of GM animals have contributed to
these increasing numbers (8, 9). In 1995, GM ani-
mals were used in 8% of regulated procedures. In
2004, the total was 32%, and in 2006 it was 34%
(1,035,343 regulated procedures; 7). Increased GM
animal use has also been recorded in Germany (10)
and Switzerland (11), where total animal use is also
increasing (11, 12).
Chemical testing programmes
Recently-initiated, large-scale chemical testing pro-
grammes are also important drivers of the recent
and probable substantial future increases in labora-
tory animal use (13, 14). These programmes are
intended to rectify existing knowledge gaps with
regard to the toxicity of chemicals that are pro-
duced or imported into the EU and the USA in par-
ticularly high quantities (or that otherwise give rise
to special concerns), and are likely to result in the
use of unprecedented numbers of animals in toxic-
ity testing. Included are three programmes initiated
by the US government, and managed by the
Environmental Protection Agency (EPA) since
1998: the High Production Volume (HPV) Chall -
enge Program, the Endocrine Disruptor Screening
Program, and the Voluntary Children’s Chemical
Evaluation Program. The 2003 EC proposal for the
Registration, Evaluation and Authorisation of
Chemicals (REACH), similarly aims to assess the
toxicity of chemicals produced or imported in high
quantities (15–20). It is reported that the HPV pro-
gramme, for example, has already subjected over
150,000 animals to chemical tests (21).
Claims supporting laboratory animal use
Biomedical research using laboratory animals is
highly controversial. Advocates frequently claim
that such research is vital for preventing, curing or
alleviating human diseases (e.g. 22, 23), that the
greatest achievements of medicine have been possi-
ble only due to the use of animals (e.g. 24), and that
the complexity of humans requires nothing less
than the complexity of laboratory animals to serve
as an effective model during biomedical investiga-
tions (e.g. 25). They even claim that medical
progress would be “severely maimed by prohibition
or severe curtailing of animal experiments,” and
that “catastrophic consequences would ensue” (26).
However, such claims are hotly contested (e.g.
27), and the right of humans to experiment on ani-
mals has also been strongly contested philosophi-
cally (e.g. 28, 29). A growing body of empirical
evidence also casts doubt upon the scientific utility
of animals as experimental models of humans.
Clinical utility of animal models: case studies
Within the field of pharmaceutical development,
case studies exemplifying differing human and ani-
mal outcomes — sometimes with severe adverse
consequences for human patients — are sufficiently
numerous to fill entire book chapters (e.g. 30, 31).
A recent notorious example was TGN1412 (also
known as CD28-SuperMAB), a fully-humanised
monoclonal antibody (i.e. a product developed in a
non-human species and protein-engineered to pos-
sess specifically-human characteristics) that was
undergoing development for the treatment of
inflammatory conditions, such as leukaemia and
rheumatoid arthritis (32, 33). During a Phase I clin-
ical trial in the UK in 2006, TGN1412 caused severe
adverse reactions, culminating in organ failure
requiring intensive care, in all six volunteers given
the drug, with one volunteer suffering permanent
damage. These effects occurred despite the admin-
istration of an expected sub-clinical dose of 0.1
mg/kg — 500 times lower than the 50 mg/kg dose
found not to cause adverse effects in cynomolgus
monkeys. Tests on rhesus macaques, rats and mice
also failed to reveal adverse effects (34, 35).
Another recent notorious example was the arthri-
tis drug, Vioxx, which appeared to be safe, and even
642 A. Knight
beneficial to the heart, in animal studies. However,
Vioxx was withdrawn from the global market in
2004, after causing as many as 140,000 heart
attacks and strokes, and over 60,000 deaths, in the
USA alone (36).
Since their commercial introduction in the early
1980s, non-steroidal anti-inflammatory drugs
(NSAIDs) have also had a problematic clinical his-
tory. Although apparently safe in year-long studies
in rhesus monkeys, benoxaprofen (Oraflex) pro-
duced thousands of serious adverse reactions in
humans, including dozens of deaths, within three
months of its initial marketing (37). Fenclofenac
(Flenac) revealed no toxicity in ten animal species,
yet produced severe liver toxicity in humans, and
was subsequently withdrawn (38). Similar fates
befell some other NSAIDs, including zomepirac
(Zomax; 39), bromfenac (Duract; 40), and phenyl -
butazone (Butazolidin; 41), which produced adverse
human effects undetected in animal studies.
Numerous other pharmaceuticals have also been
marketed after passing limited clinical trials and
more rigorous animal testing, only to subsequently
be found to cause serious side-effects or death in
human patients. Examples include various antibi-
otics (e.g. chloramphenicol, clindamycin, tema flox -
acin), antidepressants (e.g. nomifensine), antivirals
(e.g. idoxuridine), cardiovascular medications (e.g.
amrinone, cerivastatin, mibefradil, ticrynafen), and
many others (e.g. 30, 42–44).
Although 92% of new drugs that pass preclinical
testing, which routinely includes animal tests, fail
to reach the market because of safety or efficacy
failures in human clinical trials (45), adverse drug
reactions detected after drugs have been approved
for clinical use, nevertheless remain common. They
are, indeed, sufficiently common to have been
recently recorded as the 4th–6th leading cause of
death in US hospitals (based on a 95% confidence
interval; 46), a rate considered by investigators to
be “extremely high”.
There are also cases of safe and efficacious
human pharmaceuticals that would not pass rigor-
ous animal testing, because of severe or lethal toxi-
city in some laboratory animal species. Notable
examples include, penicillin (e.g. 47), paracetamol
(acetaminophen; e.g. 48), and aspirin (acetylsali-
cylic acid; e.g. 49). More rigorous animal testing
may well have delayed or prevented the use of these
highly beneficial drugs in human patients.
The large number of examples of apparent differ-
ences between outcomes in laboratory animals and in
human patients may be the result of several factors.
Flaws may occur during the pharmaceutical develop-
ment and testing process, in which the design, con-
duct or interpretation of experiments may fail to
adequately highlight the risks to human patients.
Such flaws are more likely in animal studies than in
human clinical trials, because the experimental qual-
ity of the former are usually significantly lower (see
Results and Discussion). True discordance in results
may also arise from interspecies differences.
Finally, the limited predictivity for wider human
outcomes of human clinical trials may result from
their focus on small groups of healthy young men,
or from insufficient study durations. Particularly in
Phases I–II, small cohorts of young men (20–300)
are typically used, to minimise experimental vari-
ability and to avoid the possibility of endocrinologi-
cal disruption or other risks to women of
reproductive age. Although 1,000–3,000 volunteers
may be used in Phase III trials, the final phase
before marketing (50), it is nevertheless clear that
cohort numbers, study durations or other aspects of
protocol design, conduct or interpretation, are inad-
equate to detect the adverse side-effects of the large
number of pharmaceuticals that are found to harm
patients after marketing. Longer studies of more-
broadly representative human populations would
be more predictive, but would increase the time and
cost of pharmaceutical development, and are resis-
ted by pharmaceutical companies.
The necessity of systematic reviews
The premise that laboratory animal models are gen-
erally predictive of human outcomes is the basis for
their widespread use in human toxicity testing, and
in the safety and efficacy testing of putative
chemotherapeutic agents and other clinical inter-
ventions. However, the numerous cases of discor-
dance between laboratory animal and human
outcomes suggest that this premise may well be
incorrect, and that the utility of animal experi-
ments for these purposes may not be assured. On
the other hand, only small numbers of experiments
are normally reviewed in case studies, and their
selection may be subject to bias. To provide more-
definitive conclusions, systematic reviews of the
human clinical or toxicological utility of large num-
bers of animal experiments are necessary.
Experiments included in such reviews should be
selected without bias, via randomisation, or simi-
larly methodical and impartial means.
In support of this concept, Pound and colleagues
(51) commented that clinicians and the public often
consider it axiomatic that animal research has con-
tributed to human clinical knowledge, on the basis of
anecdotal evidence or unsupported claims. These
constitute an inadequate form of evidence, they
asserted, for such a controversial area of research,
particularly given increasing competition for scarce
research resources. Hence, they called for systematic
reviews to examine the human clinical utility of ani-
mal experiments, and commenced by examining six
existing reviews, which did not demonstrate the clin-
ical utility expected of the experiments in question.
Soon afterwards, the UK Nuffield Council on
Bioethics stated that, It would… be desirable to
Poor human clinical and toxicological utility of animal experiments 643
undertake further systematic reviews and meta-
analyses to evaluate more fully the predictability
and transferability of animal models. They called
for these to be undertaken by the UK Home Office,
in collaboration with the major funders of research,
industry associations and animal protection groups
(52).
Since then, several such reviews and meta-analy-
ses have been published, which collectively provide
important insights into the human clinical and tox-
icological utility of animal models. Their identifica-
tion and examination was the purpose of this
review.
Methods
The Scopus biomedical bibliographic databases
were searched for systematic reviews of the human
clinical or toxicological utility of animal experi-
ments published in the peer-reviewed biomedical
literature. Among the world’s most comprehensive
databases, they include over 12,850 academic jour-
nals, 500 open access journals, 700 conference pro-
ceedings, and a total of 29 million abstracts (53).
The Life Sciences database includes over 3,400
titles, and the Health Sciences database includes
over 5,300 titles, including all of Medline, the lead-
ing medical and allied health profession database,
which itself contains over 15 million citations from
the mid-1950s onwards, sourced from more than
5,000 biomedical journals from over 80 countries
(54).
All abstracts, titles and key words were searched
for (animal experiment OR animal model OR ani-
mal study OR animal trial) AND (clinical trial OR
human outcome OR human relevance OR human
result). The results were limited to articles or
reviews, but no chronological, language or other
limitations were applied. Titles and, where neces-
sary, abstracts or complete papers, were examined,
in order to locate relevant papers. Additional rele-
vant studies were obtained by examination of the
reference lists of the papers retrieved, and by con-
sultation with colleagues working in this field.
To minimise bias, reviews were included only
when they had been conducted systematically, by
using randomisation or similarly methodical and
impartial means to select animal studies. For exam-
ple, in some cases, all the animal studies within rel-
evant subsets of toxic chemical databases were
examined, without exclusion.
The examination covered only reviews which con-
sidered the human toxicological predictivity or util-
ity of animal experiments, their contributions
toward the development of prophylactic, diagnostic
or therapeutic interventions with clear potential for
combating human diseases or injuries, or their con-
sistency with human clinical outcomes. Reviews
which focused, for example, only on the contribu-
tions of animal experiments toward increased
understanding of the aetiological, pathogenesic or
other aspects of human diseases, or on the clinical
utility of animal experiments in non-human
species, were excluded from consideration.
Results and Discussion
Bibliographic databases are constantly updated.
2,274 articles or reviews were retrieved, by using
the specified search terms, on 1 March 2007. In
total, 27 systematic reviews which examined the
utility of animal experiments during the develop-
ment of human clinical interventions (20), or in
deriving human toxicity classifications (seven),
were located. Three different approaches that
sought to determine the maximum human clinical
utility that may be achieved by animal experiments,
were of particular interest.
Clinical utility of experiments expected to
lead to medical advances
Lindl and colleagues (55, 56) examined animal exper-
iments conducted at three German universities
between 1991 and 1993, that had been approved by
animal ethics committees, at least partly on the basis
of claims by researchers that the experiments might
lead to concrete advances toward the cure of human
diseases. Experiments were only included where pre-
vious studies had shown that the applications of
related animal research had confirmed the hypothe-
ses of the researchers, and where the experiments
had achieved publication in biomedical journals.
For 17 experiments meeting these inclusion crite-
ria, citations were analysed over at least 12 years.
Citation frequencies and types of citing papers were
recorded: whether they were reviews or animal-
based, in vitro, or clinical studies. 1,183 citations
were evident, but only 8.2% (97 citations) were in
clinical publications, and only 0.3% (4 citations)
demonstrated a direct correlation between the
results of animal experiments and human out-
comes. However, even in these four cases, the
hypotheses that had been verified successfully in
the animal experiment failed in every respect when
applied to humans. None of these 17 experiments
led to any new therapies, or had any beneficial clin-
ical impact during the period examined.
As a result of their analysis, Lindl and colleagues
called for serious, rather than cursory, evaluations
of the likely benefits of animal experiments by ani-
mal ethics committees and related authorities, and
for a reversal of the current paradigm, in which ani-
mal experiments are routinely approved. Instead of
approving experiments because of the possibility
that benefits might accrue, Lindl and colleagues
suggested that where significant doubt exists, labo-
644 A. Knight
ratory animals should receive the benefit of that
doubt, and that such experiments should not, in
fact, be approved.
Clinical utility of highly-cited animal
experiments
Hackam and Redelmeier (57) also used a citation
analysis, but without geographical limitations.
Based on the assumption that findings from highly-
cited animal experiments would be most likely to be
subsequently tested in clinical trials, they searched
for experiments with more than 500 citations and
published in the seven leading scientific journals, as
ranked by citation impact factor.
Of 76 animal studies located, with a median cita-
tion count of 889 (range: 639–2,233), only 36.8%
(28/76) were replicated in randomised human trials.
18.4% (14/76) were contradicted by randomised tri-
als, and 44.7% (34/76) had not translated to clinical
trials. Ultimately, only 10.5% (8/76) of these medical
interventions were subsequently approved for use in
patients, and, as stated previously, even in these
cases, human benefit cannot be assumed, because
adverse reactions to approved interventions are com-
mon, and a leading cause of death (46).
A low rate of translation to clinical trials of even
these highly-cited animal experiments was appar-
ent, despite 1992 being the median publication
year, allowing a median of 14 years for potential
translation. For studies that did translate to clinical
trials, the median time for translation was seven
years (range 1–15). The frequency of translation
was not affected by the laboratory animal species
used, the type of disease or therapy under examina-
tion, the journal, year of publication, methodologi-
cal quality, and even, surprisingly, the citation rate.
However, animal studies incorporating dose–
response gradients were more likely to be trans-
lated to clinical trials (odds ratio [OR] = 3.3; 95%
confidence interval [CI] = 1.1–10.1).
Although the rate of translation of these animal
studies to clinical trials was low, as Hackam and
Redelmeier stated, it is nevertheless higher than
that of most published animal experiments, which
are considerably less likely to be translated than
these highly-cited animal studies published in lead-
ing journals. Furthermore, the selective focus on
positive animal data, whilst ignoring negative
results (optimism bias), was one of several factors
proposed that may have increased the likelihood of
translation beyond that which was scientifically
merited. As Hackam (58) stated, the rigorous meta-
analysis of all relevant animal experimental data
would probably significantly decrease the transla-
tion rate to clinical trials.
In addition, only 48.7% (37/76) of these highly-
cited animal studies were considered to be of good
methodological quality. Despite their publication in
leading scientific journals, few included the random
allocation of animals to test groups, any adjustment
for multiple hypothesis testing, or the blinded
assessment of outcomes. Accordingly, Hackam and
Redelmeier cautioned patients and physicians
about the extrapolation of the findings of even
highly-cited animal research to cases of human dis-
ease.
Clinical utility of chimpanzee experiments
Chimpanzees are the species most closely related to
humans, and consequently, are considered to be the
laboratory animals most likely to provide results
which are predictive of human outcomes. Hence, in
2005, I conducted a citation analysis of the human
clinical utility of chimpanzee experiments (59).
I searched three major biomedical bibliographic
databases, and located 749 papers published
between 1995 and 2004, which described experi-
ments on captive chimpanzees or their tissues.
Although published in the international scientific
literature, the vast majority of these experiments
were conducted within the USA (60). To obtain 95%
CIs with an accuracy of at least plus or minus 10%,
when estimating the proportion of chimpanzee
studies subsequently cited by other published
papers, a subset of at least 86 chimpanzee studies
was required (61–63).
Of 95 published randomly-selected studies on
chimpanzees, 49.5% (47/95) were not cited by any
subsequent papers, demonstrating minimal contri-
butions toward the advancement of biomedical
knowledge. This is of particular concern, because it
can be assumed that research judged to be of lesser
value was not published. Hence, it appears that the
majority of chimpanzee research generates data of
questionable value, which make little obvious con-
tribution toward the advancement of biomedical
knowledge.
35.8% (34/95) of the 95 published chimpanzee
studies were cited by 116 papers that clearly did not
describe well-developed methods for combating
human diseases. Only 14.7% (14/95) of them were
cited by 27 papers that had abstracts which indi-
cated well-developed prophylactic, diagnostic or
therapeutic methods for combating human dis-
eases. However, a detailed examination of these 27
medically-oriented papers revealed that in vitro
studies, human clinical and epidemiological studies,
molecular assays and methods, and genomic stud-
ies, contributed most to their development. 63.0%
(17/27) were wide-ranging reviews of 26–300
(median 104) references, to which these cited chim-
panzee studies made very small contributions.
Duplication of human outcomes, inconsistency with
other human or primate data, and other causes,
resulted in the absence of any chimpanzee study
able to demonstrate an essential contribution, or, in
Poor human clinical and toxicological utility of animal experiments 645
most cases, a significant contribution of any kind,
toward the development of the medical method
described.
Despite the low utility of chimpanzee experi-
ments in advancing human health which was indi-
cated by these results, it remains true that
chimpanzees are the species most closely related to
human beings. Hence, it is highly likely other labo-
ratory species are even less useful as experimental
models of humans in biomedical research and toxi-
city testing.
Clinical utility of stroke and head injury
models
Despite the existence of literature on the efficacy of
more than 700 drugs in treating experimental mod-
els of stroke (artificially-induced focal cerebral
ischaemias; 64), only recombinant tissue plasmino-
gen activator (rt-PA) and aspirin have convincingly
demonstrated efficacy in human clinical trials of
treatments for acute ischaemic stroke (65–67).
Hence, Macleod and colleagues (64) stated that,
This failure of putative neuroprotective drugs in
clinical trials represents a major challenge to the
doctrine that animals provide a scientifically-valid
model for human stroke. At least 10 published sys-
tematic reviews have described the poor human
clinical utility of animal experimental models of
stroke and head injuries (64, 68–76).
In some cases, clinical trials proceeded, despite
equivocal evidence of efficacy in animal studies. For
example, Horn and colleagues (68) systematically
reviewed 20 animal studies on the efficacy of
nimodipine, of which only 50% showed beneficial
effects following treatment. They concluded that,
...the results of this review did not show convincing
evidence to substantiate the decision to perform tri-
als with nimodipine in large numbers of patients.
These clinical trials also demonstrated equivocal
evidence of efficacy, and furthermore, proceeded
concurrently with the animal studies, despite the
fact that the latter are intended to be conducted
prior to clinical trials, to facilitate the detection of
potential human toxicity.
O’Collins and colleagues (69) conducted a very
large review of 1,026 experimental drugs for acute
stroke that had been tested in animal models. They
found that the effectiveness in animals of 114 drugs
chosen for human clinical use was no greater than
that of the remaining 912 drugs not chosen for clin-
ical use, thereby demonstrating that effectiveness
in animal models had no measurable effect on
whether or not these drugs were selected for human
clinical use. Accordingly, O’Collins and colleagues
questioned whether the most efficacious drugs are,
in fact, being selected for clinical trials, and called
for greater rigour in the conduct, reporting, and
analysis of animal experiments.
In many cases, animal models did indicate efficacy,
but this did not translate to humans. In a few
reviews, the authors speculated on the possible
causes. For example, Jonas and colleagues (70)
hypothesised that the poor clinical efficacy of neuro-
protectants which had been found to be successful in
animal models, was due to differences in the timing of
the initiation of treatment. Curry (71) hypothesised
that the human clinical failure of fourteen neuropro-
tective agents which were successful in animal mod-
els, was due to the antagonism of glutamate — which
may be associated with neuroprotection — by drug
treatment in clinically-normal individuals. He there-
fore proposed that clinical trials should be restricted
to real stroke patients, who experience elevated
plasma glutamate levels. However, such speculation
has not resulted in improvements in the poor clinical
record of neuroprotectants which were previously
found to be successful in animal models.
The utility of the majority of these animal studies
also appears to have been impeded by their poor
methodological quality. Examples include: animal
studies on the efficacy of melatonin (64); 20 animal
studies on the efficacy of nimodipine (68); 29 animal
studies on the efficacy of FK506 (72); 45 animal stud-
ies on five compounds from different classes of
alleged neuroprotective agents — clomethiazole,
gavestinel, lubeluzole, selfotel, and tirilazad mesylate
(73); 25 animal studies on the efficacy of nitric oxide
(NO) donors and L-arginine (74); and 73 animal stud-
ies of the efficacy of NO synthase inhibitors (75).
The methodological quality of animal studies was
typically scored on the basis of the presence of char-
acteristics such as: appropriate animal models (aged,
diabetic or hypertensive animals are considered to
more-closely model human stroke patients); power
calculations of sample sizes; random allocation to
treatment and control groups; use of a clinically-rel-
evant time window for commencement of treatment;
blinded drug administration; use of anaesthetics
without significant intrinsic neuroprotective activity
(ketamine, for example, may alter neuroprotective
activity); blinded induction of ischaemia (given that
the severity of induced infarcts may be subtly
affected by knowledge of treatment allocation);
blinded outcome assessment; assessment of both
infarct volume and functional outcome; adequate
monitoring of physiological parameters; assessment
during both the acute (e.g. one to six days) and
chronic (e.g. seven to 30 days) phases; statement of
temperature control; compliance with animal wel-
fare regulations; peer-reviewed publication; and con-
flict of interest statements. Typically, one point was
given for the presence of each characteristic. For
example, The Stroke Therapy Academic Industry
Roundtable recommendations for standards with
regard to preclinical and restorative drug develop-
ment involve an eight-point scale (68, 77).
Median quality scores were: four out of 10 (13
studies; range zero to six [64]); four out of 10 (29
646 A. Knight
studies; range zero to seven [72]); three out of 10
(45 studies [73]); and three out of 8 (73 studies;
range one to six [75]). Common deficiencies
included lack of: sample size calculations, aged ani-
mals or those with appropriate co-morbidities, ran-
domised treatment allocation, blinded drug
administration, blinded induction of ischaemia,
blinded outcome assessment, and conflict of inter-
est statements. Some studies also used ketamine
anaesthesia, and there was also substantial varia-
tion in the parameters assessed.
van der Worp and colleagues (73), for example, con-
cluded that the collective evidence for neuroprotec-
tive efficacy which formed the basis for 21 clinical
trials, was obtained in animal studies with a method-
ological quality that would not, in retrospect, justify
such a decision. Wilmot and colleagues (74) also
found considerable variations in animal experiment
protocols, which concerned: animal species; physio-
logical parameters (such as blood pressure); drug
administration (timing, dosage, and route); surgical
methodology; and duration of ischaemia. Statistical
analysis (Egger’s test) also revealed the likely exis-
tence of publication bias (an increased tendency to
publish studies in which a treatment effect is appar-
ent, or a decreased tendency to so publish, e.g. result-
ing from commercial pressures, particularly in the
case of patented drugs under development). Macleod
and colleagues (64) commented that, These deficien-
cies apply to most, if not all, of the animal literature.
This is of particular concern, because Macleod and
colleagues (72) reported that efficacy was apparently
lower in higher quality studies, which raised concerns
that the apparent efficacy may have been artificially
elevated by factors such as poor methodological qual-
ity and publication bias.
A related review, not limited solely to stroke, exem-
plified some of these issues. Perel and colleagues (76)
examined therapeutic interventions with unambigu-
ous evidence of a treatment effect (benefit or harm),
in clinical trials related to the following: corticos-
teroidal treatment for head injury; anti-fibrinolytics
for the treatment of haemorrhage; thrombolysis, and
also tirilazad, for the treatment of acute ischaemic
stroke; antenatal corticosteroids in the prevention of
neonatal respiratory distress syndrome; and bisphos-
phonates in the treatment of osteoporosis. They
found that three interventions had similar outcomes
in animal models, whilst three did not, suggesting
that the animal studies did not reliably predict the
human outcomes. Perel and colleagues reported that
the animal studies varied in methodological quality
and sample sizes, that randomisation and blinding
were rarely reported, and that publication bias was
evident.
Clinical utility of other animal experiments
Of seven systematic reviews on the utility of animal
models in other clinical fields identified by this
review (78–85, of which 79 and 80 described a sin-
gle review), in only two cases — one of which was
contentious — did the animal models appear to be
clearly useful in the development of human clinical
interventions, or substantially consistent with
human clinical outcomes.
As in the case of stroke, some clinical trials pro-
ceeded, despite equivocal evidence of efficacy in ani-
mal studies. Upon systematically reviewing the
effects of Low Level Laser Therapy (LLLT) on
wound healing in 36 cell or animal studies, Lucas
and colleagues (78) found that an in-depth analysis
of studies with the highest methodological quality
showed no significant pooled treatment effect.
Despite this, the clinical trials proceeded. Further -
more, almost from the beginning of LLLT investi-
gations, animal experiments and clinical studies
occurred simultaneously, rather than sequentially.
The human trials also failed to demonstrate signif-
icant benefits.
Roberts and colleagues (79), and Mapstone and
colleagues (80), all systematically reviewed a group
of 44 randomised, controlled animal studies on the
efficacy of fluid resuscitation in bleeding animals. A
previous systematic review by some of these inves-
tigators of clinical trials of fluid resuscitation had
found no evidence that the practice improved out-
comes, and had even identified the possibility that
it might be harmful (86). In this later review
(79–80), they found that fluid resuscitation reduced
mortality in animal models of severe haemorrhage,
but increased mortality in those with less severe
haemorrhage.
After clinical trials in humans failed to provide
evidence of benefit, Lee and colleagues (81) con-
ducted a systematic review and meta-analysis of
controlled trials of endothelin receptor blockade in
animal models of heart failure. Meta-analysis failed
to provide evidence of overall benefit, and indicated
increased mortality with early administration.
In their investigation of the contributions of
human clinical trial results and analogous experi-
mental studies to asthma research — one of the
most common and heavily-investigated of modern
diseases — Corry and Kheradmand (82) demon-
strated that failure to conduct and analyse the
results of animal studies before proceeding to clini-
cal trials is not uncommon: Research along two
fronts, involving experimental models of asthma
and human clinical trials, proceeds in parallel,
often with investigators unaware of their counter-
part’s findings.
The clinical utility of animal models is clearly
questionable in such cases, in which clinical trials
proceed concurrently with, or prior to, animal stud-
ies, or continue, despite equivocal evidence of effi-
cacy in the animals.
As in the case of stroke, the clinical utility of the
majority of these animal studies also appears to
have been limited by their poor methodological
Poor human clinical and toxicological utility of animal experiments 647
quality. Examples include: 36 cell or animal studies
on the effects of LLLT on wound healing (78); 44
studies on the efficacy of fluid resuscitation in
bleeding animals (79–80); and studies on the effi-
cacy of endothelin receptor blockade in animal mod-
els of heart failure (81). Common flaws included
inadequate sample sizes, leaving studies underpow-
ered, and lack of randomisation and blinding.
In some cases, obvious deficiencies within the
animal models were identified. In commenting on
the clinical relevance of animal models for testing
the effects of LLLT on wound healing, Lucas and
colleagues (78) noted that the animal models
excluded common problems associated with wound
healing in humans, such as ischaemia, infection,
and necrotic debris.
Difficulties were also apparent, in translating
animal outcomes to human clinical protocols, in at
least one case. Lazzarini and colleagues (83)
reviewed experimental studies on osteomyelitis, to
ascertain their impacts on the systemic antibiotic
treatment of human osteomyelitis. Although they
found that most of the animal models reviewed
were reproducible and dependable, they also found
that the human predictivity of these studies was
unclear, and was possibly undermined by difficul-
ties in establishing the right dose regimen in the
animals. Although they considered that the use of
antibiotic combinations was associated with better
outcomes in the majority of animal studies, and
that these studies did provide indications of appro-
priate minimum treatment durations, they con-
cluded that these studies had limited relevance to
clinical practice.
In two cases, reviewers reported that animal and
human outcomes were substantially consistent,
although in one case this conclusion was con-
tentious. While reviewing therapeutic approaches
to streptococcal endocarditis, Scheld (84) reported
good overall correlations among results obtained by
in vitro susceptibility testing (especially killing
kinetics in broth), in animal experiments, and in
clinical trials on different antimicrobial regimens in
humans with streptococcal endocarditis.
To investigate the efficacy of rodent models of
carcinogenesis in predicting treatment outcomes in
humans, Corpet and Pierre (85) conducted a sys-
tematic review and meta-analysis of colon cancer
chemoprevention studies involving the use of
aspirin, β-carotene, calcium, and wheat bran, in
rats, mice and humans. Controlled intervention
studies on the recurrence of adenomas in human
volunteers were compared with chemoprevention
studies of carcinogen-induced tumours in rats, and
of polyps in Min (Apc[+/–]) mice. 6,714 humans,
3,911 rats and 458 mice were included in the meta-
analyses. Corpet and Pierre found that comparable
results were achieved in rats and humans with
aspirin, calcium, β-carotene, and wheat bran.
Comparable results were found in Min mice and
humans with aspirin, but discordant results were
obtained with calcium and wheat bran (the equiva-
lent β-carotene results were not available). Corpet
and Pierre concluded that these results suggest that
the use of the rodent models can roughly predict
treatment effects in humans, but that the predic-
tion is not accurate for all agents, and that the car-
cinogen-induced rat model is more predictive than
the Min mouse model. However, relatively few
agents were tested, and two of the three agents
tested in mice produced different outcomes in
humans, so the conclusion that rodents are predic-
tive of human treatment effects, albeit only
roughly, is itself contentious.
Toxicological utility: carcinogenicity
Due to the limited availability of data on human
exposure, the identification and regulation of expo-
sure to potential human toxins has traditionally
relied heavily on animal studies. However, system-
atic reviews have indicated that the utility of ani-
mal studies for these purposes is lacking in the
fields of carcinogenicity (at least five reviews:
87–91) and teratology (one review: 92). No system-
atic review demonstrated a contrary result. The
sensitivities of animal models to a range of human
toxicities (i.e. the ability to identify them) high-
lighted by one review (93) generally appears to be
accompanied by poor human specificity (i.e. the
ability to correctly identify human non-toxins),
resulting in a high incidence of false-positive
results.
EPA survey
The regulation of human exposure to potentially
carcinogenic chemicals constitutes society’s most
important use of animal carcinogenicity data. In
2004, to examine the utility of animal carcinogenic-
ity data in protecting public health, I surveyed the
EPA’s Integrated Risk Information System (IRIS)
chemicals database. This database contains the
environmental contaminants of greatest concern in
the USA, together with their animal, and, in a small
minority of cases, human toxicity data, along with
the human toxicity assessments based on this
pooled data. However, of the 160 IRIS chemicals
lacking even limited human exposure data, but pos-
sessing animal data, for which human toxicity
assessments existed, the EPA considered the ani-
mal carcinogenicity data to be inadequate to sup-
port a classification of probable human carcinogen
or non-carcinogen in the majority of cases (58.1%,
93/160; 95% CI: 50.4–65.5; 87).
Furthermore, data from the World Health
Organisation’s International Agency for Research
on Cancer (IARC) indicated that the true utility of
648 A. Knight
animal carcinogenicity data for deriving human car-
cinogenicity assessments is actually substantially
lower than that indicated solely by EPA assess-
ments. Of 128 chemicals with human or animal
data assessed by both the EPA and the IARC,
human carcinogenicity classifications were consis-
tent between the two agencies only for the 17 chem-
icals for which at least limited human data were
available. For those 111 chemicals for which the
classification was primarily reliant on animal data,
the EPA was much more likely than the IARC to
assign carcinogenicity classifications indicative of
greater human risk (p < 0.0001; 87).
The IARC is a leading international authority on
carcinogenicity assessments, and the significant dif-
ferences between its human carcinogenicity classifi-
cations and those of the EPA, for identical
chemicals, indicate that: i) in the absence of signifi-
cant human data, the EPA is over-reliant on animal
carcinogenicity data; ii) as a result, the EPA tends
to over-predict carcinogenic risk; and iii) the true
predictivity for human carcinogenicity of animal
data is even poorer than that indicated by EPA fig-
ures alone. EPA policy erroneously assuming that
tumours in animals are indicative of human car-
cinogenicity, was implicated as a primary cause of
these errors, which have substantial US public
health implications concerning the regulation of
human exposures to environmental contaminants
(87).
IARC Monographs survey
The poor human predictivity of animal carcino-
genicity studies was also demonstrated in 1993 by
Tomatis and Wilbourn (88), who surveyed the 780
chemical agents or exposure circumstances evalu-
ated and listed within Volumes 1–55 of the IARC
Monographs series (94). Of these, 502 (64.4%) had
definite or limited evidence of animal carcinogenic-
ity, and 104 (13.3%) were assessed as definite or
probable human carcinogens. Virtually all of the
latter group would, of course, have been members
of the former; so at least 398 animal carcinogens
were assessed and considered not to be definite or
probable human carcinogens.
The positive predictivity of a test is the propor-
tion of positive outcomes that are truly positive for
the characteristic being tested for, while the false-
positive rate refers to the proportion that are not.
Hence, based on these IARC figures, the positive
predictivity of the animal bioassay for definite or
probable human carcinogens was, at best, only
20.7% (104/502), while the false-positive rate was at
least 79.3% (398/502).
More-recent IARC classifications indicate little
improvement in the positive predictivity of the ani-
mal bioassay for human carcinogens. By 1 January
2004, a decade later, only 105 additional agents had
been added to the 1993 number, yielding a total of
885 agents or exposure circumstances listed in the
IARC Monographs (95). The proportion of definite
or probable human carcinogens had increased only
slightly, from 13.3% in 1993 to 17.1% in 2004.
The NTP and other surveys
Surveys by other investigators have also demon-
strated the poor human predictivity of animal car-
cinogenicity data. After examining the studies on
471 substances contained within the US National
Toxicology Program (NTP) carcinogenicity data-
base as of 1 July 1998, Haseman (89) concluded
that, although 250 (53.1%) produced carcinogenic
effects in at least one sex–species group, the actual
proportion which posed a significant carcinogenic
risk to humans was probably far lower, for reasons
such as interspecies differences in mechanisms of
carcinogenicity.
Similarly, around half of all chemicals tested on
animals and included in the comprehensive
Berkeley-based carcinogenic potency database,
whether natural or synthetic, gave positive results
(89). Rall (96) estimated that only around 10% of
chemicals are truly carcinogenic to humans. Ashby
and Purchase (97) speculated that all chemicals
would eventually display some carcinogenic activ-
ity, if tested in sufficient rodent strains. Even com-
mon table salt has been classified as a tumour
promoter in rats (98).
Fung and colleagues (99) estimated that, if all the
75,000 chemicals in use were tested for carcino-
genicity via the standard NTP bioassay, signifi-
cantly less than 50% would prove carcinogenic in
animals, and less than 5–10% would warrant fur-
ther investigation. They suggested that the higher
positivity rate recorded is due to chemical selection
based on a priori suspicion of carcinogenicity.
However, examination of the carcinogenicity litera-
ture reveals that chemicals are selected for study
for many reasons other than a priori suspicion,
including production volumes, occupational and
environmental exposure risks, and investigations of
mechanisms of carcinogenesis (100). Despite this,
the positivity rate of the carcinogenicity bioassay in
the general literature remains around 50% (101).
Huff (90) demonstrated a significant variation in
carcinogenicity test results between two major car-
cinogenicity testing programmes, at the NTP
(Research Triangle Park, NC, USA) and the Rama -
zzini Foundation (RF; Bentivoglio, Italy). Both lab-
oratories had carried out several hundred chemical
carcinogenesis bioassays: around 500 at the NTP,
and 200 at the RF. Of these, 21 chemicals were eval-
uated by both laboratories, of which published
results were available for 14. The results were
inconsistent for 3 of these 14 chemicals (21.4%),
which had been declared carcinogenic by one labo-
Poor human clinical and toxicological utility of animal experiments 649
ratory but not the other, questioning the reliability
of these assays. Of the remaining 11 chemicals,
both laboratories found nine to be carcinogenic, and
two not to be carcinogenic.
Possible causes for such different toxicity results
between laboratories include differences in: the test
species, strain, age or gender; the quantity, dura-
tion and consistency of dosing; the route and
method of administration; diet and laboratory envi-
ronmental conditions; and the criteria used for the
assessment of toxicity.
Ennever and Lave (91) demonstrated that nei-
ther of the two commonly-used interpretations of
rodent carcinogenicity data provide valid conclu-
sions about human carcinogenicity. If a risk avoid-
ance interpretation is used, in which any positive
result in male or female mice or rats is considered
positive, then nine of the 10 known human carcino-
gens among the hundreds of chemicals tested by the
NTP are positive (102), but so are an implausible
22% of all chemicals tested (99). If a less risk-sensi-
tive interpretation is used, whereby only chemicals
positive in both mice and rats are considered posi-
tive, then only three of the six known human car-
cinogens tested in both species are positive (102).
The former interpretation could result in the need-
less denial of potentially useful chemicals to society,
while the latter could result in widespread exposure
to undetected human carcinogens.
Toxicological utility: teratogenicity
In 2005, my colleagues and I published an extensive
survey examining the human predictivity of animal
teratogenicity testing (92). We examined nearly
every putative teratogen tested in more than one
species, including 1,396 studies. Data for 11 groups
of known human teratogens tested in 12 animal
species were analysed. Discordance between species
was apparent in just under 30% of these 1,396
reports. Almost a quarter of all the outcomes in the
six main species used (mouse, rat, rabbit, hamster,
primate and dog) were equivocal. For known human
teratogens, there was high variability in positive pre-
dictivity between species, the mean of which was
only 51% — hardly better than tossing a coin. Some
species exhibited a high false-negative rate. Only
around half of these known human teratogens were
teratogenic in more than one primate species. Fewer
than one in 40 of the substances designated as poten-
tial teratogens from animal studies, were conclu-
sively linked to human birth defects.
We concluded that the poor human predictivity of
animal-based teratology warrants the cessation of
animal testing, and that resources should be reallo-
cated into the further development and implemen-
tation of quicker, cheaper and more reliable,
scientifically validated alternatives, such as the
embryonic stem cell test.
Toxicological utility: various
Under the auspices of the International Life
Sciences Institute’s Health and Environmental
Sciences Institute, Olsen and colleagues (93) sought
to determine the extent to which various types of
human toxicities evident during clinical trials could
be predicted from standard toxicology studies.
Based on a multi-company database of 131 pharma-
ceutical agents with one or more human toxicities
identified during clinical trials, they reported a
true-positive prediction rate of animal models for
human toxicity of 69%, and also that study results
from non-rodent (dog, primate) species have good
potential to identify human toxicities from many
therapeutic classes.
These results concur with those of the other tox-
icity reviews described. Animal studies are often
reasonably sensitive for human toxins. However,
their human predictivity and toxicological utility
are limited by their poor human specificity, which
results in high false-positive rates.
Causes of the poor human utility of animal
models
When evaluated overall, these 27 systematic reviews
clearly do not support the widely-held assumptions of
animal ethics committees and the opinions of advo-
cates of animal experimentation, that laboratory ani-
mal use is generally beneficial in the development of
human therapeutic interventions and the assessment
of human toxicity. On the contrary, they frequently
demonstrate that animal experiments are of low util-
ity for these purposes. This appears to result both
from limitations of the animal models themselves,
and also from the poor methodological quality and
statistical design of many animal experiments.
Biomedical research
Chimpanzees are our closest living relatives, but
despite great similarities between the structural
regions of chimpanzee DNA and human DNA, impor-
tant differences between the regulatory regions exert
an “avalanche” effect on large numbers of structural
genes (103). Despite nucleotide difference between
chimpanzees and humans of only 1–2%, this effect
results in differences of around 20%, in terms of pro-
tein expression (104), representing a marked pheno-
typic differences between the species. These
differences manifest as: altered susceptibility to the
aetiology and progression of various diseases; differ-
ences in the absorption, tissue distribution, metabo-
lism, and excretion of chemotherapeutic agents; and
differences in the toxicity and efficacy of pharmaceu-
ticals and other agents (59, 103). Such effects appear
to be responsible for the demonstrated inability of
650 A. Knight
most chimpanzee research to contribute substan-
tially to the development of methods which are effi-
cacious in combating human diseases (59).
Other laboratory animal species are much less
similar to humans, both genetically and phenotypi-
cally, and are therefore less likely to be useful for
accurately modelling the progression of human dis-
eases or of human responses to chemicals and puta-
tive chemotherapeutic agents.
Toxicity testing
Rodents are by far the most common laboratory
animal species used in toxicity studies. Several fac-
tors contribute to the demonstrated inability of
rodent bioassays to reliably predict human toxicity.
The stresses incurred during handling, restraint,
other routine laboratory procedures, and particu-
larly, the stressful routes of dose administration
common to toxicity tests, alter immune status and
disease predisposition in ways which are very diffi-
cult to accurately predict, and which distort the pro-
gression of diseases and responses to chemicals and
putative chemotherapeutic agents (105, 106).
In addition, animals have a broad range of physi-
ological defences against general toxic insults, such
as epithelial shedding and inducible enzymes,
which commonly prove effective at environmentally
relevant doses, but which may be overwhelmed at
the high doses commonly applied in routine toxicity
testing (101). Carcinogenicity assays, in particular,
involve chronic, high level dosing. This may result,
inter alia, in insufficient rest intervals between
doses for the effective operation of DNA and tissue
repair mechanisms, which, with the unnatural ele-
vation of cell division rates during ad libitum feed-
ing, may predispose the animals to mutagenesis and
carcinogenesis. Lower doses, greater intervals
between exposures, shorter total periods of expo-
sure, and intermittent feeding, which represent a
more realistic approach to the environmental expo-
sure of humans to most potential toxins, might not
result in toxic changes at all (106).
Finally, differences in rates of absorption and
transport mechanisms between test routes of
administration and other important human routes
of exposure, and the considerable variability of
organ systems in response to toxic insults, between
and within species, strains and genders, render pro-
foundly difficult any attempt to accurately predict
human hazard on the basis of animal toxicity data
(106).
Methodological quality
At least 11 systematic reviews (57, 64, 68, 72–76,
78–81 [of which, 79 and 80 described a single
review]) demonstrated the poor methodological
quality of many of the animal studies examined,
and none of the reviews demonstrated good
methodological quality in a majority of studies.
While the omission of study details due to publica-
tion space constraints may artificially lower appar-
ent quality, the prevalence of such deficiencies
exceeds that which might reasonably be expected,
and is, accordingly, grounds for considerable con-
cern.
Common deficiencies included lack of: sample
size calculations, sufficient sample sizes, appropri-
ate animal models (e.g. aged animals or those with
appropriate comorbidities), randomised treatment
allocation, blinded drug administration, blinded
induction of ischaemia in the case of stroke models,
blinded outcome assessment, and conflict of inter-
est statements. Some studies also used anaesthetics
that may have altered the experimental outcomes,
and substantial variation was evident in the param-
eters assessed.
These deficiencies limited the clinical utility of
these studies in various significant ways. For exam-
ple, it is well established that studies lacking ran-
domisation or blinding often over-estimate the
magnitude of the effects of treatments (107–109).
Bebarta and colleagues (110) described the impacts
of lack of randomisation or blinding on estimations
of the significance of treatment effects in 389 ani-
mal studies and in 2,203 cell line studies. They
found that studies lacking randomisation or blind-
ing, but not both, were more likely to report a treat-
ment response than studies that used these
measures (OR = 3.4; 95% CI = 1.7 to 6.9, and OR =
3.2; 95% CI = 1.3 to 7.7, respectively), and that
studies lacking both randomisation and blinding
were even more likely to report a treatment
response (OR = 5.2; 95% CI = 2.0 to 13.5).
Statistical design
Insufficient sample sizes left many studies under-
powered, limiting the statistical validity of the
study conclusions. Animal lives and other resources
may also be wasted, if experiments subsequently
require repetition as a result. As stated by the UK
Medical Research Council (111), The number of ani-
mals used… must be the minimum sufficient to cre-
ate adequate statistical power to answer the question
posed.
According to Balls and colleagues (112), however,
…surveys of published papers, as well as more anec-
dotal information, suggest that more than half of the
published papers in biomedical research have statis-
tical mistakes, many seem to use excessive numbers
of animals, and a proportion are poorly designed.
Festing (113) similarly stated that, Surveys of pub-
lished papers show that there are many errors, both
in the design of the experiments and in the statisti-
cal analysis of the resulting data. This must result
Poor human clinical and toxicological utility of animal experiments 651
in a waste of animals and scientific resources, and it
is surely unethical. De Boo and Hendriksen (114)
noted the tendency to alter animal numbers based
on scientifically irrelevant issues, such as availabil-
ity or cost.
Factors that should be considered when calculat-
ing appropriate sample sizes include: detectability
threshold (the size of the difference between treat-
ment groups considered significant); known or
expected data variation; the required significance of
the test (‘p’ or ‘α’: the probability of a Type I error
— assuming a difference where none exists); the
acceptable probability of assuming no difference
where one does exist (‘β’, a Type II error. The
‘power’ of an experiment = 1–β; 0.8 is the usual
choice); and the type of statistical analysis to which
the data will be subjected. Smaller thresholds,
greater data variation, smaller acceptable error
probabilities (greater power), and certain statistical
tests for differences, all require larger samples.
No universal rule for calculating correct sample
sizes exists (114). Festing (115), for example,
describes two methods, the preferred ‘power calcu-
lation,’ and the ‘resource equation.’ Power calcula-
tions use formulae which are available in
interactive computer programmes (e.g. 116, 117),
and calculate the minimum sample sizes required to
detect treatment effects with specified degrees of
certainty. Mead’s ‘resource equation’ (118) calcu-
lates sample sizes by using degrees of freedom, and
incorporates statistical parameters, such as treat-
ment effects, block effects and error degrees of free-
dom.
Strategies should also be considered for minimis-
ing animal numbers without unacceptably compro-
mising statistical power. Several of these strategies
aim to decrease data variability by minimising het-
erogeneity in experimental environments and pro-
tocols. This can be achieved by: i) the appropriate
use of environmental enrichment, aimed at decreas-
ing physiological variation resulting from barren
laboratory housing and stressful procedures; ii)
choosing, where possible, to measure variables with
relatively low inherent variability; iii) the use of
genetically homogeneous (isogenic or inbred) or
specified pathogen-free animal strains; and iv)
screening raw data for obvious errors or outliers
(105, 114, 119–122).
Meta-analysis involves the aggregation and sta-
tistical analysis of suitable data from multiple
experiments. For some purposes, treatment and
control groups can be combined, permitting group
numbers to be minimised. Although new informa-
tion can be derived through meta-analysis, more
frequently, the results allow the refinement of
existing knowledge. By designing experiments and
reporting protocols to maximise their utility for
later meta-analyses, the benefit of individual ran-
domised controlled experiments can be maximised
(123). Strategies such as these, aimed at maximis-
ing the statistical power of small samples, are par-
ticularly appropriate when marked ethical, cost or
practical constraints limit the number of animals
that may be used (e.g. in experiments involving
non-human primates).
Finally, the appropriate statistical analysis of the
resultant data should be closely linked to the exper-
imental design, and to the type of data produced
(124). The relatively poor statistical knowledge of
many animal researchers may be the cause of the
high prevalence of poor sample size choices in ani-
mal studies. Solutions could include the training of
researchers in statistics, and the direct input of
statisticians in experimental design and data analy-
sis (114, 125).
Raising standards: evidence-based medicine
Evidence-based medicine (EBM) bases clinical deci-
sions on methodologically-sound, prospective, ran-
domised, blinded, and controlled clinical trials. The
gold standard for EBM is large prospective epidemio-
logical studies, or meta-analyses of randomised and
blinded, controlled clinical trials (126). The applica-
tion to animal experiments of the EBM standards
which are currently applied to human clinical trials,
would make the results more robust and would
increase their applicability (76, 127–130). However,
mechanisms would be needed to ensure compliance
with such standards. Compliance could, for example,
be made a prerequisite for research funding, ethics
committee approval, and the publication of results.
These measures would require the education and co-
operation of funding agencies, ethics committees and
journal editors.
The UK Medical Research Council requires
researchers who are planning clinical trials, to ref-
erence systematic reviews of related previous work
before they are permitted to proceed (51). To facili-
tate the detection of toxicity and of potentially effi-
cacious drugs, such reviews should also include all
relevant animal research (76). A similar require-
ment to reference, or where necessary, conduct, sys-
tematic reviews of relevant animal studies, prior to
the commencement of further animal studies,
would encourage a more complete and impartial
assessment of the existing evidence (51).
Mechanisms are also needed to encourage the
reporting of negative results. The negative results
of preclinical studies are much more likely to
remain unpublished than are the negative results of
clinical trials (131). In a systematic review of stud-
ies on the efficacy of nicotinamide in combating
experimentally-induced stroke, comparisons pub-
lished only in abstract form gave a significantly
lower estimate of effect size than those published in
full, demonstrating publication bias (132). van der
Worp and colleagues (73) commented on the pres-
sure to obtain and publish positive results: It is
652 A. Knight
therefore conceivable that the career of a preclinical
investigator is more dependent on obtaining positive
results, than that of a clinical trialist.
Fundamental constraints on the human
utility of animal models
Strategies designed to increase the full and impar-
tial examination of existing data before conducting
animal studies, to improve their methodological
quality, and to decrease bias during the publication
of results, would minimise the consumption of ani-
mal, financial and other resources within studies of
questionable merit and quality, and would increase
the potential utility of animal data in addressing
human situations and problems. However, the poor
human clinical or toxicological utility of many ani-
mal experiments is unlikely to result solely from
their poor methodological quality, or from publica-
tion bias. As stated by Perel et al. (76), the failure of
animal models to adequately represent human dis-
ease may be another fundamental cause, which, in
contrast, could be technically and theoretically
impossible to correct.
The genetic modification of animal models
through the addition of foreign genes (transgenic
animals) or the inactivation or deletion of genes
(knockout animals) is being attempted, to make
them more-closely model humans. However, as well
as being technically very difficult to achieve, such
modification may not permit clear conclusions, due
to a large number of factors, including those reflect-
ing the intrinsic complexity of living organisms,
such as the variable redundancy of some metabolic
pathways between species (133). Furthermore, the
animal welfare burdens incurred during the cre-
ation and use of GM animals are particularly high
(134).
Implications for scientific validation of
experimental models
Proposed non-animal test models are generally
required to pass formal scientific validation before
their use is widely or officially accepted.
Pharmaceutical licensing agencies, for example, are
generally unwilling to accept non-animal test data
as evidence of the human safety of proposed new
pharmaceuticals, until the test models used have
been scientifically validated.
Scientific validation has traditionally involved
the demonstration, in multiple independent labora-
tories, that the test in question is relevant and reli-
able for its specified purpose (practical validation;
135), such as the prediction of a certain in vivo out-
come. It should also be preceded by an evaluation of
the necessity for the test and of the adequacy of its
development (136, 137). A three-stage prevalidation
process should be utilised to improve the efficiency
of the formal validation process, by ensuring satis-
factory protocol refinement and transferability, and
test performance (138).
However, it is not always scientifically necessary,
or even logistically possible, to conduct multi-centre
practical studies. Hence weight-of-evidence valida-
tion, also known as validation by retrospective
analysis (139, 140), may be conducted, based on the
assessment of existing data in a structured, system-
atic and transparent manner, provided that data of
sufficient quantity and quality are available (141).
Regardless of the approach taken, the criteria
required for formal validation are comprehensive
(136, 141). Key objectives include: establishing the
role and necessity of the test model; ensuring clar-
ity of the defined goals; defining a prediction model,
i.e. an algorithm for converting the test data into
meaningful predictions of in vivo toxicity; examin-
ing the mechanistic relevance and credibility of the
model with respect to those goals; and providing a
description of the limitations of the model.
Where practical validation studies do occur, these
should adhere to best practice standards, designed to
ensure good methodological quality, including, for
example, statistical justifications of sample sizes, ran-
domised allocation to test groups, and blinded treat-
ment and assessment of results. Where possible,
inter-laboratory reproducibility should be demon-
strated (136).
Whether validation studies are conducted by prac-
tical or weight-of-evidence approaches, experience
has shown that transparency and independence from
commercial, political or other interests should be
maximised through the use of independent experts
and the peer-reviewed publication of outcomes (136).
Scientific validation should lead to the reasoned
overall assessment that sufficient evidence exists to
demonstrate that a model is, or is not, relevant and
reliable for the specified purpose, or that insuffi-
cient evidence exists to be reasonably certain either
way. In some cases, an interim assessment can be
made, until further evidence becomes available
(141).
The European Centre for the Validation of
Alternative Methods (ECVAM) was created by the
EC in 1991, to fulfil the requirements of Directive
86/609/EEC on the protection of animals used for
experimental and other scientific purposes. These
requirements state that the EC and its Member
States should actively support the development,
validation and acceptance of methods which could
replace, refine or reduce the use of laboratory ani-
mals (142). The US equivalent is the Interagency
Coordinating Committee on the Validation of
Alternative Methods (ICCVAM), which has similar
goals. Despite the high standards required for suc-
cessful validation, between 1998 and 2007, 21 dis-
tinct tests or categories of test methods that could
replace, reduce or refine laboratory animal use,
Poor human clinical and toxicological utility of animal experiments 653
had been validated and registered with ECVAM,
and nine had achieved regulatory acceptance
(143).
However, unlike non-animal models, animal mod-
els are generally assumed to be reasonably predictive
of human outcomes in preclinical drug development,
toxicity testing, and other fields of biomedical
research, without the need to undergo formal valida-
tion studies. Yet the 27 systematic reviews examined
in this study, demonstrate that it is insufficient to
assume that animal models are reliably predictive of
human outcomes, even those in use for long periods,
without subjecting them to critical assessment.
Clearly, formal validation should be consistently
applied to all proposed experimental models,
regardless of their animal, non-animal, historical,
contemporary or possible future status, and models
should be chosen on the basis of critical scientific
review, with appropriate consideration also given to
animal welfare, ethical, legal, economic, and any
other relevant factors.
The Heads of ECVAM and the European
Chemicals Bureau, the EC agencies responsible for
technical aspects of validation and for EU chemicals
regulations, respectively, at that time, made a simi-
lar call in 1995, in which they urged that prevalida-
tion and independent assessment be applied with
equal force to all new or modified animal and non-
animal test guidelines (144).
Conclusions
The historical and contemporary paradigm, that ani-
mal models are generally reasonably predictive of
human outcomes, provides the basis for their wide-
spread use in toxicity testing and biomedical
research aimed at preventing or developing cures for
human diseases. However, their use persists for his-
torical and cultural reasons, rather than because
they have been demonstrated to be scientifically
valid. For example, many regulatory officials “feel
more comfortable” with animal data (145), and some
even believe that animal tests are inherently valid,
simply because they are conducted in animals (146).
However, most existing systematic reviews have
demonstrated that animal experiments are insuffi-
ciently predictive of human outcomes to provide
substantial benefits during the development of
human clinical interventions, or in deriving human
toxicity assessments. In only two of 20 reviews in
which clinical utility was examined, did the authors
conclude that the animal models were either signif-
icantly useful in contributing to the development of
clinical interventions, or were substantially consis-
tent with clinical outcomes (84, 85), and one of
these conclusions was contentious. Seven additional
reviews also failed to clearly demonstrate utility in
predicting human toxicological outcomes, such as
carcinogenicity and teratogenicity. Consequently,
animal data can be generally assumed not to be sub-
stantially useful for these purposes.
Likely causes of this inadequacy include inherent
genotypic and phenotypic differences between
human and non-human species, the distortion of
experimental outcomes arising from experimental
environments and protocols, and the poor method-
ological quality of many animal experiments, as was
apparent in at least 11 reviews. There were no
reviews in which a majority of animal experiments
were of good methodological quality. Some of these
problems might be minimised with concerted effort
(given their widespread prevalence), but the limita-
tions resulting from interspecies differences are
likely to be technically and theoretically impossible
to overcome.
Despite the fact that they have not passed and,
indeed, could not pass, the formal scientific validation
process required of non-animal models prior to regu-
latory acceptance, most animal models are incorrectly
assumed to be predictive of human outcomes. The
consistent application of formal validation studies to
all test models is clearly warranted, regardless of
their animal, non-animal, historical, contemporary or
possible future status. Experi mental model choices
should be based on such critical scientific review,
with appropriate cons ideration also given to animal
welfare, ethical, legal, economic and other relevant
factors.
Likely benefits would include greater selection of
models truly predictive for human outcomes,
increased safety of people exposed to chemicals that
have passed toxicity tests, increased efficiency during
the development of human pharmaceuticals and
other therapeutic interventions, and decreased
wastage of animal, personnel and financial resources.
In addition, the poor human clinical and toxicolog-
ical utility of most animal models for which data
exists, in conjunction with their generally substantial
animal welfare and economic costs, justify a ban on
the use of animal models lacking scientific data
clearly establishing their human predictivity or util-
ity.
Received 02.03.07; received in final form 10.07.07;
accepted for publication 11.07.07.
References
1. Anon. (2007). Annex to the Fifth Report on the Stat -
istics on the Number of Animals Used for Experi -
mental and other Scientific Purposes in the Member
States of the European Union (COM(2007)675 final),
277pp. Brussels, Belgium: European Commission.
2. Goldberg, A.M. (2002). Use of animals in research:
a science–society controversy? The American per-
spective: animal welfare issues. ALTEX 19,
137–139.
3. Stephens, M.L., Alvino, G.M. & Branson, J.B. (2002).
Animal pain and distress in vaccine testing in the
654 A. Knight
United States. Developments in Biologicals 111,
213–216.
4. Anon. (2007). FY 2006 AWA Inspections, 11pp.
Riverdale, MD, USA: United States Department of
Agriculture Animal and Plant Health Inspection
Service (USDA APHIS). Available at: http://www.
aphis.usda.gov/animal_welfare/downloads/
awreports/awreport2006.pdf (Accessed 12.12.07).
5. Carbone, L. (2004). What Animals Want: Expertise
and Advocacy in Laboratory Animal Welfare Policy,
291pp. Oxford, UK: Oxford University Press.
6. Office of Technology Assessment, US Congress
(1986). Alternatives to Animal Use in Research, Test -
ing and Education, OTA-BA-273, 437pp. Washing -
ton, DC, USA: US Government Printing Office.
7. Home Office (2007). Statistics of Scientific Proced -
ures on Living Animals: Great Britain 2006, 49pp.
London, UK: The Stationery Office.
8. O’Shea, D. (2000). Johns Hopkins enters suit over
lab animal regulations. Press Release, 22 Septem -
ber, 2000. Baltimore, MD, USA: Johns Hopkins
University.
9. Fishbein, E.A. (2001). What price mice? Journal of
the American Medical Association 235, 939–941.
10. Sauer, U.G., Kolar, R. & Rusche, B. (2005). The use
of transgenic animals in biomedical research in
Germany. Part 1: Status Report 2001–2003. [Die
Verwendung transgener Tiere in der biomed izin -
ischen Forschung in Deutschland. Teil 1: Sach -
stands bericht 2001–2003.] ALTEX 22, 233–246.
11. Anon. (2007). Swiss animal use statistics for 2005.
Pain & Distress Report 7, 2. Available at: http://www.
hsus.org/pain_distress_report (Accessed 12.12.07).
12. Rusche, B. (2003). The 3Rs and animal welfare —
conflict or the way forward? ALTEX 20 Suppl. 1,
63–76.
13. Combes, R.D., Balls, M., Bansil, L., Barratt, M., Bell,
D., Botham, P., Broadhead, C., Clothier, R., George,
E., Fentem, J., Jackson, M., Indans, I., Loizou, G.,
Navaratnam, V., Pentreath, V., Phillips, B., Stemp -
lewski, H. & Stewart, J. (2004). The Third FRAME
Toxicity Committee: Working toward greater imple-
mentation of alternatives in toxicity testing. ATLA
32 Suppl. 1B, 635–642.
14. Green, S. & Goldberg, A.M. (2004). TestSmart and
toxic ignorance. ATLA 32 Suppl. 1A, 359–363.
15. Fenner-Crisp, P.A., Maciorowski, A.F. & Timm,
G.E. (2000). The endocrine disruptor screening pro-
gram developed by the US Environmental Protec -
tion Agency. Ecotoxicology 9, 85–91.
16. Green, S., Goldberg, A.M. & Zurlo, J. (2001). The
TestSmart-HPV program — Development of an
integrated approach for testing high production vol-
ume chemicals. Regulatory Toxicology & Pharm -
acology 33, 105–109.
17. Armstrong, T.W., Zaleski, R.T., Konkel, W.J. & Park -
erton, T.J. (2002). A tiered approach to assessing chil-
dren’s exposure: a review of methods and data.
Toxicology Letters 127, 111–119.
18. Charles, G.D. (2004). In vitro models in endocrine
disruptor screening. ILAR Journal 45, 494–501.
19. Stokes, W.S. (2004). Selecting appropriate animal
models and experimental designs for endocrine dis-
ruptor research and testing studies. ILAR Journal
45, 387–393.
20. Louekari, K., Sihvonen, K., Kuittinen, M. & Sømnes,
V. (2006). In vitro tests within the REACH informa-
tion strategies. ATLA 34, 377–386.
21. Sandusky, C., Even, M., Stoick, K. & Sandler, J.
(2006). Strategies to reduce animal testing in US
EPA’s HPV program. ALTEX 23 Special Issue,
150–152.
22. Brom, F.W. (2002). Science and society: different
bioethical approaches towards animal experimenta-
tion. ALTEX 19, 78–82.
23. Festing, M.F.W. (2004). Is the use of animals in bio-
medical research still necessary in 2002? Unfort -
unately, “Yes”. ATLA 32 Suppl. 1B, 733–739.
24. Pawlik, W.W. (1998). The significance of animals in
biomedical research. [Znaczenie zwierzat w badani-
ach biomedycznych.] Folia Medica Cracoviensia 39,
175–182.
25. Kjellmer, I. (2002). Animal experiments are neces-
sary. Coordinated control functions are difficult to
study without the use of nature’s most complex sys-
tems: mammals and human beings. [Djurförsök är
nödvändiga. Samordnade kontrollfunktioner låter
sig svårligen studeras utan tillgång till naturens
mest komplexa system: däggdjur och människa.]
Lakartidningen 99, 1172–1173.
26. Osswald, W. (1992). Ethics of animal research and
application to humans. [Etica da investigação no
animal e aplicação ao homem.] Acta Medica Port -
uguesa 5, 222–225.
27. Greek, C.R. & Greek, J.S. (2002). 4th World Con gress
Point/Counterpoint: Is Animal Research Necess ary in
2002?, 54pp. Los Angeles, CA, US: Americans for
Medical Advancement.
28. Singer, P. (1990). Animal Liberation: A New Ethics
for our Treatment of Animals, 2nd edn, 320pp. New
York, NY, USA: New York Review/Random House.
29. La Follette, H. & Shanks, N. (1994). Animal experi-
mentation: the legacy of Claude Bernard. Inter -
national Studies in the Philosophy of Science 8,
195–210.
30. Greek, C.R. & Greek, J.S. (2000). Sacred Cows and
Golden Geese, 242pp. New York, NY, USA: Cont -
inuum.
31. Greek, C.R. & Greek, J.S. (2002). Specious Science,
288pp. New York, NY, USA: Continuum.
32. Anon. (2006). Statement re: TGN1412. Available at:
http://www.tegenero.com/news/statement_re_tgn
1412/index.php (Accessed 18.04.06).
33. Anon. (2006). Frequently asked questions regarding
TGN1412. Available at: http://www.tegenero.com/
news/faqs_re_tgn1412/index.php (Accessed 18.04.06).
34. Bhogal, N. & Combes, R. (2006). TGN1412: time to
change the paradigm for the testing of new phar-
maceuticals. ATLA 34, 225–229.
35. Coghlan, A. (2006). Mystery over drug trial debacle
deepens. NewScientist.com news service, 14 August,
2006. Available at: http://www.newscientist.com/
article.ns?id=dn9734 (Accessed 12.12.07).
36. Graham, D.J., Campen, D., Hui, R., Spence, M.,
Cheetham, C., Levy, G., Shoor, S. & Ray, W.A.
(2005). Risk of acute myocardial infarction and sud-
den cardiac death in patients treated with cyclo-oxy-
genase 2 selective and non-selective non-steroidal
anti-inflammatory drugs: nested case-control study.
Lancet 365, 475–481.
37. Dahl, S.L. & Ward, J.R. (1982). Pharmacology, clin-
ical efficacy, and adverse effects of the nonsteroidal
anti-inflammatory agent benoxaprofen. Pharmaco -
therapy 2, 354–366.
38. Gad, S.C. (1990). Model selection in toxicology: prin -
ciples and practice. Journal of the American College of
Toxicology 9, 291–302.
39. Ross-Degnan, D., Soumerai, S.B., Fortess, E.E. &
Poor human clinical and toxicological utility of animal experiments 655
Gurwitz, J.H. (1993). Examining product risk in
context. Market withdrawal of zomepirac as a case
study. Journal of the American Medical Association
270, 1937–1942.
40. Peters, T.S. (2005). Do preclinical testing strategies
help predict human hepatotoxic potentials? Tox -
icologic Pathology 33, 146–154.
41. Venning, G.R. (1983). Identification of adverse reac-
tions to new drugs. I: What have been the important
adverse reactions since thalidomide? British Med -
ical Journal 286, 199–202.
42. Wallenstein, L. & Snyder, J. (1952). Neurotoxic reac -
tion to chloromycetin. Annals of Internal Medicine
36, 1526–1528.
43. Blum, M.D., Graham, D.J. & McCloskey, C.A.
(1994). Temafloxacin syndrome: review of 95 cases.
Clinical Infectious Diseases 18, 946–950.
44. Mulder, P., Richard, V. & Thuillez, C. (1998). Diff -
erent effects of calcium antagonists in a rat model of
heart failure. Cardiology 89 Suppl. 1, 33–37.
45. Food and Drug Administration, US Department of
Health and Human Services (2004). Innovation or
Stagnation: Challenge and Opportunity on the Crit -
ical Path to New Medical Products, 31pp. Available
at: http://www.fda.gov/oc/initiatives/criticalpath/
whitepaper.pdf (Accessed 12.12.07).
46. Lazarou, J. & Pomeranz, B. (1998). Incidence of
adverse drug reactions in hospitalized patients: a
meta-analysis of prospective studies. Journal of the
American Medical Association 279, 1200–1205.
47. Koppanyi, T. & Avery, M.A. (1966). Species differ-
ences and the clinical trial of new drugs: a review.
Clinical Pharmacology & Therapeutics 7, 250–270.
48. Villar, D., Buck, W.B. & Gonzalez, J.M. (1998).
Ibuprofen, aspirin and acetaminophen toxicosis and
treatment in dogs and cats. Veterinary & Human
Toxicology 40, 156–162.
49. Wilson, J.G., Ritter, E.J., Scott, W.J. & Fradkin, R.
(1977). Comparative distribution and embryotoxic-
ity of acetylsalicylic acid in pregnant rats and rhe-
sus monkeys. Toxicology & Applied Pharmacology
41, 67–78.
50. National Institutes of Health (2006). Information on
Clinical Trials and Human Research Studies.
Available at: http://clinicaltrials.gov/ct/info/whatis;
jsessionid=B9D601AD55432DBDD59314931CA8385
C#phases (Accessed 17.04.07).
51. Pound, P., Ebrahim, S., Sandercock, P., Bracken,
M. & Roberts, I. (2004). Where is the evidence that
animal research benefits humans? British Medical
Journal 328, 514–517.
52. Nuffield Council on Bioethics (2005). The Ethics of
Research Involving Animals, 376pp. London, UK:
Nuffield Council on Bioethics.
53. Anon. (2006). Scopus in detail: what does it cover?
Available at: http://www.info.scopus.com/detail/
what/ (Accessed 01.03.07).
54. National Center for Biotechnology Information
(2006). PubMed overview. Available at: http://www.
ncbi.nlm.nih.gov/entrez/query/static/overview.html
(Accessed 14.04.07).
55. Lindl, T., Völkel, M. & Kolar, R. (2005). [Animal
experiments in biomedical research. An evaluation
of the clinical relevance of approved animal experi-
mental projects.] [German.] ALTEX 22, 143–151.
56. Lindl, T., Völkel, M. & Kolar, R. (2006). Animal
experiments in biomedical research. An evaluation
of the clinical relevance of approved animal experi-
mental projects: No evident implementation in
human medicine within more than 10 years.
[Lecture abstract.] ALTEX 23, 111.
57. Hackam, D.G. & Redelmeier, D.A. (2006). Trans -
lation of research evidence from animals to humans.
Journal of the American Medical Association 296,
1731–1732.
58. Hackam, D.G. (2007). Translating animal research
into clinical benefit: poor methodological standards
in animal studies mean that positive results may
not translate to the clinical domain. British Medical
Journal 334, 163–164.
59. Knight, A. (2007). The poor contribution of chim-
panzee experiments to biomedical progress. Journal
of Applied Animal Welfare Science 10, 281–308.
60. Conlee, K.M., Hoffeld, E.H. & Stephens, M.L.
(2004). A demographic analysis of primate research
in the United States. ATLA 32 Suppl. 1A, 315–322.
61. Morris, E. (Undated). Sampling from Small Popul -
ations. Available at: http://uregina.ca/~morrisev/
Sociology/Sampling%20from%20small%20
populations.htm (Accessed 12.12.07).
62. Guenther, W.C. (1973). A sample size formula for
the hypergeometric. Journal of Quality Technology
5, 167–170.
63. Green, J. (1982). Asymptotic sample size for given
confidence interval length. Applied Statistics 31,
298–300.
64. Macleod, M.R., O’Collins, T., Horky, L.L., Howells,
D.W. & Donnan, G.A. (2005). Systematic review and
meta-analysis of the efficacy of melatonin in experi-
mental stroke. Journal of Pineal Research 38,
35–41.
65. The National Institute of Neurological Disorders and
Stroke rt-PA Stroke Study Group (1995). Tissue plas-
minogen activator for acute ischemic stroke. New
England Journal of Medicine 333, 1581–1588.
66. Chinese Acute Stroke Trial (CAST) Collaborative
Group (1997). Randomised placebo-controlled trial
of early aspirin use in 20,000 patients with acute
ischaemic stroke. Lancet 349, 1641–1649.
67. International Stroke Trial Collaborative Group
(1997). The International Stroke Trial (IST): a ran-
domised trial of aspirin, subcutaneous heparin, or
both, or neither, among 19,435 patients with acute
ischaemic stroke. Lancet 349, 1569–1581.
68. Horn, J., de Haan, R.J., Vermeulen, M., Luiten,
P.G.M. & Limburg, M. (2001). Nimodipine in ani-
mal model experiments of focal cerebral ischemia: a
systematic review. Stroke 32, 2433–2438.
69. O’Collins, V.E., Macleod, M.R., Donnan, G.A., Horky,
L.L., van der Worp, B.H. & Howells, D.W. (2006).
1026 experimental treatments in acute stroke.
Annals of Neurology 59, 467–477.
70. Jonas, S., Aiyagari, V., Vieira, D. & Figueroa, M.
(2001). The failure of neuronal protective agents ver-
sus the success of thrombolysis in the treatment of
ischemic stroke: the predictive value of animal mod-
els. Annals of the New York Academy of Sciences 939,
257–267.
71. Curry, S.H. (2003). Why have so many drugs with
stellar results in laboratory stroke models failed in
clinical trials? A theory based on allometric rela-
tionships. Annals of the New York Academy of
Sciences 993, 69–74.
72. Macleod, M.R., O’Collins, T., Horky, L.L., Howells,
D.W. & Donnan, G.A. (2005). Systematic review and
meta-analysis of the efficacy of FK506 in experi-
mental stroke. Journal of Cerebral Blood Flow &
Metabolism 25, 1–9.
656 A. Knight
73. van der Worp, H.B., de Haan, P., Morrema, E. &
Kalk man, C.J. (2005). Methodological quality of ani-
mal studies on neuroprotection in focal cerebral
ischaemia. Journal of Neurology 252, 1108–1114.
74. Willmot, M., Gray, L., Gibson, C., Murphy, S. &
Bath, P.M. (2005). A systematic review of nitric
oxide donors and L-arginine in experimental stroke;
effects on infarct size and cerebral blood flow. Nitric
Oxide 12, 141–149.
75. Willmot, M., Gibson, C., Gray, L., Murphy, S. &
Bath, P. (2005). Nitric oxide synthase inhibitors in
experimental ischemic stroke and their effects on
infarct size and cerebral blood flow: a systematic
review. Free Radical Biology & Medicine 39,
412–425.
76. Perel, P., Roberts, I., Sena, E., Wheble, P., Briscoe,
C., Sandercock, P., Macleod, M., Mignini, L.E.,
Jayaram, P. & Khan, K.S. (2007). Comparison of
treatment effects between animal experiments and
clinical trials: systematic review. British Medical
Journal 334, 197–200.
77. Stroke Therapy Academic Industry Roundtable
(1999). Recommendations for standards regarding
preclinical neuroprotective and restorative drug
development. Stroke 30, 2752–2758.
78. Lucas, C., Criens-Poublon, L.J., Cockrell, C.T. & De
Haan, R.J. (2002). Wound healing in cell studies
and animal model experiments by Low Level Laser
Therapy; were clinical studies justified? A system-
atic review. Lasers in Medical Science 17, 110–134.
79. Roberts, I., Kwan, I., Evans, P. & Haig, S. (2002).
Does animal experimentation inform human health -
care? Observations from a systematic review of inter-
national animal experiments on fluid resuscitation.
British Medical Journal 324, 474–476.
80. Mapstone, J., Roberts, I. & Evans, P. (2003). Fluid
resuscitation strategies: a systematic review of ani-
mal trials. Journal of Trauma 55, 571–589.
81. Lee, D.S., Nguyen, Q.T., Lapointe, N., Austin, P.C.,
Ohlsson, A., Tu, J.V., Stewart, D.J. & Rouleau, J.L.
(2003). Meta-analysis of the effects of endothelin
receptor blockade on survival in experimental heart
failure. Journal of Cardiac Failure 9, 368–374.
82. Corry, D.B. & Kheradmand, F. (2005). The future of
asthma therapy: integrating clinical and experimen-
tal studies. Immunologic Research 33, 35–51.
83. Lazzarini, L., Overgaard, K.A., Conti, E. & Shirtliff,
M.E. (2006). Experimental osteomyelitis: What have
we learned from animal studies about the systemic
treatment of osteomyelitis? Journal of Chemotherapy
18, 451–460.
84. Scheld, W.M. (1987). Therapy of streptococcal endo-
carditis: correlation of animal model and clinical
studies. Journal of Antimicrobial Chemotherapy 20
Suppl. A, 71–85.
85. Corpet, D.E. & Pierre, F. (2005). How good are
rodent models of carcinogenesis in predicting effi-
cacy in humans? A systematic review and meta-
analysis of colon chemoprevention in rats, mice and
men. European Journal of Cancer 41, 1911–1922.
86. Roberts, I., Evans, A., Bunn, F., Kwan, I. & Crow -
hurst, E. (2001). Normalising the blood pressure in
bleeding trauma patients may be harmful. Lancet
357, 385–387.
87. Knight, A., Bailey, J. & Balcombe, J. (2006). Animal
carcinogenicity studies: 1. Poor human predictivity.
ATLA 34, 19–27.
88. Tomatis, L. & Wilbourn, J. (1993). Evaluation of car -
cin ogenic risk to humans: the experience of IARC. In
New Frontiers in Cancer Causation (ed. O. Iversen),
pp. 371–387. Washington, DC, USA: Taylor and
Francis.
89. Haseman, K. (2000). Using the NTP database to
assess the value of rodent carcinogenicity studies
for determining human cancer risk. Drug Metab -
olism Reviews 32, 169–186.
90. Huff, J. (2002). Chemicals studied and evaluated in
long-term carcinogenesis bioassays by both the
Ramazzini Foundation and the National Toxicology
Program. Annals of the New York Academy of
Sciences 982, 208–230.
91. Ennever, F.K. & Lave, L.B. (2003). Implications of
the lack of accuracy of the lifetime rodent bioassay
for predicting human carcinogenicity. Regulatory
Toxicology & Pharmacology 38, 52–57.
92. Bailey, J., Knight, A. & Balcombe, J. (2005). The
future of teratology research is in vitro. Biogenic
Amines 19, 97–145.
93. Olson, H., Betton, G., Stritar, J. & Robinson, D.
(1998). The predictivity of the toxicity of pharma-
ceuticals in humans from animal data — an interim
assessment. Toxicology Letters 102–103, 535–538.
94. International Agency for Research on Cancer (IARC)
(1972–1992). IARC Monographs on the Eval uation of
Carcinogenic Risks to Humans, Volumes 1–55. Lyon,
France: IARC.
95. International Agency for Research on Cancer
(IARC) (undated). IARC Monographs Programme
on the Evaluation of Carcinogenic Risks to Humans.
Available at: http://monographs.iarc.fr (Accessed
01.01.04).
96. Rall, D.P. (2000). Laboratory animal tests and human
cancer. Drug Metabolism Reviews 2, 119–128.
97. Ashby, J. & Purchase, I.F.H. (1993). Will all chemi-
cals be carcinogenic to rodents when adequately
evaluated? Carcinogenesis 8, 489–495.
98. Shirai, T., Fukushima, S., Ohshima, M. & Ito, N.
(1984). Effects of butylated hydroxyanisole, buty-
lated hydroxytoluene, and NaCl on gastric car-
cinogenesis initiated with N-methyl-N-nitro-N-
nitrosoguanidine in F344 rats. Journal of the
National Cancer Institute 72, 1189–1198.
99. Fung, V., Barrett, J. & Huff, J. (1995). The carcino-
genesis bioassay in perspective: application in iden-
tifying human hazards. Environmental Health
Perspectives 103, 680–683.
100. Gold, L.S., Bernstein, L., Magaw, R. & Slone, T.H.
(1989). Interspecies extrapolation in carcinogenesis:
prediction between rats and mice. Environmental
Health Perspectives 81, 211–219.
101. Gold, L.S., Slone, T.H. & Ames, B.N. (1998). What
do animal cancer tests tell us about human cancer
risk? Overview of analyses of the carcinogenic
potency database. Drug Metabolism Reviews 30,
359–404.
102. Johnson, F.M. (2001). Response to Tennant et al.:
Attempts to replace the NTP rodent bioassay with
transgenic alternatives are unlikely to succeed.
Environmental Molecular Mutagenesis 37, 89–92.
103. Bailey, J. (2005). Non-human primates in medical
research and drug development: a critical review.
Biogenic Amines 19, 235–255.
104. Glazko, G., Veeramachaneni, V., Nei, M. & Makal -
owski, W. (2005). Eighty percent of proteins are dif-
ferent between humans and chimpanzees. Gene
346, 215–219.
105. Balcombe, J., Barnard, N. & Sandusky, C. (2004).
Laboratory routines cause animal stress. Contemp -
Poor human clinical and toxicological utility of animal experiments 657
orary Topics in Laboratory Animal Science 43,
42–51.
106. Knight, A., Bailey, J. & Balcombe, J. (2006). Animal
carcinogenicity studies: 2. Obstacles to extrapola-
tion of data to humans. ATLA 34, 29–38.
107. Poignet, H., Nowicki, J.P. & Scatton, B. (1992).
Lack of neuroprotective effect of some sigma ligands
in a model of focal cerebral ischemia in the mouse.
Brain Research 596, 320–324.
108. Aronowski, J., Strong, R. & Grotta, J.C. (1996).
Treatment of experimental focal ischemia in rats
with lubeluzole. Neuropharmacology 35, 689–693.
109. Marshall, J.W., Cross, A.J., Jackson, D.M., Green,
A.R., Baker, H.F. & Ridley, R.M. (2000). Clometh -
iazole protects against hemineglect in a primate
model of stroke. Brain Research Bulletin 52, 21–29.
110. Bebarta, V., Luyten, D. & Heard, K. (2003). Emer -
gency medicine animal research: does use of ran-
domisation and blinding affect the results?
Acad emic Emergency Medicine 10, 684–687.
111. Medical Research Council (MRC) (1993). Respon -
sibility in the Use of Animals in Medical Research,
12pp. London, UK: MRC.
112. Balls, M., Festing, M.F.W. & Vaughan, S. (eds)
(2004). Reducing the use of experimental animals
where no replacement is yet available. ATLA 32
Suppl. 2, 1–104.
113. Festing, M.F.W. (2004). Good experimental design
and statistics can save animals, but how can it be
promoted? ATLA 32 Suppl. 1A, 133–135.
114. De Boo, J. & Hendriksen, C. (2005). Reduction
strategies in animal research: a review of scientific
approaches at the intra-experimental, supra-experi-
mental and extra-experimental levels. ATLA 33,
369–377.
115. Festing, M.F.W. (1997). Experimental design and
husbandry. Experimental Gerontology 32, 39–47.
116. van Wilgenburg, H., van Schaick Zillesen, P.G. &
Krulichova, I. (2003). Sample power and ExpDesign:
tools for improving design of animal experiments.
Laboratory Animals 32, 39–43.
117. van Wilgenburg, H., van Schaick Zillesen, P.G. &
Krulichova, I. (2004). Experimental design: com-
puter simulation for improving the precision of an
experiment. ATLA 32 Suppl. 1B, 607–611.
118. Mead, R. (1988). The Design of Experiments, 634pp.
New York, NY, USA: Cambridge University Press.
119. Balcombe, J. (2006). Laboratory environments and
rodents’ behavioural needs: a review. Laboratory
Animals 40, 217–235.
120. Eskola, S., Lauhikari, M., Voipio, H., Laitinen, M. &
Nevalainen, T. (1999). Environmental enrichment
may alter the number of rats needed to achieve sta-
tistical significance. Scandinavian Journal of
Laboratory Animal Science 26, 134–144.
121. Schauber, E.M. & Edge, W.D. (1999). Statistical
power to detect main and interactive effects on the
attributes of small-mammal populations. Canadian
Journal of Zoology 77, 68–73.
122. Festing, M.F.W. & Altman, D.G. (2002). Guidelines
for the design and statistical analysis of experi-
ments using laboratory animals. ILAR Journal 43,
244–257.
123. Phillips, C.J.C. (2005). Meta-analysis — A system-
atic and quantitative review of animal experiments
to maximise the information derived. Animal
Welfare 14, 333–338.
124. Festing, M.F.W., Baumans, V., Combes, R.D., Halder,
M., Hendriksen, C.F.M., Howard, B.R., Lovell, D.P.,
Moore, G.J., Overend, P. & Wilson, M.S. (1998).
Reducing the use of laboratory animals in biomedical
research: problems and possible solutions. ATLA 26,
283–301.
125. Balls, M., Goldberg, A.M., Fentem, J.H., Broadhead,
C.L., Burch, R.L., Festing, M.F.W., Frazier, J.M.,
Hendriksen, C.F., Jennings, M., van der Kamp, M.D.,
Morton, D.B., Rowan, A.N., Russell, C., Russell,
W.M.S., Spielmann, H., Stephens, M.L., Stokes, W.S.,
Straughan, D.W., Yager, J.D., Zurlo, J. & Van
Zutphen, B.F. (1995). The Three Rs: the way forward:
The report and recommendations of ECVAM
Workshop 11. ATLA 23, 838–866.
126. Evidence-Based Medicine Working Group (1992).
Evidence-based medicine. A new approach to teach-
ing the practice of medicine. Journal of the
American Medical Association 286, 2420–2425.
127. Watters, M.P.R. & Goodman, N.W. (1999). Com -
parison of basic methods in clinical studies and in
vitro tissue and cell culture studies in three anaes-
thesia journals. British Journal of Anaesthesia 82,
295–298.
128. Moher, D., Schulz, K.F. & Altman, D.G. (2001). The
CONSORT statement: revised recommendations
for improving the quality of reports of parallel-
group randomised trials. Lancet 357, 1191–1194.
129. Arlt, S. & Heuwieser, W. (2005). [Evidence based
veterinary medicine.] [German.] Deutsche Tierärzt -
liche Wochenschrift 112, 146–148.
130. Schulz, K.F. (2005). Assessing allocation concealment
and blinding in randomised controlled trials: why
bother? Equine Veterinary Journal 37, 394–395.
131. Brown, C.M., Calder, C., Linton, C., Small, C.,
Kenny, B.A., Spedding, M. & Patmore, L. (1995).
Neuroprotective properties of lifarizine compared
with those of other agents in a mouse model of focal
cerebral ischaemia. British Journal of Pharm -
acology 115, 1425–1432.
132. Oktem, I.S., Menku, A., Akdemir, H., Kontas, O.,
Kurtsoy, A. & Koc, R.K. (2000). Therapeutic effect of
tirilazad mesylate (U-74006F), mannitol, and their
combination, on experimental ischemia. Research in
Experimental Medicine 199, 231–242.
133. Houdebine, L.M. (2007). Transgenic animal models
in biomedical research. Methods in Molecular
Biology 360, 163–202.
134. Sauer, U.G., Kolar, R. & Rusche, B. (2006). [The use
of transgenic animals in biomedical research in
Germany. Part 2: Ethical evaluation of the use of
transgenic animals in biomedical research and per-
spectives for the changeover in research to research
animal-free methods.] [German.] ALTEX 23, 3–16.
135. Balls, M., Blaauboer, B.J., Fentem, J.H., Bruner, L.,
Combes, R.D., Ekwall, B., Fielder, R.J., Guillouzo,
A., Lewis, R.W., Lovell, D.P., Reinhardt, C.A., Rep -
etto, G., Sladowski, D., Spielmann, H. & Zucco, F.
(1995). Practical aspects of the validation of toxicity
test procedures: The report and recommendations
of ECVAM Workshop 5. ATLA 23, 129–147.
136. Balls, M. & Combes, R. (2005). The need for a for-
mal invalidation process for animal and non-animal
tests. ATLA 33, 299–308.
137. Hoffmann, S. & Hartung, T. (2006). Toward an evi-
dence-based toxicology. Human & Experimental
Toxicology 25, 497–513.
138. Curren, R.D., Southee, J.A., Spielmann, H., Lieb sch,
M., Fentem, J.H. & Balls, M. (1995). The role of
prevalidation in the development, validation and acc -
eptance of alternative methods. ATLA 23, 211–217.
658 A. Knight
139. US Interagency Coordinating Committee on the
Validation of Alternative Methods in National
Institutes of Health (1997). Validation and Regul -
atory Acceptance of Toxicological Test Methods. A
Report of the ad hoc Interagency Coordinating
Committee on the Validation of Alternative Methods,
123pp. Research Triangle Park, NC, USA: National
Institute of Environmental Health Sciences.
140. Organisation for Economic Cooperation and
Development (OECD) (2003). OECD Series on Test -
ing and Assess ment: No. 34: Guidance Document on
the Validation and International Acceptance of New
and Updated Test Methods for Hazard Assessment,
Environment Directorate, 96pp. Paris, France:
OECD.
141. Balls, M. & Combes, R. (2006). Validation via weight-
of-evidence approaches. ALTEX 23, 332–335.
142. European Centre for the Validation of Alternative
Methods (ECVAM), Joint Research Centre, Euro -
pean Commission Directorate General (Undated).
About ECVAM. Available at: http://ecvam.jrc.cec.
eu.int/index.htm (Accessed 12.12.07).
143. European Centre for the Validation of Alternative
Methods, Joint Research Centre, European Com -
mission Directorate General (Undated). Validated
methods. Available at: http://ecvam.jrc.cec.eu.int/
index.htm (Accessed 12.12.07).
144. Balls, M. & Karcher, W. (1995). The validation of
alt ernative test methods. ATLA 23, 884–886.
145. O’Connor, A.M. (1997). Barriers to regulatory accept -
ance. In Animal Alternatives, Welfare and Ethics (ed.
L.F.M. van Zutphen & M. Balls), pp. 1173–1176.
Amsterdam, The Netherlands: Elsevier Science B.V.
146. Balls, M. (2004). Are animal tests inherently valid?
ATLA 32 Suppl. 1B, 755–758.
Poor human clinical and toxicological utility of animal experiments 659
... The use of animals to predict human response to drugs, chemicals, or foods (including probiotics) remains a contentious issue. While some advocate for a ban on animal experimentation due to a perceived lack of scientific evidence for human predictivity [179], the relevance of animal disease models, such as mice, for studying human conditions has been positively evaluated [180]. ...
Article
Full-text available
This review provides a comprehensive overview of the current state of probiotic research, covering a wide range of topics, including strain identification, functional characterization, preclinical and clinical evaluations, mechanisms of action, therapeutic applications, manufacturing considerations, and future directions. The screening process for potential probiotics involves phenotypic and genomic analysis to identify strains with health-promoting properties while excluding those with any factor that could be harmful to the host. In vitro assays for evaluating probiotic traits such as acid tolerance, bile metabolism, adhesion properties, and antimicrobial effects are described. The review highlights promising findings from in vivo studies on probiotic mitigation of inflammatory bowel diseases, chemotherapy-induced mucositis, dysbiosis, obesity, diabetes, and bone health, primarily through immunomodulation and modulation of the local microbiota in human and animal models. Clinical studies demonstrating beneficial modulation of metabolic diseases and human central nervous system function are also presented. Manufacturing processes significantly impact the growth, viability, and properties of probiotics, and the composition of the product matrix and supplementation with prebiotics or other strains can modify their effects. The lack of regulatory oversight raises concerns about the quality, safety, and labeling accuracy of commercial probiotics, particularly for vulnerable populations. Advancements in multi-omics approaches, especially probiogenomics, will provide a deeper understanding of the mechanisms behind probiotic functionality, allowing for personalized and targeted probiotic therapies. However, it is crucial to simultaneously focus on improving manufacturing practices, implementing quality control standards, and establishing regulatory oversight to ensure the safety and efficacy of probiotic products in the face of increasing therapeutic applications.
... These toxicities, ranging from cytokine release syndrome to organ damage, often lead to the termination of promising new clinical programs and limit the broad application of approved therapies [9][10][11] . However, most of these challenges are typically unforeseen by traditional preclinical toxicology models, including cell lines and animals, which either fall short in capturing the complexity of native organs, or lack human-specific tissue features and immunological responses 12,13 . ...
Article
Full-text available
Predicting the toxicity of cancer immunotherapies preclinically is challenging because models of tumours and healthy organs do not typically fully recapitulate the expression of relevant human antigens. Here we show that patient-derived intestinal organoids and tumouroids supplemented with immune cells can be used to study the on-target off-tumour toxicities of T-cell-engaging bispecific antibodies (TCBs), and to capture clinical toxicities not predicted by conventional tissue-based models as well as inter-patient variabilities in TCB responses. We analysed the mechanisms of T-cell-mediated damage of neoplastic and donor-matched healthy epithelia at a single-cell resolution using multiplexed immunofluorescence. We found that TCBs that target the epithelial cell-adhesion molecule led to apoptosis in healthy organoids in accordance with clinical observations, and that apoptosis is associated with T-cell activation, cytokine release and intra-epithelial T-cell infiltration. Conversely, tumour organoids were more resistant to damage, probably owing to a reduced efficiency of T-cell infiltration within the epithelium. Patient-derived intestinal organoids can aid the study of immune–epithelial interactions as well as the preclinical and clinical development of cancer immunotherapies.
... Lethal concentration (LC50), lethal dose (LD50), effective concentration (EC50) and effective dose (ED50) are some of the terms frequently encountered in toxicity testing [10]. LC50 for liquid and LD50 for solid are defined as concentration or dose of a toxicant that kills 50% of test organisms within a particular period of exposure [11]. However, if the end point is not mortality, EC50 or ED50 is determined, i.e., the concentration or dose that can cause effects in 50% of test organisms [12]. ...
Preprint
Full-text available
Caenorhabditis elegans increasingly is attractive as a toxicity test organism, particularly as a model system to study mechanisms of toxicity at a molecular level and the way that these lead to whole organism and population level effects. In this study, lethal concentration (LC50) values of methiocarb on nematodes (Caenorhabditis elegans) were investigated. In practice, experimental setup was constituted 30 worms (a total of 300 worms with 30 control worms) to be placed in three replicates. methiocarb was added into NGM at the concentration range from 1-20 mg/l (1, 2.5, 5, 7.5, 10, 12.5, 15, 17.5 and 20 mg/l) and determined mortality of worms exposed to calculated Percentage death of worms in these concentrations, mortality was observed at all treatments. The results indicate that due to methiocarb, can be lethargy, lack of breath in media at all of the concentrations, the reason of death. The results of regression analysis indicated that the mortality rate (Y) is positively correlated the concentration (X) having a regression coefficient (R), after 24 hours LC50 value (with 95% confidence limits) was estimated at 4.805 mg/l.
... Barriers that once limited the use of iPSCs, such as costly reagents, complicated culture protocols, or restricted access to high quality, well-characterized iPSC lines, have diminished over the last decade. It has also become increasingly apparent that human biology can diverge significantly from rodent and even non-human primate biology, thus necessitating the use of human cells 3,4 . ...
Article
Full-text available
Induced pluripotent stem cell (iPSC) derived cell types are increasingly employed as in vitro model systems for drug discovery. For these studies to be meaningful, it is important to understand the reproducibility of the iPSC-derived cultures and their similarity to equivalent endogenous cell types. Single-cell and single-nucleus RNA sequencing (RNA-seq) are useful to gain such understanding, but they are expensive and time consuming, while bulk RNA-seq data can be generated quicker and at lower cost. In silico cell type decomposition is an efficient, inexpensive, and convenient alternative that can leverage bulk RNA-seq to derive more fine-grained information about these cultures. We developed CellMap, a computational tool that derives cell type profiles from publicly available single-cell and single-nucleus datasets to infer cell types in bulk RNA-seq data from iPSC-derived cell lines.
... Although multiple possible mechanisms have been demonstrated through animal experiments, small sample size effects and individual differences still exist, making it hard to draw a specific and reliable conclusion. For a long time, quality requirements in clinical research are normally higher than in animal experiments, hence the derivative is a relatively comprehensive way of quality control and assessment of clinical interventions, resulting in, however, a poor human clinical and toxicological utility of animal experiments (Knight, 2007). A systematic review is conducive to assembling overall evidence, enlarging sample size, and lowering various risks of bias, thus systematic evaluations of high-qualified randomized controlled trials (RCTs) are acknowledged to be one the most reliable pieces of evidence when the efficacy of interventions needs proving. ...
Article
Full-text available
Background: Fuzi’s compatibilities with other medicines are effective treatments for chronic heart failure. Pre-clinical animal experiments have indicated many possible synergistic compatibility mechanisms of it, but the results were not reliable and reproducible enough. Therefore, we performed this systematic review and meta-analysis of pre-clinical animal studies to integrate evidence, conducted both qualitative and quantitative evaluations of the compatibility and summarized potential synergistic mechanisms. Method: An exhaustive search was conducted for potentially relevant studies in nine online databases. The selection criteria were based on the Participants, Interventions, Control, Outcomes, and Study designs strategy. The SYRCLE risk of bias tool for animal trials was used to perform the methodological quality assessment. RevMan V.5.3 and STATA/SE 15.1 were used to perform the meta-analysis following the Cochrane Handbook for Systematic Reviews of Interventions. Result: 24 studies were included in the systematic review and meta-analysis. 12 outcomes were evaluated in the meta-analysis, including BNP, HR, HWI, ALD, LVEDP, LVSP, EF, FS, +dP/dtmax, −dP/dtmax, TNF-α and the activity of Na ⁺ -K ⁺ -ATPase. Subgroup analyses were performed depending on the modeling methods and duration. Conclusion: The synergistic Fuzi compatibility therapeutic effects against CHF animals were superior to those of Fuzi alone, as shown by improvements in cardiac function, resistance to ventricular remodeling and cardiac damage, regulation of myocardial energy metabolism disorder and RAAS, alleviation of inflammation, the metabolic process in vivo, and inhibition of cardiomyocyte apoptosis. Variations in CHF modeling methods and medication duration brought out possible model–effect and time-effect relationships.
... Regarding oxidative stress, higher protection against ROS-mediated toxicity in rodents compared to humans was reported for arsenite-exposed embryonic mouse brains (Allan et al. 2015), thalidomide-mediated toxicity in rat and rabbit whole embryo cultures (Hansen et al. 1999), embryonic fibroblasts in vitro, and adult heart tissue in vivo (Janssen et al. 1993;Knobloch et al. 2008). It is increasingly recognized that the physiology of laboratory animals often differs from human physiology (Knight 2007;Leist and Hartung 2013). For example, Olson et al. (2000) demonstrated that rodents identified only 43% of 150 pharmaceuticals known to be toxic in humans. ...
Article
Full-text available
Adverse outcome pathways (AOPs) are organized sequences of key events (KEs) that are triggered by a xenobiotic-induced molecular initiating event (MIE) and summit in an adverse outcome (AO) relevant to human or ecological health. The AOP framework causally connects toxicological mechanistic information with apical endpoints for application in regulatory sciences. AOPs are very useful to link endophenotypic, cellular endpoints in vitro to adverse health effects in vivo. In the field of in vitro developmental neurotoxicity (DNT), such cellular endpoints can be assessed using the human “Neurosphere Assay,” which depicts different endophenotypes for a broad variety of neurodevelopmental KEs. Combining this model with large-scale transcriptomics, we evaluated DNT hazards of two selected Chinese herbal medicines (CHMs) Lei Gong Teng (LGT) and Tian Ma (TM), and provided further insight into their modes-of-action (MoA). LGT disrupted hNPC migration eliciting an exceptional migration endophenotype. Time-lapse microscopy and intervention studies indicated that LGT disturbs laminin-dependent cell adhesion. TM impaired oligodendrocyte differentiation in human but not rat NPCs and activated a gene expression network related to oxidative stress. The LGT results supported a previously published AOP on radial glia cell adhesion due to interference with integrin-laminin binding, while the results of TM exposure were incorporated into a novel putative, stressor-based AOP. This study demonstrates that the combination of phenotypic and transcriptomic analyses is a powerful tool to elucidate compounds’ MoA and incorporate the results into novel or existing AOPs for a better perception of the DNT hazard in a regulatory context. Graphical abstract
... cancer stem-like cells [CSCs]) are widely used [7][8][9]. However, they are unable to mimic genuine osteosarcoma histological structure because of the lack of ECM [10], while animal models face ethical issues and so-called '3R' principle disadvantages [11,12]. Therefore, novel in vitro osteosarcoma models that properly recapitulate the biologically native interactions between ECM and osteosarcoma cells are urgently needed. ...
Article
Full-text available
Current in vitro models for osteosarcoma investigation and drug screening, including two-dimensional (2D) cell culture and tumour spheroids (i.e. cancer stem-like cells), lack extracellular matrix (ECM). Therefore, results from traditional models may not reflect real pathological processes in genuine osteosarcoma histological structures. Here, we report a three-dimensional (3D) bioprinted osteosarcoma model (3DBPO) that contains osteosarcoma cells and shrouding ECM analogue in a 3D frame. Photo-crosslinkable bioinks composed of gelatine methacrylamide and hyaluronic acid methacrylate mimicked tumour ECM. We performed multi-omics analysis, including transcriptomics and DNA methylomics, to determine differences between the 3DBPO model and traditional models. Compared with 2D models and tumour spheroids, our 3DBPO model showed significant changes in cell cycle, metabolism, adherens junctions, and other pathways associated with epigenetic regulation. The 3DBPO model was more sensitive to therapies targeted to the autophagy pathway. We showed that simulating ECM yielded different osteosarcoma cell metabolic characteristics and drug sensitivity in the 3DBPO model compared with classical models. We suggest 3D printed osteosarcoma models can be used in osteosarcoma fundamental and translational research, which may contribute to novel therapeutic strategy discovery.
Article
Present study was aimed to estimate the median lethal concentration of most extensively used pesticides, pyrethroid (transfluthrin and cyfluthrin) and carbamates’ pesticides (methiocarb and propoxur to deter pests, using a free-living nematode, Caenorhabditis elegans as model organism. The median Lethal Concentration (LC50) was calculated by Log-dose/ probit regression line method, and Worms showed 24-hours lethality at concentrations 37 mg/l, 61 mg/l, 63 mg/l, and 48 mg/l for transfluthrin, cyfluthrin and methiocarb, propoxur respectively on NGM. Structural and toxicidal differences may has been for Differences in the median lethal concentration. LD50 and LC50 values data was compare with mammalian oral LD50 and calculated LC50 using C. elegans respectively. C. elegans found to be the more convenient for generating LC50 values analogous to the mammals LD50 values. So, C. elegans has great promises in the area of toxicological research.
Chapter
The convergence of humanized drug discovery and the human specificity of new therapeutic modalities have placed induced pluripotent stem cells (iPSCs) at the center of patient-driven research. In this chapter, we provided guidance on the careful consideration of whether an iPSC-derived model is best suited to the problem being asked, followed by specifics related to handling iPSC-derived neurons in a high-throughput screen.
Article
Full-text available
Experience has shown that the outcome of large and expensive validation studies on alternative methods can be compromised if their managers do not insist that optimised test protocols and proof of their performance are submitted before the start of the formal validation study. One way for the sponsors of validation studies to confirm both the likely relevance of a method for its stated purpose and its readiness for validation would be to require a prevalidation study before formal validation was contemplated. This process would involve the developers (or other proponents of the method) and selected independent laboratories in protocol refinement (Phase I) and protocol transfer (Phase II). The optimised protocol would then be assessed in a protocol performance phase (Phase III), which would involve the testing of a relevant set of coded test materials and an evaluation of a proposed prediction model. In certain circumstances, a successful outcome of Phase III might be sufficient for promotion of the regulatory acceptance of the method. Normally, however, the method would proceed to a formal validation study. The European Centre for the Validation of Alternative Methods, a recognised validation authority, now proposes to introduce this prevalidation scheme into its validation strategy.
Article
When using attribute sampling without replacement on lots of size N, a plan (n, c) may be desired which satisfies two conditions on the operating characteristic. This paper gives an approximate procedure which is very accurate and simple to use for solving the problem.
Article
According to the German Animal Welfare Act, scientists in Germany must provide an ethical and scientific justification for their application to the licensing authority prior to undertaking an animal experiment. Such justifications commonly include lack of knowledge on the development of human diseases or the need for better or new therapies for humans. The present literature research is based on applications to perform animal experiments from biomedical study groups of three universities in Bavaria (Germany) between 1991 and 1993. These applications were classified as successful in the animal model in the respective publications (Lindl et al. ALTEX, 18, 171-178, 2001). We investigated the frequency of citations, the course of citations, and in which type of research the primary publications were cited: subsequent animal-based studies, in vitro studies, review articles or clinical studies. The criterion we applied was whether the scientists succeeded in reaching the goal they postulated in their applications, i.e. to contribute to new therapies or to gain results with direct clinical impact. The outcome was unambiguous: even though 97 clinically orientated publications containing citations of the above-mentioned publications were found (8 % of all citations), only 4 publications evidenced a direct correlation between the results from animal experiments and observations in humans (0,3 %). However, even in these 4 cases the hypotheses that had been verified successfully in the animal experiment failed in every respect. The implications of our findings may lead to demands concerning improvement of the licensing practice in Germany.
Article
Part 1: Status Report 2001-2003 (published in ALTEX-21, 4 2005 While the German Federal Government has set itself the goal to make an active contribution to reducing animal experiments, the use of transgenic animals in biomedical research continuously increases every year It is against this background that the study at hand aimed at providing an overview over the goals and the contents of research projects performed in Germany, in the course of which transgenic animals were produced or used in experimental procedures. Specifically, it was envisaged to spell out those specific areas of research, for which transgenic animals mainly were being used. Subsequently it was evaluated whether the research goals revealed might also be pursued with non animal test methods. In a literature survey, a total of 577 scientific publications relevant for the purposes of the study were collected. This material enables conclusions on those scientific areas, in which transgenic animals are used, applying to fundamental research, but not on their use in routine procedures in applied research or for the maintenance of transgenic breeds, since such purposes do not tend to be the subject of publications in scientific journals. According to the topics covered by the publications, main areas of biomedical research with transgenic animals can be found in the fields of neurobiology, immunology, cardiology, embryology and oncology. However their use can be discerned in all other areas of,fundamental biomedical research as well. In accordance with the official German laboratory animal statistics, the vast majority of transgenic animals used were mice, followed by rats and pigs. Additionally, singular research projects with fish, rabbits and chicken were recorded., (In the official German laboratory animals statistics, very small numbers of transgenic hamsters, sheep and amphibians were also recorded In the past years.) A high percentage of the rats were used in cardiovascular research, whereas transgenic pigs as a rule were produced and bred as organ donors in xenotransplantation research. The majority of research projects either dealt with the experimental use of already established transgenic animal lines, or they described that transgenic animals specifically were produced for the purpose of the respective research project. Mostly, transgenesis was initiated by inserting the foreign gene into the germ cell genome. In some research projects, it was reported that the transgenic material was inserted into normally bred animals some time after parturition. Part 2: Perspectives to change biomedical research to non-animal test methods As a rule, transgenic animals are being used in in vivo experiments to examine gene functions, their regulation or the contribution of genetic alterations to the development of diseases. Many transgenic animals already are affected in their wellbeing due to the genetic modification alone regardless of the procedures performed with them. Moreover it is to be questioned wither the experimental use of transgenic animals led to results that were of such outstanding scientific relevance that they legitimated the suffering of the animals. In order to point to possible approaches to avoiding the use of transgenic animals in the areas of research identified, subsequent investigations aimed at collecting information on non-animal test methods that might be applied in pursuing the aforesaid questions. In particular these were non-animal test methods that make use of genetic techniques. Amongst these are in vitro cell culture methods with genetically modified cells, such as the so called Transfected Cell Array, as well as in vitro test methods, in which specifically targeted genes can be turned on or off selectively for example by the so-called RNA interference technique or by antisense oligonucleotide genes. Since such technologies can also be applied to cell cultures with human cells, investigations with these methods enable direct information on the function of human genes. Even though a one to one replacement of animal experiments with transgenic animals by non-animal test methods is considered unlikely, from the point of view of animal welfare the broad spectrum of already available non animal test methods with which to study the function of genes and genetically caused pathophysiological reactions proves that waiving of animal tests with transgenic animals is possible without impeding biomedical research. Even if it cannot be totally excluded that some very specific questions linked to the respective animal experiment might not be pursued for the time being, nevertheless research that would be restricted to modern and ethically acceptable in vitro test methods would certainly conceive its very own questions to pursue and solve the problems currently faced by biomedical research. It is against this background that it is to be welcomed that the German Federal Government currently actively promotes the further development of genetechnological non-animal test methods. In order to ensure that these funding measures will make an effective contribution to reducing animal experiments, as spelled out by the government itself the conversion of genetechnological research, just like biomedical research as a whole, to non-animal testing methods should be supported by concrete political actions. From the point of view of the German Animal Welfare Federation the following issues are to be requested: In order to enable a fast and comprehensive advancement of promising genetechnological non-animal test methods, it should be ensured that public funding is provided with an adequate budget and over a sufficiently long period of time. The legislator should initiate broad discussions on the question if society would be willing to dispense with certain pieces of knowledge if they would necessarily have to be gained at the expense of a certain degree of animal suffering. As the case may be, in the German Animal Welfare Act it should be laid down that certain procedures should not be considered acceptable as such. As long as animal experiments with transgenic animals continue to be performed, concrete legal measures should be laid down in the German Animal Welfare Act to ensure that the distress of the animals (taking into account all factors relevant for transgenic animals) and the expected benefit of the research project are determined objectively so that the outcome of the ethical evaluation process becomes comprehensible. The legislator should provide the authorities responsible for the licensing of research projects with concrete instructions in order to ensure that all aspects relevant for the welfare of the animals are fully taken into account when evaluating the ethical acceptability and scientific indispensability of projects and that special attention is given to research projects with transgenic animals. The German Decree on the Reporting of Laboratory Animals should be amended to ensure that all individual transgenic animals are included in the official statistical reports regardless of whether they end up being used in scientific procedures or not. From the point of view of animal welfare it is possible to redesign biomedical research to do without transgenic animals without impeding necessary scientific progress. The survey in hand sought to make a contribution to providing a scientifically sound background for initiating these discussions.