A preview of this full-text is provided by American Psychological Association.
Content available from Psychological Bulletin
This content is subject to copyright. Terms and conditions apply.
Psychological Bulletin
1979, Vol.
86, No. 3,
638-641
The
"File
Drawer Problem"
and
Tolerance
for
Null Results
Robert
Rosenthal
Harvard
University
For any
given
research
area,
one
cannot
tell
how
many
studies
have
been
con-
ducted
but
never reported.
The
extreme
view
of the
"file
drawer problem"
is
that
journals
are filled
with
the
5%
of the
studies
that
show
Type
I
errors,
while
the file
drawers
are filled
with
the
95%
of the
studies
that
show
non-
significant
results.
Quantitative
procedures
for
computing
the
tolerance
for filed
and
future
null
results
are
reported
and
illustrated,
and the
implications
are
discussed.
Both behavioral researchers
and
statisti-
cians have long suspected
that
the
studies
published
in the
behavioral sciences
are a
biased sample
of the
studies that
are
actually
carried
out
(Bakan, 1967; McNemar, 1960;
Smart,
1964; Sterling,
1959).
The
extreme
view
of
this problem,
the
"file drawer prob-
lem,"
is
that
the
journals
are filled
with
the
5%
of the
studies
that
show
Type
I
errors,
while
the file
drawers back
at the lab are
filled
with
the 95% of the
studies
that
show
nonsignificant
(e.g.,
p >
.05) results.
In the
past
there
was
very little
one
could
do
to
assess
the net
effect
of
studies, tucked
away
in file
drawers,
that
did not
make
the
magic
.05
level (Rosenthal
&
Gaito, 1963,
1964).
Now, however, although
no
definitive
solution
to the
problem
is
available,
one can
establish reasonable boundaries
on the
prob-
lem
and
estimate
the
degree
of
damage
to
any
research conclusion
that
could
be
done
by the file
drawer problem.
This
advance
in our
ability
to
cope with
the file
drawer
is an
outgrowth
of the in-
creasing interest
of
behavioral scientists
in
summarizing
bodies
of
research
literature
sys-
Preparation
of
this article
was
supported
in
part
by
the
National Science Foundation.
I
would
like
to
thank Judith
A.
Hall
and
Donald
B.
Rubin
for
their valuable improvements
of an
earlier
version
of
this article.
Requests
for
reprints should
be
sent
to
Robert
Rosenthal,
Department
of
Psychology
and
Social
Relations,
Harvard University,
33
Kirkland
Street,
Cambridge,
Massachusetts 02138.
tematically
and
quantitatively, both with
re-
spect
to
significance levels
(Rosenthal,
1969,
1976,
1978)
and
with respect
to
effect-size
estimation
(Hall,
1978; Rosenthal, 1969,
1976;
Rosenthal
&
Rosnow, 1975; Smith
&
Glass,
1977;
Glass, Note
1).
One
hopes
that
this interest
in
summarizing entire research
domains
will
lead
to an
improvement
in
book-
keeping
so
that eventually
all
results will
be
recorded
both with
an
estimate
of
effect
size
(e.g.,
r or d;
Cohen, 1977)
and
with
the
level
of
significance
obtained,
or
more prac-
tically, with
the
standard normal deviate
(Z)
that corresponds
to the
obtained
p
(Rosen-
thai,
1978).1
Future appraisals
of
research
domains
of the
type
found
in
Psychological
Bulletin
should give estimates
of
overall
effect
sizes
and
significance levels; these esti-
mates
of
overall
significance
can
provide
a
basis
for
coping with
the file
drawer problem.
Tolerance
for
Future Null Results
Given
any
systematic quantitative review
of
the
literature bearing
on a
particular
hy-
1
Standard normal deviates
(Z) can be
found
by
various
methods,
of
which
the
following
three
are
most
often
useful:
(a)
Obtain
the
exact
p
asso-
ciated
with
the
test statistic
(e.g.,
t, F, or
x")
and
find the Z
associated with that
p in
tables
of the
normal
distribution;
(6) if the
effect
size
r or phi
is
given
or can be
computed,
Z can be
estimated
by
r(N)l;
(c) if the
effect
size
d is
given
or can be
computed,
Z can be
estimated
by
[<fa/(<f
+
4)]!
w*.
Copyright 1979
by the
American Psychological Association, Inc.
0033-2909/79/8603-0638$00.75
638