Content uploaded by Howard Wainer
Author content
All content in this area was uploaded by Howard Wainer
Content may be subject to copyright.
Psychological
Bulletin
1991.
Vol. 109.
No.
1.147-151
Copyright
1991
by the
American Psychological
Association,
Inc.
0033-2909/9I/$3.00
Adjusting
for
Differential
Base
Rates:
Lord's
Paradox
Again
Howard
Wainer
Educational
Testing Service, Princeton,
New
Jersey
and
Princeton
University
When
the
responses
of 2 or
more groups
to the
relative
effects
of
some stimulus
are
compared,
it is
often
important
to
adjust statistically
the
estimates
of
those
effects
for
baseline
differences
among
those groups. This
is
often
the
case
in
experiments
on
heart rate
for
animals
of
different
ages.
How
should such adjustment
be
done? Among
the
competing methodologies
are (a)
subtract
the
base
rate,
(b)
divide
by the
base
rate,
and (c)
covary
out the
base rate. Because each
can
give
a
different
answer,
the
choice
is
crucial. This article shows
that
this
is an
example
of
Lord's Paradox
and
that
Rubin's Model
for the
measurement
of
causal
effects
allows researchers
to
understand what
the
assumptions
are
underlying
the
validity
of
each adjustment strategy.
The
answer
for
heart rate
data
is
almost
surely
Methodology
(a).
Twenty-five
years ago,
as a new
graduate student,
I first en-
tered
Green
Hall,
the
home
of
Princeton University's Psychol-
ogy
Department. Today,
I
returned
to
teach
a
course there.
Despite
the
wisdom
painfully
acquired over those years,
I
still
felt
the
apprehension associated
with
the
angst
of the
so-many-
years-ago
oral exams. This
was
intensified
when
Byron Camp-
bell,
who
once tried
to
teach
me
about learning
and
motivation,
showed
up at my
door with
two
postdocs
in
tow.
He had a
problem
and
wanted
my
advice.
1
couldnt
help thinking,
"Does
this
count toward
my final
grade?"
Introduction
to the
Problem
Campbell
and his
associates
had run an
experiment
in
which
they
recorded
the
baseline heart rates
of
rats
of
different
ages;
they
then introduced
a
stimulus
(a
noise)
and
kept track
of how
it
affected
the
rats'
heart rates. They
felt
that
the
effect
of the
stimulus
was
different
for
young
rats
(16
days old) than
for
older
rats
(60
days
old),
but
their baseline heart rates were also quite
different.
They needed
to
disentangle
the
causal
effect
of the
stimulus
from
the
baseline
differences
associated with develop-
ment.
Reduced
to its
foundation, their question was, "How
are
we
to
adjust heart-rate data obtained
after
an
experimental
treatment,
for
differences
among animals
in
their base rates?"
Campbell
pointed
out
that there
was
lots
of
advice
on
what
has
been called "The
Law of
Initial
Values"
(Wilder, 1950).
This law,
which
bears
at
least
a
superficial resemblance
to
Weber's Law, states that
the
magnitude
of
response
to a
stimu-
lus
is
related
to the
prestimulus level.
The
challenge
is to un-
cover
the
"true" physiological response
to
an
experimental
stim-
ulus
in the
face
of
what
may be
substantial individual
differ-
ences
in
initial
values. Some experts (Graham
&
Jackson, 1970)
This
presentation
benefited
from
the
stimulation
and
advice
of a
number
of my
colleagues.
Prominent
among
these
are
Byron
A.
Camp-
bell,
Paul
W
Holland,
George
A.
Miller,
Rick
Richardson,
and
Donald
B.
Rubin.
Correspondence
concerningthis
article
should
be
addressed
to
How-
ard
Wainer,
Research
Statistics
Group,
Educational
Testing
Service,
Princeton,
New
Jersey
had
suggested subtracting
the
base rate
from
the
observed heart
rate;
others
(Benjamin, 1963; 1967)
said
that
one
ought
to use
the
base rate
as a
covariate. Still others advised dividing
by the
base
rate
and
looking
at
percentage change.
One
authoritative
review
(Richards, 1980,
p.
155) concluded that "analysis
of co-
variance
. . .
has
the
widest acceptance
by
cardiac researchers
as the
answer
to the
problem." Since
the
results
are
affected
by
the
method
of
adjustment,
Campbell wanted
to
know what
the
right
way
was.
This problem
was
described
by
Lord
(1967,1969,1975),
who
carefully
laid
out all of the
issues
but,
in
what must
have
been
frustrating
to
both
the
journal editors
and the
readers, chose
to
omit
the
solution. Thus
the
phenomenon commonly observed
of
these related techniques
yielding
different
results
has
come
to be
called
Lord's
Paradox. This apparent paradox
was
only
unraveled
recently (Holland
&
Rubin, 1983)
and
relies
on a
careful
explication
of the
specific
question being asked.
The
explanation
1
gave
to
Byron
Campbell
and his
associates, which
I
reproduce here, strongly
fol
lows
Holland
and
Rubin's develop-
ment
in
both logic
and
notation. Throughout that article
I
fre-
quently
quote
from
Holland
and
Rubin. Sections
of
this article
are not
direct quotes
but are
pretty close paraphrasing (espe-
cially
in the
third
and
fourth
sections).
To the
extent that what
I
say
is
accurate
is a
tribute
to
them;
any
errors incurred when
I
deviate
from
their
prose
are my
own.
Introduction
to the
Solution
Statistics
can be
used
for a
variety
of
purposes.
In
this
in-
stance
two
purposes come
to the
fore:
description
and
causal
inference.
If
all we are
interested
in is
description,
we can use any of the
adjustment
methods proposed, although each
one
yields
a
dif-
ferent
descriptive statement.
For
example,
if we
subtract
the
baseline
from
the
treatment
we can
make such descriptive state-
ments
as
On
average
the
heart
rate
decreases
62
beats/mm
for
young ani-
mals
after
stimulation,
whereas
it
decreases
10
beats/min
for
older
ones.
(A)
147
148
HOWARD
WAINER
Population
V
of
units
Treatment
S
1
or c
Sub-
Populativn
O
I
or 2
Outf
Yt
-)me
Y,
Carted
Vark
X,
tnitant
blc
X,
Figure
1. A
framework
for
causal
inference
(From
"On
Lord's
Para-
dox"
[p. 5 ] by P. W
Holland
& D. B.
Rubin,
1983,
in H.
Wainer
& S.
Messick
[Eds.],
Principals
of
modern
psychological
measurement,
Hillsdale,
NJ:
Erlbaum.
Copyright
1983
by
Lawrence
Erlbaum
Asso-
ciates.
Adapted
by
permission.)
If
we
divide
by the
baseline,
such
statements
as
On
average
the
heart
rale
decreases
12%
for
young animals
after
stimulation,
whereas
it
decreases
3% for
older
ones
(B)
become sensible, whereas
a
descriptive statement associated
with
a
covariance
adjustment might
be
On
average
a
young animal's heart
rate
decreases
as
much
after
stimulation
as
that
of
an
older
one
with
the
same
baseline.
(C)
Although
all of
these descriptive
statements
are
correct,
the
purpose
of the
investigation
was to be
able
to
make causal
statements
like
The
stimulus
had
a
greater
effect
on
young animals than
it did on
old
animals
(D)
As
I
will
show,
the
validity
of the
causal
inferences that natu-
rally
follow
from
each
of
these descriptive statements
will
all
depend
on
different
untestable assumptions. Which
is the
"correct"
adjustment scheme
depends
crucially
on the
relative
plausibility
of
these assumptions.
Rubin's
Model
for
Causal
Inference
The
structure used
to
unravel this mystery involves Rubin's
model
(Rubin,
1974,1977,1978,1980;
Holland,
1986a,
I986b)
for
the
analysis
of
causal
effects.
This model allows absolute
explicitness
about certain distinctions
and
elements that
are
often
left
implicit
in
other accounts. This model
is not
meant
to
find the
cause
of an
effect;
rather
it
tells
how to
measure
the
effect
of a
cause. This purpose
is
made
explicit
in
Equation
1.
The
basic
elements
of the
model
are as
follows:
1.
A
population
of
units,
U
2. An
"experimental manipulation," with
levels;
or c, and its
associated
indicator variable,
S
3. A
subpopulation indicator variable,
G
4.
An
outcome variable,
Y
5.
A
concomitant variable,
X.
This
framework
is
summarized
in
Figure
I.
Holland
and
Rubin (1983,
p.
8)
define,
"the
causal
effect
oft
on Y
(relative
to c) for
each unit
in U is
given
by the
difference,
y,
—
Yc.
This
is the
amount that
t has
increased
(or
decreased)
the
value
of
Y
(relative
to c) on
each unit.
The
expected value
E(y,
-
Yc)
is the
average causal
effect
of t
versus
c on Y in
IT
This
can be
restated
as
E(Y,-Yc)=E(Y,)-E(Yf),
(1)
which
explicitly shows
how the
unconditional
means
of Y, and
Yc
over
U
have
direct
causal
interpretations.
"In any
study,
what
is
observed
on
each
unit
is
Ys,
so
that
when
S = t, we
observe
Y,
and
when
S = c, we
observe
Yf.
Thus
the
observed mean
of the
treatment
group
is
treatment
group
mean
=
E(y,|5"
=
()•
(2)
The
mean
of Y for the
control
group
is
control
group
mean
=
E(YC\S
= c).
(3
)
In
general, there
is no
reason
why
E(y;)
and
E(Y,\S=
t)
should
be
equal.
Similarly
for
E(yc)
and
E(yc|5=
c).
Hence,
in
general,
neither
E(YC\S=
c) nor
E
(Y,\
S =
()
has a
direct
causal
interpre-
tation" (Holland
&
Rubin, 1983,
p.
8).
We
can
connect
E(y,)
and
E(Y,\S
= t) by
recognizing that
t
and
c are
mutually
exclusive
and
exhaustive
conditions
and use
the
definition
of
conditional expectation
to
yield
the
basic
equation
=
t)P(S=t)+
E(Y,\S=
c)P(S=c).
(4)
(5)
E(y,)=
Similarly,
E(Yc)=E(Yc\S=c)P(S=c)
+
Note
that both equations
involve
variables that
can
never
be
directly
observed.
Specifically,
they require
the
average value
of
y,
among
those
animals
exposed
to c and the
average value
of
Yc
among
those
animals exposed
to
(.
This
is the
fundamental
problem
of
causal inference
(see
particularly
Holland,
1986a,
1986b;
more
distantly,
Lewis,
1973,
1986).
Using
Rubin's
Model
to
Adjust
for
Base
Rate
The
identification
of
elements
in
this application
is
summa-
rized
in
Table
1
.
Table
1
Identification
of
Elements
Used
in
Rubin's
Model
for
Causal
Inference
Element
Identification
Study
design
U All
of
the
animals
in the
study
; The
stimulus
(noise)
condition
c
?
S S = ( for all
units
Variables
measured
G
Intact
age
groups
of
rats
(1
=
young,
2
=
old)
X
The
baseline
heart
rate,
taken
prior
to
time
= 0
Y
The
average
heart
rate
after
time
= 0
LORD'S
PARADOX AGAIN
149
The
reason
for
a
question mark next
to c is
that
the
manipula-
tion
c
(control)
was,
in
some sense,
not
performed.
It is the
subjunctive
situation
of
what would have been
the
heart rate
after
time
= 0 had
(
not
occurred.
The
statistical machinery
and
notation
are now set up so
that
we
can
proceed
to the
solution.
The key
question (Statement
D)
involves
making
a
compari-
son of the
size
of the
average causal
effect
for
young rats with
that
of
older
rats.
Thus
we
need
to
separately estimate these
causal
effects.
These
are
Af=E(y,-r,|G
=
/),
1 =
1,2,
(6)
and the
difference
of
average causal
effects
A=A,-A2.
(7)
In
terms
of the
individual subpopulation averages,
A may be
expressed
either
as
A
=
[E(y,|G=l)-E(yc|G=l)]
2)-E(yjG=2)]
(8)
-[E(i;|G=l)-E(yc|G=2)].
(9)
This
second
form
(
Equation
9
)
is
especially
useful
because
it
separates
the
observed
Y,
from
the
unobserved
Yc.
At
this point
let us
consider
two
methods
of
adjustment
indi-
vidually.
The first is to
subtract
the
baseline rate
from
observed
heart
rate;
the
second
is to
covary
out the
baseline rate.
Method
1:
Subtract
Out the
Baseline
If
we
subtract
the
baseline
we
obtain
(
=
1,2.
(10)
The
quantity
D,
is the
mean change
in
heart rate
in
subpopula-
tion
i.
The
difference
of the
changes
is
D=D,-D2.
(11)
We
can
interpret Equation
11
directly
as the
observation
that
(Statement
A)
there
isa
difference
of
52
beats/min
between
the
young
and old
animals.
But the
causal
conclusion
that
"the
effect
of the
stimulus
was a 52
beat/min
greater reduction
in
heart rate
for
young rats than
for
older
ones"
is
not
true without making
an
additional assumption.
The
D,
in
Equation
10
are not the
average causal
effect
parameters
de-
picted
in
Equation
6. To
draw this conclusion
we
must make
an
assumption about
the
values
of the
unobserved variable
y,,.
Specifically,
we
must assume that
the
animal's heart rate,
if
there
had
been
no
stimulation, would have been
the
same
as its
base
rate;
that
is,
yc=
X.
(12)
Under
this untestable assumption,
the
D,
in
Equation
10
are
equal
to the
average causal
effect
parameters
A,
in
Equation
6.
Method
2:
Covary
Out the
Baseline
If
we
construct
a
linear covariate
from
the
baseline heart rate
and
adjust
the
observed heart rates using
it, we
obtain
the
fol-
lowing
conditional expectations:
E(Y,\X,G=i),
1
=
1,2. (13)
Thus
the
mean,
conditional
heart
rate
change
in
group
i at X is
D,(X)=E(Y,-X\X,
G =
i),
;
=
1,2.
(14)
The
difference
in
these mean, conditional changes
in
heart
rate
at
A1
is
D(X)=Dl(X)-DI(X).
(15)
If
we
assume that
the
conditional expectations
in
Equation
13
are
both linear
and
parallel,
we can
write
1
=
1,2.
(16)
E(Y,\X,G=i)=a,
Substituting
this into Equation
1
4
yields
and so we can
write
D(A")
as
(17)
a,-a2.
(18)
Thus
the
covariance-adjusted
difference
between
the two
groups,
D(A"),
is
independent
of the
value
of
A".
Because
the
intercepts
of the
best
fitting
straight lines
within
each
of the two
groups
are
about
the
same,
we are led to
descriptive State-
ment
C.
Although
this
statement about
D(A")
is
correct,
it
bears
no
direct relevance
to the
differential
causal
effect
described
in
Equation
7. To
connect
the
values
of
D,(X)
to
their analogous
causal
parallels
A,,
we
must
make
an
untestable assumption
relating
the
covariate
X to the
value
of the
observed heart rate
under
the
control
condition,
Yc.
This
is
akin
to, but
different
from,
Assumption
12,
which
allowed
us to
make
a
causal inter-
pretation
of
Equation
11.
Suppose
we
generalize Assumption
12 to
Ye=a
+
PX.
(19)
This asserts that
the
heart rate under
the
control condition
is a
deterministic
function
of the
baseline rate. Under Method
1,
a = 0 and 0 =
1.
Under Method
2, a = 0
and
/3
= b, the
common
slope
of the two
within-group regression lines.
If we
make this
latter assumption,
we can
interpret
D(A")
in
Equation
18
as
the
difference
in
causal
effects
A
defined
in
Equation
7.
Discussion
What
is the
Correct Untestable
Assumption?
The
very nature
of
untestable assumptions means
that
there
is
no
statistical procedure that
can be
counted
on to
settle
this
issue.
The
answer
to
this question must come from other
sources.
In
this instance, however,
we can be
guided
by
some
very
strong intuition. Although
we can
never
know
for
sure
what
each animal's heart rate would have been
had
we
not
inter-
vened,
it is not
unreasonable
to
believe that
it
would have con-
tinued
after
time
= 0 at
about
the
same rate
as it had
before.
150
HOWARD
WAINER
Panel
A
Panel
B
Prestimulus
condition
Assumed
heart
rate
under
control
condition
Assumption
2
Figure
2.
Panel
A
depicts
a
condition under which subtracting
the
baseline
is a
reasonable
adjustment
strategy;
Panel
B
depicts
a
situation
in
which
a
covariance
adjustment seems
appropriate.
Thus
the
untestable
assumption
underlying
Method
1
(subtract-
ing
the
baseline) seems like
a
good bet. Thus
the
causal inter-
pretation based
on the
kind
of
covariance analysis described
here
would
lead
us
astray.
Is
this
always
true?
The
current situation
is
depicted
in
Panel
A
of
Figure
2. The
dotted line
is the
assumed
value
of the
heart
rate under
the
control condition.
It is
under this condition that
subtracting
out the
baseline
is the
natural methodology
to
make
causal
inferences.
But
suppose
the
baseline condition
looked more
like
that
shown
in
Panel
B of
Figure
2.
Which
of
the
two
assumed values
of the
control condition
would
we be
more
likely
to
believe?
In
this situation
we
would
be
consider-
ably
more
likely
to
believe that
fi
+1
in
Equation
19, and
conse-
quently
a
covariance adjustment
would
be
more reasonable.
What
About
Percentages?
Initially,
we
mentioned
the
possibility
of
adjusting
for
differ-
ential
baselines
by
dividing
by the
base rate. This strategy
was
never
discussed,
although
it may be
perfectly
plausible.
We can
fit
this
adjustment scheme,
as
well
as any
others
not
discussed,
into
Rubin's
model.
To
accomplish this,
we
merely consider
two
new
dependent variables,
say
Yf
and
Y*,
which
are
denned
as
and
(20a)
(20b)
After
this,
everything
follows
as
before.
All
other variables
are
as
they have
always
been.
The
relationship
of the
covariate,
X,
with
these transformed variables
will
be
different,
but the
logic
of
the
analysis remains.
The
choice
of
which
untestable
as-
sumption
we
prefer
remains
the
same. There
is a
minor change
in
computational
strategy.
Because
of the
well
known instabil-
ity
of
ratios,
we
would take
the
ratio
of
sums rather than
the
sum
of
ratios. This
would
imply
calculating
the
average heart
rate
per
group
and
dividing
by the
average baseline rather than
doing
the
transformation implied
by
Equations
20 first and
summing
afterward.
Are
There
Any
Other
Problems?
Sure
there
are
problems.
We
have
not
touched
on the
very
difficult
problems
associated
with
restriction
of
range,
al-
though
we
suspect that traditional approaches might work
well
here
(arcsine
or
logistic transformation
of the
percentages).
Nor
have
we
considered appropriate scaling.
As a first
guess,
it
would appear that trying
to find the
transformation
of the
ani-
mals-by-time
periods
matrix
that yields
an
additive
decomposi-
tion would
be
fruitful.
Our
only
purpose
in
this
article
was to
try
to
settle
the
initial
question
of
what
is the
"right"
adjustment
scheme. There
is no
automatic answer
to
this question,
but a
variety
of
different
ones
emerge
if one is
careful
and
explicit
about
the
goals
of the
study.
References
Benjamin,
L. S.
(1963).
Statistical
treatment
of the law of
initial values
(LI
V)
in
autonomic
research:
A
review
and
recommendation. Psy-
chosomatic Medicine,
25,
556-566.
Benjamin,
L. S.
(1967).
Facts
and
artifacts
in
using analysis
of
covari-
ance
to
"undo"
the law of
initial values.
Psychophysio/ogy,
4,
487-
202.
Graham,
F.
K.,
&
Jackson,
J. C.
(1970).
Arousal
systems
and
infant
heart
rate
responses.
In H.
W
Reese
& L. P.
Lipsitt
(Eds.),
Advances
in
Child Development
and
Behavior
[special
issue],
5.
59-117.
LORD'S
PARADOX AGAIN
151
Holland,
P.
W
(1986a).
Statistics
and
causal
inference.
Journal
of
the
American Statistical Association,
81,945-970.
Holland,
P.
W
(1986b). Which
comes
first,
cause
or
effect?
The New
York
Statistician,
38.1-6.
Holland,
P.
W,
&
Rubin,
D. B.
(1983).
On
Lord's
paradox.
In H.
Wainer
& S.
Messick
(Eds.),
Principals
of
modern psychological measure-
ment (pp.
3-35).
Hillsdale,
NJ:
Erlbaum.
Lewis,
D.
(1973).
Counterfactuals.
Cambridge,
MA:
Harvard Univer-
sity
Press.
Lewis,
D.
(1986). Philosophical
papers:
11.
New
York:
Oxford
Univer-
sity
Press.
Lord,
E
M.
(1967).
A
paradox
in the
interpretation
of
group
compari-
sons.
Psychological
Bulletin,
68,
304-305.
Lord,
F.
M.
(1969).
Statistical
adjustments when comparing
preexist-
ing
groups.
Psychological
Bulletin,
72,
336-337.
Lord,
E M.
(1975).
Lord's
paradox.
In S. B.
Anderson,
S.
Ball,
R. T.
Murphy,
&
Associates,
Encyclopedia
of
Educational Evaluation (pp.
232-236).
San
Francisco,
CA:
Jossey-Bass.
Richards,
1. E.
(1980).
The
statistical
analysis
of
heart
rate:
A
review
emphasizing
infancy
data.
Psychophysiology,
17,153-166.
Rubin,
D.
B.
(1974).
Estimating
causal
erfectsof
treatments
in
random-
ized
and
non-randomized studies. Journal
of
Educational
Psychol-
ogy,
66,
688-701.
Rubin,
D. B.
(1977).
Assignment
to
treatment group
on the
basis
of a
covariate.
Journal
of
Educational
Statistics,
2,1-26.
Rubin,
D. B.
(1978).
Bayesian inference
for
causal
effects:
The
role
of
randomization.
The
Annals
of
Statistics,
7,
34-58.
Rubin,
D. B.
(1980). Discussion
of
"Randomization
of
experimental
data
in the
Fisher randomization
test,"
by
Basu.
Journal
of
the
Ameri-
can
Statistical
Association,
75,
591-593.
Wilder,
I
(1950).
The
law
of
initial
values. Psychosomatic
Medicine,
12,
392.
Received
November
20,1989
Revision
received
April
3,1990
Accepted
April
14,1990
•
Today's
Date_
American
Psychological
Association
Subscription
Claims
Information
This
form
is
provided
to
assist
members,
institutions,
and
nonmember
individuals
with
any
subscription
problems.
With
the
appropri-
ate
information
provided,
a
resolution
can
begin.
If you use the
services
of an
agent,
please
do NOT
duplicate
claims
through
them
and
directly
to us.
PLEASE
PRINT
CLEARLY
AND
m
INK
IP
POSSIBLE.
FfeBfrFULL
NAMBOR
KEYNAME
OPBCTTIUTIOH
Crry
STA-nVCoUNTRY
ZIP
MEMBER
OR
CUHOMER
NUMBER
(
MAY
BE
POUND
ON
ANT
PAST
ISSUE
LABEL)
DATS
YOUR
ORDER
WAS
MAILED
(OK
PHONED)
P. O.
NUMBER
YOUR
NAME
AMD
PHONE
NUMBER
_CHHCX___CHAROB.
CHECK/CARD
CXBARBD
DATI
(If
possible,
send
a
copy,
front and
back,
of
your
cancelled
check
to
help
us in
our
research
of
your
claim.)
ISSUES:
__MBSNO
DAMAGED
VoWYn.
ISSUBSM
NO./MON™
Thank
you.
Once
a
claim
is
received
and
resolved,
delivery
of
replacement
issues
routinely
takes
4-6
weeks.
—
—
— —
—
—
„i—
— —
—
__(TO
>E
FILLED
OUT
BY
APA
STAFF)—
••««•••••••••«•••
DATBRBGBTVBD_
ACTION
TAKHN_
STAFF
NAMB_
DATE
OP
ACTION_
INV.
No. &
DATB_
LABEL
#,
DATE_
SEND
THIS
FORM
TO: APA
Subscription
Claims,
1400
N.
Uhle
Street,
Arlington,
VA
22201
PLBASB
DONOT
REMOVE.
A
PHOTOCOPY
MAY
BE
USED.
A preview of this full-text is provided by American Psychological Association.
Content available from Psychological Bulletin
This content is subject to copyright. Terms and conditions apply.