ArticlePDF Available

Wainer H. Adjusting for differential base rates: Lord's paradox again. Psychol Bull 109: 147-151

Authors:
  • Independent Statistician and Author

Abstract

When the responses of 2 or more groups to the relative effects of some stimulus are compared, it is often important to adjust statistically the estimates of those effects for baseline differences among those groups. This is often the case in experiments on heart rate for animals of different ages. How should such adjustment be done? Among the competing methodologies are (a) subtract the base rate, (b) divide by the base rate, and (c) covary out the base rate. Because each can give a different answer, the choice is crucial. This article shows that this is an example of Lord's Paradox and that Rubin's Model for the measurement of causal effects allows researchers to understand what the assumptions are underlying the validity of each adjustment strategy. The answer for heart rate data is almost surely Methodology (a).
Psychological
Bulletin
1991.
Vol. 109.
No.
1.147-151
Copyright
1991
by the
American Psychological
Association,
Inc.
0033-2909/9I/$3.00
Adjusting
for
Differential
Base
Rates:
Lord's
Paradox
Again
Howard
Wainer
Educational
Testing Service, Princeton,
New
Jersey
and
Princeton
University
When
the
responses
of 2 or
more groups
to the
relative
effects
of
some stimulus
are
compared,
it is
often
important
to
adjust statistically
the
estimates
of
those
effects
for
baseline
differences
among
those groups. This
is
often
the
case
in
experiments
on
heart rate
for
animals
of
different
ages.
How
should such adjustment
be
done? Among
the
competing methodologies
are (a)
subtract
the
base
rate,
(b)
divide
by the
base
rate,
and (c)
covary
out the
base rate. Because each
can
give
a
different
answer,
the
choice
is
crucial. This article shows
that
this
is an
example
of
Lord's Paradox
and
that
Rubin's Model
for the
measurement
of
causal
effects
allows researchers
to
understand what
the
assumptions
are
underlying
the
validity
of
each adjustment strategy.
The
answer
for
heart rate
data
is
almost
surely
Methodology
(a).
Twenty-five
years ago,
as a new
graduate student,
I first en-
tered
Green
Hall,
the
home
of
Princeton University's Psychol-
ogy
Department. Today,
I
returned
to
teach
a
course there.
Despite
the
wisdom
painfully
acquired over those years,
I
still
felt
the
apprehension associated
with
the
angst
of the
so-many-
years-ago
oral exams. This
was
intensified
when
Byron Camp-
bell,
who
once tried
to
teach
me
about learning
and
motivation,
showed
up at my
door with
two
postdocs
in
tow.
He had a
problem
and
wanted
my
advice.
1
couldnt
help thinking,
"Does
this
count toward
my final
grade?"
Introduction
to the
Problem
Campbell
and his
associates
had run an
experiment
in
which
they
recorded
the
baseline heart rates
of
rats
of
different
ages;
they
then introduced
a
stimulus
(a
noise)
and
kept track
of how
it
affected
the
rats'
heart rates. They
felt
that
the
effect
of the
stimulus
was
different
for
young
rats
(16
days old) than
for
older
rats
(60
days
old),
but
their baseline heart rates were also quite
different.
They needed
to
disentangle
the
causal
effect
of the
stimulus
from
the
baseline
differences
associated with develop-
ment.
Reduced
to its
foundation, their question was, "How
are
we
to
adjust heart-rate data obtained
after
an
experimental
treatment,
for
differences
among animals
in
their base rates?"
Campbell
pointed
out
that there
was
lots
of
advice
on
what
has
been called "The
Law of
Initial
Values"
(Wilder, 1950).
This law,
which
bears
at
least
a
superficial resemblance
to
Weber's Law, states that
the
magnitude
of
response
to a
stimu-
lus
is
related
to the
prestimulus level.
The
challenge
is to un-
cover
the
"true" physiological response
to
an
experimental
stim-
ulus
in the
face
of
what
may be
substantial individual
differ-
ences
in
initial
values. Some experts (Graham
&
Jackson, 1970)
This
presentation
benefited
from
the
stimulation
and
advice
of a
number
of my
colleagues.
Prominent
among
these
are
Byron
A.
Camp-
bell,
Paul
W
Holland,
George
A.
Miller,
Rick
Richardson,
and
Donald
B.
Rubin.
Correspondence
concerningthis
article
should
be
addressed
to
How-
ard
Wainer,
Research
Statistics
Group,
Educational
Testing
Service,
Princeton,
New
Jersey
had
suggested subtracting
the
base rate
from
the
observed heart
rate;
others
(Benjamin, 1963; 1967)
said
that
one
ought
to use
the
base rate
as a
covariate. Still others advised dividing
by the
base
rate
and
looking
at
percentage change.
One
authoritative
review
(Richards, 1980,
p.
155) concluded that "analysis
of co-
variance
. . .
has
the
widest acceptance
by
cardiac researchers
as the
answer
to the
problem." Since
the
results
are
affected
by
the
method
of
adjustment,
Campbell wanted
to
know what
the
right
way
was.
This problem
was
described
by
Lord
(1967,1969,1975),
who
carefully
laid
out all of the
issues
but,
in
what must
have
been
frustrating
to
both
the
journal editors
and the
readers, chose
to
omit
the
solution. Thus
the
phenomenon commonly observed
of
these related techniques
yielding
different
results
has
come
to be
called
Lord's
Paradox. This apparent paradox
was
only
unraveled
recently (Holland
&
Rubin, 1983)
and
relies
on a
careful
explication
of the
specific
question being asked.
The
explanation
1
gave
to
Byron
Campbell
and his
associates, which
I
reproduce here, strongly
fol
lows
Holland
and
Rubin's develop-
ment
in
both logic
and
notation. Throughout that article
I
fre-
quently
quote
from
Holland
and
Rubin. Sections
of
this article
are not
direct quotes
but are
pretty close paraphrasing (espe-
cially
in the
third
and
fourth
sections).
To the
extent that what
I
say
is
accurate
is a
tribute
to
them;
any
errors incurred when
I
deviate
from
their
prose
are my
own.
Introduction
to the
Solution
Statistics
can be
used
for a
variety
of
purposes.
In
this
in-
stance
two
purposes come
to the
fore:
description
and
causal
inference.
If
all we are
interested
in is
description,
we can use any of the
adjustment
methods proposed, although each
one
yields
a
dif-
ferent
descriptive statement.
For
example,
if we
subtract
the
baseline
from
the
treatment
we can
make such descriptive state-
ments
as
On
average
the
heart
rate
decreases
62
beats/mm
for
young ani-
mals
after
stimulation,
whereas
it
decreases
10
beats/min
for
older
ones.
(A)
147
148
HOWARD
WAINER
Population
V
of
units
Treatment
S
1
or c
Sub-
Populativn
O
I
or 2
Outf
Yt
-)me
Y,
Carted
Vark
X,
tnitant
blc
X,
Figure
1. A
framework
for
causal
inference
(From
"On
Lord's
Para-
dox"
[p. 5 ] by P. W
Holland
& D. B.
Rubin,
1983,
in H.
Wainer
& S.
Messick
[Eds.],
Principals
of
modern
psychological
measurement,
Hillsdale,
NJ:
Erlbaum.
Copyright
1983
by
Lawrence
Erlbaum
Asso-
ciates.
Adapted
by
permission.)
If
we
divide
by the
baseline,
such
statements
as
On
average
the
heart
rale
decreases
12%
for
young animals
after
stimulation,
whereas
it
decreases
3% for
older
ones
(B)
become sensible, whereas
a
descriptive statement associated
with
a
covariance
adjustment might
be
On
average
a
young animal's heart
rate
decreases
as
much
after
stimulation
as
that
of
an
older
one
with
the
same
baseline.
(C)
Although
all of
these descriptive
statements
are
correct,
the
purpose
of the
investigation
was to be
able
to
make causal
statements
like
The
stimulus
had
a
greater
effect
on
young animals than
it did on
old
animals
(D)
As
I
will
show,
the
validity
of the
causal
inferences that natu-
rally
follow
from
each
of
these descriptive statements
will
all
depend
on
different
untestable assumptions. Which
is the
"correct"
adjustment scheme
depends
crucially
on the
relative
plausibility
of
these assumptions.
Rubin's
Model
for
Causal
Inference
The
structure used
to
unravel this mystery involves Rubin's
model
(Rubin,
1974,1977,1978,1980;
Holland,
1986a,
I986b)
for
the
analysis
of
causal
effects.
This model allows absolute
explicitness
about certain distinctions
and
elements that
are
often
left
implicit
in
other accounts. This model
is not
meant
to
find the
cause
of an
effect;
rather
it
tells
how to
measure
the
effect
of a
cause. This purpose
is
made
explicit
in
Equation
1.
The
basic
elements
of the
model
are as
follows:
1.
A
population
of
units,
U
2. An
"experimental manipulation," with
levels;
or c, and its
associated
indicator variable,
S
3. A
subpopulation indicator variable,
G
4.
An
outcome variable,
Y
5.
A
concomitant variable,
X.
This
framework
is
summarized
in
Figure
I.
Holland
and
Rubin (1983,
p.
8)
define,
"the
causal
effect
oft
on Y
(relative
to c) for
each unit
in U is
given
by the
difference,
y,
Yc.
This
is the
amount that
t has
increased
(or
decreased)
the
value
of
Y
(relative
to c) on
each unit.
The
expected value
E(y,
-
Yc)
is the
average causal
effect
of t
versus
c on Y in
IT
This
can be
restated
as
E(Y,-Yc)=E(Y,)-E(Yf),
(1)
which
explicitly shows
how the
unconditional
means
of Y, and
Yc
over
U
have
direct
causal
interpretations.
"In any
study,
what
is
observed
on
each
unit
is
Ys,
so
that
when
S = t, we
observe
Y,
and
when
S = c, we
observe
Yf.
Thus
the
observed mean
of the
treatment
group
is
treatment
group
mean
=
E(y,|5"
=
()
(2)
The
mean
of Y for the
control
group
is
control
group
mean
=
E(YC\S
= c).
(3
)
In
general, there
is no
reason
why
E(y;)
and
E(Y,\S=
t)
should
be
equal.
Similarly
for
E(yc)
and
E(yc|5=
c).
Hence,
in
general,
neither
E(YC\S=
c) nor
E
(Y,\
S =
()
has a
direct
causal
interpre-
tation" (Holland
&
Rubin, 1983,
p.
8).
We
can
connect
E(y,)
and
E(Y,\S
= t) by
recognizing that
t
and
c are
mutually
exclusive
and
exhaustive
conditions
and use
the
definition
of
conditional expectation
to
yield
the
basic
equation
=
t)P(S=t)+
E(Y,\S=
c)P(S=c).
(4)
(5)
E(y,)=
Similarly,
E(Yc)=E(Yc\S=c)P(S=c)
+
Note
that both equations
involve
variables that
can
never
be
directly
observed.
Specifically,
they require
the
average value
of
y,
among
those
animals
exposed
to c and the
average value
of
Yc
among
those
animals exposed
to
(.
This
is the
fundamental
problem
of
causal inference
(see
particularly
Holland,
1986a,
1986b;
more
distantly,
Lewis,
1973,
1986).
Using
Rubin's
Model
to
Adjust
for
Base
Rate
The
identification
of
elements
in
this application
is
summa-
rized
in
Table
1
.
Table
1
Identification
of
Elements
Used
in
Rubin's
Model
for
Causal
Inference
Element
Identification
Study
design
U All
of
the
animals
in the
study
; The
stimulus
(noise)
condition
c
?
S S = ( for all
units
Variables
measured
G
Intact
age
groups
of
rats
(1
=
young,
2
=
old)
X
The
baseline
heart
rate,
taken
prior
to
time
= 0
Y
The
average
heart
rate
after
time
= 0
LORD'S
PARADOX AGAIN
149
The
reason
for
a
question mark next
to c is
that
the
manipula-
tion
c
(control)
was,
in
some sense,
not
performed.
It is the
subjunctive
situation
of
what would have been
the
heart rate
after
time
= 0 had
(
not
occurred.
The
statistical machinery
and
notation
are now set up so
that
we
can
proceed
to the
solution.
The key
question (Statement
D)
involves
making
a
compari-
son of the
size
of the
average causal
effect
for
young rats with
that
of
older
rats.
Thus
we
need
to
separately estimate these
causal
effects.
These
are
Af=E(y,-r,|G
=
/),
1 =
1,2,
(6)
and the
difference
of
average causal
effects
A=A,-A2.
(7)
In
terms
of the
individual subpopulation averages,
A may be
expressed
either
as
A
=
[E(y,|G=l)-E(yc|G=l)]
2)-E(yjG=2)]
(8)
-[E(i;|G=l)-E(yc|G=2)].
(9)
This
second
form
(
Equation
9
)
is
especially
useful
because
it
separates
the
observed
Y,
from
the
unobserved
Yc.
At
this point
let us
consider
two
methods
of
adjustment
indi-
vidually.
The first is to
subtract
the
baseline rate
from
observed
heart
rate;
the
second
is to
covary
out the
baseline rate.
Method
1:
Subtract
Out the
Baseline
If
we
subtract
the
baseline
we
obtain
(
=
1,2.
(10)
The
quantity
D,
is the
mean change
in
heart rate
in
subpopula-
tion
i.
The
difference
of the
changes
is
D=D,-D2.
(11)
We
can
interpret Equation
11
directly
as the
observation
that
(Statement
A)
there
isa
difference
of
52
beats/min
between
the
young
and old
animals.
But the
causal
conclusion
that
"the
effect
of the
stimulus
was a 52
beat/min
greater reduction
in
heart rate
for
young rats than
for
older
ones"
is
not
true without making
an
additional assumption.
The
D,
in
Equation
10
are not the
average causal
effect
parameters
de-
picted
in
Equation
6. To
draw this conclusion
we
must make
an
assumption about
the
values
of the
unobserved variable
y,,.
Specifically,
we
must assume that
the
animal's heart rate,
if
there
had
been
no
stimulation, would have been
the
same
as its
base
rate;
that
is,
yc=
X.
(12)
Under
this untestable assumption,
the
D,
in
Equation
10
are
equal
to the
average causal
effect
parameters
A,
in
Equation
6.
Method
2:
Covary
Out the
Baseline
If
we
construct
a
linear covariate
from
the
baseline heart rate
and
adjust
the
observed heart rates using
it, we
obtain
the
fol-
lowing
conditional expectations:
E(Y,\X,G=i),
1
=
1,2. (13)
Thus
the
mean,
conditional
heart
rate
change
in
group
i at X is
D,(X)=E(Y,-X\X,
G =
i),
;
=
1,2.
(14)
The
difference
in
these mean, conditional changes
in
heart
rate
at
A1
is
D(X)=Dl(X)-DI(X).
(15)
If
we
assume that
the
conditional expectations
in
Equation
13
are
both linear
and
parallel,
we can
write
1
=
1,2.
(16)
E(Y,\X,G=i)=a,
Substituting
this into Equation
1
4
yields
and so we can
write
D(A")
as
(17)
a,-a2.
(18)
Thus
the
covariance-adjusted
difference
between
the two
groups,
D(A"),
is
independent
of the
value
of
A".
Because
the
intercepts
of the
best
fitting
straight lines
within
each
of the two
groups
are
about
the
same,
we are led to
descriptive State-
ment
C.
Although
this
statement about
D(A")
is
correct,
it
bears
no
direct relevance
to the
differential
causal
effect
described
in
Equation
7. To
connect
the
values
of
D,(X)
to
their analogous
causal
parallels
A,,
we
must
make
an
untestable assumption
relating
the
covariate
X to the
value
of the
observed heart rate
under
the
control
condition,
Yc.
This
is
akin
to, but
different
from,
Assumption
12,
which
allowed
us to
make
a
causal inter-
pretation
of
Equation
11.
Suppose
we
generalize Assumption
12 to
Ye=a
+
PX.
(19)
This asserts that
the
heart rate under
the
control condition
is a
deterministic
function
of the
baseline rate. Under Method
1,
a = 0 and 0 =
1.
Under Method
2, a = 0
and
/3
= b, the
common
slope
of the two
within-group regression lines.
If we
make this
latter assumption,
we can
interpret
D(A")
in
Equation
18
as
the
difference
in
causal
effects
A
defined
in
Equation
7.
Discussion
What
is the
Correct Untestable
Assumption?
The
very nature
of
untestable assumptions means
that
there
is
no
statistical procedure that
can be
counted
on to
settle
this
issue.
The
answer
to
this question must come from other
sources.
In
this instance, however,
we can be
guided
by
some
very
strong intuition. Although
we can
never
know
for
sure
what
each animal's heart rate would have been
had
we
not
inter-
vened,
it is not
unreasonable
to
believe that
it
would have con-
tinued
after
time
= 0 at
about
the
same rate
as it had
before.
150
HOWARD
WAINER
Panel
A
Panel
B
Prestimulus
condition
Assumed
heart
rate
under
control
condition
Assumption
2
Figure
2.
Panel
A
depicts
a
condition under which subtracting
the
baseline
is a
reasonable
adjustment
strategy;
Panel
B
depicts
a
situation
in
which
a
covariance
adjustment seems
appropriate.
Thus
the
untestable
assumption
underlying
Method
1
(subtract-
ing
the
baseline) seems like
a
good bet. Thus
the
causal inter-
pretation based
on the
kind
of
covariance analysis described
here
would
lead
us
astray.
Is
this
always
true?
The
current situation
is
depicted
in
Panel
A
of
Figure
2. The
dotted line
is the
assumed
value
of the
heart
rate under
the
control condition.
It is
under this condition that
subtracting
out the
baseline
is the
natural methodology
to
make
causal
inferences.
But
suppose
the
baseline condition
looked more
like
that
shown
in
Panel
B of
Figure
2.
Which
of
the
two
assumed values
of the
control condition
would
we be
more
likely
to
believe?
In
this situation
we
would
be
consider-
ably
more
likely
to
believe that
fi
+1
in
Equation
19, and
conse-
quently
a
covariance adjustment
would
be
more reasonable.
What
About
Percentages?
Initially,
we
mentioned
the
possibility
of
adjusting
for
differ-
ential
baselines
by
dividing
by the
base rate. This strategy
was
never
discussed,
although
it may be
perfectly
plausible.
We can
fit
this
adjustment scheme,
as
well
as any
others
not
discussed,
into
Rubin's
model.
To
accomplish this,
we
merely consider
two
new
dependent variables,
say
Yf
and
Y*,
which
are
denned
as
and
(20a)
(20b)
After
this,
everything
follows
as
before.
All
other variables
are
as
they have
always
been.
The
relationship
of the
covariate,
X,
with
these transformed variables
will
be
different,
but the
logic
of
the
analysis remains.
The
choice
of
which
untestable
as-
sumption
we
prefer
remains
the
same. There
is a
minor change
in
computational
strategy.
Because
of the
well
known instabil-
ity
of
ratios,
we
would take
the
ratio
of
sums rather than
the
sum
of
ratios. This
would
imply
calculating
the
average heart
rate
per
group
and
dividing
by the
average baseline rather than
doing
the
transformation implied
by
Equations
20 first and
summing
afterward.
Are
There
Any
Other
Problems?
Sure
there
are
problems.
We
have
not
touched
on the
very
difficult
problems
associated
with
restriction
of
range,
al-
though
we
suspect that traditional approaches might work
well
here
(arcsine
or
logistic transformation
of the
percentages).
Nor
have
we
considered appropriate scaling.
As a first
guess,
it
would appear that trying
to find the
transformation
of the
ani-
mals-by-time
periods
matrix
that yields
an
additive
decomposi-
tion would
be
fruitful.
Our
only
purpose
in
this
article
was to
try
to
settle
the
initial
question
of
what
is the
"right"
adjustment
scheme. There
is no
automatic answer
to
this question,
but a
variety
of
different
ones
emerge
if one is
careful
and
explicit
about
the
goals
of the
study.
References
Benjamin,
L. S.
(1963).
Statistical
treatment
of the law of
initial values
(LI
V)
in
autonomic
research:
A
review
and
recommendation. Psy-
chosomatic Medicine,
25,
556-566.
Benjamin,
L. S.
(1967).
Facts
and
artifacts
in
using analysis
of
covari-
ance
to
"undo"
the law of
initial values.
Psychophysio/ogy,
4,
487-
202.
Graham,
F.
K.,
&
Jackson,
J. C.
(1970).
Arousal
systems
and
infant
heart
rate
responses.
In H.
W
Reese
& L. P.
Lipsitt
(Eds.),
Advances
in
Child Development
and
Behavior
[special
issue],
5.
59-117.
LORD'S
PARADOX AGAIN
151
Holland,
P.
W
(1986a).
Statistics
and
causal
inference.
Journal
of
the
American Statistical Association,
81,945-970.
Holland,
P.
W
(1986b). Which
comes
first,
cause
or
effect?
The New
York
Statistician,
38.1-6.
Holland,
P.
W,
&
Rubin,
D. B.
(1983).
On
Lord's
paradox.
In H.
Wainer
& S.
Messick
(Eds.),
Principals
of
modern psychological measure-
ment (pp.
3-35).
Hillsdale,
NJ:
Erlbaum.
Lewis,
D.
(1973).
Counterfactuals.
Cambridge,
MA:
Harvard Univer-
sity
Press.
Lewis,
D.
(1986). Philosophical
papers:
11.
New
York:
Oxford
Univer-
sity
Press.
Lord,
E
M.
(1967).
A
paradox
in the
interpretation
of
group
compari-
sons.
Psychological
Bulletin,
68,
304-305.
Lord,
F.
M.
(1969).
Statistical
adjustments when comparing
preexist-
ing
groups.
Psychological
Bulletin,
72,
336-337.
Lord,
E M.
(1975).
Lord's
paradox.
In S. B.
Anderson,
S.
Ball,
R. T.
Murphy,
&
Associates,
Encyclopedia
of
Educational Evaluation (pp.
232-236).
San
Francisco,
CA:
Jossey-Bass.
Richards,
1. E.
(1980).
The
statistical
analysis
of
heart
rate:
A
review
emphasizing
infancy
data.
Psychophysiology,
17,153-166.
Rubin,
D.
B.
(1974).
Estimating
causal
erfectsof
treatments
in
random-
ized
and
non-randomized studies. Journal
of
Educational
Psychol-
ogy,
66,
688-701.
Rubin,
D. B.
(1977).
Assignment
to
treatment group
on the
basis
of a
covariate.
Journal
of
Educational
Statistics,
2,1-26.
Rubin,
D. B.
(1978).
Bayesian inference
for
causal
effects:
The
role
of
randomization.
The
Annals
of
Statistics,
7,
34-58.
Rubin,
D. B.
(1980). Discussion
of
"Randomization
of
experimental
data
in the
Fisher randomization
test,"
by
Basu.
Journal
of
the
Ameri-
can
Statistical
Association,
75,
591-593.
Wilder,
I
(1950).
The
law
of
initial
values. Psychosomatic
Medicine,
12,
392.
Received
November
20,1989
Revision
received
April
3,1990
Accepted
April
14,1990
Today's
Date_
American
Psychological
Association
Subscription
Claims
Information
This
form
is
provided
to
assist
members,
institutions,
and
nonmember
individuals
with
any
subscription
problems.
With
the
appropri-
ate
information
provided,
a
resolution
can
begin.
If you use the
services
of an
agent,
please
do NOT
duplicate
claims
through
them
and
directly
to us.
PLEASE
PRINT
CLEARLY
AND
m
INK
IP
POSSIBLE.
FfeBfrFULL
NAMBOR
KEYNAME
OPBCTTIUTIOH
Crry
STA-nVCoUNTRY
ZIP
MEMBER
OR
CUHOMER
NUMBER
(
MAY
BE
POUND
ON
ANT
PAST
ISSUE
LABEL)
DATS
YOUR
ORDER
WAS
MAILED
(OK
PHONED)
P. O.
NUMBER
YOUR
NAME
AMD
PHONE
NUMBER
_CHHCX___CHAROB.
CHECK/CARD
CXBARBD
DATI
(If
possible,
send
a
copy,
front and
back,
of
your
cancelled
check
to
help
us in
our
research
of
your
claim.)
ISSUES:
__MBSNO
DAMAGED
VoWYn.
ISSUBSM
NO./MON
Thank
you.
Once
a
claim
is
received
and
resolved,
delivery
of
replacement
issues
routinely
takes
4-6
weeks.
„i
__(TO
>E
FILLED
OUT
BY
APA
STAFF)
••««•••••••••«••
DATBRBGBTVBD_
ACTION
TAKHN_
STAFF
NAMB_
DATE
OP
ACTION_
INV.
No. &
DATB_
LABEL
#,
DATE_
SEND
THIS
FORM
TO: APA
Subscription
Claims,
1400
N.
Uhle
Street,
Arlington,
VA
22201
PLBASB
DONOT
REMOVE.
A
PHOTOCOPY
MAY
BE
USED.
... This logical fact has proven difficult for psychopathologists to accept, perhaps because of the burden that preexisting group differences place on interpretation of experimental results. Despite numerous technical treatments in the literature (e.g., Chapman & Chapman, 1973;Cochran, 1957;Elashoff, 1969;Fleiss & Tanur, 1973;Huitema, 1980;Jin, 1992;Lord, 1967Lord, , 1969Maxwell & Delaney, 1990;Maxwell, Delaney, & Manheimer, 1985;Porter & Raudenbush, 1987;Wainer, 1991;Wildt & Ahtola, 1978) and more accessible statements (e.g., Neale & Oltmanns, 1980;Siddle & Turpin, 1980), together making an overwhelming case against inappropriate attempts to "control for" such group differences, they remain common in the research literature and, if anything, even more common in research grant applications. 1 Given the continuing popularity of inappropriate uses or interpretations of ANCOVA, the present article offers a relatively nontechnical critique, in hopes of helping to popularize the correct use of ANCOVA and helping researchers to avoid its more common abuses. ...
... Differ on the Covariate Beyond this basic point, there are some gray areas regarding the use and misuse of ANCOVA. Bock (1975) and Wainer (1991) noted that the appropriateness of an ANCOVA depends not only on meeting statistical assumptions but also on the nature of the question posed. Heckman (1989, p. 166) stated, "A decision about the appropriate statistical procedure requires information outside of statistics." ...
Article
Full-text available
Despite numerous technical treatments in many venues, analysis of covariance (ANCOVA) remains a widely misused approach to dealing with substantive group differences on potential covariates, particularly in psychopathology research. Published articles reach unfounded conclusions, and some statistics texts neglect the issue. The problem with ANCOVA in such cases is reviewed. In many cases, there is no means of achieving the superficially appealing goal of “correcting” or “controlling for” real group differences on a potential covariate. In hopes of curtailing misuse of ANCOVA and promoting appropriate use, a nontechnical discussion is provided, emphasizing a substantive confound rarely articulated in textbooks and other general presentations, to complement the mathematical critiques already available. Some alternatives are discussed for contexts in which ANCOVA is inappropriate or questionable.
... Eight promising measures of attention to advertising were identified from prior research, along with each measure's hypothesized signature for distinguishing high from low attention. Each signature, or hypothesized pattern, refers to a change from the person's resting baseline, which is a standard procedure with biological measures to control for differences between individuals (Wainer 1991). In theory and typical measurement, the resting baseline is a common state to which people return, so it should be identifiable by all measures. ...
... We also included the participant's pre-test scores on the same outcome measures in our regression modeling. The importance of disentangling the effect of an intervention on baseline differences in outcomes is long established, though there are competing theories on the most appropriate manner in which to do so (Wilder, 1950;Wainer, 1991). As we are primarily interested in controlling for the effect of baseline scores on our outcome, rather than the interpretation of those coefficients, we included their baseline pre-test score in our models as a covariate. ...
Article
Behavioral parent training (BPT) programs are needed to address disruptive behavior disorders among school-aged children. Given the prolonged COVID-19 pandemic and associated mental health consequences, adapting BPTs to telehealth modalities is necessary to ensure continued services to children and families. This pilot study evaluated the use of a telehealth vs in-person modality to deliver the Developing Our Children’s Skills K-5 (DOCS K-5) BPT. Participants were caregivers of children enrolled in elementary school exhibiting disruptive behaviors who participated in either in-person DOCS K-5 ( n = 21) or internet-DOCS K-5 (i-DOCS K-5; n = 34). Pre- and post-intervention outcome measures were collected for child disruptive behavior, parenting stress, and caregiver symptoms of depression while consumer satisfaction was assessed at post-test only. Multiple linear and Poisson regression models were performed to assess the effect of session modality on the outcomes. Child disruptive behavior, parenting stress and depression, and consumer satisfaction scores were not significantly different across groups, even after adjusting for baseline characteristics. The results of this study provide preliminary evidence that the i-DOCS K-5 modality is as effective as the in-person program. Study findings may be beneficial to practitioners treating school-age children and utilizing telehealth interventions during the COVID-19 pandemic and onward.
... Specifically, following the recommendations and practices of others (e.g., Bamberger et al., 2017;Petrou et al., 2018;Selig & Preacher, 2009;Taylor et al., 2017;Toker & Biron, 2012), justice change between Time 1 and Time 2 is represented as a distinct latent construct by a) holding the loadings of justice indicators equal at Time 1 and Time 2 to impose measurement invariance, b) fixing the loadings of the paths from justice change to Time 2 justice as 1, with the residual's variance set at 0, c) specifying the Time 2 justice as a function of the Time 1 justice with weightings fixed to 1 and the residual variance set at 0, and d) regressing justice change on Time 1 justice. The key reason for using a LDS model is that the change score is represented as a distinct latent construct, which avoids problems associated with difference scores (e.g., measurement error, regression-to-the-mean bias) (Taris, 2000;Wainer, 1991). There are two possible directions of justice change in our LDS model: an increase (above "0") or decrease (below "0"). ...
Article
Full-text available
The experience of justice is a dynamic phenomenon that changes over time, yet few studies have directly examined justice change. In this paper we integrate theories of self-regulation and group engagement to derive predictions about the consequences of justice change. We posit that justice change is an important factor because, as suggested by self-regulation theory, people are particularly sensitive to change. Also consistent with self-regulation, we posit that experiencing justice change will influence behavior via separate approach and avoidance systems. Across three multi-wave and multi-source field studies, we found that justice change predicts employees’ engagement in work via perceived insider status along an approach path, whereas it predicts employees’ withdrawal from work via exhaustion along an avoidance path, after controlling for the effects of static justice level. Moreover, these approach and avoidance effects are bounded by employees’ perception of their employment situation, consistent with a regulatory fit pattern. As expected, employees’ perceptions of employment opportunity, which correspond to gains, strengthen the effects along the approach path. Meanwhile, their perceptions of threat of job continuity, which correspond to losses, strengthen the effects along the avoidance path. Importantly, our set of studies highlight the unique influence of justice change incremental to static justice level.
... Da Selbsteinschätzungen auch in der Lehrkräftefortbildung nicht als aussagekräftige Indikatoren für einen tatsächlichen Lernerfolg gelten ( Campbell & Stanley, 1967;Dugard & Todman, 1995;Knapp & Schafer, 2009;van Breukelen, 2006;Wainer, 1991), bleibt in den meisten Studien aber ebenfalls aus. Nach Rost (2013, S. 120) können Veränderungen in randomisierten Experimenten inhaltlich interpretiert werden, wenn Personen mit niedrigen Eingangswerten in der Experimentalgruppe einen höheren Zuwachs erfahren haben als ...
Book
Full-text available
Vor dem Hintergrund der noch unzureichenden Verankerung von Fragen eines nachhaltigen Wirtschaftens in der kaufmännischen Lehrerbildung wird ein Blended Learning Konzept zur Förderung fachbezogener Kompetenzen im Nachhaltigkeitsmanagement mit Studierenden der Wirtschaftspädagogik sowie ausgebildeten Lehrkräften erprobt und evaluiert. Im theoretisch-konzeptuellen Teil der Arbeit erfolgt eine lernpsychologische und fachdidaktische Begründung der dem Aus- und Fortbildungsangebot zugrundeliegenden Konstruktionsprinzipien. Im empirischen Teil wird untersucht, wie die Teilnehmenden die Qualität des Angebots bewerten, welche Effekte auf ihre Kompetenzen erzielt werden und welche Faktoren den Lernerfolg erklären können. Die Wirksamkeit wird in Anlehnung an den Evaluationsansatz von Kirkpatrick (1998) bzw. Lipowsky (2010) im Rahmen eines experimentellen Forschungsdesigns mit zwei Untersuchungsgruppen und einer randomisierten Zuweisung der Studierenden als primäre Zielgruppe der Studie überprüft. Die Arbeit liefert einen innovativen Ansatz zur Förderung domänenspezifischer Nachhaltigkeitskompetenzen in der kaufmännischen Lehrerbildung und lässt evidenzbasierte Aussagen über die Wirksamkeit des Interventionsprogramms zu. Aus den Ergebnissen werden Vorschläge für die Weiterentwicklung des Konzepts und dessen Einsatz in der fachdidaktischen Ausbildung kaufmännischer Lehrkräfte abgeleitet.
Article
Full-text available
This study investigated the influence of media literacy skills and use of electronic resources by Library and Information Students in digital environment, University of Uyo. Three research questions and three research hypotheses were formulated to guide the study. The study involved 213 LIS undergraduates who were registered library users in 2021/2022 academic session. Census sampling technique were used for the study; the total population was used as sample size. A structured questionnaire tagged: media literacy skills and use of electronic resources by LIS in digital environment was used for data collection. The instrument was vetted by two experts in department of educational foundation, University of Uyo. The instrument was administered on the respondents by the researchers with help of research assistants in the three campus libraries in the University of Uyo. The data collected were analyzed using mean statistics and t-test to test the hypotheses. The result of the data analyzed revealed that a significant influence exists between each of the media literacy skills and use of electronic resources by LIS students in the University of Uyo. The study recommends among others that the University Library and Department of library and Information Science should collaborate in designing programme for teaching of media literacy skills in the University to equipped students with adequate basic media literacy skills.
Article
심리학 여러 분야에서 사전, 사후 시점에 반복측정한 자료에 기반하여 처치집단과 통제집단 간 변화의 차이를 살펴보는 연구를 자주 볼 수 있다. 이때 연구자들이 가장 널리 사용하는 분석 모 형은 차이점수 모형과 공분산분석 모형이다. 하지만, 이 두 모형은 때로 상이한 결과를 산출하기 때문에, 많은 연구자들은 언제 어떠한 방법을 사용해야 하는지 혼란을 겪고 있다. 이에, 본 연구 는 두 모형을 이론적, 경험적으로 비교한 연구를 개관하고, 이에 기반하여 언제 어느 모형을 사 용하는 것이 적절한지 가이드라인을 제시하고자 하였다. 이를 위해, 우선 두 모형을 각각 소개하 고, 예시 자료를 통해 두 모형이 서로 다른 분석 결과를 산출할 수 있음을 보였다. 다음으로, 차 이점수 사용과 관련된 논쟁을 살펴보고, 차이점수에 대한 전통적인 비판이 지나치게 단순화된 가정과 잘못된 믿음에 근거한 것임을 확인하였다. 이어서, 인과추론의 맥락에서 두 방법이 어떤 숨겨진 가정을 내포하고 있는지 이론적으로 살펴보고, 이러한 가정 및 시뮬레이션 연구 결과들 에 기반하여, 실험집단에 참여자를 할당하는 방법과 분석 목적에 따라 어떤 방법을 사용하는 것 이 적절한지 가이드라인을 제시하였다. 본 연구를 통해 연구자들이 보다 적절한 분석 방법을 선 택하고, 엄밀하고 효과적으로 분석을 수행하는 데 도움을 제공할 수 있을 것으로 기대된다.
Article
Full-text available
This study investigated the influence of media literacy skills and use of electronic resources by Library and Information Students in digital environment, University of Uyo. Three research questions and three research hypotheses were formulated to guide the study. The study involved 213 LIS undergraduates who were registered library users in 2021/2022 academic session. Census sampling technique were used for the study; the total population was used as sample size. A structured questionnaire tagged: media literacy skills and use of electronic resources by LIS in digital environment was used for data collection. The instrument was vetted by two experts in department of educational foundation, University of Uyo. The instrument was administered on the respondents by the researchers with help of research assistants in the three campus libraries in the University of Uyo. The data collected were analyzed using mean statistics and t-test to test the hypotheses. The result of the data analyzed revealed that a significant influence exists between each of the media literacy skills and use of electronic resources by LIS students in the University of Uyo. The study recommends among others that the University Library and Department of library and Information Science should collaborate in designing programme for teaching of media literacy skills in the University to equipped students with adequate basic media literacy skills.
Article
Determining clinically meaningful change (CMC) in a patient-reported (PRO) measure is central to its existence in gauging how patients feel and function, especially for evaluating a treatment effect. Anchor-based approaches are recommended to estimate a CMC threshold on a PRO measure. Determination of CMC involves linking changes or differences in the target PRO measure to that in an external (anchor) measure that is easier to interpret than and appreciably associated with the PRO measure. One type of anchor-based approach for CMC is the "mean change method" where the mean change in score of the target PRO measure within a particular anchor transition level (e.g. one-category improvement) is subtracted from the mean change in score of within an adjacent anchor category (e.g. no change category). In the literature, the mean change method has been applied with and without an adjustment for the baseline scores for the PRO of interest. This article provides the analytic rationale and conceptual justification for keeping the analysis unadjusted and not controlling for baseline PRO scores. Two illustrative examples are highlighted. The current research is essentially a variation of Lord's paradox (where whether to adjust for a baseline variable depends on the research question) placed in a new context. Once the adjustment is made, the resulting CMC estimate reflects an artificial case where the anchor transition levels are forced to have the same average baseline PRO score. The unadjusted estimate acknowledges that the anchor transition levels are naturally occurring (not randomized) groups and thus maintains external validity.
Book
Full-text available
Im Fokus dieser Arbeit steht das Vertrauen (und Misstrauen) von Mitarbeitenden in kollaborationsfähige Roboter (sog. Cobots) am industriellen Arbeitsplatz. Der empirische, interdisziplinäre und anwendungsnahe Forschungsansatz greift auf Theorien aus verschiedenen Disziplinen und auf quantitative sowie qualitative Untersuchungsmethoden zurück. Die Stichproben umfassen Mitarbeitende auf operativer und leitender Ebene in produzierenden Unternehmen sowie Studierende. Die Ergebnisse verdeutlichen u. a. den signifikanten Einfluss des sprachlichen Framings auf das Vertrauen der Produktionsmitarbeitenden. Das Framing bezieht sich dabei einerseits auf die Vermenschlichung eines Cobots und andererseits auf die wahrgenommene Mensch-Cobot-Relation im Spannungsfeld zwischen Kooperation und Konkurrenz. Ein vertrauensförderlicher Effekt der Vermenschlichung stellt sich ein, wenn sich die Mitarbeitenden in einer kooperierenden Relation zum Cobot sehen. Ferner beeinflussen näher zu untersuchende personenspezifische und kontextuelle Faktoren die Wirkkraft des Framings. Vertrauen und Misstrauen erscheinen letztlich als konzeptionell unterschiedliche, multidimensionale und sich zeitdynamisch entwickelnde Konstrukte. Daraus ergeben sich unternehmerische und gesellschaftliche Implikationen auch in Hinblick auf ähnliche Technologien und Anwendungskontexte sowie Bedarfe für anknüpfende anwendungsnahe und theoriebildende Forschungsarbeiten.
Article
Full-text available
Gives an illustration to show why the analysis of covariance usually does not provide the appropriate adjustment to compensate for preexisting differences between nonexperimental groups.
Article
Full-text available
Presents a discussion of matching, randomization, random sampling, and other methods of controlling extraneous variation. The objective was to specify the benefits of randomization in estimating causal effects of treatments. It is concluded that randomization should be employed whenever possible but that the use of carefully controlled nonrandomized data to estimate causal effects is a reasonable and necessary procedure in many cases. (15 ref) (PsycINFO Database Record (c) 2006 APA, all rights reserved).
Article
Problems involving causal inference have dogged at the heels of Statistics since its earliest days. Correlation does not imply causation and yet causal conclusions drawn from a carefully designed experiment are often valid. What can a statistical model say about causation? This question is addressed by using a particular model for causal inference (Rubin, 1974; Holland and Rubin, 1983) to critique the discussions of other writers on causation and causal inference. These include selected philosophers, medical researchers, statisticians, econometricians, and proponents of causal modelling.
Article
This is a short introduction to Rubin's model for the study of causal inference in experiments and observational studies. More details, proofs of mathematical assertions, and expanded discussions can be found in the references.
Article
This book contains 15 papers by the influential American philosopher, David Lewis. All previously published (between 1966 and 80), these papers are divided into three groups: ontology, the philosophy of mind, and the philosophy of language. Lewis supplements eight of the fifteen papers with postscripts in which he amends claims, answers objections, and introduces later reflections. Topics discussed include possible worlds, counterpart theory, modality, personal identity, radical interpretation, language, propositional attitudes, the mind, and intensional semantics. Among the positions Lewis defends are modal realism, materialism, socially contextualized formal semantics, and functionalism of the mind. The volume begins with an introduction in which Lewis discusses his philosophical method.
Article
When assignment to treatment group is made solely on the basis of the value of a covariate, X, effort should be concentrated on estimating the conditional expectations of the dependent variable Y given X in the treatment and control groups. One then averages the difference between these conditional expectations over the distribution of X in the relevant population. There is no need for concern about "other" sources of bias, e.g., unreliability of X, unmeasured background variables. If the conditional expectations are parallel and linear, the proper regression adjustment is the simple covariance adjustment. However, since the quality of the resulting estimates may be sensitive to the adequacy of the underlying model, it is wise to search for nonparallelism and nonlinearity in these conditional expectations. Blocking on the values of X is also appropriate, although the quality of the resulting estimates may be sensitive to the coarseness of the blocking employed. In order for these techniques to be useful in practice, there must be either substantial overlap in the distribution of X in the treatment groups or strong prior information.
Article
The writer has proposed (Benjamin, 1963) that the criterion for a score linearly independent of initial level be that the score have no correlation with initial level. The criterion makes analysis of covariance (anacova) the method of choice for undoing LIV. This paper reviews miscellaneous artifacts said to be associated with anacova and finds that none of them precludes this procedure. Some presumed artifacts considered are: that rxd is “… so complexly constituted that it does not allow simple interpretation” (Lacey and Lacey, 1962); that anacova can cause a loss of valuable information (Heath and Oken, 1965); that anacova introduces an artifactual association with final level (Heath and Oken, 1965); that LIV as measured by anacova techniques can be an artifact of whether resistance or conductance happens to be chosen (Hord, Johnson, and Lubin, 1964); that the use of anacova in (clinically important) instances where groups are defined by a fixed variable is sure to vitally violate its assumptions (Lubin, 1965).
Article
Heart rate is a dependent variable used widely in psychological and psychophysiological research. Several statistical problems arise in the analysis of heart rate data, many of them specific to infancy research. The present paper discusses the problems of a statistically appropriate cardiac measure, the Law of Initial Values, the problem of differential variability in heart rate scores, and the use of multivariate statistical methods in analyzing heart rate data. Special attention is given to those problems and solutions which have potential application to the analysis of infant heart rate data. A flowchart is presented which may guide the researcher in the appropriate use of the several statistical techniques reviewed in this paper.
Article
Problems involving causal inference have dogged at the heels of statistics since its earliest days. Correlation does not imply causation, and yet causal conclusions drawn from a carefully designed experiment are often valid. What can a statistical model say about causation? This question is addressed by using a particular model for causal inference (Holland and Rubin 1983; Rubin 1974) to critique the discussions of other writers on causation and causal inference. These include selected philosophers, medical researchers, statisticians, econometricians, and proponents of causal modeling.